This package contains the downloads associated with Susan Howlett and Mark Dras (2011) "Clause Restructuring For SMT Not Absolutely Helpful" in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Package last updated: 22 April 2011 This code is provided for your use free of charge and without warranty. Please cite the above publication if you use this code in your research. Please address questions and comments to Suzy Howlett (suzy@showlett.id.au). ========== Package Contents ---------------- 1. README 2. HD10.README - A copy of the README file from the Howlett and Dras (2010) code bundle 3. Code: analysis/ ACL11oracles oracle.py parsing/ distributed_parser parse_german parse_german_lc preprocessing/ Collins_baseline Collins_baseline_lc reordering/ Collins_rules.py Collins_rules_test.py 4. Notes: notes/ ger_tiger_10pc.txt ger_tiger_25pc.txt ger_tiger_half.txt ger_tiger_half_lc.txt ger_tiger_lc.txt moses-diff.txt oracles-output.txt 5. Experiment Management System configuration files: configs/ [40 configuration files: see below] ========== README Contents --------------- 1. System setup 1.1 Moses 1.1.1 Modifications for our cluster 1.2 Howlett and Dras (2010) system 1.3 Berkeley parser 2. Data 2.1 Replicating Collins et al. (2005) 2.2 Other experiments 3. Changes from Howlett and Dras (2010) 3.1 Modifications to scripts 3.2 Additional scripts 3.3 Additional parsing models 3.4 Experiment configuration files ========== 1. SYSTEM SETUP 1.1 Moses We use revision 3799 from the Moses subversion respository, with the SRILM toolkit. For installation instructions, see the Moses website: http://www.statmt.org/moses/. 1.1.1 Modifications for our cluster The experiments in this paper were run across a cluster using TORQUE for job scheduling. The TORQUE qsub command is different from that used in Moses; the accompanying file notes/moses-diff.txt contains the diff between our baseline and the repository revision. These changes should not affect the performance of the system. Our cluster setup also affects the configuration files for the Moses Experiment Management System, specifically the "qsub-settings" variables. These settings instruct the cluster to assign 8 or 16 CPUs for each job. This number of CPUs was not actually needed; this was a hack for our setup to ensure not too many jobs were assigned to each cluster node. These changes should also not affect the results of the systems. 1.2 Howlett and Dras (2010) system Much of the code is reused from our earlier work, Howlett and Dras (2010), cited in this paper. The scripts are free-standing and no installation is required. This code bundle includes a copy of the README file from the earlier paper's code package, which describes the usage of each script. From the earlier paper's code package, we use the scripts analysis/oracle.py, parsing/distributed_parser, parsing/parse_german, preprocessing/Collins_baseline, and reordering/Collins_rules.py. We have made small changes to some of the scripts, and created some additional scripts based on them. This package contains our new scripts and modified versions, along with copies of the original scripts that we used unchanged. We outline our changes in Section (3). We use the Howlett and Dras (2010) parsing model, ger_tiger.gr. In addition, we create five new parsing models by the same method. These are ger_tiger_lc.gr (lowercased), ger_tiger_half.gr (using 50% parsing data), ger_tiger_half_lc.gr (using 50% parsing data, lowercased), ger_tiger_25pc.gr (using 25% parsing data) and ger_tiger_10pc.gr (using 10% parsing data). 1.3 Berkeley parser We use BerkeleyParser.jar from revision 14 of the Berkeley parser subversion repository, as in our earlier work. ========== 2. DATA 2.1 Replicating Collins et al. (2005) A copy of the data used in Collins et al. (2005) was provided by Michael Collins. This data came already tokenised and lowercased. 2.2 Other experiments All remaining experiments use data from the 2009 and 2010 Workshops on Statistical Machine Translation, available from http://www.statmt.org/wmt09/translation-task.html and http://www.statmt.org/wmt10/translation-task.html. From the 2009 Workshop, we use the parallel corpus training data (training-parallel.tar) and additional development sets (additional-dev.tgz). From the 2010 Workshop we use the development sets (dev.tgz). ========== 3. CHANGES FROM HOWLETT AND DRAS (2010) 3.1 Modifications to scripts parsing/distributed_parser parsing/parse_german We modified these scripts to be able to specify the parsing model and number of batches for parsing as command-line arguments. reordering/Collins_rules.py We changed the label() and function() methods to convert their return values to uppercase. This enables us to use the same script to reorder the output of our lowercased parsing models. 3.2 Additional scripts parsing/parse_german_lc This script is similar to parsing/parse_german except it processes the Collins et al. (2005) data, which is plain text, tokenised and lowercased. The primary differences are that tokenisation and unwrapping from SGML format are not required, and *lrb* and *rrb* are used instead of *LRB* and *RRB* to match the lowercased parsing model. preprocessing/Collins_baseline_lc Likewise, this script is similar to preprocessing/Collins_baseline except in that it processes the Collins et al. (2005) data. The primary differences are that the TRAIN/TUNE/TEST distinction is no longer needed, plus the same changes used in creating parsing/parse_german_lc. analysis/ACL11oracles This script calls the analysis/oracle.py script repeatedly to produce the outputs of all of the oracle experiments we run in the paper. notes/oracles-output.txt For reference, this file contains the output from our run of the analysis/ACL11oracles script. 3.3 Additional parsing models notes/ger_tiger_lc.txt notes/ger_tiger_half.txt notes/ger_tiger_half_lc.txt notes/ger_tiger_10pc.txt notes/ger_tiger_25pc.txt These five files describe the training method and performance figures for the three additional parsing models we created. The grammars themselves are not included, as they can be quite large (up to 7.5MB each). The grammars may be downloaded from http://www.showlett.id.au/. 3.4 Experiment configuration files We use the Moses Experiment Management System (EMS) to run the experiments in the paper. The 40 files in the configs directory are the configuration files used. configs/config.baseline-collins Replicating the baseline system of Collins et al. (2005). Evaluation is with the multi-bleu script only. configs/config.reordered-collins configs/config.reordered-collins-half Replicating the reordered system of Collins et al. (2005), using the full, lowercased parsing model and the 50% data, lowercased parsing model, respectively. These systems can only be run after the corresponding baseline, as they reuse its language model and recasing model.Evaluation is with the multi-bleu script only. configs/config.baseline-wmt09* The various configurations of the baseline system tried in the paper. Each configuration file runs the evaluation on both Europarl and news test sets (test2008, newstest2009) using both NIST BLEU and multi-bleu scripts. configs/config.reordered-wmt09* The reordered systems corresponding to each of the baseline systems above. They similarly run the evaluation on both test2008 and newstest2009 with both NIST BLEU and multi-bleu, and can only be run after the corresponding baselines, as they reuse their language models and recasing models. configs/config.reordered-wmt09*-half configs/config.reordered-wmt09*-25pc configs/config.reordered-wmt09*-10pc As for configs/config.reordered-wmt09*, except these experiments use the 50%, 25% or 10% data parsing model instead of the full parsing model. configs/config.oracles This configuration file evaluates all of the oracle outputs (generated by analysis/ACL11oracles) with the multi-bleu script only.