readme update for dev pull request

allenai · Nov 28, 2016 · e841109 · e841109
1 parent 843f88d
commit e841109
Showing 1 changed file with 27 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # Bi-directional Attention Flow for Machine Comprehension
-
-Follow three simple steps.
-For more advanced usages, see below.
+ 
+- This the original implementation of [Bi-directional Attention Flow for Machine Comprehension][paper] (Seo et al., 2016).
+- The CodaLab worksheet for the [SQuAD Leaderboard][squad] submission is available [here][worksheet].
 
 ## 0. Requirements
 #### General
@@ -31,22 +31,37 @@ The model was trained with NVidia Titan X (Pascal Architecture, 2016).
 The model requires at least 12GB of GPU RAM.
 If your GPU RAM is smaller than 12GB, you can either decrease batch size (performance might degrade),
 or you can use multi GPU (see below).
-The training converges at 10~15k steps, and it took ~1.5s per step (i.e. ~6 hours).
+The training converges at ~18k steps, and it took ~4s per step (i.e. ~20 hours).
 
+Before training, it is recommended to first try the following code to verify everything is okay and memory is sufficient:
+```
+python -m basic.cli --mode train --noload --debug
+```
 
-To train:
+Then to fully train, run:
 ```
 python -m basic.cli --mode train --noload
 ```
 
+You can speed up the training process with optimization flags:
+```
+python -m basic.cli --mode train --noload --len_opt --cluster
+```
+You can still omit them, but training will be much slower.
+
+
 ## 3. Testing
-To Test (~30 mins):
+To Test, run:
+```
+python -m basic.cli --mode test
+```
+
+Similarly to training, you can give the optimization flags to speed up test (5 minutes on dev data):
 ```
-python -m basic.cli --mode test --batch_size 8 --eval_num_batches 0
+python -m basic.cli --mode test --len_opt --cluster
 ```
 
 This command loads the most recently saved model during training and begins testing on the test data.
-Note that batch size is reduced to 8, because testing requires more memory per example.
 After the process ends, it prints F1 and EM scores, and also outputs a json file (`$PWD/out/basic/00/answer/test-####.json`,
 where `####` is the step # that the model was saved).
 Note that the printed scores are not official (our scoring scheme is a bit harsher).
@@ -84,12 +99,14 @@ This can be easily done by running:
 python -m basic.cli --mode train --noload --num_gpus 3 --batch_size 20
 ```
 
-Similarly, you can speed up your testing by (if your GPU's RAM is 4GB, then batch size should be 2 or 3):
+Similarly, you can speed up your testing by:
 ```
-python -m basic.cli --mode test --batch_size 2 --num_gpus 3
+python -m basic.cli --mode test --num_gpus 3 --batch_size 20 
 ```
 
 
 [multi-gpu]: https://www.tensorflow.org/versions/r0.11/tutorials/deep_cnn/index.html#training-a-model-using-multiple-gpu-cards
 [save]: #
 [squad]: http://stanford-qa.com
+[paper]: https://arxiv.org/abs/1611.01603
+[worksheet]: https://worksheets.codalab.org/worksheets/0x37a9b8c44f6845c28866267ef941c89d/