Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
New SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 new, unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD2.0 is a challenging natural language understanding task for existing models, and we release SQuAD2.0 to the community as the successor to SQuAD1.1. We are optimistic that this new dataset will encourage the development of reading comprehension systems that know what they don't know.
Explore SQuAD2.0 and model predictionsSQuAD2.0 paper (Rajpurkar & Jia et al. '18)SQuAD 1.1, the previous version of the SQuAD dataset, contains 100,000+ question-answer pairs on 500+ articles.
Explore SQuAD1.1 and model predictionsSQuAD1.0 paper (Rajpurkar et al. '16)We've built a few resources to help you get started with the dataset.
Download a copy of the dataset (distributed under the CC BY-SA 4.0 license):
To evaluate your models, we have also made available the evaluation script we will use for official evaluation, along with a sample prediction file that the script will take as input. To run the evaluation, use python evaluate-v2.0.py <path_to_dev-v2.0> <path_to_predictions>
.
Once you have a built a model that works to your expectations on the dev set, you submit it to get official scores on the dev and a hidden test set. To preserve the integrity of test results, we do not release the test set to the public. Instead, we require you to submit your model so that we can run it on the test set for you. Here's a tutorial walking you through official evaluation of your model:
Submission TutorialBecause SQuAD is an ongoing effort, we expect the dataset to evolve.
To keep up to date with major changes to the dataset, please subscribe:
Ask us questions at our google group or at pranavsr@stanford.edu and robinjia@stanford.edu.
SQuAD2.0 tests the ability of a system to not only answer reading comprehension questions, but also abstain when presented with a question that cannot be answered based on the provided paragraph. How will your system compare to humans on this task?
Rank | Model | EM | F1 |
---|---|---|---|
Human Performance Stanford University (Rajpurkar & Jia et al. '18) | 86.831 | 89.452 | |
1 Jan 15, 2019 | BERT + MMFT + ADA (ensemble) Microsoft Research Asia | 85.082 | 87.615 |
2 Jan 10, 2019 | BERT + Synthetic Self-Training (ensemble) Google AI Language https://github.com/google-research/bert | 84.292 | 86.967 |
3 Dec 13, 2018 | BERT finetune baseline (ensemble) Anonymous | 83.536 | 86.096 |
4 Dec 16, 2018 | Lunet + Verifier + BERT (ensemble) Layer 6 AI NLP Team | 83.469 | 86.043 |
4 Dec 21, 2018 | PAML+BERT (ensemble model) PINGAN GammaLab | 83.457 | 86.122 |
5 Jan 10, 2019 | BERT + Synthetic Self-Training (single model) Google AI Language https://github.com/google-research/bert | 82.972 | 85.810 |
5 Feb 16, 2019 | Bert-raw (ensemble) None | 83.175 | 85.635 |
5 Dec 15, 2018 | Lunet + Verifier + BERT (single model) Layer 6 AI NLP Team | 82.995 | 86.035 |
5 Jan 14, 2019 | BERT + MMFT + ADA (single model) Microsoft Research Asia | 83.040 | 85.892 |
6 Feb 15, 2019 | BERT + NeurQuRI (ensemble) 2SAH | 82.803 | 85.703 |
7 Dec 16, 2018 | PAML+BERT (single model) PINGAN GammaLab | 82.577 | 85.603 |
8 Nov 16, 2018 | AoA + DA + BERT (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 82.374 | 85.310 |
8 Jan 13, 2019 | Bert-raw (ensemble) None | 82.577 | 84.884 |
9 Dec 12, 2018 | BERT finetune baseline (single model) Anonymous | 82.126 | 84.820 |
9 Dec 10, 2018 | Candi-Net+BERT (ensemble) 42Maru NLP Team | 82.126 | 84.624 |
10 Feb 15, 2019 | BERT + NeurQuRI (single model) 2SAH | 81.257 | 84.342 |
11 Nov 16, 2018 | AoA + DA + BERT (single model) Joint Laboratory of HIT and iFLYTEK Research | 81.178 | 84.251 |
12 Dec 19, 2018 | Candi-Net+BERT (single model) 42Maru NLP Team | 80.659 | 83.562 |
13 Jan 22, 2019 | BERT + NeurQuRI (single model) 2SAH | 80.591 | 83.391 |
13 Jan 07, 2019 | Bert-raw (single model) None | 80.512 | 83.539 |
14 Jan 07, 2019 | BERT + NeurQuRI (single model) 2SAH | 80.343 | 83.221 |
14 Dec 05, 2018 | Candi-Net+BERT (single model) 42Maru NLP Team | 80.388 | 82.908 |
14 Nov 08, 2018 | BERT (single model) Google AI Language | 80.005 | 83.061 |
15 Feb 12, 2019 | BERT + Sparse-Transformer single model | 79.948 | 83.023 |
16 Dec 06, 2018 | NEXYS_BASE (single model) NEXYS, DGIST R7 | 79.779 | 82.912 |
16 Feb 16, 2019 | Bert-raw (single model) None | 80.343 | 83.243 |
17 Dec 03, 2018 | PwP+BERT (single model) AITRICS | 80.117 | 83.189 |
18 Feb 02, 2019 | SJRC (single model) Shanghai Jiao Tong University | 79.711 | 82.842 |
18 Feb 01, 2019 | {bert-finetuning} (single model) ksai | 79.632 | 82.852 |
19 Nov 09, 2018 | L6Net + BERT (single model) Layer 6 AI | 79.181 | 82.259 |
20 Jan 08, 2019 | Bert-raw-full (single model) None | 78.301 | 81.350 |
21 Dec 14, 2018 | BERT+AC(single model) Hithink RoyalFlush | 78.052 | 81.174 |
22 Nov 06, 2018 | SLQA+BERT (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158 | 77.003 | 80.209 |
23 Jan 05, 2019 | synss (single model ) bert_finetune | 76.055 | 79.329 |
24 Dec 18, 2018 | ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai | 74.746 | 78.227 |
24 Nov 05, 2018 | MIR-MRC(F-Net) (single model) Kangwon National University, Natural Language Processing Lab. & ForceWin, KP Lab. | 74.791 | 77.988 |
25 Sep 13, 2018 | nlnet (single model) Microsoft Research Asia | 74.272 | 77.052 |
26 Dec 20, 2018 | {Anonymous} (single model) Anonymous | 73.234 | 76.790 |
26 Dec 29, 2018 | MMIPN Single | 73.505 | 76.424 |
27 Oct 12, 2018 | YARCS (ensemble) IBM Research AI | 72.670 | 75.507 |
28 Nov 14, 2018 | BERT+Answer Verifier (single model) Pingan Tech Olatop Lab | 71.666 | 75.457 |
28 Oct 13, 2018 | RNANetSimple (ensemble) Anonymous | 72.580 | 75.075 |
29 Sep 17, 2018 | Unet (ensemble) Fudan University & Liulishuo Lab https://arxiv.org/abs/1810.06638 | 71.417 | 74.869 |
29 Aug 28, 2018 | SLQA+ (single model) Alibaba DAMO NLP http://www.aclweb.org/anthology/P18-1158 | 71.462 | 74.434 |
29 Aug 15, 2018 | Reinforced Mnemonic Reader + Answer Verifier (single model) NUDT https://arxiv.org/abs/1808.05759 | 71.767 | 74.295 |
30 Sep 14, 2018 | SAN (ensemble model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556 | 71.316 | 73.704 |
30 Jan 19, 2019 | {BERT-base} (single-model) Anonymous | 70.763 | 74.449 |
31 Oct 13, 2018 | RNANetSimple (single model) Anonymous | 70.718 | 73.403 |
32 Sep 26, 2018 | Multi-Level Attention Fusion(MLAF) (single model) Chonbuk National University, Cognitive Computing Lab. | 69.476 | 72.857 |
33 Sep 14, 2018 | Unet (single model) Fudan University & Liulishuo Lab | 69.262 | 72.642 |
33 Aug 21, 2018 | FusionNet++ (ensemble) Microsoft Business Applications Group AI Research https://arxiv.org/abs/1711.07341 | 70.300 | 72.484 |
34 Dec 20, 2018 | DocQA + NeurQuRI (single model) 2SAH | 68.766 | 71.662 |
35 Aug 21, 2018 | SAN (single model) Microsoft Business Applications AI Research https://arxiv.org/abs/1712.03556 | 68.653 | 71.439 |
35 Aug 25, 2018 | ARRR (single model) anonymous | 68.653 | 71.124 |
36 Jul 13, 2018 | VS^3-NET (single model) Kangwon National University in South Korea | 67.897 | 70.884 |
36 Jun 24, 2018 | KACTEIL-MRC(GFN-Net) (single model) Kangwon National University, Natural Language Processing Lab. | 68.213 | 70.878 |
36 Sep 13, 2018 | BiDAF++ with pair2vec (single model) UW and FAIR | 68.021 | 71.583 |
37 Jan 01, 2019 | EBB-Net (single model) Enliple AI | 66.610 | 70.303 |
38 Jun 25, 2018 | KakaoNet2 (single model) Kakao NLP Team | 65.719 | 69.381 |
39 Jul 11, 2018 | abcNet (single model) Fudan University & Liulishuo AI Lab | 65.256 | 69.206 |
39 Sep 13, 2018 | BiDAF++ (single model) UW and FAIR | 65.651 | 68.866 |
40 Jun 27, 2018 | BSAE AddText (single model) reciTAL.ai | 63.338 | 67.422 |
41 Aug 14, 2018 | eeAttNet (single model) BBD NLP Team https://www.bbdservice.com | 63.327 | 66.633 |
41 May 30, 2018 | BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence [modified by Stanford] | 63.372 | 66.251 |
42 Nov 27, 2018 | Tree-LSTM + BiDAF + ELMo (single model) Carnegie Mellon University | 57.707 | 62.341 |
42 May 30, 2018 | BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence [modified by Stanford] | 59.332 | 62.305 |
43 May 30, 2018 | BiDAF-No-Answer (single model) University of Washington [modified by Stanford] | 59.174 | 62.093 |
Here are the ExactMatch (EM) and F1 scores evaluated on the test set of SQuAD v1.1.
Rank | Model | EM | F1 |
---|---|---|---|
Human Performance Stanford University (Rajpurkar et al. '16) | 82.304 | 91.221 | |
1 Oct 05, 2018 | BERT (ensemble) Google AI Language https://arxiv.org/abs/1810.04805 | 87.433 | 93.160 |
2 Oct 05, 2018 | BERT (single model) Google AI Language https://arxiv.org/abs/1810.04805 | 85.083 | 91.835 |
2 Sep 09, 2018 | nlnet (ensemble) Microsoft Research Asia | 85.356 | 91.202 |
2 Sep 26, 2018 | nlnet (ensemble) Microsoft Research Asia | 85.954 | 91.677 |
3 Jul 11, 2018 | QANet (ensemble) Google Brain & CMU | 84.454 | 90.490 |
4 Jul 08, 2018 | r-net (ensemble) Microsoft Research Asia | 84.003 | 90.147 |
5 Mar 19, 2018 | QANet (ensemble) Google Brain & CMU | 83.877 | 89.737 |
5 Sep 09, 2018 | nlnet (single model) Microsoft Research Asia | 83.468 | 90.133 |
5 Jun 20, 2018 | MARS (ensemble) YUANFUDAO research NLP | 83.982 | 89.796 |
6 Sep 01, 2018 | MARS (single model) YUANFUDAO research NLP | 83.185 | 89.547 |
7 Jan 22, 2018 | Hybrid AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 82.482 | 89.281 |
7 Jun 21, 2018 | MARS (single model) YUANFUDAO research NLP | 83.122 | 89.224 |
7 Jun 20, 2018 | QANet (single) Google Brain & CMU | 82.471 | 89.306 |
7 Mar 06, 2018 | QANet (ensemble) Google Brain & CMU | 82.744 | 89.045 |
7 Feb 19, 2018 | Reinforced Mnemonic Reader + A2D (ensemble model) Microsoft Research Asia & NUDT | 82.849 | 88.764 |
8 Feb 27, 2018 | QANet (single model) Google Brain & CMU | 82.209 | 88.608 |
8 Feb 02, 2018 | Reinforced Mnemonic Reader (ensemble model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 82.283 | 88.533 |
8 Jan 03, 2018 | r-net+ (ensemble) Microsoft Research Asia | 82.650 | 88.493 |
8 May 09, 2018 | MARS (single model) YUANFUDAO research NLP | 82.587 | 88.880 |
9 Jan 05, 2018 | SLQA+ (ensemble) Alibaba iDST NLP | 82.440 | 88.607 |
10 Apr 23, 2018 | r-net (single model) Microsoft Research Asia | 81.391 | 88.170 |
10 Dec 22, 2017 | AttentionReader+ (ensemble) Tencent DPDAC NLP | 81.790 | 88.163 |
10 Dec 23, 2018 | MMIPN Single | 81.580 | 88.948 |
11 Dec 17, 2018 | ARSG-BERT (single model) TRINITI RESEARCH LABS, Active.ai https://active.ai | 81.307 | 88.909 |
11 Dec 17, 2017 | r-net (ensemble) Microsoft Research Asia http://aka.ms/rnet | 82.136 | 88.126 |
11 May 09, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 81.538 | 88.130 |
12 Apr 03, 2018 | KACTEIL-MRC(GF-Net+) (ensemble) Kangwon National University, Natural Language Processing Lab. | 81.496 | 87.557 |
12 May 09, 2018 | Reinforced Mnemonic Reader + A2D + DA (single model) Microsoft Research Asia & NUDT | 81.401 | 88.122 |
13 Feb 12, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 80.489 | 87.454 |
13 Nov 17, 2017 | BiDAF + Self Attention + ELMo (ensemble) Allen Institute for Artificial Intelligence | 81.003 | 87.432 |
13 Feb 27, 2018 | QANet (single model) Google Brain & CMU | 80.929 | 87.773 |
14 Feb 19, 2018 | Reinforced Mnemonic Reader + A2D (single model) Microsoft Research Asia & NUDT | 80.919 | 87.492 |
15 Apr 12, 2018 | AVIQA+ (ensemble) aviqa team | 80.615 | 87.311 |
16 Jan 13, 2018 | SLQA+ single model | 80.436 | 87.021 |
17 Jan 12, 2018 | EAZI+ (ensemble) Yiwise NLP Group | 80.426 | 86.912 |
17 Jan 04, 2018 | {EAZI} (ensemble) Yiwise NLP Group | 80.436 | 86.912 |
18 Mar 20, 2018 | DNET (ensemble) QA geeks | 80.164 | 86.721 |
18 Jan 22, 2018 | Hybrid AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research | 80.027 | 87.288 |
19 Feb 12, 2018 | BiDAF + Self Attention + ELMo + A2D (single model) Microsoft Research Asia & NUDT | 79.996 | 86.711 |
20 Jan 29, 2018 | Reinforced Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 79.545 | 86.654 |
20 Feb 23, 2018 | MAMCN+ (single model) Samsung Research | 79.692 | 86.727 |
20 Apr 10, 2018 | Unnamed submission by null | 80.027 | 86.612 |
21 Dec 28, 2017 | SLQA+ (single model) Alibaba iDST NLP | 79.199 | 86.590 |
21 Jan 03, 2018 | r-net+ (single model) Microsoft Research Asia | 79.901 | 86.536 |
22 Dec 05, 2017 | SAN (ensemble model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556 | 79.608 | 86.496 |
23 Oct 17, 2017 | Interactive AoA Reader+ (ensemble) Joint Laboratory of HIT and iFLYTEK | 79.083 | 86.450 |
23 Nov 05, 2018 | MIR-MRC(F-Net) (single model) ForceWin, KP Lab. | 79.083 | 86.288 |
24 Jun 01, 2018 | MDReader single model | 79.031 | 86.006 |
24 Feb 01, 2018 | Unnamed submission by null | 78.999 | 86.151 |
25 Oct 24, 2017 | FusionNet (ensemble) Microsoft Business AI Solutions Team https://arxiv.org/abs/1711.07341 | 78.978 | 86.016 |
26 Oct 22, 2017 | DCN+ (ensemble) Salesforce Research https://arxiv.org/abs/1711.00106 | 78.852 | 85.996 |
27 Nov 03, 2017 | BiDAF + Self Attention + ELMo (single model) Allen Institute for Artificial Intelligence | 78.580 | 85.833 |
27 Mar 29, 2018 | KACTEIL-MRC(GF-Net+) (single model) Kangwon National University, Natural Language Processing Lab. | 78.664 | 85.780 |
28 May 09, 2018 | KakaoNet (single model) Kakao NLP Team | 78.401 | 85.724 |
29 Nov 30, 2017 | SLQA(ensemble) Alibaba iDST NLP | 78.328 | 85.682 |
29 Jan 02, 2018 | Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504 | 78.433 | 85.517 |
29 Mar 19, 2018 | aviqa (ensemble) aviqa team | 78.496 | 85.469 |
29 Jun 01, 2018 | MDReader0 single model | 78.171 | 85.543 |
29 Jan 03, 2018 | MEMEN (single model) Zhejiang University https://arxiv.org/abs/1707.09098 | 78.234 | 85.344 |
29 Sep 18, 2018 | BiDAF++ with pair2vec (single model) UW and FAIR | 78.223 | 85.535 |
30 Jan 29, 2018 | test single | 78.087 | 85.348 |
31 Jul 25, 2017 | Interactive AoA Reader (ensemble) Joint Laboratory of HIT and iFLYTEK Research | 77.845 | 85.297 |
32 Jan 10, 2018 | Unnamed submission by null | 77.436 | 85.130 |
32 Mar 20, 2018 | DNET (single model) QA geeks | 77.646 | 84.905 |
33 Sep 18, 2018 | BiDAF++ (single model) UW and FAIR | 77.573 | 84.858 |
33 Dec 13, 2017 | RaSoR + TR + LM (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609 | 77.583 | 84.163 |
33 Apr 10, 2018 | Unnamed submission by null | 77.489 | 84.735 |
33 Dec 06, 2017 | AttentionReader+ (single) Tencent DPDAC NLP | 77.342 | 84.925 |
34 Nov 06, 2017 | Conductor-net (ensemble) CMU https://arxiv.org/abs/1710.10504 | 76.996 | 84.630 |
34 Sep 26, 2018 | {gqa} (single model) FAIR | 77.090 | 83.931 |
34 Dec 21, 2017 | Jenga (ensemble) Facebook AI Research | 77.237 | 84.466 |
34 Jan 23, 2018 | MARS (single model) YUANFUDAO research NLP | 76.859 | 84.739 |
35 Nov 01, 2017 | SAN (single model) Microsoft Business AI Solutions Team https://arxiv.org/abs/1712.03556 | 76.828 | 84.396 |
36 Oct 13, 2017 | r-net (single model) Microsoft Research Asia http://aka.ms/rnet | 76.461 | 84.265 |
36 Dec 19, 2017 | FRC (single model) in review | 76.240 | 84.599 |
36 May 14, 2018 | VS^3-NET (single model) Kangwon National University in South Korea | 76.775 | 84.491 |
37 Oct 22, 2017 | Conductor-net (ensemble) CMU | 76.146 | 83.991 |
38 Sep 08, 2017 | FusionNet (single model) Microsoft Business AI Solutions team https://arxiv.org/abs/1711.07341 | 75.968 | 83.900 |
38 Oct 18, 2018 | KAR (single model) York University https://arxiv.org/abs/1809.03449 | 76.125 | 83.538 |
39 Jul 14, 2017 | smarnet (ensemble) Eigen Technology & Zhejiang University | 75.989 | 83.475 |
39 Oct 22, 2017 | Interactive AoA Reader+ (single model) Joint Laboratory of HIT and iFLYTEK | 75.821 | 83.843 |
39 Mar 15, 2018 | AVIQA-v2 (single model) aviqa team | 75.926 | 83.305 |
40 Oct 05, 2018 | Unnamed submission by null | 74.950 | 83.294 |
40 Aug 18, 2017 | RaSoR + TR (single model) Tel-Aviv University https://arxiv.org/abs/1712.03609 | 75.789 | 83.261 |
41 Oct 23, 2017 | DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106 | 75.087 | 83.081 |
42 Feb 13, 2018 | SSR-BiDAF ensemble model | 74.541 | 82.477 |
42 Nov 01, 2017 | Mixed model (ensemble) Sean | 75.265 | 82.769 |
43 Jan 02, 2018 | Conductor-net (single model) CMU https://arxiv.org/abs/1710.10504 | 74.405 | 82.742 |
43 Nov 17, 2017 | two-attention-self-attention (ensemble) guotong1988 | 75.223 | 82.716 |
43 May 21, 2017 | MEMEN (ensemble) Eigen Technology & Zhejiang University https://arxiv.org/abs/1707.09098 | 75.370 | 82.658 |
44 Mar 09, 2017 | ReasoNet (ensemble) MSR Redmond https://arxiv.org/abs/1609.05284 | 75.034 | 82.552 |
45 Aug 14, 2018 | eeAttNet (single model) BBD NLP Team https://www.bbdservice.com | 74.604 | 82.501 |
45 Jul 10, 2017 | DCN+ (single model) Salesforce Research https://arxiv.org/abs/1711.00106 | 74.866 | 82.806 |
45 Feb 06, 2018 | Jenga (single model) Facebook AI Research | 74.373 | 82.845 |
45 Oct 27, 2017 | Unnamed submission by null | 74.489 | 82.312 |
45 Oct 31, 2017 | SLQA (single model) Alibaba iDST NLP | 74.489 | 82.815 |
46 Jul 14, 2017 | Mnemonic Reader (ensemble) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 74.268 | 82.371 |
47 Dec 23, 2017 | S^3-Net (ensemble) Kangwon National University in South Korea | 74.121 | 82.342 |
48 Jul 29, 2017 | SEDT (ensemble model) CMU https://arxiv.org/abs/1703.00572 | 74.090 | 81.761 |
49 Jul 06, 2017 | SSAE (ensemble) Tsinghua University | 74.080 | 81.665 |
49 Dec 14, 2017 | Jenga (single model) Facebook AI Research | 73.303 | 81.754 |
49 Nov 06, 2017 | Conductor-net (single) CMU https://arxiv.org/abs/1710.10504 | 73.240 | 81.933 |
49 Jul 25, 2017 | Interactive AoA Reader (single model) Joint Laboratory of HIT and iFLYTEK Research | 73.639 | 81.931 |
49 Apr 22, 2017 | SEDT+BiDAF (ensemble) CMU https://arxiv.org/abs/1703.00572 | 73.723 | 81.530 |
49 Jan 24, 2017 | Multi-Perspective Matching (ensemble) IBM Research https://arxiv.org/abs/1612.04211 | 73.765 | 81.257 |
49 Feb 22, 2017 | BiDAF (ensemble) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603 | 73.744 | 81.525 |
50 May 01, 2017 | jNet (ensemble) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617 | 73.010 | 81.517 |
51 Oct 22, 2017 | Conductor-net (single) CMU | 72.590 | 81.415 |
51 Apr 17, 2018 | Unnamed submission by null | 72.831 | 80.622 |
51 Nov 16, 2017 | two-attention-self-attention (single model) guotong1988 | 72.600 | 81.011 |
51 Apr 12, 2017 | T-gating (ensemble) Peking University | 72.758 | 81.001 |
51 Apr 17, 2018 | Unnamed submission by null | 72.831 | 80.622 |
51 Sep 20, 2017 | BiDAF + Self Attention (single model) Allen Institute for Artificial Intelligence https://arxiv.org/abs/1710.10723 | 72.139 | 81.048 |
52 Dec 15, 2017 | S^3-Net (single model) Kangwon National University in South Korea | 71.908 | 81.023 |
52 Mar 03, 2018 | AVIQA (single model) aviqa team | 72.485 | 80.550 |
53 Nov 06, 2017 | attention+self-attention (single model) guotong1988 | 71.698 | 80.462 |
54 Nov 01, 2016 | Dynamic Coattention Networks (ensemble) Salesforce Research https://arxiv.org/abs/1611.01604 | 71.625 | 80.383 |
55 Jul 14, 2017 | smarnet (single model) Eigen Technology & Zhejiang University https://arxiv.org/abs/1710.02772 | 71.415 | 80.160 |
56 Jul 14, 2017 | Mnemonic Reader (single model) NUDT and Fudan University https://arxiv.org/abs/1705.02798 | 70.995 | 80.146 |
56 Apr 13, 2017 | QFASE NUS | 71.898 | 79.989 |
57 Apr 22, 2018 | MAMCN (single model) Samsung Research | 70.985 | 79.939 |
57 Oct 27, 2017 | M-NET (single) UFL | 71.016 | 79.835 |
58 Mar 24, 2017 | jNet (single model) USTC & National Research Council Canada & York University https://arxiv.org/abs/1703.04617 | 70.607 | 79.821 |
58 May 23, 2018 | AttReader (single) College of Computer & Information Science, SouthWest University, Chongqing, China | 71.373 | 79.725 |
59 Apr 02, 2017 | Ruminating Reader (single model) New York University https://arxiv.org/abs/1704.07415 | 70.639 | 79.456 |
59 May 13, 2017 | RaSoR (single model) Google NY, Tel-Aviv University https://arxiv.org/abs/1611.01436 | 70.849 | 78.741 |
59 Mar 14, 2017 | Document Reader (single model) Facebook AI Research https://arxiv.org/abs/1704.00051 | 70.733 | 79.353 |
59 Dec 28, 2016 | FastQAExt German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816 | 70.849 | 78.857 |
59 Mar 08, 2017 | ReasoNet (single model) MSR Redmond https://arxiv.org/abs/1609.05284 | 70.555 | 79.364 |
60 Apr 14, 2017 | Multi-Perspective Matching (single model) IBM Research https://arxiv.org/abs/1612.04211 | 70.387 | 78.784 |
61 Aug 30, 2017 | SimpleBaseline (single model) Technical University of Vienna | 69.600 | 78.236 |
61 Feb 05, 2018 | SSR-BiDAF single model | 69.443 | 78.358 |
62 Apr 12, 2017 | SEDT+BiDAF (single model) CMU https://arxiv.org/abs/1703.00572 | 68.478 | 77.971 |
63 Jun 25, 2017 | PQMN (single model) KAIST & AIBrain & Crosscert | 68.331 | 77.783 |
64 Apr 12, 2017 | T-gating (single model) Peking University | 68.132 | 77.569 |
64 Jul 29, 2017 | SEDT (single model) CMU https://arxiv.org/abs/1703.00572 | 68.163 | 77.527 |
65 Nov 28, 2016 | BiDAF (single model) Allen Institute for AI & University of Washington https://arxiv.org/abs/1611.01603 | 67.974 | 77.323 |
65 Jan 22, 2018 | FABIR Single Model https://arxiv.org/abs/1810.09580 | 67.744 | 77.605 |
65 Dec 28, 2016 | FastQA German Research Center for Artificial Intelligence https://arxiv.org/abs/1703.04816 | 68.436 | 77.070 |
65 Feb 22, 2018 | Unnamed submission by null | 68.425 | 77.077 |
65 Feb 22, 2018 | Unnamed submission by null | 68.478 | 77.220 |
66 Sep 19, 2017 | AllenNLP BiDAF (single model) Allen Institute for AI http://allennlp.org/ | 67.618 | 77.151 |
66 Oct 26, 2016 | Match-LSTM with Ans-Ptr (Boundary) (ensemble) Singapore Management University https://arxiv.org/abs/1608.07905 | 67.901 | 77.022 |
67 Feb 05, 2017 | Iterative Co-attention Network Fudan University | 67.502 | 76.786 |
68 Nov 01, 2016 | Dynamic Coattention Networks (single model) Salesforce Research https://arxiv.org/abs/1611.01604 | 66.233 | 75.896 |
68 Jan 03, 2018 | newtest single model | 66.527 | 75.787 |
69 Feb 24, 2018 | Unnamed submission by null | 65.992 | 75.469 |
70 Jan 10, 2018 | Unnamed submission by null | 64.796 | 74.272 |
71 Dec 09, 2017 | Unnamed submission by ravioncodalab | 64.439 | 73.921 |
71 Oct 26, 2016 | Match-LSTM with Bi-Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905 | 64.744 | 73.743 |
72 Feb 19, 2017 | Attentive CNN context with LSTM NLPR, CASIA | 63.306 | 73.463 |
72 Sep 21, 2017 | OTF dict+spelling (single) University of Montreal https://arxiv.org/abs/1706.00286 | 64.083 | 73.056 |
73 Sep 21, 2017 | OTF spelling (single) University of Montreal https://arxiv.org/abs/1706.00286 | 62.897 | 72.016 |
73 Nov 02, 2016 | Fine-Grained Gating Carnegie Mellon University https://arxiv.org/abs/1611.01724 | 62.446 | 73.327 |
73 Sep 21, 2017 | OTF spelling+lemma (single) University of Montreal https://arxiv.org/abs/1706.00286 | 62.604 | 71.968 |
74 Sep 28, 2016 | Dynamic Chunk Reader IBM https://arxiv.org/abs/1610.09996 | 62.499 | 70.956 |
75 Aug 27, 2016 | Match-LSTM with Ans-Ptr (Boundary) Singapore Management University https://arxiv.org/abs/1608.07905 | 60.474 | 70.695 |
76 Sep 11, 2018 | Unnamed submission by Will_Wu | 59.058 | 69.436 |
77 Jan 10, 2018 | Unnamed submission by null | 58.764 | 69.276 |
78 Aug 27, 2016 | Match-LSTM with Ans-Ptr (Sentence) Singapore Management University https://arxiv.org/abs/1608.07905 | 54.505 | 67.748 |
79 Nov 14, 2018 | Unnamed submission by jinhyuklee | 52.544 | 62.780 |
80 Oct 26, 2018 | Unnamed submission by minjoon | 52.533 | 62.757 |