Automatic Translation Error Analysis
Members
Mark Fishel, Dan Zeman, Maja Popović, Jan Berka, Ondřej Bojar, Joachim van den Bogaert, Suhel Jaber, Arianna Bisazza, Sabine Hunsicker, Martin Popel
Links
Project Guide
Addicter Home
MTM Slides
Hjerson Home
MTM Slides
Summary
- Cross-evaluate Addicter and Hjerson on each other's datasets
- Get hands dirty with applying both systems to a dataset of your choice
- Improve both systems
Some analysis
- Hjerson is also evaluated on ranking error types and systems (less strict than prec/rec, still very useful)
- Hjerson abuses
lex
(and reord
); Addicter abuses miss
and extra
much more seriously -- have to use a less restrictive alignment in Addicter (and possibly restrict Hjerson's alignment a bit)
Result summary:
- applying both to WMT'11 en-de (Sabine)
- applying both to IWSLT'11 ar-en (Arianna, Suhel)
- evaluating both on WMT'09 de-en
- Berkeley alignment (Arianna)
- greedy smarter alignment (Martin)
- evaluating Hjerson on WMT'09 en-cz (Maja)
- lemma-wer Hjerson (maja)
- friendlier Addicter (Dan)
De->En data set, overall accuracy, Addicter and Hjerson:
flexible human error analysis
MT system | jane | pbt | rpbt | mix |
berk-addicter | 41.1/41.0 | 48.0/48.4 | 46.8/46.8 | 45.3/45.2 |
berkwmt-addicter | 51.2/51.2 | 55.4/56.8 | 55.0/54.9 | 53.9/54.2 |
greedy-addicter | 49.7/50.7 | 55.9/55.9 | 54.4/53.9 | 53.3/53.4 |
hmm-addicter | 48.9/48.8 | 55.4/53.2 | 54.4/51.1 | 52.9/51.0 |
default-hjerson | 51.8/53.6 | 56.1/57.3 | 54.3/57.3 | 54.0/56.0 |
lemmawer-hjerson | 51.7/53.7 | 56.2/57.7 | 53.7/57.3 | 53.9/56.1 |
En->Cz data set, overall accurracy, Hjerson:
free human error analysis
MT system | bojar 1/2 | tectomt 1/2 | google 1/2 | pctrans 1/2 |
standard WER | 46.9/46.5 | 44.0/44.2 | 45.3/44.9 | 43.3/43.1 |
lemma WER | 48.3/48.0 | 44.6/45.0 | 46.3/46.1 | 43.9/43.9 |
Detailed results
En->Cz data set, Hjerson:
Rank correlations (annotator1/annotator2/Hjerson)
| infl | reord | miss | ext | lex | rho |
bojar | 333/320/459 | 72/66/474 | 149/134/313 | 147/142/240 | 379/309/1527 | 0.400/0.100 |
tectomt | 310/319/450 | 85/71/450 | 122/108/312 | 207/166/226 | 612/528/1813 | 0.475/0.475 |
google | 360/341/494 | 81/74/542 | 80/64/175 | 190/172/368 | 369/319/1523 | 0.700/0.500 |
pctrans | 351/341/428 | 95/89/506 | 69/57/237 | 168/160/326 | 467/412/1786 | 0.700/0.700 |
rho | 0.400/0.150 | 0.400/0.800 | 0.800/0.800 | -0.200/0.400 | 0.800/1.000 |
lemma-WER:
| infl | reord | miss | ext | lex | rho |
bojar | 333/320/459 | 72/66/424 | 149/134/374 | 147/142/296 | 379/309/1462 | 0.700/0.500 |
tectomt | 310/319/450 | 85/71/447 | 122/108/367 | 207/166/278 | 612/528/1751 | 0.600/0.600 |
google | 360/341/494 | 81/74/506 | 80/64/216 | 190/172/420 | 369/319/1473 | 0.700/0.500 |
pctrans | 351/341/428 | 95/89/482 | 69/57/273 | 168/160/354 | 467/412/1741 | 0.700/0.700 |
rho | 0.400/0.150 | 0.400/0.800 | 0.800/0.800 | -0.200/0.400 | 0.800/0.600 |
Confusions (hypothesis only):
bojar 1/2 | infl | reord | ext | lex | x |
infl | 154/135 | 14/6 | 8/7 | 9/10 | 282/253 |
reord | 9/6 | 21/20 | 10/7 | 19/22 | 417/368 |
ext | 20/20 | 8/8 | 21/24 | 47/35 | 149/139 |
lex | 149/161 | 40/34 | 97/97 | 284/227 | 968/855 |
x | 19/18 | 10/13 | 11/7 | 33/29 | 1527/1371 |
tectomt 1/2 | infl | reord | ext | lex | x |
infl | 118/119 | 19/11 | 9/7 | 29/21 | 287/260 |
reord | 9/10 | 26/22 | 18/13 | 30/28 | 397/350 |
ext | 23/22 | 11/6 | 40/20 | 68/57 | 90/104 |
lex | 168/178 | 48/49 | 129/117 | 459/409 | 1041/897 |
x | 11/11 | 9/12 | 12/9 | 42/32 | 1235/1112 |
google 1/2 | infl | reord | ext | lex | x |
infl | 170/145 | 17/9 | 13/9 | 9/13 | 297/274 |
reord | 15/9 | 20/20 | 13/12 | 14/15 | 482/417 |
ext | 29/30 | 10/10 | 41/37 | 59/55 | 237/215 |
lex | 148/158 | 31/31 | 113/106 | 275/226 | 1034/904 |
x | 17/16 | 14/14 | 10/8 | 32/25 | 1598/1433 |
pctrans 1/2 | infl | reord | ext | lex | x |
infl | 120/125 | 16/11 | 10/12 | 10/12 | 280/239 |
reord | 8/3 | 29/24 | 8/7 | 11/18 | 452/387 |
ext | 38/33 | 10/12 | 46/38 | 55/48 | 181/176 |
lex | 186/188 | 49/44 | 101/101 | 376/319 | 1117/978 |
x | 9/11 | 15/19 | 4/2 | 27/29 | 1368/1215 |
lemma-WER:
bojar 1/2 | infl | reord | ext | lex | x |
infl | 154/135 | 14/6 | 8/7 | 9/10 | 279/250 |
reord | 5/5 | 20/18 | 8/7 | 16/20 | 377/325 |
ext | 23/24 | 9/10 | 35/29 | 52/41 | 184/181 |
lex | 145/156 | 39/32 | 82/92 | 277/221 | 922/800 |
x | 24/20 | 11/15 | 14/7 | 38/31 | 1581/1430 |
tectomt 1/2 | infl | reord | ext | lex | x |
infl | 118/119 | 19/11 | 9/7 | 29/21 | 287/260 |
reord | 6/9 | 26/23 | 18/13 | 27/27 | 372/324 |
ext | 20/23 | 12/6 | 40/29 | 77/60 | 127/133 |
lex | 167/171 | 47/49 | 129/107 | 449/406 | 1001/866 |
x | 18/18 | 9/11 | 12/10 | 46/33 | 1263/1140 |
google 1/2 | infl | reord | ext | lex | x |
infl | 170/145 | 16/8 | 13/9 | 9/13 | 297/274 |
reord | 13/9 | 21/20 | 11/9 | 13/17 | 450/385 |
ext | 29/31 | 11/10 | 52/51 | 68/62 | 268/244 |
lex | 145/156 | 34/31 | 110/90 | 265/217 | 991/862 |
x | 22/17 | 14/15 | 14/13 | 34/25 | 1642/1478 |
pctrans 1/2 | infl | reord | ext | lex | x |
infl | 119/125 | 16/11 | 10/12 | 10/11 | 279/238 |
reord | 7/3 | 28/23 | 7/5 | 13/19 | 428/370 |
ext | 28/28 | 6/10 | 56/50 | 64/49 | 199/195 |
lex | 194/192 | 48/46 | 90/89 | 366/317 | 1094/953 |
x | 13/12 | 16/20 | 6/4 | 26/30 | 1398/1239 |
Recall:
bojar 1, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 43.87/43.87 | 2.56/1.42 | 5.70/6.55 | 42.45/41.31 | 5.41/6.84 |
hum reord | 15.05/15.05 | 22.58/21.51 | 8.60/9.68 | 43.01/41.94 | 10.75/11.83 |
hum ext | 5.44/5.44 | 6.80/5.44 | 14.29/23.81 | 65.99/55.78 | 7.48/9.52 |
hum lex | 2.30/2.30 | 4.85/4.08 | 11.99/13.27 | 72.45/70.66 | 8.42/9.69 |
hum x | 8.44/8.35 | 12.47/11.28 | 4.46/5.50 | 28.96/27.58 | 45.68/47.29 |
bojar 2, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 39.71/39.71 | 1.76/1.47 | 5.88/7.06 | 47.35/45.88 | 5.29/5.88 |
hum reord | 7.41/7.41 | 24.69/22.22 | 9.88/12.35 | 41.98/39.51 | 16.05/18.52 |
hum ext | 4.93/4.93 | 4.93/4.93 | 16.90/20.42 | 68.31/64.79 | 4.93/4.93 |
hum lex | 3.10/3.10 | 6.81/6.19 | 10.84/12.69 | 70.28/68.42 | 8.98/9.60 |
hum x | 8.47/8.37 | 12.32/10.88 | 4.66/6.06 | 28.63/26.79 | 45.91/47.89 |
tectomt 1, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 35.87/35.87 | 2.74/1.82 | 6.99/6.08 | 51.06/50.76 | 3.34/5.47 |
hum reord | 16.81/16.81 | 23.01/23.01 | 9.73/10.62 | 42.48/41.59 | 7.96/7.96 |
hum ext | 4.33/4.33 | 8.65/8.17 | 19.23/23.08 | 62.02/57.21 | 5.77/7.21 |
hum lex | 4.62/4.62 | 4.78/4.30 | 10.83/12.26 | 73.09/71.50 | 6.69/7.32 |
hum x | 9.41/9.41 | 13.02/12.20 | 2.95/4.16 | 34.13/32.82 | 40.49/41.41 |
tectomt 2, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 35.00/35.00 | 2.94/2.65 | 6.47/6.76 | 52.35/50.29 | 3.24/5.29 |
hum reord | 11.00/11.00 | 22.00/23.00 | 6.00/6.00 | 49.00/49.00 | 12.00/11.00 |
hum ext | 4.22/4.22 | 7.83/7.83 | 12.05/17.47 | 70.48/64.46 | 5.42/6.02 |
hum lex | 3.84/3.84 | 5.12/4.94 | 10.42/10.97 | 74.77/74.22 | 5.85/6.03 |
hum x | 9.55/9.55 | 12.85/11.90 | 3.82/4.88 | 32.94/31.80 | 40.84/41.87 |
google 1, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 44.85/44.85 | 3.96/3.43 | 7.65/7.65 | 39.05/38.26 | 4.49/5.80 |
hum reord | 17.71/16.67 | 20.83/21.88 | 10.42/11.46 | 37.50/35.42 | 13.54/14.58 |
hum ext | 6.84/6.50 | 6.84/5.50 | 21.58/26.00 | 59.47/55.00 | 5.26/7.00 |
hum lex | 2.31/2.31 | 3.60/3.34 | 15.17/17.48 | 70.69/68.12 | 8.23/8.74 |
hum x | 8.14/8.14 | 13.21/12.34 | 6.50/7.35 | 28.34/27.17 | 43.80/45.01 |
google 2, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 40.50/40.50 | 2.51/2.51 | 8.38/8.66 | 44.13/43.58 | 4.47/4.75 |
hum reord | 10.71/9.52 | 23.81/23.81 | 11.90/11.90 | 36.90/36.90 | 16.67/17.86 |
hum ext | 5.23/5.23 | 6.98/5.23 | 21.51/29.65 | 61.63/52.33 | 4.65/7.56 |
hum lex | 3.89/3.89 | 4.49/5.09 | 16.47/18.56 | 67.66/64.97 | 7.49/7.49 |
hum x | 8.45/8.45 | 12.86/11.87 | 6.63/7.52 | 27.88/26.58 | 44.19/45.58 |
pctrans 1, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 33.24/32.96 | 2.22/1.94 | 10.53/7.76 | 51.52/53.74 | 2.49/3.60 |
hum reord | 13.45/14.04 | 24.37/24.56 | 8.40/5.26 | 41.18/42.11 | 12.61/14.04 |
hum ext | 5.92/5.92 | 4.73/4.14 | 27.22/33.14 | 59.76/53.25 | 2.37/3.55 |
hum lex | 2.09/2.09 | 2.30/2.71 | 11.48/13.36 | 78.50/76.41 | 5.64/5.43 |
hum x | 8.24/8.21 | 13.30/12.60 | 5.33/5.86 | 32.87/32.20 | 40.26/41.14 |
pctrans 2, full/lemma | aut infl | aut reord | aut ext | aut lex | aut x |
hum infl | 34.72/34.72 | 0.83/0.83 | 9.17/7.78 | 52.22/53.33 | 3.06/3.33 |
hum reord | 10.00/10.00 | 21.82/20.91 | 10.91/9.09 | 40.00/41.82 | 17.27/18.18 |
hum ext | 7.50/7.50 | 4.38/3.12 | 23.75/31.25 | 63.12/55.62 | 1.25/2.50 |
hum lex | 2.82/2.58 | 4.23/4.46 | 11.27/11.50 | 74.88/74.41 | 6.81/7.04 |
hum x | 7.98/7.95 | 12.92/12.35 | 5.88/6.51 | 32.65/31.82 | 40.57/41.37 |
Precision:
bojar 1, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 32.98/33.19 | 3.00/3.02 | 1.71/1.72 | 1.93/1.94 | 60.39/60.13 |
aut reord | 1.89/1.17 | 4.41/4.69 | 2.10/1.88 | 3.99/3.76 | 87.61/88.50 |
aut ext | 8.16/7.59 | 3.27/2.97 | 8.57/11.55 | 19.18/17.16 | 60.82/60.73 |
aut lex | 9.69/9.90 | 2.60/2.66 | 6.31/5.60 | 18.47/18.91 | 62.94/62.94 |
aut x | 1.19/1.44 | 0.62/0.66 | 0.69/0.84 | 2.06/2.28 | 95.44/94.78 |
bojar 2, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 32.85/33.09 | 1.46/1.47 | 1.70/1.72 | 2.43/2.45 | 61.56/61.27 |
aut reord | 1.42/1.33 | 4.73/4.80 | 1.65/1.87 | 5.20/5.33 | 87.00/86.67 |
aut ext | 8.85/8.42 | 3.54/3.51 | 10.62/10.18 | 15.49/14.39 | 61.50/63.51 |
aut lex | 11.72/11.99 | 2.47/2.46 | 7.06/7.07 | 16.52/16.99 | 62.23/61.49 |
aut x | 1.25/1.33 | 0.90/1.00 | 0.49/0.47 | 2.02/2.06 | 95.34/95.14 |
tectomt 1, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 25.54/25.54 | 4.11/4.11 | 1.95/1.95 | 6.28/6.28 | 62.12/62.12 |
aut reord | 1.88/1.34 | 5.42/5.80 | 3.75/3.79 | 6.25/6.03 | 82.71/83.04 |
aut ext | 9.91/7.04 | 4.74/4.23 | 17.24/16.90 | 29.31/27.11 | 38.79/44.72 |
aut lex | 9.11/9.37 | 2.60/2.64 | 6.99/6.67 | 24.88/25.18 | 56.42/56.14 |
aut x | 0.84/1.33 | 0.69/0.67 | 0.92/1.11 | 3.21/3.40 | 94.35/93.49 |
tectomt 2, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 28.47/28.47 | 2.63/2.63 | 1.67/1.67 | 5.02/5.02 | 62.20/62.20 |
aut reord | 2.36/2.27 | 5.20/5.81 | 3.07/3.28 | 6.62/6.82 | 82.74/81.82 |
aut ext | 10.53/9.16 | 2.87/2.39 | 9.57/11.55 | 27.27/23.90 | 49.76/52.99 |
aut lex | 10.79/10.69 | 2.97/3.06 | 7.09/6.69 | 24.79/25.39 | 54.36/54.16 |
aut x | 0.94/1.49 | 1.02/0.91 | 0.77/0.83 | 2.72/2.72 | 94.56/94.06 |
google 1, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 33.60/33.66 | 3.36/3.17 | 2.57/2.57 | 1.78/1.78 | 58.70/58.81 |
aut reord | 2.76/2.56 | 3.68/4.13 | 2.39/2.17 | 2.57/2.56 | 88.60/88.58 |
aut ext | 7.71/6.78 | 2.66/2.57 | 10.90/12.15 | 15.69/15.89 | 63.03/62.62 |
aut lex | 9.22/9.39 | 2.24/2.20 | 7.04/7.12 | 17.12/17.15 | 64.38/64.14 |
aut x | 1.02/1.27 | 0.78/0.81 | 0.60/0.81 | 1.92/1.97 | 95.69/95.13 |
google 2, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 32.22/32.29 | 2.00/1.78 | 2.00/2.00 | 2.89/2.90 | 60.89/61.02 |
aut reord | 1.90/2.05 | 4.23/4.55 | 2.54/2.05 | 3.17/3.86 | 88.16/87.50 |
aut ext | 8.65/7.79 | 2.88/2.51 | 10.66/12.81 | 15.85/15.58 | 61.96/61.31 |
aut lex | 11.09/11.50 | 2.18/2.29 | 7.44/6.64 | 15.86/16.00 | 63.44/63.57 |
aut x | 1.07/1.10 | 0.94/0.97 | 0.53/0.84 | 1.67/1.61 | 95.79/95.48 |
pctrans 1, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 27.52/27.42 | 3.67/3.69 | 2.29/2.30 | 2.29/2.30 | 64.22/64.29 |
aut reord | 1.57/1.45 | 5.71/5.80 | 1.57/1.45 | 2.17/2.69 | 88.98/88.61 |
aut ext | 11.52/7.93 | 3.03/1.70 | 13.94/15.86 | 16.67/18.13 | 54.85/56.37 |
aut lex | 10.17/10.83 | 2.68/2.68 | 5.52/5.02 | 20.56/20.42 | 61.07/61.05 |
aut x | 0.63/0.89 | 1.05/1.10 | 0.28/0.41 | 1.90/1.78 | 96.13/95.82 |
pctrans 2, full/lemma | hum infl | hum reord | hum ext | hum lex | hum x |
aut infl | 31.33/31.49 | 2.76/2.77 | 3.01/3.02 | 3.01/2.77 | 59.90/59.95 |
aut reord | 0.68/0.71 | 5.47/5.48 | 1.59/1.19 | 4.10/4.52 | 88.15/88.10 |
aut ext | 10.75/8.43 | 3.91/3.01 | 12.38/15.06 | 15.64/14.76 | 57.33/58.73 |
aut lex | 11.53/12.02 | 2.70/2.88 | 6.20/5.57 | 19.57/19.85 | 60.00/59.67 |
aut x | 0.86/0.92 | 1.49/1.53 | 0.16/0.31 | 2.27/2.30 | 95.22/94.94 |
De->En data set, detailed results (confusions/prec-s/rec-s):
Evaluating berk addicter (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1659 | 9 | 50 | 34 | 44 | | - | 1669 | 13 | 9 | 52 | 53 |
infl | 53 | 21 | 1 | 7 | 2 | | ext | 638 | 70 | 6 | 93 | 40 |
lex | 781 | 5 | 282 | 148 | 35 | | infl | 54 | 3 | 22 | 3 | 2 |
miss | 764 | 11 | 180 | 274 | 45 | | lex | 859 | 57 | 13 | 262 | 45 |
reord | 529 | 2 | 12 | 74 | 69 | | reord | 567 | 30 | 3 | 14 | 87 |
precision | 0.92 | 0.25 | 0.23 | 0.22 | 0.10 | | precision | 0.93 | 0.08 | 0.26 | 0.21 | 0.12 |
recall | 0.44 | 0.44 | 0.54 | 0.51 | 0.35 | | recall | 0.44 | 0.40 | 0.42 | 0.62 | 0.38 |
f1-score | 0.59 | 0.32 | 0.32 | 0.30 | 0.16 | | f1-score | 0.60 | 0.14 | 0.32 | 0.32 | 0.19 |
Evaluating berkwmt addicter (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1934 | 6 | 41 | 25 | 49 | | - | 1931 | 3 | 9 | 47 | 65 |
infl | 93 | 28 | 6 | 4 | 2 | | ext | 570 | 103 | 9 | 93 | 34 |
lex | 531 | 3 | 259 | 94 | 13 | | infl | 96 | 3 | 29 | 1 | 3 |
miss | 655 | 9 | 211 | 337 | 24 | | lex | 580 | 35 | 2 | 274 | 14 |
reord | 439 | 0 | 10 | 68 | 107 | | reord | 475 | 22 | 1 | 8 | 111 |
precision | 0.94 | 0.21 | 0.29 | 0.27 | 0.17 | | precision | 0.94 | 0.13 | 0.22 | 0.30 | 0.18 |
recall | 0.53 | 0.61 | 0.49 | 0.64 | 0.55 | | recall | 0.53 | 0.62 | 0.58 | 0.65 | 0.49 |
f1-score | 0.68 | 0.31 | 0.36 | 0.38 | 0.26 | | f1-score | 0.68 | 0.21 | 0.32 | 0.41 | 0.26 |
Evaluating default hjerson (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1816 | 0 | 11 | 31 | 32 | | - | 1854 | 20 | 0 | 21 | 38 |
infl | 59 | 42 | 7 | 9 | 5 | | ext | 101 | 36 | 0 | 16 | 9 |
lex | 827 | 4 | 394 | 216 | 24 | | infl | 61 | 1 | 45 | 6 | 9 |
miss | 205 | 0 | 97 | 202 | 17 | | lex | 876 | 73 | 3 | 361 | 41 |
reord | 599 | 0 | 15 | 29 | 117 | | reord | 589 | 20 | 2 | 19 | 130 |
precision | 0.96 | 0.34 | 0.27 | 0.39 | 0.15 | | precision | 0.96 | 0.22 | 0.37 | 0.27 | 0.17 |
recall | 0.52 | 0.91 | 0.75 | 0.41 | 0.60 | | recall | 0.53 | 0.24 | 0.90 | 0.85 | 0.57 |
f1-score | 0.67 | 0.50 | 0.40 | 0.40 | 0.25 | | f1-score | 0.68 | 0.23 | 0.52 | 0.41 | 0.26 |
Evaluating greedy addicter (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1941 | 8 | 36 | 23 | 48 | | - | 1947 | 10 | 9 | 34 | 56 |
infl | 105 | 34 | 5 | 8 | 1 | | ext | 398 | 77 | 0 | 95 | 14 |
lex | 553 | 2 | 258 | 92 | 3 | | infl | 105 | 3 | 36 | 4 | 3 |
miss | 463 | 1 | 201 | 330 | 16 | | lex | 591 | 42 | 2 | 262 | 8 |
reord | 663 | 1 | 26 | 98 | 127 | | reord | 689 | 55 | 4 | 32 | 146 |
precision | 0.94 | 0.22 | 0.28 | 0.33 | 0.14 | | precision | 0.95 | 0.13 | 0.24 | 0.29 | 0.16 |
recall | 0.52 | 0.74 | 0.49 | 0.60 | 0.65 | | recall | 0.52 | 0.41 | 0.71 | 0.61 | 0.64 |
f1-score | 0.67 | 0.34 | 0.36 | 0.42 | 0.23 | | f1-score | 0.67 | 0.20 | 0.36 | 0.39 | 0.25 |
Evaluating hmm addicter (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1947 | 6 | 7 | 14 | 52 | | - | 1939 | 9 | 7 | 8 | 63 |
infl | 111 | 35 | 8 | 9 | 2 | | ext | 1143 | 138 | 3 | 402 | 50 |
lex | 0 | 0 | 0 | 0 | 0 | | infl | 116 | 2 | 37 | 5 | 4 |
miss | 1166 | 4 | 502 | 451 | 40 | | lex | 0 | 0 | 0 | 0 | 0 |
reord | 304 | 1 | 11 | 20 | 101 | | reord | 310 | 2 | 3 | 12 | 110 |
precision | 0.96 | 0.21 | 0.00 | 0.21 | 0.23 | | precision | 0.96 | 0.08 | 0.23 | 0.00 | 0.25 |
recall | 0.55 | 0.76 | 0.00 | 0.91 | 0.52 | | recall | 0.55 | 0.91 | 0.74 | 0.00 | 0.48 |
f1-score | 0.70 | 0.33 | 0.00 | 0.34 | 0.32 | | f1-score | 0.70 | 0.15 | 0.35 | 0.00 | 0.33 |
Evaluating lemmawer hjerson (ref / hyp tables; left: auto / top: manual):
| - | infl | lex | miss | reord | | | - | ext | infl | lex | reord |
- | 1822 | 0 | 11 | 29 | 33 | | - | 1861 | 20 | 0 | 20 | 38 |
infl | 59 | 42 | 7 | 9 | 5 | | ext | 102 | 35 | 0 | 18 | 10 |
lex | 814 | 4 | 394 | 228 | 24 | | infl | 61 | 1 | 45 | 6 | 8 |
miss | 214 | 0 | 97 | 189 | 17 | | lex | 870 | 74 | 3 | 359 | 40 |
reord | 597 | 0 | 15 | 32 | 116 | | reord | 587 | 20 | 2 | 20 | 131 |
precision | 0.96 | 0.34 | 0.27 | 0.37 | 0.15 | | precision | 0.96 | 0.21 | 0.37 | 0.27 | 0.17 |
recall | 0.52 | 0.91 | 0.75 | 0.39 | 0.59 | | recall | 0.53 | 0.23 | 0.90 | 0.85 | 0.58 |
f1-score | 0.67 | 0.50 | 0.40 | 0.38 | 0.24 | | f1-score | 0.69 | 0.22 | 0.53 | 0.41 | 0.27 |
De->En data set, detailed results (error ranking):
Ranking evaluation for addicter, hmm (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
.jane | 17/69 | 98/173 | 109/695 | 67/645 | 173/0 | -0.100 |
.pbt | 13/55 | 62/138 | 203/749 | 54/554 | 193/0 | 0.300 |
.rpbt | 16/57 | 35/132 | 175/719 | 29/537 | 158/0 | 0.300 |
sys rank | 1.00 | 1.00 | 1.00 | 1.00 | 0.50 |
Ranking evaluation for addicter, greedy (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
jane | 17/66 | 98/380 | 109/293 | 67/243 | 173/326 | 0.700 |
pbt | 13/49 | 62/319 | 203/378 | 54/183 | 193/293 | 0.900 |
rpbt | 16/54 | 35/305 | 175/340 | 29/158 | 158/308 | 1.000 |
sys rank | 1.00 | 1.00 | 1.00 | 1.00 | -0.50 |
Ranking evaluation for addicter, berk (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
.jane | 17/31 | 98/289 | 109/400 | 67/350 | 173/452 | 0.900 |
.pbt | 13/26 | 62/244 | 203/441 | 54/246 | 193/412 | 0.900 |
.rpbt | 16/28 | 35/228 | 175/433 | 29/251 | 158/409 | 0.900 |
sys rank | 1.00 | 1.00 | 1.00 | 0.50 | 0.50 |
Ranking evaluation for addicter, berkwmt (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
.jane | 17/54 | 98/263 | 109/375 | 67/325 | 173/328 | 0.800 |
.pbt | 13/42 | 62/210 | 203/448 | 54/253 | 193/292 | 0.900 |
.rpbt | 16/47 | 35/210 | 175/413 | 29/231 | 158/299 | 0.900 |
sys rank | 1.00 | 0.86 | 1.00 | 1.00 | -0.50 |
Ranking evaluation for hjerson, default (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
.jane | 17/47 | 98/289 | 109/142 | 67/85 | 173/490 | 0.900 |
.pbt | 13/32 | 62/248 | 203/192 | 54/31 | 193/498 | 0.600 |
.rpbt | 16/43 | 35/223 | 175/187 | 29/46 | 158/477 | 0.700 |
sys rank | 1.00 | 1.00 | 1.00 | 0.50 | 1.00 |
Ranking evaluation for hjerson, lemmawer (manual/automatic counts)
system | infl | reord | miss | ext | lex | rho |
.jane | 17/47 | 98/289 | 109/140 | 67/86 | 173/491 | 0.900 |
.pbt | 13/32 | 62/246 | 203/194 | 54/33 | 193/494 | 0.700 |
.rpbt | 16/43 | 35/225 | 175/183 | 29/46 | 158/479 | 0.700 |
sys rank | 1.00 | 1.00 | 1.00 | 0.50 | 1.00 |