NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Re-evaluation the Role of Bleu in Machine Translation Research
Chris Callison-Burch
|
Miles Osborne
|
Philipp Koehn
|
Paper Details:
Month: April
Year: 2006
Location: Trento, Italy
Venue:
EACL |
Citations
URL
Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points
Ming Zhou
|
Bo Wang
|
Shujie Liu
|
Mu Li
|
Dongdong Zhang
|
Tiejun Zhao
|
Choosing the Right Translation: A Syntactically Informed Classification Approach
Simon Zwarts
|
Mark Dras
|
Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings
Francisco Guzmán
|
Houda Bouamor
|
Ramy Baly
|
Nizar Habash
|
A Systematic Comparison of Training Criteria for Statistical Machine Translation
Richard Zens
|
Saša Hasan
|
Hermann Ney
|
Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
David Chiang
|
Steve DeNeefe
|
Yee Seng Chan
|
Hwee Tou Ng
|
Feasibility of Human-in-the-loop Minimum Error Rate Training
Omar F. Zaidan
|
Chris Callison-Burch
|
Fast, Cheap, and Creative: Evaluating Translation Quality Using Amazon’s Mechanical Turk
Chris Callison-Burch
|
Automatic Evaluation of Translation Quality for Distant Language Pairs
Hideki Isozaki
|
Tsutomu Hirao
|
Kevin Duh
|
Katsuhito Sudoh
|
Hajime Tsukada
|
Corroborating Text Evaluation Results with Heterogeneous Measures
Enrique Amigó
|
Julio Gonzalo
|
Jesús Giménez
|
Felisa Verdejo
|
A Human Judgement Corpus and a Metric for Arabic MT Evaluation
Houda Bouamor
|
Hanan Alshikhabobakr
|
Behrang Mohit
|
Kemal Oflazer
|
Human Effort and Machine Learnability in Computer Aided Translation
Spence Green
|
Sida I. Wang
|
Jason Chuang
|
Jeffrey Heer
|
Sebastian Schuster
|
Christopher D. Manning
|
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
Chia-Wei Liu
|
Ryan Lowe
|
Iulian Serban
|
Mike Noseworthy
|
Laurent Charlin
|
Joelle Pineau
|
Measuring the behavioral impact of machine translation quality improvements with A/B testing
Ben Russell
|
Duncan Gillespie
|
Further Investigation into Reference Bias in Monolingual Evaluation of Machine Translation
Qingsong Ma
|
Yvette Graham
|
Timothy Baldwin
|
Qun Liu
|
Identifying Well-formed Natural Language Questions
Manaal Faruqui
|
Dipanjan Das
|
Towards a Better Metric for Evaluating Question Generation Systems
Preksha Nema
|
Mitesh M. Khapra
|
A Comparison of Merging Strategies for Translation of German Compounds
Sara Stymne
|
Improving Evaluation of Document-level Machine Translation Quality Estimation
Yvette Graham
|
Qingsong Ma
|
Timothy Baldwin
|
Qun Liu
|
Carla Parra
|
Carolina Scarton
|
Heterogeneous Automatic MT Evaluation Through Non-Parametric Metric Combinations
Jesús Giménez
|
Lluís Màrquez
|
Automatic Evaluation of Information Ordering: Kendall’s Tau
Mirella Lapata
|
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems
Ehud Reiter
|
Anja Belz
|
A Structured Review of the Validity of BLEU
Ehud Reiter
|
Applying Automated Metrics to Speech Translation Dialogs
Sherri Condon
|
Jon Phillips
|
Christy Doran
|
John Aberdeen
|
Dan Parvaz
|
Beatrice Oshika
|
Greg Sanders
|
Craig Schlenoff
|
Towards Heterogeneous Automatic MT Error Analysis
Jesús Giménez
|
Lluís Màrquez
|
Sensitivity of Automated MT Evaluation Metrics on Higher Quality MT Output: BLEU vs Task-Based Evaluation Methods
Bogdan Babych
|
Anthony Hartley
|
BLEU+: a Tool for Fine-Grained BLEU Computation
A. Cüneyd Tantuǧ
|
Kemal Oflazer
|
Ilknur Durgar El-Kahlout
|
Can we Evaluate the Quality of Generated Text?
David Hardcastle
|
Donia Scott
|
Improved Statistical Machine Translation Using Paraphrases
Chris Callison-Burch
|
Philipp Koehn
|
Miles Osborne
|
A Comparison of Pivot Methods for Phrase-Based Statistical Machine Translation
Masao Utiyama
|
Hitoshi Isahara
|
Graph-based Learning for Statistical Machine Translation
Andrei Alexandrescu
|
Katrin Kirchhoff
|
Exploiting Named Entity Classes in CCG Surface Realization
Rajakrishnan Rajkumar
|
Michael White
|
Dominic Espinosa
|
The Best Lexical Metric for Phrase-Based Statistical MT System Optimization
Daniel Cer
|
Christopher D. Manning
|
Daniel Jurafsky
|
Text Alignment for Real-Time Crowd Captioning
Iftekhar Naim
|
Daniel Gildea
|
Walter Lasecki
|
Jeffrey P. Bigham
|
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Zhen Yang
|
Wei Chen
|
Feng Wang
|
Bo Xu
|
Combination of Arabic Preprocessing Schemes for Statistical Machine Translation
Fatiha Sadat
|
Nizar Habash
|
GLEU: Automatic Evaluation of Sentence-Level Fluency
Andrew Mutton
|
Mark Dras
|
Stephen Wan
|
Robert Dale
|
A Re-examination of Machine Learning Approaches for Sentence-Level MT Evaluation
Joshua Albrecht
|
Rebecca Hwa
|
Enriching Morphologically Poor Languages for Statistical Machine Translation
Eleftherios Avramidis
|
Philipp Koehn
|
Robust Machine Translation Evaluation with Entailment Features
Sebastian Padó
|
Michel Galley
|
Dan Jurafsky
|
Christopher D. Manning
|
The Contribution of Linguistic Features to Automatic Machine Translation Evaluation
Enrique Amigó
|
Jesús Giménez
|
Julio Gonzalo
|
Felisa Verdejo
|
Comparing Objective and Subjective Measures of Usability in a Human-Robot Dialogue System
Mary Ellen Foster
|
Manuel Giuliani
|
Alois Knoll
|
The Backtranslation Score: Automatic MT Evalution at the Sentence Level without Reference Translations
Reinhard Rapp
|
Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation
Michael Bloodgood
|
Chris Callison-Burch
|
Evaluating Machine Translations Using mNCD
Marcus Dobrinkat
|
Tero Tapiovaara
|
Jaakko Väyrynen
|
Kimmo Kettunen
|
PORT: a Precision-Order-Recall MT Evaluation Metric for Tuning
Boxing Chen
|
Roland Kuhn
|
Samuel Larkin
|
Improving machine translation by training against an automatic semantic frame based evaluation metric
Chi-kiu Lo
|
Karteek Addanki
|
Markus Saers
|
Dekai Wu
|
A New Syntactic Metric for Evaluation of Machine Translation
Melania Duma
|
Cristina Vertan
|
Wolfgang Menzel
|
XMEANT: Better semantic MT evaluation without reference translations
Chi-kiu Lo
|
Meriem Beloucif
|
Markus Saers
|
Dekai Wu
|
Language Models for Image Captioning: The Quirks and What Works
Jacob Devlin
|
Hao Cheng
|
Hao Fang
|
Saurabh Gupta
|
Li Deng
|
Xiaodong He
|
Geoffrey Zweig
|
Margaret Mitchell
|
deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets
Michel Galley
|
Chris Brockett
|
Alessandro Sordoni
|
Yangfeng Ji
|
Michael Auli
|
Chris Quirk
|
Margaret Mitchell
|
Jianfeng Gao
|
Bill Dolan
|
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Xin Wang
|
Wenhu Chen
|
Yuan-Fang Wang
|
William Yang Wang
|
Re-evaluating Machine Translation Results with Paraphrase Support
Liang Zhou
|
Chin-Yew Lin
|
Eduard Hovy
|
Contextual Bitext-Derived Paraphrases in Automatic MT Evaluation
Karolina Owczarzak
|
Declan Groves
|
Josef Van Genabith
|
Andy Way
|
Manual and Automatic Evaluation of Machine Translation between European Languages
Philipp Koehn
|
Christof Monz
|
Dependency-Based Automatic Evaluation for Machine Translation
Karolina Owczarzak
|
Josef van Genabith
|
Andy Way
|
CCG Supertags in Factored Statistical Machine Translation
Alexandra Birch
|
Miles Osborne
|
Philipp Koehn
|
Labelled Dependencies in Machine Translation Evaluation
Karolina Owczarzak
|
Josef van Genabith
|
Andy Way
|
(Meta-) Evaluation of Machine Translation
Chris Callison-Burch
|
Cameron Fordyce
|
Philipp Koehn
|
Christof Monz
|
Josh Schroeder
|
Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System
Loïc Dugast
|
Jean Senellart
|
Philipp Koehn
|
Linguistic Features for Automatic Evaluation of Heterogenous MT Systems
Jesús Giménez
|
Lluís Màrquez
|
Further Meta-Evaluation of Machine Translation
Chris Callison-Burch
|
Cameron Fordyce
|
Philipp Koehn
|
Christof Monz
|
Josh Schroeder
|
Can we Relearn an RBMT System?
Loïc Dugast
|
Jean Senellart
|
Philipp Koehn
|
Automated Metrics That Agree With Human Judgements On Generated Output for an Embodied Conversational Agent
Mary Ellen Foster
|
Machine Translation Evaluation with Textual Entailment Features
Sebastian Padó
|
Michel Galley
|
Daniel Jurafsky
|
Christopher D. Manning
|
Statistical Post Editing and Dictionary Extraction: Systran/Edinburgh Submissions for ACL-WMT2009
Loic Dugast
|
Jean Senellart
|
Philipp Koehn
|
A Quantitative Analysis of Reordering Phenomena
Alexandra Birch
|
Phil Blunsom
|
Miles Osborne
|
Chunk-Based Verb Reordering in VSO Sentences for Arabic-English Statistical Machine Translation
Arianna Bisazza
|
Marcello Federico
|
The DCU Dependency-Based Metric in WMT-MetricsMATR 2010
Yifan He
|
Jinhua Du
|
Andy Way
|
Josef van Genabith
|
Structured vs. Flat Semantic Role Representations for Machine Translation Evaluation
Chi-kiu Lo
|
Dekai Wu
|
Evaluating Sentence Compression: Pitfalls and Suggested Remedies
Courtney Napoles
|
Benjamin Van Durme
|
Chris Callison-Burch
|
A Lightweight Evaluation Framework for Machine Translation Reordering
David Talbot
|
Hideto Kazawa
|
Hiroshi Ichikawa
|
Jason Katz-Brown
|
Masakazu Seno
|
Franz Och
|
AMBER: A Modified BLEU, Enhanced Ranking Metric
Boxing Chen
|
Roland Kuhn
|
Regression and Ranking based Optimisation for Sentence Level MT Evaluation
Xingyi Song
|
Trevor Cohn
|
Design of a hybrid high quality machine translation system
Bogdan Babych
|
Kurt Eberle
|
Johanna Geiß
|
Mireia Ginestí-Rosell
|
Anthony Hartley
|
Reinhard Rapp
|
Serge Sharoff
|
Martin Thomas
|
Fully Automatic Semantic MT Evaluation
Chi-kiu Lo
|
Anand Karthik Tumuluru
|
Dekai Wu
|
DFKI’s SMT System for WMT 2012
David Vilar
|
Discourse Structure and Computation: Past, Present and Future
Bonnie Webber
|
Aravind Joshi
|
Unsupervised vs. supervised weight estimation for semantic MT evaluation metrics
Chi-kiu Lo
|
Dekai Wu
|
A Finite-State Approach to Phrase-Based Statistical Machine Translation
Jorge González
|
MEANT at WMT 2013: A Tunable, Accurate yet Inexpensive Semantic Frame Based MT Evaluation Metric
Chi-kiu Lo
|
Dekai Wu
|
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
Shafiq Joty
|
Francisco Guzmán
|
Lluís Màrquez
|
Preslav Nakov
|
Better Semantic Frame Based MT Evaluation via Inversion Transduction Grammars
Dekai Wu
|
Chi-kiu Lo
|
Meriem Beloucif
|
Markus Saers
|
A Comparison of MT Methods for Closely Related Languages: a Case Study on Czech - Slovak Language Pair
Vladislav Kuboň
|
Jernej Vičič
|
Lexical Access Preference and Constraint Strategies for Improving Multiword Expression Association within Semantic MT Evaluation
Dekai Wu
|
Chi-kiu Lo
|
Markus Saers
|
A Domain-Restricted, Rule Based, English-Hindi Machine Translation System Based on Dependency Parsing
Pratik Desai
|
Amit Sangodkar
|
Om P. Damani
|
Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation
Christian Hardmeier
|
Preslav Nakov
|
Sara Stymne
|
Jörg Tiedemann
|
Yannick Versley
|
Mauro Cettolo
|
Measuring ‘Registerness’ in Human and Machine Translation: A Text Classification Approach
Ekaterina Lapshinova-Koltunski
|
Mihaela Vela
|
Semantic Tuples for Evaluation of Image to Sentence Generation
Lily D. Ellebracht
|
Arnau Ramisa
|
Pranava Swaroop Madhyastha
|
Jose Cordero-Rama
|
Francesc Moreno-Noguer
|
Ariadna Quattoni
|
MT Tuning on RED: A Dependency-Based Evaluation Metric
Liangyou Li
|
Hui Yu
|
Qun Liu
|
Exploiting portability to build an RBMT prototype for a new source language
Nora Aranberri
|
Gorka Labaka
|
Arantza Díaz de Ilarraza
|
Kepa Sarasola
|
An Awkward Disparity between BLEU / RIBES Scores and Human Judgements in Machine Translation
Liling Tan
|
Jon Dehdari
|
Josef van Genabith
|
DFKI’s system for WMT16 IT-domain task, including analysis of systematic errors
Eleftherios Avramidis
|
Aljoscha Burchardt
|
Vivien Macketanz
|
Ankit Srivastava
|
CobaltF: A Fluent Metric for MT Evaluation
Marina Fomicheva
|
Núria Bel
|
Lucia Specia
|
Iria da Cunha
|
Anton Malinovskiy
|
Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations
Aaron Smith
|
Christian Hardmeier
|
Joerg Tiedemann
|
North-Sámi to Finnish rule-based machine translation system
Tommi Pirinen
|
Francis M. Tyers
|
Trond Trosterud
|
Ryan Johnson
|
Kevin Unhammer
|
Tiina Puolakainen
|
Data-driven Morphology and Sociolinguistics for Early Modern Dutch
Marijn Schraagen
|
Marjo van Koppen
|
Feike Dietz
|
The Helsinki Neural Machine Translation System
Robert Östling
|
Yves Scherrer
|
Jörg Tiedemann
|
Gongbo Tang
|
Tommi Nieminen
|
UHH Submission to the WMT17 Metrics Shared Task
Melania Duma
|
Wolfgang Menzel
|
bleu2vec: the Painfully Familiar Metric on Continuous Vector Space Steroids
Andre Tättar
|
Mark Fishel
|
A Call for Clarity in Reporting BLEU Scores
Matt Post
|
Exploring gap filling as a cheaper alternative to reading comprehension questionnaires when evaluating machine translation for gisting
Mikel L. Forcada
|
Carolina Scarton
|
Lucia Specia
|
Barry Haddow
|
Alexandra Birch
|
No URLs Found
Field Of Study
Task
Word Sense Disambiguation
Language Generation
Machine Translation
Language
English
Similar Papers
Cross Language Dependency Parsing using a Bilingual Lexicon
Hai Zhao
|
Yan Song
|
Chunyu Kit
|
Guodong Zhou
|
Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models
Arne Mauser
|
Saša Hasan
|
Hermann Ney
|
Integrating Graph-Based and Transition-Based Dependency Parsers
Joakim Nivre
|
Ryan McDonald
|
A treebank-based study on the influence of Italian word order on parsing performance
Anita Alicante
|
Cristina Bosco
|
Anna Corazza
|
Alberto Lavelli
|
Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language
Nizar Habash
|
Jun Hu
|