NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
Jason R. Smith
|
Chris Quirk
|
Kristina Toutanova
|
Paper Details:
Month: June
Year: 2010
Location: Los Angeles, California
Venue:
NAACL |
Citations
URL
Langforia: Language Pipelines for Annotating Large Collections of Documents
Marcus Klang
|
Pierre Nugues
|
Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction
Ahmad Aghaebrahimian
|
Extracting Parallel Sentences with Bidirectional Recurrent Neural Networks to Improve Machine Translation
Francis Grégoire
|
Philippe Langlais
|
Monolingual Marginal Matching for Translation Model Adaptation
Ann Irvine
|
Chris Quirk
|
Hal Daumé III
|
An Iterative Link-based Method for Parallel Web Page Mining
Le Liu
|
Yu Hong
|
Jun Lu
|
Jun Lang
|
Heng Ji
|
Jianmin Yao
|
Improving Statistical Machine Translation with a Multilingual Paraphrase Database
Ramtin Mehdizadeh Seraj
|
Maryam Siahbani
|
Anoop Sarkar
|
Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition
Tao Ge
|
Qing Dou
|
Heng Ji
|
Lei Cui
|
Baobao Chang
|
Zhifang Sui
|
Furu Wei
|
Ming Zhou
|
Toward Statistical Machine Translation without Parallel Corpora
Alexandre Klementiev
|
Ann Irvine
|
Chris Callison-Burch
|
David Yarowsky
|
Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs
Fernando Alva-Manchego
|
Joachim Bingel
|
Gustavo Paetzold
|
Carolina Scarton
|
Lucia Specia
|
Cross-Lingual Named Entity Recognition via Wikification
Chen-Tse Tsai
|
Stephen Mayhew
|
Dan Roth
|
Dual Subtitles as Parallel Corpora
Shikun Zhang
|
Wang Ling
|
Chris Dyer
|
Constructing a Chinese—Japanese Parallel Corpus from Wikipedia
Chenhui Chu
|
Toshiaki Nakazawa
|
Sadao Kurohashi
|
Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor’s Love Affair
Nikola Ljubešić
|
Miquel Esplà-Gomis
|
Antonio Toral
|
Sergio Ortiz Rojas
|
Filip Klubička
|
Linking, Searching, and Visualizing Entities in Wikipedia
Marcus Klang
|
Pierre Nugues
|
Text Simplification from Professionally Produced Corpora
Carolina Scarton
|
Gustavo Paetzold
|
Lucia Specia
|
A Multilingual Dataset for Evaluating Parallel Sentence Extraction from Comparable Corpora
Pierre Zweigenbaum
|
Serge Sharoff
|
Reinhard Rapp
|
Transliteration Mining Using Large Training and Test Sets
Ali El Kahki
|
Kareem Darwish
|
Mohamed Abdul-Wahab
|
Ahmed Taei
|
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling
Ferhan Ture
|
Jimmy Lin
|
Statistical Machine Translation in Low Resource Settings
Ann Irvine
|
Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
Kriste Krstovski
|
David Smith
|
Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings
Yuan Zhang
|
David Gaddy
|
Regina Barzilay
|
Tommi Jaakkola
|
Identifying Semantic Divergences in Parallel Text without Annotations
Yogarshi Vyas
|
Xing Niu
|
Marine Carpuat
|
Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced from Comparable Corpora
Sree Harsha Ramesh
|
Krishna Prasad Sankaranarayanan
|
Crowdsourcing Translation: Professional Quality from Non-Professionals
Omar F. Zaidan
|
Chris Callison-Burch
|
Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia
Sungchul Kim
|
Kristina Toutanova
|
Hwanjo Yu
|
ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora
Mārcis Pinnis
|
Radu Ion
|
Dan Ştefănescu
|
Fangzhong Su
|
Inguna Skadiņa
|
Andrejs Vasiļjevs
|
Bogdan Babych
|
Microblogs as Parallel Corpora
Wang Ling
|
Guang Xiang
|
Chris Dyer
|
Alan Black
|
Isabel Trancoso
|
Dirt Cheap Web-Scale Parallel Text from the Common Crawl
Jason R. Smith
|
Herve Saint-Amand
|
Magdalena Plamada
|
Philipp Koehn
|
Chris Callison-Burch
|
Adam Lopez
|
Are Two Heads Better than One? Crowdsourced Translation via a Two-Step Collaboration of Non-Professional Translators and Editors
Rui Yan
|
Mingkun Gao
|
Ellie Pavlick
|
Chris Callison-Burch
|
Learning Translational and Knowledge-based Similarities from Relevance Rankings for Cross-Language Retrieval
Shigehiko Schamoni
|
Felix Hieber
|
Artem Sokolov
|
Stefan Riezler
|
Set-Theoretic Alignment for Comparable Corpora
Thierry Etchegoyhen
|
Andoni Azpeitia
|
Two Ways to Use a Noisy Parallel News Corpus for Improving Statistical Machine Translation
Souhir Gahbiche-Braham
|
Hélène Bonneau-Maynard
|
François Yvon
|
Paraphrase Fragment Extraction from Monolingual Comparable Corpora
Rui Wang
|
Chris Callison-Burch
|
Extracting Parallel Phrases from Comparable Data
Sanjika Hewavitharana
|
Stephan Vogel
|
Identifying Parallel Documents from a Large Bilingual Collection of Texts: Application to Parallel Article Extraction in Wikipedia.
Alexandre Patry
|
Philippe Langlais
|
A Minimally Supervised Approach for Detecting and Ranking Document Translation Pairs
Kriste Krstovski
|
David A. Smith
|
Crisis MT: Developing A Cookbook for MT in Crisis Situations
William Lewis
|
Robert Munro
|
Stephan Vogel
|
Measuring Comparability of Documents in Non-Parallel Corpora for Efficient Extraction of (Semi-)Parallel Translation Equivalents
Fangzhong Su
|
Bogdan Babych
|
Combining Bilingual and Comparable Corpora for Low Resource Machine Translation
Ann Irvine
|
Chris Callison-Burch
|
Evaluating (and Improving) Sentence Alignment under Noisy Conditions
Omar Zaidan
|
Vishal Chowdhary
|
Chinese–Japanese Parallel Sentence Extraction from Quasi–Comparable Corpora
Chenhui Chu
|
Toshiaki Nakazawa
|
Sadao Kurohashi
|
Improving MT System Using Extracted Parallel Fragments of Text from Comparable Corpora
Rajdeep Gupta
|
Santanu Pal
|
Sivaji Bandyopadhyay
|
Mining for Domain-specific Parallel Text from Wikipedia
Magdalena Plamadă
|
Martin Volk
|
Automatic Building and Using Parallel Resources for SMT from Comparable Corpora
Santanu Pal
|
Partha Pakray
|
Sudip Kumar Naskar
|
Hallucinating Phrase Translations for Low Resource MT
Ann Irvine
|
Chris Callison-Burch
|
Crowdsourcing High-Quality Parallel Data Extraction from Twitter
Wang Ling
|
Luís Marujo
|
Chris Dyer
|
Alan W. Black
|
Isabel Trancoso
|
Consistent Improvement in Translation Quality of Chinese-Japanese Technical Texts by Adding Additional Quasi-parallel Training Data
Wei Yang
|
Yves Lepage
|
A Factory of Comparable Corpora from Wikipedia
Alberto Barrón-Cedeño
|
Cristina España-Bonet
|
Josu Boldoba
|
Lluís Màrquez
|
AUT Document Alignment Framework for BUCC Workshop Shared Task
Atefeh Zafarian
|
Amir Pouya Agha Sadeghi
|
Fatemeh Azadi
|
Sonia Ghiasifard
|
Zeinab Ali Panahloo
|
Somayeh Bakhshaei
|
Seyyed Mohammad Mohammadzadeh Ziabary
|
LINA: Identifying Comparable Documents from Wikipedia
Emmanuel Morin
|
Amir Hazem
|
Florian Boudin
|
Elizaveta Loginova-Clouet
|
Pairing Wikipedia Articles Across Languages
Marcus Klang
|
Pierre Nugues
|
Users and Data: The Two Neglected Children of Bilingual Natural Language Processing Research
Phillippe Langlais
|
Weighted Set-Theoretic Alignment of Comparable Sentences
Andoni Azpeitia
|
Thierry Etchegoyhen
|
Eva Martínez Garcia
|
zNLP: Identifying Parallel Sentences in Chinese-English Comparable Corpora
Zheng Zhang
|
Pierre Zweigenbaum
|
Learning Bilingual Projections of Embeddings for Vocabulary Expansion in Machine Translation
Pranava Swaroop Madhyastha
|
Cristina España-Bonet
|
Detecting Cross-Lingual Semantic Divergence for Neural Machine Translation
Marine Carpuat
|
Yogarshi Vyas
|
Xing Niu
|
http://research.microsoft.com/en-
http://research.microsoft.com/en-
Field Of Study
Task
Machine Translation
Language
Multilingual
Chinese
English
Japanese
Spanish
French
Dataset
News
Encyclopedia
Similar Papers
Neural Morphological Tagging of Lemma Sequences for Machine Translation
Costanza Conforti
|
Matthias Huck
|
Alexander Fraser
|
Enriching Word Vectors with Subword Information
Piotr Bojanowski
|
Edouard Grave
|
Armand Joulin
|
Tomas Mikolov
|
Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data
Mona Diab
|
Mahmoud Ghoneim
|
Abdelati Hawwari
|
Fahad AlGhamdi
|
Nada AlMarwani
|
Mohamed Al-Badrashiny
|
Semi-supervised Structured Prediction with Neural CRF Autoencoder
Xiao Zhang
|
Yong Jiang
|
Hao Peng
|
Kewei Tu
|
Dan Goldwasser
|
A Survey of Arabic Named Entity Recognition and Classification
Khaled Shaalan
|