NLPExplorer

Language Identification: The Long and the Short of the Matter

Timothy Baldwin | Marco Lui |

Paper Details:

Month: June
Year: 2010
Location: Los Angeles, California
Venue: NAACL |

Citations

URL

Word Level Language Identification in Online Multilingual Communication
Dong Nguyen | A. Seza Doğruöz |

A Graph-based Approach for Contextual Text Normalization
Cagil Sönmez | Arzucan Özgür |

Yet Another Language Identifier
Martin Majliš |

LanideNN: Multilingual Language Identification on Text Stream
Tom Kocmi | Ondřej Bojar |

What’s in a Domain? Learning Domain-Robust Text Representations using Adversarial Training
Yitong Li | Timothy Baldwin | Trevor Cohn |

Lexical Normalisation of Short Text Messages: Makn Sens a #twitter
Bo Han | Timothy Baldwin |

langid.py: An Off-the-shelf Language Identification Tool
Marco Lui | Timothy Baldwin |

Automatic Detection and Language Identification of Multilingual Documents
Marco Lui | Jey Han Lau | Timothy Baldwin |

Accurate Language Identification of Twitter Messages
Marco Lui | Timothy Baldwin |

Short-Term Projects, Long-Term Benefits: Four Student NLP Projects for Low-Resource Languages
Alexis Palmer | Michaela Regneri |

Code Mixing: A Challenge for Language Identification in the Language of Social Media
Utsab Barman | Amitava Das | Joachim Wagner | Jennifer Foster |

Language Identification in Code-Switching Scenario
Naman Jain | Riyaz Ahmad Bhat |

Language variety identification in Spanish tweets
Wolfgang Maier | Carlos Gómez-Rodríguez |

Identifying Languages at the Word Level in Code-Mixed Indian Social Media Text
Amitava Das | Björn Gambäck |

Improved Sentence-Level Arabic Dialect Classification
Christoph Tillmann | Saab Mansour | Yaser Al-Onaizan |

Using Maximum Entropy Models to Discriminate between Similar Languages and Varieties
Jordi Porta | José-Luis Sancho |

A Simple Baseline for Discriminating Similar Languages
Matthew Purver |

A Language Detection System for Short Chats in Mobile Games
Pidong Wang | Nikhil Bojja | Shivasankari Kannan |

Discriminating between Similar Languages Using PPM
Victoria Bobicev |

Byte-based Language Identification with Deep Convolutional Networks
Johannes Bjerva |

Language and Dialect Discrimination Using Compression-Inspired Language Models
Paul McNamee |

An Unsupervised Morphological Criterion for Discriminating Similar Languages
Adrien Barbaresi |

Tuning Bayes Baseline for Dialect Detection
Hector-Hugo Franco-Penya | Liliana Mamani Sanchez |

N-gram and Neural Language Models for Discriminating Similar Languages
Andre Cianflone | Leila Kosseim |

Discriminating between Similar Languages using Weighted Subword Features
Adrien Barbaresi |

Towards Normalising Konkani-English Code-Mixed Social Media Text
Akshata Phadte | Gaurish Thakkar |

Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
Adrien Barbaresi |

Field Of Study

Task

Language Identification Text Categorization

Language

Multilingual Chinese Japanese

Dataset

News

Similar Papers

A Bilingual Attention Network for Code-switched Emotion Prediction

Natural Language Processing for Dialectical Arabic: A Survey

Abdulhadi Shoufan | Sumaya Alameri |

Sentiment after Translation: A Case-Study on Arabic Social Media Posts

Mohammad Salameh | Saif Mohammad | Svetlana Kiritchenko |

Semi-supervised Structured Prediction with Neural CRF Autoencoder

A Survey of Arabic Named Entity Recognition and Classification

Khaled Shaalan |