NLPExplorer

Automatically Constructing a Normalisation Dictionary for Microblogs

Bo Han | Paul Cook | Timothy Baldwin |

Paper Details:

Month: July
Year: 2012
Location: Jeju Island, Korea
Venue: CoNLL | EMNLP |

Citations

URL

Morphological Analysis for Japanese Noisy Text based on Character-level and Word-level Normalization
Itsumi Saito | Kugatsu Sadamitsu | Hisako Asano | Yoshihiro Matsuo |

Paraphrasing 4 Microblog Normalization
Wang Ling | Chris Dyer | Alan W Black | Isabel Trancoso |

Accurate Word Segmentation and POS Tagging for Japanese Microblogs: Corpus Annotation and Joint Modeling with Lexical Normalization
Nobuhiro Kaji | Masaru Kitsuregawa |

A Graph-based Approach for Contextual Text Normalization
Cagil Sönmez | Arzucan Özgür |

Motivating Personality-aware Machine Translation
Shachar Mirkin | Scott Nowson | Caroline Brun | Julien Perez |

The Language of Place: Semantic Value from Geospatial Context
Anne Cocos | Chris Callison-Burch |

DIMSIM: An Accurate Chinese Phonetic Similarity Algorithm Based on Learned High Dimensional Encoding
Min Li | Marina Danilevsky | Sara Noeman | Yunyao Li |

Creating dialect sub-corpora by clustering: a case in Japanese for an adaptive method
Yo Sato | Kevin Heffernan |

What to do about bad language on the internet
Jacob Eisenstein |

An In-depth Analysis of the Effect of Text Normalization in Social Media
Tyler Baldwin | Yunyao Li |

Mining Informal Language from Chinese Microtext: Joint Word Recognition and Segmentation
Aobo Wang | Min-Yen Kan |

Social Text Normalization using Contextual Graph Random Walks
Hany Hassan | Arul Menezes |

Normalizing tweets with edit scripts and recurrent neural embeddings
Grzegorz Chrupała |

Improving Text Normalization via Unsupervised Model and Discriminative Reranking
Chen Li | Yang Liu |

Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings
Luchen Tan | Haotian Zhang | Charles Clarke | Mark Smucker |

Mining Cross-Cultural Differences and Similarities in Social Media
Bill Yuchen Lin | Frank F. Xu | Kenny Zhu | Seung-won Hwang |

evision PDF of 'Improving Topic Models with Latent Feature Word Representations
Dat Quoc Nguyen | Richard Billingsley | Lan Du | Mark Johnson |

Unsupervised Word Usage Similarity in Social Media Texts
Spandana Gella | Paul Cook | Bo Han |

CDTDS: Predicting Paraphrases in Twitter via Support Vector Regression
Rafael Michael Karampatsis |

Gathering and Generating Paraphrases from Twitter with Application to Normalization
Wei Xu | Alan Ritter | Ralph Grishman |

Mining Lexical Variants from Microblogs: An Unsupervised Multilingual Approach
Alejandro Mosquera | Paloma Moreda Pozo |

Experiments to Improve Named Entity Recognition on Turkish Tweets
Dilek Küçük | Ralf Steinberger |

Code Mixing: A Challenge for Language Identification in the Language of Social Media
Utsab Barman | Amitava Das | Joachim Wagner | Jennifer Foster |

DCU-UVT: Word-Level Language Classification with Code-Mixed Data
Utsab Barman | Joachim Wagner | Grzegorz Chrupała | Jennifer Foster |

Identifying Languages at the Word Level in Code-Mixed Indian Social Media Text
Amitava Das | Björn Gambäck |

Unsupervised Text Normalization Using Distributed Representations of Words and Phrases
Vivek Kumar Rangarajan Sridhar |

NCSU_SAS_WOOKHEE: A Deep Contextual Long-Short Term Memory Model for Text Normalization
Wookhee Min | Bradford Mott |

USZEGED: Correction Type-sensitive Normalization of English Tweets Using Efficiently Indexed n-gram Statistics
Gábor Berend | Ervin Tasnádi |

Exploring Word Embeddings for Unsupervised Textual User-Generated Content Normalization
Thales Felipe Costa Bertaglia | Maria das Graças Volpe Nunes |

Unraveling the English-Bengali Code-Mixing Phenomenon
Arunavha Chanda | Dipankar Das | Chandan Mazumdar |

Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description
Arunavha Chanda | Dipankar Das | Chandan Mazumdar |

Evaluating hypotheses in geolocation on a very large sample of Twitter
Bahar Salehi | Anders Søgaard |

Language Identification and Analysis of Code-Switched Social Media Text
Deepthi Mave | Suraj Maharjan | Thamar Solorio |

Field Of Study

Task

Language Identification Sentiment Analysis Machine Translation

Language

English

Dataset

Social Media Twitter

Similar Papers

Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP

Edoardo Maria Ponti | Roi Reichart | Anna Korhonen | Ivan Vulić |

Enriching Word Vectors with Subword Information

Piotr Bojanowski | Edouard Grave | Armand Joulin | Tomas Mikolov |

Knowledge-Rich Morphological Priors for Bayesian Language Models

Victor Chahuneau | Noah A. Smith | Chris Dyer |

A Comparative Study of Minimally Supervised Morphological Segmentation

A Survey of Arabic Named Entity Recognition and Classification

Khaled Shaalan |