NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
langid.py: An Off-the-shelf Language Identification Tool
Marco Lui
|
Timothy Baldwin
|
Paper Details:
Month: July
Year: 2012
Location: Jeju Island, Korea
Venue:
ACL |
Citations
URL
Word Level Language Identification in Online Multilingual Communication
Dong Nguyen
|
A. Seza Doğruöz
|
A Graph-based Approach for Contextual Text Normalization
Cagil Sönmez
|
Arzucan Özgür
|
Non-linear Mapping for Improved Identification of 1300+ Languages
Ralf Brown
|
One Sense per Tweeter ... and Other Lexical Semantic Tales of Twitter
Spandana Gella
|
Paul Cook
|
Timothy Baldwin
|
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
Olutobi Owoputi
|
Brendan O’Connor
|
Chris Dyer
|
Kevin Gimpel
|
Nathan Schneider
|
Noah A. Smith
|
Dirt Cheap Web-Scale Parallel Text from the Common Crawl
Jason R. Smith
|
Herve Saint-Amand
|
Magdalena Plamada
|
Philipp Koehn
|
Chris Callison-Burch
|
Adam Lopez
|
Task Alternation in Parallel Sentence Retrieval for Twitter Translation
Felix Hieber
|
Laura Jehl
|
Stefan Riezler
|
Crawling microblogging services to gather language-classified URLs. Workflow and case study
Adrien Barbaresi
|
A Stacking-based Approach to Twitter User Geolocation Prediction
Bo Han
|
Paul Cook
|
Timothy Baldwin
|
Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models
Jey Han Lau
|
Paul Cook
|
Diana McCarthy
|
Spandana Gella
|
Timothy Baldwin
|
Real-Time Detection, Tracking, and Monitoring of Automatically Discovered Events in Social Media
Miles Osborne
|
Sean Moran
|
Richard McCreadie
|
Alexander Von Lunen
|
Martin Sykora
|
Elizabeth Cano
|
Neil Ireson
|
Craig Macdonald
|
Iadh Ounis
|
Yulan He
|
Tom Jackson
|
Fabio Ciravegna
|
Ann O’Brien
|
Unsupervised Word Usage Similarity in Social Media Texts
Spandana Gella
|
Paul Cook
|
Bo Han
|
Learning the Peculiar Value of Actions
Daniel Dahlmeier
|
Sentiment analysis on Italian tweets
Valerio Basile
|
Malvina Nissim
|
Tunable Distortion Limits and Corpus Cleaning for SMT
Sara Stymne
|
Christian Hardmeier
|
Jörg Tiedemann
|
Joakim Nivre
|
Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources
Adrien Barbaresi
|
Focused Web Corpus Crawling
Roland Schäfer
|
Adrien Barbaresi
|
Felix Bildhauer
|
Accurate Language Identification of Twitter Messages
Marco Lui
|
Timothy Baldwin
|
Language variety identification in Spanish tweets
Wolfgang Maier
|
Carlos Gómez-Rodríguez
|
A Report on the DSL Shared Task 2014
Marcos Zampieri
|
Liling Tan
|
Nikola Ljubešić
|
Jörg Tiedemann
|
Exploring Methods and Resources for Discriminating Similar Languages
Marco Lui
|
Ned Letcher
|
Oliver Adams
|
Long Duong
|
Paul Cook
|
Timothy Baldwin
|
http://numpy.scipy.org
http://www.wsgi.org
http://www.wikipedia.org
http://www.csse.unimelb.edu.au/research/lt/resources/langid/
http://code.google.com/p/language-detection/
http://odur.let.rug.nl/vannoord/TextCat/
http://code.google.com/p/chromium-compact-language-detector/
http://www.twitter.com
http://semiocast.com/downloads/
http://boston.lti.cs
Field Of Study
Task
Language Identification
Machine Translation
Text Categorization
Biomedical
Language
English
Dataset
News
Encyclopedia
Social Media
Twitter
Blogs
Similar Papers
One-Shot Neural Cross-Lingual Transfer for Paradigm Completion
Katharina Kann
|
Ryan Cotterell
|
Hinrich Schütze
|
A treebank-based study on the influence of Italian word order on parsing performance
Anita Alicante
|
Cristina Bosco
|
Anna Corazza
|
Alberto Lavelli
|
Integrating Graph-Based and Transition-Based Dependency Parsers
Joakim Nivre
|
Ryan McDonald
|
Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models
Arne Mauser
|
Saša Hasan
|
Hermann Ney
|
Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language
Nizar Habash
|
Jun Hu
|