NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Crawling microblogging services to gather language-classified URLs. Workflow and case study
Adrien Barbaresi
|
Paper Details:
Month: August
Year: 2013
Location: Sofia, Bulgaria
Venue:
ACL |
Citations
URL
Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources
Adrien Barbaresi
|
Efficient construction of metadata-enhanced web corpora
Adrien Barbaresi
|
http://www.comscore.com/Press
http://www.alexa.com/topsites/category/Top/News
http://www.reddit.com/r/norge+oslo+norskenyheter
http://www.abisource.com/projects/enchant/
http://w3techs.com/technologies/overview/content
https://github.com/adbar/microblog-explorer
Field Of Study
Task
Language Identification
Text Categorization
Language
Chinese
English
Hindi
Japanese
Spanish
French
Arabic
Dataset
News
Social Media
Twitter
Similar Papers
Expectation-Regulated Neural Model for Event Mention Extraction
Ching-Yun Chang
|
Zhiyang Teng
|
Yue Zhang
|
Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora
Andrius Mudinas
|
Dell Zhang
|
Mark Levene
|
A Corpus of Corporate Annual and Social Responsibility Reports: 280 Million Tokens of Balanced Organizational Writing
Sebastian G.M. Händschke
|
Sven Buechel
|
Jan Goldenstein
|
Philipp Poschmann
|
Tinghui Duan
|
Peter Walgenbach
|
Udo Hahn
|
Argumentation Mining in User-Generated Web Discourse
Ivan Habernal
|
Iryna Gurevych
|
A Joint Model of Conversational Discourse Latent Topics on Microblogs
Jing Li
|
Yan Song
|
Zhongyu Wei
|
Kam-Fai Wong
|