NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Evaluating Models of Latent Document Semantics in the Presence of OCR Errors
Daniel Walker
|
William B. Lund
|
Eric K. Ringger
|
Paper Details:
Month: October
Year: 2010
Location: Cambridge, MA
Venue:
EMNLP |
SIG: SIGDAT
Citations
URL
Measuring Contextual Fitness Using Error Contexts Extracted from the Wikipedia Revision History
Torsten Zesch
|
Topic Modeling on Historical Newspapers
Tze-I Yang
|
Andrew Torget
|
Rada Mihalcea
|
http://finereader.abbyy.com
http://code.google.com/p/tesseract-ocr
http://www.research.att.com/lewis
http://mallet.cs.umass.edu
http://www.nuance.com/imaging/products/omnipage.asp
http://nlp.cs.byu.edu/techreports/BYUNLP-
Field Of Study
Task
Information Retrieval
OCR
Text Categorization
Approach
Topic Modeling
Language
English
Dataset
News
Similar Papers
Expectation-Regulated Neural Model for Event Mention Extraction
Ching-Yun Chang
|
Zhiyang Teng
|
Yue Zhang
|
Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora
Andrius Mudinas
|
Dell Zhang
|
Mark Levene
|
A Joint Model of Conversational Discourse Latent Topics on Microblogs
Jing Li
|
Yan Song
|
Zhongyu Wei
|
Kam-Fai Wong
|
A Corpus of Corporate Annual and Social Responsibility Reports: 280 Million Tokens of Balanced Organizational Writing
Sebastian G.M. Händschke
|
Sven Buechel
|
Jan Goldenstein
|
Philipp Poschmann
|
Tinghui Duan
|
Peter Walgenbach
|
Udo Hahn
|
Argumentation Mining in User-Generated Web Discourse
Ivan Habernal
|
Iryna Gurevych
|