NLPExplorer

Introduction to the Special Issue on the Web as Corpus

Adam Kilgarriff | Gregory Grefenstette |

Year: 2003

Venue: CL |

Using Web-Search Results to Measure Word-Group Similarity
Ann Gledson | John Keane |

Web-based and combined language models: a case study on noun compound identification
Carlos Ramisch | Aline Villavicencio | Christian Boitet |

Learning to Find English to Chinese Transliterations on the Web
Jian-Cheng Wu | Jason S. Chang |

A Figure of Merit for the Evaluation of Web-Corpus Randomness
Massimiliano Ciaramita | Marco Baroni |

Large Linguistically-Processed Web Corpora for Multiple Languages
Marco Baroni | Adam Kilgarriff |

Using the Structure of a Conceptual Network in Computing Semantic Relatedness
Iryna Gurevych |

Harvesting the Bitexts of the Laws of Hong Kong From the Web
Chunyu Kit | Xiaoyue Liu | KingKui Sin | Jonathan J. Webster |

Orthographic Errors in Web Pages: Toward Cleaner Web Corpora
Christoph Ringlstetter | Klaus U. Schulz | Stoyan Mihov |

A Lightweight and Efficient Tool for Cleaning Web Pages
Stefan Evert |

Targeting Chinese Nominal Compounds in Corpora
Weiruo Qu | Christoph Ringlstetter | Randy Goebel |

Using the Web as a Linguistic Resource to Automatically Correct Lexico-Syntactic Errors
Matthieu Hermet | Alain Désilets | Stan Szpakowicz |

Cross-Corpus Evaluation of Word Alignment
Sylwia Ozdowska |

Using the Web to Disambiguate Acronyms
Eiichiro Sumita | Fumiaki Sugaya |

Near-Synonym Choice in an Intelligent Thesaurus
Diana Inkpen |

The Effect of Corpus Size on Case Frame Acquisition for Discourse Analysis
Ryohei Sasano | Daisuke Kawahara | Sadao Kurohashi |

Creating Multilingual Translation Lexicons with Regional Variations Using Web Corpora
Pu-Jen Cheng | Wen-Hsiang Lu | Jei-Wen Teng | Lee-Feng Chien |

Phoneme-to-Text Transcription System with an Infinite Vocabulary
Shinsuke Mori | Daisuke Takuma | Gakuto Kurata |

Chinese-English Term Translation Mining Based on Semantic Prediction
Gaolin Fang | Hao Yu | Fumihito Nishino |

Creating Robust Supervised Classifiers via Web-Scale N-Gram Data
Shane Bergsma | Emily Pitler | Dekang Lin |

Biases in Predicting the Human Language Model
Alex B. Fine | Austin F. Frank | T. Florian Jaeger | Benjamin Van Durme |

Attribute-Based and Value-Based Clustering: An Evaluation
Abdulrahman Almuhareb | Massimo Poesio |

Measuring Non-native Speakers’ Proficiency of English by Using a Test with Automatically-Generated Fill-in-the-Blank Questions
Eiichiro Sumita | Fumiaki Sugaya | Seiichi Yamamoto |

Automated Multiword Expression Prediction for Grammar Engineering
Yi Zhang | Valia Kordoni | Aline Villavicencio | Marco Idiart |

Annotated Web as corpus
Paul Rayson | James Walkerdine | William H. Fletcher | Adam Kilgarriff |

Corporator: A tool for creating RSS-based specialized corpora
Cédrick Fairon |

Web corpus mining by instance of Wikipedia
Rüdiger Gleim | Alexander Mehler | Matthias Dehmer |

Landmark Classification for Route Directions
Aidan Furlan | Timothy Baldwin | Alex Klippel |

Towards Domain-Independent Deep Linguistic Processing: Ensuring Portability and Re-Usability of Lexicalised Grammars
Kostadin Cholakov | Valia Kordoni | Yi Zhang |

NoWaC: a large web-based corpus for Norwegian
Emiliano Raul Guevara |

Web Based Manipuri Corpus for Multiword NER and Reduplicated MWEs Identification using SVM
Thoudam Doren Singh | Sivaji Bandyopadhyay |

Were the clocks striking or surprising? Using WSD to improve MT performance
Špela Vintar | Darja Fišer | Aljoša Vrščaj |

A modular open-source focused crawler for mining monolingual and bilingual corpora from the web
Vassilis Papavassiliou | Prokopis Prokopidis | Gregor Thurmair |