NLPExplorer

Contextual Dependencies in Unsupervised Word Segmentation

Sharon Goldwater | Thomas L. Griffiths | Mark Johnson |

Paper Details:

Month: July
Year: 2006
Location: Sydney, Australia
Venue: ACL | COLING |

Citations

URL

Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation
Jia Xu | Jianfeng Gao | Kristina Toutanova | Hermann Ney |

Unsupervised phonemic Chinese word segmentation using Adaptor Grammars
Mark Johnson | Katherine Demuth |

The Infinite PCFG Using Hierarchical Dirichlet Processes
Percy Liang | Slav Petrov | Michael Jordan | Dan Klein |

Sampling Alignment Structure under a Bayesian Translation Model
John DeNero | Alexandre Bouchard-Côté | Dan Klein |

Unsupervised Tokenization for Machine Translation
Tagyoung Chung | Daniel Gildea |

Bayesian Learning of Phrasal Tree-to-String Templates
Ding Liu | Daniel Gildea |

An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL
Valentin Zhikov | Hiroya Takamura | Manabu Okumura |

Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model
Markus Dreyer | Jason Eisner |

A Bayesian Model for Learning SCFGs with Discontiguous Rules
Abby Levenberg | Chris Dyer | Phil Blunsom |

Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation
Longkai Zhang | Houfeng Wang | Xu Sun | Mairgup Mansur |

Joint Learning of Chinese Words, Terms and Keywords
Ziqiang Cao | Sujian Li | Heng Ji |

Do we need bigram alignment models? On the effect of alignment quality on transduction accuracy in G2P
Steffen Eger |

Unsupervised Segmentation of Phoneme Sequences based on Pitman-Yor Semi-Markov Model using Phoneme Length Context
Ryu Takeda | Kazunori Komatani |

Substring Frequency Features for Segmentation of Japanese Katakana Words with Unlabeled Corpora
Yoshinari Fujinuma | Alvin Grissom II |

Punctuation as Implicit Annotations for Chinese Word Segmentation
Zhongguo Li | Maosong Sun |

A New Unsupervised Approach to Word Segmentation
Hanshi Wang | Jian Zhu | Shiping Tang | Xiaozhong Fan |

Unsupervised Event Coreference Resolution
Cosmin Adrian Bejan | Sanda Harabagiu |

Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists
Taraka Rama |

A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
Matti Varjokallio | Mikko Kurimo |

Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars
Mark Johnson | Sharon Goldwater |

Inducing Compact but Accurate Tree-Substitution Grammars
Trevor Cohn | Sharon Goldwater | Phil Blunsom |

Online EM for Unsupervised Models
Percy Liang | Dan Klein |

Type-Based MCMC
Percy Liang | Michael I. Jordan | Dan Klein |

Painless Unsupervised Learning with Features
Taylor Berg-Kirkpatrick | Alexandre Bouchard-Côté | John DeNero | Dan Klein |

Combining multiple information types in Bayesian word segmentation
Gabriel Doyle | Roger Levy |

Learning Document-Level Semantic Properties from Free-Text Annotations
S.R.K. Branavan | Harr Chen | Jacob Eisenstein | Regina Barzilay |

Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure
Mark Johnson |

Unsupervised Multilingual Learning for Morphological Segmentation
Benjamin Snyder | Regina Barzilay |

A Gibbs Sampler for Phrasal Synchronous Grammar Induction
Phil Blunsom | Trevor Cohn | Chris Dyer | Miles Osborne |

A Note on the Implementation of Hierarchical Dirichlet Processes
Phil Blunsom | Trevor Cohn | Sharon Goldwater | Mark Johnson |

Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression
Elif Yamangil | Stuart M. Shieber |

Blocked Inference in Bayesian Tree Substitution Grammars
Trevor Cohn | Phil Blunsom |

A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction
Phil Blunsom | Trevor Cohn |

Unsupervized Word Segmentation: the Case for Mandarin Chinese
Pierre Magistry | Benoît Sagot |

Empirical Study of Unsupervised Chinese Word Segmentation Methods for SMT on Large-scale Corpora
Xiaolin Wang | Masao Utiyama | Andrew Finch | Eiichiro Sumita |

Inducing Word and Part-of-Speech with Pitman-Yor Hidden Semi-Markov Models
Kei Uchiumi | Hiroshi Tsukahara | Daichi Mochihashi |

Joint Word Segmentation and Phonetic Category Induction
Micha Elsner | Stephanie Antetomaso | Naomi Feldman |

Nonparametric Bayesian Semi-supervised Word Segmentation
Ryo Fujii | Ryo Domoto | Daichi Mochihashi |

Unsupervised Word Segmentation for Sesotho Using Adaptor Grammars
Mark Johnson |

Integration of Multiple Bilingually-Learned Segmentation Schemes into Statistical Machine Translation
Michael Paul | Andrew Finch | Eiichiro Sumita |

Modeling Syntactic Context Improves Morphological Segmentation
Yoong Keok Lee | Aria Haghighi | Regina Barzilay |

A Regularized Compression Method to Unsupervised Word Segmentation
Ruey-Cheng Chen | Chiung-Min Tsai | Jieh Hsiang |

The Study of Effect of Length in Morphological Segmentation of Agglutinative Languages
Loganathan Ramasamy | Zdeněk Žabokrtský | Sowmya Vajjala |

Integrating Dictionaries into an Unsupervised Model for Myanmar Word Segmentation
Ye Kyaw Thu | Andrew Finch | Eiichiro Sumita | Yoshinori Sagisaka |

Generalization in Artificial Language Learning: Modelling the Propensity to Generalize
Raquel G. Alhama | Willem Zuidema |

A Dataset for Sanskrit Word Segmentation
Amrith Krishna | Pavan Kumar Satuluri | Pawan Goyal |

http://www.speech.sri.com/people/anand/

Field Of Study

Task

Word Segmentation

Language

Chinese Child Language

Dataset

Child Language

Similar Papers

Training a Natural Language Generator From Unaligned Data

Ondřej Dušek | Filip Jurčíček |

Extending Statistical Machine Translation with Discriminative and Trigger-Based Lexicon Models

Arne Mauser | Saša Hasan | Hermann Ney |

Integrating Graph-Based and Transition-Based Dependency Parsers

Joakim Nivre | Ryan McDonald |

A treebank-based study on the influence of Italian word order on parsing performance

Anita Alicante | Cristina Bosco | Anna Corazza | Alberto Lavelli |

Improving Arabic-Chinese Statistical Machine Translation using English as Pivot Language

Nizar Habash | Jun Hu |