NLPExplorer

Tomaz Erjavec

Number of Papers:- 50

Number of Citations:- 74

First ACL Paper:- 1990

Latest ACL Paper:- 2024

Venues:-

COLING

EMNLP

ParlaCLARIN

ALW

EAMT

NLP+CSS

MTSummit

LaTeCH

BSNLP

LAW

RANLP

LREC

ACL

Co-Authors:-

Similar Authors:-

Multilingual Power and Ideology identification in the Parliament: a reference dataset and simple baselines ParlaCLARIN WS

ParlaMint II: The Show Must Go On LREC ParlaCLARIN

Dealing with Abbreviations in the Slovenian Biographical Lexicon EMNLP

Angel Daza | Antske Fokkens | Tomaž Erjavec |

The siParl corpus of Slovene parliamentary proceedings LREC ParlaCLARIN WS

Andrej Pancur | Tomaž Erjavec |

Gigafida 2.0: The Reference Corpus of Written Standard Slovene LREC

Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing ACL WS

Improving UD processing via satellite resources for morphology WS

Kaja Dobrovoljc | Tomaž Erjavec | Nikola Ljubešić |

CLARIN’s Key Resource Families LREC

Darja Fišer | Jakob Lenardič | Tomaž Erjavec |

Datasets of Slovene and Croatian Moderated News Comments EMNLP WS

Nikola Ljubešić | Tomaž Erjavec | Darja Fišer |

Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing BSNLP WS

The Universal Dependencies Treebank for Slovenian BSNLP WS

Kaja Dobrovoljc | Tomaž Erjavec | Simon Krek |

Adapting a State-of-the-Art Tagger for South Slavic Languages to Non-Standard Text BSNLP WS

Nikola Ljubešić | Tomaž Erjavec | Darja Fišer |

Language-independent Gender Prediction on Twitter NLP+CSS WS

Nikola Ljubešić | Darja Fišer | Tomaž Erjavec |

Legal Framework, Dataset and Annotation Schema for Socially Unacceptable Online Discourse Practices in Slovene ALW WS

Darja Fišer | Tomaž Erjavec | Nikola Ljubešić |

Corpus vs. Lexicon Supervision in Morphosyntactic Tagging: the Case of Slovene LREC

Nikola Ljubešić | Tomaž Erjavec |

Corpus-Based Diacritic Restoration for South Slavic Languages LREC

Nikola Ljubešić | Tomaž Erjavec | Darja Fišer |

Predicting the Level of Text Standardness in User-generated Content RANLP

sloWCrowd: A crowdsourcing tool for lexicographic tasks LREC

Darja Fišer | Aleš Tavčar | Tomaž Erjavec |

TweetCaT: a tool for building Twitter corpora of smaller languages LREC

Nikola Ljubešić | Darja Fišer | Tomaž Erjavec |

Modernizing historical Slovene words with character-based SMT WS

Yves Scherrer | Tomaž Erjavec |

The goo300k corpus of historical Slovene LREC

Tomaž Erjavec |

Lexicon Construction and Corpus Annotation of Historical Language with the CoBaLT Editor LaTeCH WS

Tom Kenter | Tomaž Erjavec | Maja Žorga Dulmin | Darja Fišer |

OWL/DL formalization of the MULTEXT-East morphosyntactic specifications LAW WS

Christian Chiarcos | Tomaž Erjavec |

Automatic linguistic annotation of historical language: ToTrTaLe and XIX century Slovene LaTeCH WS

Tomaž Erjavec |

MULTEXT-East Version 4: Multilingual Morphosyntactic Specifications, Lexicons and Corpora LREC

Tomaž Erjavec |

The JOS Linguistically Tagged Corpus of Slovene LREC

Tomaž Erjavec | Darja Fišer | Simon Krek | Nina Ledinek |

Experimental Deployment of a Grid Virtual Organization for Human Language Technologies LREC

Jan Jona Javoršek | Tomaž Erjavec |

The JOS Morphosyntactically Tagged Corpus of Slovene LREC

Tomaž Erjavec | Simon Krek |

Designing and Evaluating a Russian Tagset LREC

Towards a Slovene Dependency Treebank LREC

The English-Slovene ACQUIS corpus LREC

Tomaž Erjavec |

Building Slovene WordNet LREC

Tomaž Erjavec | Darja Fišer |

The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages LREC

Making an XML-based Japanese-Slovene Learners’ Dictionary LREC

Tomaž Erjavec | Kristina Hmeljak Sangawa | Irena Srdanović | Anton ml. Vahčič |

MULTEXT-East Version 3: Multilingual Morphosyntactic Specifications, Lexicons and Corpora LREC

Tomaž Erjavec |

Migrating Language Resources from SGML to XML: The Text Encoding Initiative Recommendations LREC

Towards an International Standard on Feature Structure Representation LREC

Encoding Biomedical Resources in TEI: The Case of the GENIA Corpus WS

Stretching TEI: Converting the Genia Corpus WS

The MULTEXT-East Morphosyntactic Specification for Slavic Languages WS

Sense Discrimination with Parallel Corpora WS

Nancy Ide | Tomaz Erjavec | Dan Tufis |

The TELRI tool catalogue: structure and prospects WS

Tomaž Erjavec | Tamás Váradi |

Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets LREC

Sašo Džeroski | Tomaž Erjavec | Jakub Zavrel |

Corpora of Slovene Spoken Language for Multi-lingual Applications LREC

The Concede Model for Lexical Databases LREC

Tomaž Erjavec | Roger Evans | Nancy Ide | Adam Kilgarriff |

Slovene–English Datasets for MT EAMT

Tomaž Erjavec |

The ELAN Slovene-English aligned corpus MTSummit

Tomaz Erjavec |

Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages COLING

Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages ACL COLING

AN INTEGRATED SYSTEM FOR MORPHOLOGICAL ANALYSIS OF THE SLOVENE LANGUAGE COLING

Tomaz Erjavec | Peter Tancig |