NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Adrien Barbaresi
Number of Papers:- 17
Number of Citations:- 19
First ACL Paper:- 2011
Latest ACL Paper:- 2022
Venues:-
CMLC
VarDial
ACL
JEP/TALN/RECITAL
LREC
WAC
COLING
IJCNLP
WS
Co-Authors:-
Alexander Geyken
Egon Stemle
Felix Bildhauer
Gael Lejeune
Harald Lungen
Similar Authors:-
Arnaud Vallee
Xavier Briffault
J L Gauvain
Marion Richard
Laurent Devos
2022
2021
2020
2018
2017
2016
2014
2013
2011
Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-10)
CMLC
LREC
Piotr Banski |
Adrien Barbaresi |
Simon Clematide |
Marc Kupietz |
Harald Lüngen |
Trafilatura: A Web Scraping Library and Command-Line Tool for Text Discovery and Extraction
ACL
IJCNLP
Adrien Barbaresi |
Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora
CMLC
LREC
WS
Piotr Bański |
Adrien Barbaresi |
Simon Clematide |
Marc Kupietz |
Harald Lüngen |
Ines Pisetta |
Proceedings of the 12th Web as Corpus Workshop
LREC
WAC
WS
Adrien Barbaresi |
Felix Bildhauer |
Roland Schäfer |
Egon Stemle |
Out-of-the-Box and into the Ditch? Multilingual Evaluation of Generic Text Extraction Tools
LREC
WAC
WS
Adrien Barbaresi |
Gaël Lejeune |
Que recèlent les données textuelles issues du web ? (What do text data from the Web have to hide ?)
JEP/TALN/RECITAL
Adrien Barbaresi |
Gaël Lejeune |
Bien choisir son outil d’extraction de contenu à partir du Web (Choosing the appropriate tool for Web Content Extraction )
JEP/TALN/RECITAL
Gaël Lejeune |
Adrien Barbaresi |
A corpus of German political speeches from the 21st century
LREC
Adrien Barbaresi |
A database of German definitory contexts from selected web sources
LREC
Adrien Barbaresi |
Lothar Lemnitzer |
Alexander Geyken |
Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
COLING
VarDial
WS
Adrien Barbaresi |
Discriminating between Similar Languages using Weighted Subword Features
VarDial
WS
Adrien Barbaresi |
Efficient construction of metadata-enhanced web corpora
WAC
WS
Adrien Barbaresi |
An Unsupervised Morphological Criterion for Discriminating Similar Languages
VarDial
WS
Adrien Barbaresi |
Finding Viable Seed URLs for Web Corpora: A Scouting Approach and Comparative Study of Available Sources
WAC
WS
Adrien Barbaresi |
Focused Web Corpus Crawling
WAC
WS
Roland Schäfer |
Adrien Barbaresi |
Felix Bildhauer |
Crawling microblogging services to gather language-classified URLs. Workflow and case study
ACL
Adrien Barbaresi |
La complexité linguistique Méthode d’analyse
JEP/TALN/RECITAL
Adrien Barbaresi |
Linguistic
Task
Language
Dataset Type
.