Melatagia Yonta, Paulin Mbouopda, Michael Franklin
Named Entity Recognition (NER) is a fundamental task in many NLP applications that seek to identify and classify expressions such as people, location, and organization names. Many NER systems have been developed, but the annotated data needed for good performances are not available for low-resource languages, such as Cameroonian languages. In this ...
Ortiz Suárez, Pedro Javier Dupont, Yoann Lejeune, Gaël Tian, Tian
In this article we present the approaches developed by the Sorbonne-INRIA for NER (SinNer) team for the CLEF-HIPE 2020 challenge on Named Entity Processing on old newspapers. The challenge proposed various tasks for three languages, among them we focused on Named Entity Recognition in French and German texts. The best system we proposed ranked thir...
Boros, Emanuela Linhares Pontes, Elvys Cabrera-Diego, Luis Adrián Hamdi, Ahmed Moreno, José Sidère, Nicolas Doucet, Antoine
This paper summarizes the participation of the L3i laboratory of the University of La Rochelle in the Identifying Historical People, Places, and other Entities (HIPE) evaluation campaign of CLEF 2020. Our participation relies on two neural models, one for named entity recognition and classification (NERC) and another one for entity linking (EL). We...
Helali, Mossad Kleinbauer, Thomas Klakow, Dietrich
Despite their success in a multitude of tasks, neural models trained on natural language have been shown to memorize the intricacies of their training data, posing a potential privacy threat. In this work, we propose a metric to quantify unintended memorization in neural dis-criminative sequence models. The proposed metric, named d-exposure (discri...
Hong, Zhi Tchoua, Roselyne Chard, Kyle Foster, Ian
Published in
Computational Science – ICCS 2020
The automated extraction of claims from scientific papers via computer is difficult due to the ambiguity and variability inherent in natural language. Even apparently simple tasks, such as isolating reported values for physical quantities (e.g., “the melting point of X is Y”) can be complicated by such factors as domain-specific conventions about h...
Tikhomirov, Mikhail Loukachevitch, N. Sirotina, Anastasiia Dobrov, Boris
Published in
Natural Language Processing and Information Systems
The paper presents the results of applying the BERT representation model in the named entity recognition task for the cybersecurity domain in Russian. Several variants of the model were investigated. The best results were obtained using the BERT model, trained on the target collection of information security texts. We also explored a new form of da...
Ortiz Suárez, Pedro Javier Dupont, Yoann Muller, Benjamin Romary, Laurent Sagot, Benoît
The French TreeBank developed at the University Paris 7 is the main source of morphosyntactic and syntactic annotations for French. However, it does not include explicit information related to named entities, which are among the most useful information for several natural language processing tasks and applications. Moreover, no large-scale French c...
Azevedo, Pedro Leite, Bernardo Cardoso, Henrique Lopes Silva, Daniel Castro Reis, Luís Paulo
Published in
Artificial Intelligence Applications and Innovations
Question Answering (QA) and Question Generation (QG) have been subjects of an intensive study in recent years and much progress has been made in both areas. However, works on combining these two topics mainly focus on how QG can be used to improve QA results. Through existing Natural Language Processing (NLP) techniques, we have implemented a tool ...
Lenas, Erik
Natural language processing (NLP) is a vibrant area of research with many practical applications today like sentiment analyses, text labeling, questioning an- swering, machine translation and automatic text summarizing. At the moment, research is mainly focused on the English language, although many other lan- guages are trying to catch up. This wo...
Gritta, Milan Pilehvar, Mohammad Taher Collier, Nigel
Published in
Language resources and evaluation
Empirical methods in geoparsing have thus far lacked a standard evaluation framework describing the task, metrics and data used to compare state-of-the-art systems. Evaluation is further made inconsistent, even unrepresentative of real world usage by the lack of distinction between the different types of toponyms, which necessitates new guidelines,...