Díaz Martínez, Capitolina Díaz García, Pablo Navarro Sustaeta, Pablo
Social events become big data. The big data analysis becomes knowledge about society. If big data is biased, the bias is transmitted to the analysis and to our knowledge. We propose here a tool to discover gender biases and, potentially, eliminate them from big data before analysis. We use the neural network analysis and the words embedding. This i...
Dobreva, Radina Zhou, Jie Bawden, Rachel
Current approaches to machine translation (MT) either translate sentences in isolation, disregarding the context they appear in, or model context at the level of the full document, without a notion of any internal structure the document may have. In this work we consider the fact that documents are rarely homogeneous blocks of text, but rather cons...
Fernández-Bellon, Darío Kane, Adam
Published in
Conservation letters
In urbanized societies that are increasingly disconnected from nature, communicating ecological and species awareness is crucial to revert the global environmental crisis. However, our understanding of the effectiveness of this process is limited. We present a framework for describing how such awareness may be transferred and test it on the popular...
Couegnas, Nicolas Badir, Sémir
International audience
Adewumi, Oluwatosin Liwicki, Foteini Liwicki, Marcus
In this work, we show that the difference in performance of embeddings from differently sourced data for a given language can be due to other factors besides data size. Natural language processing (NLP) tasks usually perform better with embeddings from bigger corpora. However, broadness of covered domain and noise can play important roles. We evalu...
Obregón Sierra, Ángel González Fernández, Natalia
Since the beginning of the 21st century, there has been a change in the way people connect to the Internet, interacting more with the creators of web sites and spending more time connecting to several tools that have been called Web 2.0, such as social networks, wikis and blogs. One of the best-known wikis is Wikipedia, free online encyclopedia col...
Sanz-Lorente, María Ruiz-Belda, Paula Wanden-Berghe, Carmina Sanz-Valero, Javier
Este estudio quiso comprobar la corrección de las referencias bibliográficas existentes en los términos sobre enfermedades de transmisión sexual (ETS) de la edición española de la Wikipedia y probar su validez para acceder al documento fuente. De los resultados obtenidos se pudo concluir que el porcentaje de error de las referencias de los términos...
Jemielniak, Dariusz
Published in
GigaScience
Wikipedia is by far the largest online encyclopedia, and the number of errors it contains is on par with the professional sources even in specialized topics such as biology or medicine. Yet, the academic world is still treating it with great skepticism because of the types of inaccuracies present there, the widespread plagiarism from Wikipedia, and...
Qiu, Riyi Hadzikadic, Mirsad Yu, Sha Yao, Lixia
Published in
Health informatics journal
Data on disease burden are often used for assessing population health, evaluating the effectiveness of interventions, formulating health policies, and planning future resource allocation. We investigated whether Internet usage and social media data, specifically the search volume on Google, page view count on Wikipedia, and disease mentioning frequ...
Lages, José