Smaïli, Kamel Anissa, Hamza David, Langlois Djegdjiga, Amazouz
This article presents BOUTEF, an original and comprehensive corpus of fake news. It encompassescontent in Algerian and Tunisian dialects, Modern Standard Arabic (MSA), French, and English,featuring instances of code-switching between these languages. Moreover, for the Algerian and Tunisiandialects, we have preserved both Latin and Arabic scripts in...
Alnassan, Abidrabbo
Based on an annotated multimedia corpus, television series Marāyā 2013, we dig into the question of "automatic standardization" of Arabic dialects for machine translation. Here we distinguish between rule-based machine translation and statistical machine translation. Machine translation from Arabic most of the time takes standard or modern Arabic a...
Tachicart, Ridouane Bouzoubaa, Karim Harrat, Salima Smaïli, Kamel
Morphological analysis is a crucial stage in natural language processing. For the Arabic language many attempts have been conducted to build morphological analyzers. Despite the increasing attention paid to Arabic dialects recently, only a few number of morphological analyzers have been built compared to MSA. In addition, those tools often cover a ...
Guellil, Imane Adeel, Ahsan Azouaou, Faical Benali, Fodil Hachani, Ala-Eddine Dashtipour, Kia Gogate, Mandar Ieracitano, Cosimo Kashani, Reza Hussain, Amir
...
Published in
SN Computer Science
In this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of ...
Meftouh, K. Harrat, S Smaïli, Kamel
PADIC is a multidialectal parallel Arabic corpus. It was composed initially by five Arabic dialects, three from the Maghreb and two from the Middle East, in addition to standard Arabic. In this paper, we present an augmented version of PADIC with a Moroccan dialect. We give also an evaluation, using the σ–index, of the computerization level of the ...
Harrat, Salima Meftouh, Karima Smaïli, Kamel
Natural Language Processing for Arabic dialects has grown widely these last years. Indeed, several works were proposed dealing with all aspects of Natural Language Processing. However , some AD varieties have received more attention and have a growing collection of resources. Others varieties, such as Maghrebi, still lag behind in that respect. Mag...
Abidi, Karima Menacer, Mohamed Amine Smaili, Kamel
This paper addresses the issue of comparability of commentsextracted from Youtube. The comments concern spokenAlgerian which could be either local Arabic, Modern StandardArabic or French. This diversity of expression arises a hugenumber of problems concerning the data processing. In thisarticle, several methods of alignment will be proposed andtest...
Harrat, Salima Meftouh, Karima Smaïli, Kamel
Arabic dialects also called colloquial Arabic or vernaculars are spoken varieties of Standard Arabic. These dialects have mixed form with many variations due to the influence of ancient local tongues and other languages like European ones. Many of these dialects are mutually incomprehensible. Arabic dialects were not written until recently and were...
Siino, François Catusse, Myriam
Catherine Miller est sociolinguiste et directrice de recherche au CNRS. Née en 1955, formée à l’ethnologie et à la linguistique, elle a notamment travaillé sur les phénomènes de contact linguistique dans le domaine arabe, d’abord au Sud Soudan puis en Égypte. À partir de 2008, elle a poursuivi ses recherches sur le terrain marocain lors d’un séjour...
Harrat, Salima Meftouh, Karima Abbas, Mourad Hidouci, Walid-Khaled Smaïli, Kamel
Arabic is the official language overall Arab coun-tries, it is used for official speech, news-papers, public adminis-tration and school. In Parallel, for everyday communication, non-official talks, songs and movies, Arab people use their dialects which are inspired from Standard Arabic and differ from one Arabic country to another. These linguistic...