Renuka Devi, D. Sasikala, S.
Published in
Journal of Big Data
Feature selection is mainly used to lessen the dispensation load of data mining models. To condense the time for processing voluminous data, parallel processing is carried out with MapReduce (MR) technique. However with the existing algorithms, the performance of the classifiers needs substantial improvement. MR method, which is recommended in this...
Sveen, Atle Frenvik
Published in
Journal of Big Data
The no-schema approach of NoSQL document stores is a tempting solution for importing heterogenous geospatial data to a spatial database. However, this approach means sacrificing the benefits of RDBMSes, such as existing integrations and the ACID principle. Previous comparisons of the document-store and table-based layout for storing geospatial data...
Wu, Lianren Li, Jinjie Qi, Jiayin
Published in
Journal of Big Data
In this paper, a quantitative temporal and spatial analysis of the dynamics of hot topics popularity in Micro-blogging system was provided. Firstly, the popularity time series of 1167 hot topics were counted and calculated by Excel. Secondly, based on MATLAB software,the popularity time series were clustered into six clusters by K-spectral centroid...
Torabzadehkashi, Mahdi Rezaei, Siavash HeydariGorji, Ali Bobarshad, Hosein Alves, Vladimir Bagherzadeh, Nader
Published in
Journal of Big Data
In the era of big data applications, the demand for more sophisticated data centers and high-performance data processing mechanisms is increasing drastically. Data are originally stored in storage systems. To process data, application servers need to fetch them from storage devices, which imposes the cost of moving data to the system. This cost has...
Al-Molhem, Nour Raeef Rahal, Yasser Dakkak, Mustapha
Published in
Journal of Big Data
Many systems can be represented as networks or graph collections of nodes joined by edges. The social structures in these networks can be investigated using graph theory through a process called social network analysis (SNA). In this paper, networks and SNA concepts were applied using Telecom data such as call detail records (CDRs) and customers da...
Hashemi, Mahdi
Published in
Journal of Big Data
The input to a machine learning model is a one-dimensional feature vector. However, in recent learning models, such as convolutional and recurrent neural networks, two- and three-dimensional feature tensors can also be inputted to the model. During training, the machine adjusts its internal parameters to project each feature tensor close to its tar...
Fikri, Noussair Rida, Mohamed Abghour, Noureddine Moussaid, Khalid El Omri, Amina
Published in
Journal of Big Data
In this paper we are proposing an adaptive and real-time approach to resolve real-time financial data integration latency problems and semantic heterogeneity. Due to constraints that we have faced in some projects that requires real-time massive financial data integration and analysis, we decided to follow a new approach by combining a hybrid finan...
Sarker, Iqbal H.
Published in
Journal of Big Data
Smartphones are considered as one of the most essential and highly personal devices of individuals in our current world. Due to the popularity of context-aware technology and recent developments in smartphones, these devices can collect and process raw contextual data about users’ surrounding environment and their corresponding behavioral activitie...
Yadav, Sumedh Bode, Mathis
Published in
Journal of Big Data
A scalable graphical method is presented for selecting and partitioning datasets for the training phase of a classification task. For the heuristic, a clustering algorithm is required to get its computation cost in a reasonable proportion to the task itself. This step is succeeded by construction of an information graph of the underlying classifica...
Ciritoglu, Hilmi Egemen Murphy, John Thorpe, Christina
Published in
Journal of Big Data
The Hadoop distributed file system (HDFS) is responsible for storing very large data-sets reliably on clusters of commodity machines. The HDFS takes advantage of replication to serve data requested by clients with high throughput. Data replication is a trade-off between better data availability and higher disk usage. Recent studies propose differen...