Affordable Access

Support Vector Machine (SVM) aggregation modelling for spatio-temporal air pollution analysis

  • Ali, Shahid
Publication Date
Mar 04, 2019
Unitec Research Bank
External links


RESEARCH QUESTIONS: 1. Dealing with long term historical data of spatio-temporal is always challenging. SVM ensemble and other methods are used to handle long term historical data, but these methods result in slow processing and low accuracy especially as the data size increases. Can the data be efficiently processed and the accuracy of the model be increased for large data compared to SVM ensemble method? Air pollution data are available in huge size which need to be stored and processed. The problem is compounded with processing long-term historical data. The meaning of the data can change over time based on events, and based on the locations from which the data were captured. Dealing with such long term spatio-temporal data are indeed a challenging assignment. 2. Air pollution is a spatio-temporal problem and the data is distributed across multiple locations, which is difficult to manage for the SVM ensemble and other techniques. How the distributed nature of spatio-temporal air pollution data can be resolved efficiently and with better classification compared to SVM ensemble method? Air pollution data are physically distributed, decentralised and monitored across various monitoring stations. For example, in the Auckland region for air pollution monitoring there are 19 monitoring stations. One can design a computational system for analysing a single air pollution monitoring station’s data, but designing a system for processing distributed multiple data of all those stations is a complex task, since data are available in huge volumes. However, centralised data analysis will lead to processing and resource challenges. 3. Air pollution data are often confronted with missing values and any analysis with such data will not give us the true picture of the fundamental problem. How accurate can be the analysis of air pollution data with missing values compared to SVM ensemble method? As air pollution varies regionally, it is comparatively easy to know and compute a single location of air pollution data, but it is difficult to have air pollution regional data based on the computation of various monitoring stations. Region specific information will be useful to formulate a data aggregation strategy. 4. Can SVM aggregation and knowledge fusion over spatio-temporal dimensions be applied to conduct air pollution prediction accuracy better than SVM ensemble method? Analysis of the results of long-term historic spatio-temporal data are a tedious and time consuming task. Spatio-temporal dimensions fusion via the same SVM representation is achievable, but still remains a complex task so therefore warrants a specific research question. We envisage this question to be more focused on prediction, and any solutions to this research question will be significant. This research addresses the spatio-temporal air pollution analysis problem. Existing air pollution studies often simplify the problem and fail to consider the fact that air pollution is a spatial and temporal problem. More specifically, previous approaches are optimal for temporarily rich data; however, environmental data is more likely to be collected over a large geographical area and at different periods of time. This research proposes an approach based on a decentralised computational technique named Scalable SVM Ensemble Learning Method (SSELM) for classifying air pollution data in Auckland in 2010 on an hourly basis. Special consideration is given to the distributed ensemble in order to resolve the spatio-temporal data collection problem. The proposed approach has been compared with SVM ensemble learning for air pollution analysis in the Auckland region. Experiments demonstrated that the proposed SSELM approach outperforms SVM ensemble learning in efficiency and accuracy.

Report this publication


Seen <100 times