Affordable Access

Processing oceanographic data by Python libraries NumPy, SciPy and Pandas

  • Lemenkova, Polina
Publication Date
Apr 08, 2019
External links


The study area is located in western Pacific Ocean, Mariana Trench. The aim of the data analysis is to analyze the potential influence of how various geological and tectonic factors may affect the ge-omorphological shape of the Mariana Trench. Statistical analysis of the data set in marine geology and oceanography requires an adequate strategy on big data processing. In this context, current research proposes a combination of the Python-based methodology that couples GIS geospatial data analysis. The Quantum GIS part of the methodology produces an optimized representative sampling dataset consisting of 25 cross-section profiles having in total 12,590 bathymetric observation points. The sampling of the geospatial dataset are located across the Mariana Trench. The second part of the methodology consists of statistical data processing by means of high-level programming language Python. Current research uses libraries Pandas, NumPy and SciPy. The data processing also involves the subsampling of two auxiliary masked data frames from the initial large data set that only consists of the target variables: sediment thickness, slope angle degrees and bathymetric observation points across four tectonic plates: Pacific, Philippine, Mariana, and Caroline. Finally, the data were analyzed by several approaches: 1) Kernel Density Estimation (KDE) for analysis of the probability of data distribution; 2) stacked area chart for visualization of the data range across various segments of the trench; 3) spacial series of radar charts; 4) stacked bar plots showing the data distribution by tectonic plates; 5) stacked bar charts for correlation of sediment thickness by profiles, versus distance from the igneous volcanic areas; 6) circular pie plots visualizing data distribution by 25 profiles; 7) scatterplot matrices for correlation analysis between marine geologic variables. The results presented a distinct correlation between the geologic, tectonic and oceanographic variables. Six Python codes are provided in full for repeatability of this research.

Report this publication


Seen <100 times