Affordable Access

Access to the full text

Estimating educational outcomes from students’ short texts on social media

Authors
  • Smirnov, Ivan1
  • 1 National Research University Higher School of Economics, Moscow, Russia , Moscow (Russia)
Type
Published Article
Journal
EPJ Data Science
Publisher
Springer Berlin Heidelberg
Publication Date
Sep 01, 2020
Volume
9
Issue
1
Identifiers
DOI: 10.1140/epjds/s13688-020-00245-8
Source
Springer Nature
Keywords
License
Green

Abstract

Digital traces have become an essential source of data in social sciences because they provide new insights into human behavior and allow studies to be conducted on a larger scale. One particular area of interest is the estimation of various users’ characteristics from their texts on social media. Although it has been established that basic categorical attributes could be effectively predicted from social media posts, the extent to which it applies to more complex continuous characteristics is less understood. In this research, we used data from a nationally representative panel of students to predict their educational outcomes measured by standardized tests from short texts on a popular Russian social networking site VK. We combined unsupervised learning of word embeddings on a large corpus of VK posts with a simple, supervised model trained on individual posts. The resulting model was able to distinguish between posts written by high- and low-performing students with an accuracy of 94%. We then applied the model to reproduce the ranking of 914 high schools from 3 cities and of the 100 largest universities in Russia. We also showed that the same model could predict academic performance from tweets as well as from VK posts. Finally, we explored predictors of high and low academic performance to obtain insights into the factors associated with different educational outcomes.

Report this publication

Statistics

Seen <100 times