Affordable Access

Access to the full text

A bootstrapping approach to social media quantification

Authors
  • Daughton, Ashlynn R.1
  • Paul, Michael J.2
  • 1 Los Alamos National Laboratory, Los Alamos, USA , Los Alamos (United States)
  • 2 University of Colorado, Boulder, USA , Boulder (United States)
Type
Published Article
Journal
Social Network Analysis and Mining
Publisher
Springer Vienna
Publication Date
Aug 09, 2021
Volume
11
Issue
1
Identifiers
DOI: 10.1007/s13278-021-00760-0
Source
Springer Nature
Keywords
Disciplines
  • Original Article
License
Green

Abstract

This work considers the use of classifiers in a downstream aggregation task estimating class proportions, such as estimating the percentage of reviews for a movie with positive sentiment. We derive the bias and variance of the class proportion estimator when taking classification error into account to determine how to best trade off different error types when tuning a classifier for these tasks. Additionally, we propose a method for constructing confidence intervals that correctly adjusts for classification error when estimating these statistics. We conduct experiments on four document classification tasks comparing our methods to prior approaches across classifier thresholds, sample sizes, and label distributions. Prior approaches have focused on providing the most accurate point estimate while this work focuses on the creation of correct confidence intervals that appropriately account for classifier error. Compared to the prior approaches, our methods provide lower error and more accurate confidence intervals.

Report this publication

Statistics

Seen <100 times