Affordable Access

Algorithms for incremental learning of decision trees from time-changing data streams

Publication Date
  • Computer And Information Science
  • Computer Science


This work is detailed presentation of the main ideas behind state-of-the-art algorithms for online learning of decision trees and other models from time-changing data streams. We begin by proving Hoeffding inequality, which we then use to derive a general method for scaling up machine learning algorithms. We apply this method to scale up classical decision tree learning algorithm, and prove theoretical guarantees for one of the learners. We implement scaled up decision tree learners and, after giving a rough description of the imple-mentation, illustrate usage on a simple, bootstrapped dataset. We then turn to methods for assessing stream learning algorithm performance and comparing two algorithms on a single data stream. We later apply these methods when performing experiments on a real-world electricity-demand scenario for New York state electricity data, demonstrating usage of both our implementation and aforementioned evaluation methods. Original contribution are simple formulas and algorithms for computing entropy and Gini index on time-changing data streams.

There are no comments yet on this publication. Be the first to share your thoughts.


Seen <100 times