Affordable Access

Data Mining Techniques for Effective Flow-based Analysis of Multi-Gigabit Network Traffic

Publication Date
  • Computer Science
  • Design
  • Mathematics


NetMiner_camera_ready Abstract: This paper describes a novel approach to traffic capture and analysis in high speed networks. A format for the representation of captured packets that (i) limits the amount of data stored and (ii) enables efficient processing is defined. Then, data mining techniques widely studied and deployed for extracting relevant information from extremely large data bases, are applied as a means to effectively process the significant amount of captured data. The paper provides a first evaluation of the proposed approach in terms of its ability of extracting relevant information and its computational complexity. Such evaluation is based on the first experiments run on the prototypal implementation of the proposed approach within the Analyzer traffic capturing and analysis tool. I. INTRODUCTION ne of the most critical issues in keeping a network under control is capturing and analyzing its traffic. The complexity of these tasks is increasing as networks become faster and faster. Traffic capturing and analysis goes through the steps depicted in Figure 1, all of which are critical when operating at high data rates. Some equipment vendors, such as Endace [1], offer network interfaces specifically designed for supporting packet capture at high data rates (e.g., 10 Gbps), thereby facilitating the realization of the first step in Figure 1. Capture On-lineProcessing Dump results DiskDisk On-line monitoring and analysis Off-line analysis Off-line Processing Dump results DiskDisk Figure 1. Basic steps in network traffic capture and analysis. The time required to receive a minimum size Ethernet frame at 10 Gbps speed is less than 70 ns, which leaves a few hundreds clock cycles to a multi-GHz processor for handling a captured packet. This makes the realization of the second step critical. However, the deployment of multi-processor machines that concurrently process multiple packets increases the time availabl

There are no comments yet on this publication. Be the first to share your thoughts.