Affordable Access

Internet Business Intelligence

Authors
Publication Date
Source
HAL-UPMC
Keywords
License
Unknown
External links

Abstract

Abstract—Business Intelligence (BI) refers to computer-basedtechniques used in spotting, digging-out, and analyzing businessdata. It is mainly focused on how to dig out business data. Thistype of business data is a on-line web database which can besearched through their Web query interfaces. Deep Web (oftencalled hidden web or invisible web) is composed of all the webdatabases. With the evolution of the ”deep web”, more andmore researchers pay attention to the ”integration” of the webdatabase. However, to achieve this goal, it needs a complex systemand many applications to work together. We are interested in anautomatic extracting system to get the formulas or the lists ofthe results from those websites in specific domain of governmentprocurement. To tackle this challenge, we propose a solution tocreate a unified interface and to inquire resources in a predefineddomain. In this paper, we will discuss the automatic extractingsystem in several steps. First of all, the web query interfacescrawler which can execute JavaScript guarantees the coverageof the web database. Secondly, the query interface extractor andthe interface integrator can allow us query all these foundedweb databases through a global query interface. Thirdly, theresult page extractor and the result integrator can give a unifiedpresentation. Lastly, a feedback method is developed to gatherthe result accuracy. A statistical model is built to improve theperformance of the step 2 and 3. We assume our system is adynamic system, which means the more we use it, the moreprecise results we will get.Keywords-schema matching; web-database integration;

Statistics

Seen <100 times