Vision Based Deep Web Data Extraction on Nested Query Result Records

Provided by:International Journal of Modern Engineering Research (IJMER)

Topic:Security

Format:PDF

Web data extraction software is required by the web analysis services such as Google, Amazon etc. The web analysis services should crawl the web sites of the internet, to analyze the web data. While extracting the web data, the analysis service should visit each and every web page of each web site. But the web pages will have more number of code part and very less quantity of the data part. In this paper, the authors propose a novel vision based deep web data extraction on nested Query Result Records. This technique extract the data from web pages using different font styles, different font sizes and cascading style sheets after extracting the data the entire data will be aligned into a table using alignment algorithms.