Web Data Extraction with VB.NET
|
Web Data Extraction with VB.NET by Denis SusacDescriptionVarious technologies for intelligent information retrieval have been developed due to the explosive growth in the amount of data available online. This data resides in many different forms, ranging from the totally unstructured (as in text files) to the highly formatted (as in relational database systems). Most web content, which is simply designed for presentation purposes — not information extraction — is semi–structured and has no fixed schema. In this article, author Denis Susac explains how to implement an extensible web application for information extraction using standard components and techniques. The solution involves the creation of relatively simple XSLT stylesheets for the transformation of retrieved data to data–centric XML, and requires no code or database design changes when the source HTML document changes. It encapsulates the extraction logic in external XSL files, which are easy to develop and maintain. It also provides an efficient solution for developing various text mining, comparison shopping, business intelligence, and decision support systems.
|
Become a fan of EbooksPublication.com | Best Source for Kindle eBooks on Facebook for the inside scoop on latest and most exclusive books.