What is web scraping?
It is the process of extracting structured information from unstructured or semi-structured web data sources. Web Extraction also referred as Web Data Mining or Web Scraping.
Web Scrapping/Extraction is done by creating programme or script written in any programming language that processes the unstructured or semi-structured html web pages of a target web site or another web text based documents to extract information or data for converting unstructured data into structured format. With help of web extraction you can connect to a website’s web pages and request information or a pages, exactly as your browser would do. The web server will send back the html web page which you can then extract specific information from that web page.
Web data mining is also known as web content mining, web text mining, because the content or text is the most widely researched area in world of internet. Extracting data from html web pages is an instance of web data mining. Web data mining tasks are categorized into three main types: web content mining, web structure mining, and web usage mining.