Web Scraping

There are two terms in web searching, namely Web Crawling and Web Scraping. The two terms have a difference. If Web Crawling looks for the location/address of the site, Web Scrapping retrieves the content or site content.

Google for example, when searching, this engine will crawl sites around the world based on keywords. Next the user will get a list of sites suggested by Google based on their ranking. Furthermore, the content will be searched manually by the user from sites suggested by Google or other search engines. The combination of crawling and scraping results in a system that in addition to finding the location of sites based on keywords, also retrieves the content of these sites.

The following video is an example of how a simple application scrapes information from four sample sites based on keywords and then the results are stored in csv (Jurnal link). Stored results can be further utilized, for example for sentiment analysis (Jurnal link).