Elements Reference - Deployed Models by HTML5 UP

Web Scrapping using Beautiful Soup

Data Source: World Population dataset

Libraries and Packages: BeautifulSoup, Pandas

Steps

Objective

: To extract dataset from a webpage (open source)

1. Importing the libraries

2. Webscrapping

3. Storing the extracted data

1.importing the Libraries

The first process begins by importing BeautifulSoup and the Pandas library after the pip install(if needed) as well as the requests to send request and get back the Html page.

2. WebScrapping

Assign the url and request the page.

Use the Beautiful soup to parse the html and print it to view all the tags

Inspect the elements

Go to the webpage to inspect the elements to identify the tag where the data of interest is embedded In this case, we will be scrapping the columns separately

Create a list and a dictionary to add the data in the column and append the dictionary.

In doing this, all the data scrapped will be assigned to a dictionary ready to be converted into a dataframe

3. Storing the extracted data

The data in the dictionary can now be saved by conveting it into a dataframe using the Pandas library.

The data have now been saved and can be opened in csv or xlsx format using the MS-Excel as shown above.

Click to view code

scroll to top

Data Source: World Population dataset

Libraries and Packages: BeautifulSoup, Pandas

Steps

Objective

1. Importing the libraries

2. Webscrapping

3. Storing the extracted data

1.importing the Libraries

2. WebScrapping

3. Storing the extracted data

Social