Artificial Corner

Artificial Corner

Share this post

Artificial Corner
Artificial Corner
AI & Python #5: Web Scraping in Python with Beautiful Soup
AI & Python 🐍

AI & Python #5: Web Scraping in Python with Beautiful Soup

Getting started with web scraping in Python (Part 1).

The PyCoach's avatar
The PyCoach
Feb 22, 2024
∙ Paid
10

Share this post

Artificial Corner
Artificial Corner
AI & Python #5: Web Scraping in Python with Beautiful Soup
2
Share

Hi!

I prepared two tutorials to help you get started with Beautiful Soup and Selenium. In these tutorials, we’ll scrape a simple website from scratch so that you see with your own eyes what are the differences between these two libraries.

In this article, we’ll extract football data from all the FIFA World Cups played between 1930 to 2022. That’s around one thousand games.

Here are six of these games (we’ll extract some of this data).

Wikipedia

To extract this data, we’ll scrape Wikipedia using Python and Beautiful Soup. The data we want to extract is split into multiple Wikipedia pages, so we’ll start by extracting data from one page and then we’ll create a for loop to extract data from all the pages.

Let’s install the libraries.

Installing the libraries

In this tutorial, we’ll use bs4 to scrape websites, lxml to parse HTML documents, and requests to send requests to the target website.

Here’s the command you need to run in the terminal to install these libraries.

pip install bs4
pip install lxml
pip install requests

In addition to the previous libraries, we’ll install pandas to better manage the data we’re going to extract.

pip install pandas

Now let’s start coding!

Part 1: Scraping data from one World Cup

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Frank Andrade
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share