As we’ve seen in previous articles Python is the language of choice for AI. In this article, I present you the 4 stages to learn Python for AI & Machine Learning. I left many links to resources that will come in handy for you.
Python is the most popular language in the AI community due to its simplicity, flexibility, and data science libraries such as Pandas, Numpy, and Scikit-learn. This is why, in this article, we will see the Python stuff you need for AI and Machine Learning and discover what stage you’re in.
I’ll describe each stage and give you tips on how to master them so that you can move to the next stage.
Stage 1: The Basics of Python
This stage is for anyone who is learning the basics of Python.
At this level, you should at least know basic concepts such as data types and variables. Knowing the most popular options to store data (lists, dictionaries, and tuples.) is a must at this level. Also, you should be able to use conditional statements and control flow tools. This includes the if/else statements, boolean operations, and different types of loops (for, while, and nested).
Conditional statements, control flow, and loops open the door for a large variety of things you can do with Python, so use them and stay curious to develop a strong foundation necessary for the next stage.
One last important thing at this level is to start getting familiar with Jupyter Notebook. Jupyter allows us to create not only code but equations, visualizations, and text.
Topics: Data types, variables, lists, dictionaries, tuples, conditions, operators, control flow (if / else), loops, iterables, functions, file I/O operations (read, write to text files), and common methods.
How to master this level? As I mentioned before, solving problems that involve conditional statements, control flow, and loops will help you master stage 1.
Projects for beginners:
Stage 2: Python for Data Analysis
This is what I call the “essential Python stuff to work with data.” This means having at least a basic understanding of libraries used for data analysis such as Pandas, NumPy, Matplotlib, and Seaborn.
Using those libraries to solve common data analysis tasks such as data cleaning, exploratory data analysis (EDA) through visualizations, and feature engineering is important at this level.
Data analysis tutorials:
If you’re able to understand the code in the tutorials above, then you’re at this stage.
Regarding the stuff you already knew in stage 1, there’s still room for improvement — especially for the stuff you would frequently use for data analysis. Some of them are list comprehension, lambda, zip(), f-string, and the with
statement.
Last but not least, acquiring skills necessary for data collection like web scraping will come in handy. Below are guides to learning web scraping from scratch.
Web scraping guides:
Topics: Most of the methods/functions used in Pandas, NumPy, Matplotlib, Seaborn, and web scraping libraries (Selenium and Scrapy). List comprehension, lambda, zip(), f-string, the with
statement, and any other stuff that helps you write better code.
How to master this level? Solving Python projects. At this stage, projects usually involve all the data analysis libraries mentioned before. Make sure you start projects that have topics you’re interested in (that’s more fun!)
Projects for stage 2:
Stage 3: Python for Statistics & Math
In Stage 3 different fields get together, so your Python project will become an ML project. You already know how to clean data and conduct EDA from stage 2, but also you’re supposed to know all the fundamental statistics and math behind ML.
Statistics is crucial to make sure the data you are using to train a model is not biased. For example, using Matplotlib and Seaborn to plot histograms and boxplots will help you identify outliers. In addition to that, you should know how to apply most statistical concepts to a project. You should know how to deal with imbalanced data, segment train/test data, and formulate a problem and hypothesis.
Some topics in math you should know are functions and matrices. This stuff is implemented in Python through Numpy. Numpy has support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
Another important thing you should understand is how machine learning algorithms work. There’s a lot of math and statistics behind those algorithms, so make sure you understand them before learning the Python code that lets you build them.
Guides for stage 3:
Topics: Imbalanced data, segment train/test data, machine learning algorithms, arrays/matrices (Numpy), data visualization (Matplotlib/Seaborn). Above all, you should know how to apply statistics and math to a project.
How to master this level? Solving projects such as sentiment analysis, credit card fraud detection, and customer churn prediction.
Stage 4: Python for Machine Learning
The last stage is all about developing machine learning models. The scikit-learn library is a good start to this. Some basic things you should be able to do with this library are text representation (BOW, Count Vectorizer, TF-IDF), model selection, evaluation, and parameter tuning. This project covers all these topics. If you’re able to understand the code, then you’re at this level.
Other important libraries for data scientists at this level are Keras and TensorFlow. Keras features several of the building blocks and tools necessary for creating a neural network such as neural layers, activation and cost functions, objectives, etc. TensorFlow is one of the best library available for working with Machine Learning on Python. It makes machine learning model building easy for beginners and professionals alike.
Guides for stage 4:
Topics: Text representation, model selection, evaluation, and parameter tuning, among others.
How to master this level and beyond? This will depend on the area you’re interested in. Find an area you like and learn the necessary libraries you need for it. For example, if you’re into NLP, learning NLTK and solving projects like building a movie recommender system or a chatbot would help you get started in this area.