AI & Python #39 : Here's What Most Python and Data Science Courses Don't Teach You
And free resources to learn them on your own.
There’s no perfect course in this world, but do you know what most of them don’t teach you?
The not-so-obvious skills and tools you’ll need at work!
Most courses are great for getting you started with a programming language and refreshing your knowledge of tech skills, but they lack a couple of modules dedicated to the skills you need to have at work to work efficiently with other programmers and non-technical coworkers.
Here are three things you probably didn’t learn in an online course (plus resources to learn them on your own).
Software Engineering Practices
As a data scientist, you’ll be writing code in programming languages such as SQL, Python, and R. Although most data science courses will help you get started with coding, they won’t teach you or focus on good practices.
Good practices such as writing clean code, modular code, and optimizing your code are very popular in software engineering. Believe it or not, these good practices will help you become a better data scientist.
Here’s a simple example of how to optimize your code:
# Imagine we have an array of random exam scores and we want to get the average score of those who failed the exam (score<70)
# Below are two way of solving this problem (one using loops and the other using vector operations)
import time
import numpy as np
random_scores = np.random.randint(1, 100, size=10000001)
# SOL1: solving problem using a for loop
start = time.time()
count_failed = 0
sum_failed = 0
for score in random_scores:
if score < 70:
sum_failed += score
count_failed += 1
print(sum_failed/count_failed)
print(f'Duration: {time.time() - start} seconds')
# SOL2: solving problem using vector operations
start = time.time()
mean_failed = (random_scores[random_scores < 70]).mean()
print(mean_failed)
print(f'Duration: {time.time() - start} seconds')
That’s only a basic example with a few lines of code! The longer the script, the more important it is to follow these good practices.
What can you do to learn good software engineering practices?
You can search for guides or tutorials to learn good software engineering practices for data science, and read a book that covers all these points and gives a big picture on the topic.
Here are 5 Python books that every beginner should read to go beyond the basics. I’d recommend checking the book “Clean Code in Python” in particular because it covers most of the points mentioned before.