AI & Python #32: Software Engineering Practices Anyone Learning Python Should Know
Best practices that you should learn.
Let’s face it, we sometimes don’t pay attention to things like writing efficient code, code structure, and maintainability.
But we should!
One day you’ll be part of projects that involve working with other people. This is why we need to write robust code and follow some good practices that software engineers have.
Here are some software engineering best practices that you should know.
Write Clean Code
Writing clean code means writing readable, simple, and concise code. Clean code is the foundation of a script that is easy to maintain.
Believe me, you’ll make the lives of your teammates easier by writing code that is easy to understand. Simple is better than complex, so don’t write complex code that even you might struggle to understand.
Here’s an example of writing clean code:
# Imagine we want to write a program that categorizes a task based on its execution time
# Below are two ways of writing this code (one is more clean than the other)
# bad
t = end - start_time # get execution time
c = category(t) # get category of task
print(f'Task Duration: {t} seconds, Category: {c}')
# good
execution_time = end_time - start_time # get execution time
category = categorize_task(t) # get category of task
print(f'Task Duration: {execution_time} seconds, Category: {category}')
Below are the good practices followed in the previous example:
Use meaningful variable names: Make your variable names explanatory and descriptive. A variable named
end
isn't as explanatory asend_time.
A boolean variablesingle
isn’t as descriptive asis_single
.Don’t use abbreviations in a variable name that no one will understand (e.g.,
t
andc
).Don’t write too long variable names that no one will remember.
Use indentation and whitespaces properly: There are many conventions here like using four spaces for indentation or separating sections with a blank line. They might be hard to remember; fortunately, IDEs like Pycharm will suggest you follow such conventions and show you how to do it.
Follow PEP8 conventions as much as possible when naming objects (e.g., how to use case convention, when to use an underscore, etc)
Write Modular Code
Modular code means writing code that can be separated into functions and modules.
A program that can be broken into modules helps when debugging. As a program grows in size, it’s a good practice to split your code into modules. This lets you easily pinpoint the source of errors.
Also, modular code will help you avoid repetition and write efficient and reusable code.
Here‘s some advice to start writing modular code:
Don’t repeat yourself: If you’re using the same function/method to accomplish a single task, consider creating a function or a for loop instead.
# Imagine we have a list named "numbers" and we want to sum a random number to each element of the list and store it in a new list
# Below are two ways of writing this code (one avoids repetition)
numbers = [10, 20, 30, 40, 50]
# bad
numbers1 = []
for number in numbers:
numbers1.append(number+1)
numbers5 = []
for number in numbers:
numbers5.append(number+5)
# better
def sum_number(my_list, n):
return [number + n for number in my_list]
sum_1 = sum_number(numbers, 1)
sum_5 = sum_number(numbers, 5)
Minimize the number of functions, classes, and modules
Single responsibility principle: A class/function should have only one responsibility. If they do more than one thing, consider refactoring them to two or more classes/functions.
Use modules
Optimize Your Code
Writing code that works is good, but you know what’s better? Writing efficient code that runs fast and consumes little memory and storage. This is why you should optimize your code (even if it already does the job).
Writing efficient code isn’t easy. This is a skill that you’ll learn over time. That said, here’s some advice that will help you start writing more efficient code today.
Vectorize your operations: Use vector operations (Numpy) over loops whenever possible.
Inspect the running time of every operation: This will help you find bottlenecks in your script.
Know your data structures and which methods are faster
Let’s see an example of how vectorizing your operations can help optimize the performance of your script:
# Imagine we have an array of random exam scores and we want to get the average score of those who failed the exam (score<70)
# Below are two way of solving this problem (one using loops and the other using vector operations)
import time
import numpy as np
random_scores = np.random.randint(1, 100, size=10000001)
# SOL1: solving problem using a for loop
start = time.time()
count_failed = 0
sum_failed = 0
for score in random_scores:
if score < 70:
sum_failed += score
count_failed += 1
print(sum_failed/count_failed)
print(f'Duration: {time.time() - start} seconds')
# SOL2: solving problem using vector operations
start = time.time()
mean_failed = (random_scores[random_scores < 70]).mean()
print(mean_failed)
print(f'Duration: {time.time() - start} seconds')
If you run the snippet above, you’ll see that both solutions get the same result; however, it takes around 2.57 seconds for solution 1 to get the job done, while solution 2 only needs 0.06 seconds.
Although this little tweak barely saved us 2 seconds, at a larger scale, it can make a big difference in performance.