# The ultimate Python style guidelines

Coding guidelines help engineering teams to write consistent code which is easy to read and understand for all team members.

Python has an excellent style guide called PEP8. It covers most of the situations you will step into while writing Python. I like PEP8, I believe there has been much effort and thinking put into it. On the other hand, PEP8 can be considered a generic Python guideline rather than strict rules as it allows different approaches to achieve similar goals. And that may be a problem for teams with different skill levels that using methodologies where the team members are equal and there is always a place to argue. It is way easier to write the code or do a code review by a strictly defined practical style guide. Establishing such guidelines can be problematic but it will be very beneficial for the whole team if it is done the right way.

Throughout my career, I defined one for myself and want to share it.

The final goal of this guide is having code that is clean, consistent, and efficient. Remember — сode is read more often than it is written and only incidentally for machines to execute. Some parts of the guide are opinionated and meant to be strictly followed to preserve consistency when writing new code.

## Code Layout

• Use 4 spaces instead of tabs
• Maximum line length is 120 symbols
• 2 indents between classes and functions
• 1 indent within class between class methods
• No blank line following a def line.
• No whitespace inside parentheses, brackets or braces.
# good
spam(ham[1], {eggs: 2}, [])

spam( ham[ 1 ], { eggs: 2 }, )

• Surround binary operators with a single space on either side for assignment, comparison, and booleans.
# good
x == 1

x<1

• Never use spaces around = when passing keyword arguments or defining a default parameter value,

• Use blank lines for logical splitting functionality inside the functions/methods wherever justified

• Move function arguments to a new line with an indentation, if they do not fit into the specified line length

# Good
def long_function_name(var_one, var_two, var_three,
var_four):
print(var_one)

# Good
def long_function_name(
var_one,
var_two,
var_three,
var_four
):
print(var_one)

• Allocate logical conditions on a new line if the line does not fit into the maximum line size. This will help understand the condition by looking from top to bottom. Bad formatting complicates readability and understanding.
# Good
if (this_is_one_thing
and that_is_another_thing
or that_is_third_thing
or that_is_yet_another_thing
and one_more_thing
):
do_something()

• Use multiline strings, not \\ since it gets much more readable.
raise AttributeError(
'Here is a multiline error message '
'shortened for clarity.'
)

• Place a class' __init__ at the beginning of each class
• Use named arguments to improve readability and avoid dummy mistakes in the future
# Bad

# Good

• Never use 3.8’ function parameter syntax /, it is considered one of the code smells if you disallow keyword arguments.
• Do not terminate your lines with semicolons, and do not use semicolons to put two statements on the same line.
• Chaining methods should be broken up on multiple lines
(df.write \
.format('jdbc')
.option('url', 'jdbc:postgresql:dbserver')
.option('dbtable', 'schema.tablename')
.save()
)


## Naming

• Use snake_case for modules, variables, attributes, functions, and method names, not CamelCase
• Use CamelCase for class names and fabrics
• Names should be clear about what a variable, class, function contains/do. If a developer cannot come up with one clear name, then something is wrong with the implementation (see SRP)
• Don't include the type of a variable in its name. E.g. use senders instead of senderlist

## Formatting

• Use double quotes(") around strings that are used for interpolation or that intended for the end-user to see, and use single quotes(') until you need double quotes.
CONFIG = {
'db_name': "db",
'port': 4321,
}
MESSAGES = {
'en': "Hello %s",
}

def welcome(language):
"""Return a language-appropriate greetings"""
return MESSAGES[language] % locals()

• Add trailing commas in sequences of items only when the closing container token ], ), or } does not appear on the same line as the final element
# good
x = [1, 2, 3]
# good
y = [
0,
1,
4,
6,
]
z = {
'a': 1,
'b': 2,
}

y = [
0,
1,
4,
6
]
z = {
'a': 1,
'b': 2
}

• To format strings use format function or if you using Python3.6≥ use f-strings:
# Bad
print('var: %s' % var)

# Good
print('var: {}'.format(var))

# Good for Python3.6>=
print(f'var: {var}')

• Always start a new block on a new line
# Bad
if flag: return None

# Good
if flag:
return None

• Set your IDE to normalize inconsistent line endings

## Commenting

• First of all, if the code needs comments in order to clarify its work, you need to think about its refactoring or rewriting. The best comments for code are the code itself
• Describe complex, possibly incomprehensible points and side effects in the comments
• Separate all comments with a whitespaces
#bad comment
# good comment

• If a piece of code is poorly understood, in case of future refactoring or a possible change, mark the piece with todo note and your last name and Jira ticket number(if you have one) — @TODO(lastnameN)
def get_ancestors_ids(self):
# @TODO(mysurnameN): do a cache reset while saving and changing the category tree
cache_name = '{0}_ancestors_{1}'.format(self._meta.model_name, self.pk)
cached_ids = cache.get(cache_name)
if cached_ids:
return cached_ids

ids = [c.pk for c in self.get_ancestors(include_self=True)]
cache.set(cache_name, ids, timeout=3600)

return ids


## Type annotations

Type annotations in function signatures and module-scope variables are required. This is good documentation and can also be used with mypy for type checking and error checking. Whenever possible, annotations should be in the source. Use pyi files for third-party or extension modules.

## Docstrings

• All docstrings should be written in RST format (Sphinx). Cheetsheet
• Write docstrings for each method that is more complicated than hello world. In docstring summarize function/method behavior and document its arguments, return value(s), side effects, exceptions raised, and restrictions
• Wrap docstrings with triple double quotes (""")
• The docstrings and the description of the arguments must be indented
def some_method(name, state=None):
"""This function does something

:param name: The name to use
:type name: string
:param state: Current state to be in (optional, default: None)
:type state: bool
:returns:  int -- the return code
:raises: AttributeError, KeyError
"""
...
return 0

• Similarly to branching, do not write methods on one line in any case:
def do_something(self):
print('Something')



## Exceptions

• Use more specific exceptions, not Exception. Make errors obvious.
• Exceptions should be written only where they are really needed. No need to write them in cases where you can use a simple if statement
• Minimize the amount of code in a try/except block. The larger the body of the try, the more likely that an exception will be raised by a line of code that you didn’t expect to raise an exception.

## Imports

• Avoid creating circular imports by importing modules more specialized than the one you are editing
• Relative imports are forbidden (PEP-8 only “highly discourages” them). Where absolutely needed, the from future import absolute_import syntax should be used (see PEP-328)
• Never use * in imports. Always be explicit about what you're importing. Namespaces make code easier to read so please use them
• Break long imports using parentheses and indent by 4 spaces. Include the trailing comma after the last import and place the closing bracket on a separate line
from my_pkg.utils import (some_utility_method_1, some_utility_method_2, some_utility_method_3,
some_utility_method_4, some_utility_method_5,
)

• Imports should be written in the following order, indented by newlines:
1. build-in modules
2. third-party modules
3. modules of the current project
import os
import logging
import typing as T

import pandas as pd
import pyspark
import pyspark.sql

from .config import DBConfig

• Even a file meant to be used as an executable should be importable and a mere import should not have the side effect of executing the program’s main functionality. The main functionality should be in a main() function. That way the code can be imported as a module for testing or reused in the future
def main():
...

if __name__ == '__main__':
main()


## Unit-tests

• All unit tests should be written using pytest framework.
• There is no need to write a huge unit test with a bunch of assertions — each unit test should check only one specific thing that's they all are about
# Bad
def test_smth():
result = f()
assert isinstance(result, list)
assert result[0] == 1
assert result[1] == 2
assert result[2] == 3
assert result[3] == 4

# Good
def test_smth_type():
result = f()
assert isinstance(result, list), 'Result should be list'

def test_smth_values():
result = f()
assert set(result) == set(expected), f'Result should be {set(expected)}'

• The name of the test must clearly express what is being tested.
• assert should be followed by a message explaining what it is checking.

## It a bad idea to...

use constructions like:

• global variables,
• iterators where they can be replaced by vectorized operations,
• lambda where it is not required,
• map and lambda where it can be replaced by a simple list comprehension,
• multiple nested maps and lambdas,
• nested functions, they are hard to test and debug.

## Conclusion

Sure you don't like my guidelines here(save your breath here), it's not ideal and I am constantly working on it. In the next post, we will talk about how to create a style guide for your team as well as tooling for making the process consistent among the team.

By the way what style guide are you following?

Daily dose of