Guide

Python fundamentals explained

Python is a high-level, interpreted language designed for readability first. Its whitespace-based syntax, rich standard library, and enormous package ecosystem made it the default choice for data science, machine learning pipelines, DevOps automation, and rapid API prototyping. Unlike compiled languages that optimize for raw throughput, Python optimizes for developer throughput — you ship working code quickly, then reach for C extensions, NumPy, or Rust bindings when a hot path needs speed. This guide walks through core syntax, the collections every script uses, how modules and virtual environments isolate dependencies, async I/O basics, and the patterns that separate tutorial scripts from code you can maintain in production.

What Python is (and where it fits)

Python runs through an interpreter (CPython is the reference implementation) that compiles source to bytecode and executes it on a virtual machine. That indirection costs CPU cycles compared to Go or Rust, but buys flexibility: REPL-driven exploration, dynamic typing, and batteries-included modules for HTTP, JSON, dates, and file I/O without hunting third-party libraries.

Strong fits: ETL scripts, Jupyter notebooks, ML training and evaluation, internal admin tools, FastAPI/Flask microservices, test harnesses, and glue between databases and queues. Python dominates the machine learning stack because libraries like PyTorch and scikit-learn wrap heavily optimized native code — your Python orchestrates; C++ and CUDA do the math.

Weak fits: Latency-sensitive trading engines, mobile apps, browser front ends, and CPU-bound services without native acceleration. For those, teams often prototype in Python and rewrite hot paths in Rust or move orchestration to Node.js when the workload is I/O-heavy JSON APIs rather than numeric arrays.

Python 2 vs 3 (still relevant in legacy code)

Python 2 reached end of life in 2020. All new projects should use Python 3 (currently 3.12+ on most distros). Legacy migrations watch for print as a statement, integer division quirks, and unicode vs str splits that no longer exist in Python 3’s unified text model.

Syntax essentials: indentation, types, and control flow

Python uses indentation (typically four spaces) instead of braces to delimit blocks. Mixing tabs and spaces is a classic source of IndentationError — configure your editor to insert spaces only.

# Variables are names bound to objects; no explicit declaration
count = 42
price = 19.99
label = "SOL"
active = True

# f-strings (Python 3.6+) are the idiomatic formatting choice
message = f"Balance: {price:.2f} {label}"

# Control flow
if count > 0:
    print(message)
elif count == 0:
    print("empty")
else:
    print("negative")

for i in range(3):
    print(i)

while active:
    active = False

Python is dynamically typed: types attach to objects, not variable names. Optional type hints (PEP 484) document intent and enable static checkers like mypy without changing runtime behavior:

def total(lamports: int, fee: int) -> int:
    return lamports + fee

Hints are not enforced at runtime — they are documentation plus tooling hooks. In larger codebases, combine hints with dataclasses or Pydantic models for structured validation at API boundaries.

Core data structures

Four built-in collections cover most day-to-day work. Choosing the right one affects readability and performance more than micro-optimizing loops.

Lists — ordered, mutable sequences

blocks = ["genesis", "slot-1", "slot-2"]
blocks.append("slot-3")
first, *rest = blocks  # unpacking

Tuples — ordered, immutable sequences

Use tuples for fixed-shape records and dictionary keys. Immutability makes them hashable when all elements are hashable.

Dictionaries — key-value maps

balances = {"alice": 1.5, "bob": 0.25}
balances.get("carol", 0)  # default instead of KeyError
for wallet, sol in balances.items():
    print(wallet, sol)

Sets — unique unordered collections

Sets excel at membership tests and deduplication. Intersection and union operations implement common “in both lists” logic in one line.

List comprehensions and dict comprehensions replace many map/filter patterns with readable one-liners: [x * 2 for x in range(10) if x % 2 == 0]. Prefer comprehensions for simple transforms; fall back to explicit loops when logic branches heavily.

Functions, classes, and modules

Functions and scope

def greet(name: str, *, polite: bool = True) -> str:
    prefix = "Hello" if polite else "Hey"
    return f"{prefix}, {name}"

# *args collects positional extras; **kwargs collects keyword extras
def log_event(level: str, *args, **kwargs):
    print(level, args, kwargs)

Watch the mutable default argument trap: def f(items=[]) shares one list across calls. Use None as the default and create a fresh list inside the function body.

Classes (without over-engineering)

Python supports classical inheritance, but many scripts need only dataclasses or simple namespaces:

from dataclasses import dataclass

@dataclass
class Transfer:
    sender: str
    recipient: str
    lamports: int

Reserve full class hierarchies for frameworks (Django models, SQLAlchemy ORM) or domain objects with real behavior. Flat is better than nested when a dict plus functions would suffice.

Modules and the import system

One file is one module; a folder with __init__.py is a package. Imports resolve through sys.path. Use absolute imports in application code (from app.services import billing) and avoid circular imports by pushing shared types into a thin models module.

Virtual environments, pip, and dependency hygiene

Never install packages globally on a shared server. A virtual environment (python -m venv .venv) creates an isolated site-packages directory and its own python binary. Activate with source .venv/bin/activate on Linux/macOS.

pip installs from PyPI. Pin versions in requirements.txt or use poetry / uv for lockfiles similar to npm’s package-lock.json. Reproducible builds demand pinned transitive dependencies — “latest” in production is how security patches become surprise breakages.

  • pip install -r requirements.txt — install exact versions in CI
  • pip freeze > requirements.txt — snapshot after deliberate upgrades
  • Separate dev dependencies (pytest, black) from runtime deps

For applications talking to Postgres or SQLite, pair Python with the patterns in our SQL fundamentals guide — parameterized queries, connection pooling, and transaction boundaries matter regardless of language.

Concurrency: threads, processes, and asyncio

CPython’s Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time per process. Threads still help for I/O-bound work (waiting on network or disk) because the GIL releases during blocking syscalls. CPU-bound parallelism uses the multiprocessing module or native libraries that release the GIL internally.

asyncio provides cooperative multitasking with async def and await — ideal for thousands of concurrent HTTP clients or WebSocket connections on one process, similar in spirit to Node’s event loop:

import asyncio
import httpx

async def fetch(url: str) -> int:
    async with httpx.AsyncClient() as client:
        r = await client.get(url)
        return r.status_code

async def main():
    codes = await asyncio.gather(
        fetch("https://solana.garden/"),
        fetch("https://solana.garden/guides/"),
    )
    print(codes)

asyncio.run(main())

Do not mix blocking calls inside async functions without asyncio.to_thread — a synchronous requests.get inside async def stalls the entire loop.

Standard library highlights

Before reaching for PyPI, check the stdlib — it is larger than most languages ship:

  • pathlib — object-oriented file paths instead of string concatenation
  • json, csv, sqlite3 — data interchange and embedded DB
  • datetime — prefer timezone-aware datetime objects; see our datetime and timezones guide for UTC storage rules
  • subprocess — spawn shell commands safely with argument lists, not shell=True unless necessary
  • logging — structured logs beat print in every service
  • unittest / pytest (third-party but ubiquitous) — see software testing fundamentals

Building APIs and services

FastAPI (async-native, automatic OpenAPI docs) and Flask (minimal, synchronous) are the common choices for HTTP APIs. Design routes with the same discipline as any backend: nouns not verbs in paths, correct HTTP status codes, pagination on list endpoints, and idempotency keys on payment-like operations. Our REST API design guide applies directly — Python is just the implementation language.

Deploy behind Gunicorn or Uvicorn workers, put nginx or a cloud load balancer in front for TLS termination, and run health checks on /health. Containerize with Docker once local venv workflows stabilize.

Common pitfalls and how to avoid them

  • Mutable defaults — use None and instantiate inside the function.
  • Shadowing builtins — never name a variable list, dict, or id.
  • Implicit string concatenation bugs — missing commas in tuples/lists create accidental string joins.
  • Swallowing exceptions — bare except: hides failures; catch specific types and log context.
  • Notebook → production without refactor — notebooks are for exploration; extract functions, add tests, pin deps before shipping.
  • Ignoring the GIL on CPU work — profile; vectorize with NumPy or move hot loops to Rust/Cython.

Production checklist

  • Pin Python minor version in CI and production (e.g. 3.12.x).
  • Use venv or containers; never sudo pip install on servers.
  • Enable ruff or flake8 plus black for consistent style.
  • Run pytest in CI; aim for tests on business logic, not 100% line coverage theater.
  • Configure logging with JSON formatters for aggregation; no secrets in log messages.
  • Set request timeouts on all outbound HTTP; retry with exponential backoff on idempotent reads.
  • Document entrypoints (__main__.py or CLI via typer/click) so operators know how to start the service.

Related reading