The everyday language for data work, backend APIs, automation, and AI — built on a small set of ideas worth understanding deeply.
The most common source of subtle bugs in Python code
Most Python bugs in production come from unexpected mutation — a function silently modifies a list that was passed in, a mutable default argument accumulates state across calls, or two variables thought to be independent are actually pointing at the same object. Understanding how Python handles values and references prevents an entire category of bugs.
Interviewers use mutability questions as a proxy for whether you've truly internalised the language rather than just used it — they're quick to ask and reveal a lot.
Python variables are labels, not boxes. When you write x = [1, 2, 3], you're sticking a label x on a list object. If you then write y = x, you've put a second label on the same object. Changing the object through either label affects both.
Immutable objects (int, str, tuple) can't change in place — Python creates a new object instead. Mutable objects (list, dict, set) change in place, so all labels pointing at them see the change.
Default arguments are evaluated once when the function is defined. A mutable default persists across every call. Use None as a sentinel and create the mutable inside the function body.
is checks identity (same object in memory). == checks value equality. For nested structures, a shallow copy only copies the outer container — inner objects are still shared references.
lst=None and initialise inside the function. This reveals whether a candidate understands that default values are part of the function object, not re-evaluated per call.How Python resolves names and how functions capture their environment
Functions are first-class citizens in Python — they can be passed as arguments, returned from other functions, and stored in variables. This makes closures and higher-order functions natural, but also introduces scope bugs that are hard to debug if you don't know the rules.
The late-binding closure trap is a classic interview question precisely because it looks wrong to anyone who doesn't understand how Python looks up free variables at call time rather than definition time.
When Python sees a variable name, it searches four scopes in order: Local → Enclosing → Global → Built-in (LEGB). It finds the first match and stops. This lookup happens at runtime, not when the function is defined.
A closure remembers the enclosing scope it was created in — but it remembers the variable (the label), not the value at creation time. If the enclosing variable later changes, the closure sees the new value.
*args collects extra positional arguments as a tuple; **kwargs collects keyword arguments as a dict. Use *iterable and **mapping at the call site to unpack them back out.
Closures capture the variable, not its value. The default-argument trick captures the current value at definition time because default arguments are evaluated immediately.
lambda i=i: i captures the current value at lambda creation time via the default argument mechanism.Functions that wrap functions — the pattern behind routes, caching, and retries
Decorators appear everywhere in real Python: Flask/FastAPI routes (@app.get), pytest fixtures (@pytest.fixture), caching (@lru_cache), retry logic, access control, and timing. You need to know how to write one, not just recognise the @ symbol.
Understanding decorators also unlocks a mental model: functions are objects, and returning a function from a function is completely normal. This is the same thinking needed for closures and higher-order design patterns.
A decorator is a function that takes a function and returns a (usually enhanced) function. The @decorator syntax is pure sugar — @log_calls above def process(...) is identical to writing process = log_calls(process) immediately after the function definition.
The inner wrapper function is what gets called instead of the original. It can run code before and after the original, modify arguments or return values, or skip the original entirely.
Always use @functools.wraps(func) on the wrapper. Without it, the decorated function loses its __name__, __doc__, and signature — breaking introspection and debugging tools.
A decorator with arguments needs one extra nesting level. Three layers: outer function accepts config → returns the decorator → decorator wraps the function.
@cache memoizes pure functions — identical arguments return the cached result. Requires hashable arguments. When stacking, the decorator closest to the function applies first.
Idiomatic iteration — from concise syntax to memory-efficient pipelines
Comprehensions are the most visible marker of Python fluency — any code review in a Python shop will flag a for loop building a list when a comprehension would do. Generators go further: they're the idiomatic way to process large datasets without loading everything into memory, which matters enormously in production data pipelines.
Understanding the memory difference between a list comprehension and a generator expression is a proxy for whether you think about memory at all — a signal interviewers actively look for in data-heavy roles.
A list comprehension builds the entire result upfront and holds it in memory. A generator produces one item at a time, pausing between — it uses O(1) memory regardless of the number of items. The syntax is nearly identical: square brackets vs parentheses.
Think of a generator as a recipe, not a meal. The recipe doesn't cook all the food at once — it gives instructions you follow one step at a time. Only when you ask for the next item does it compute it.
yield turns a function into a generator. Each next() call runs up to the next yield, pauses, and returns the yielded value. Local state is preserved between yields.
[x*2 for x in data] and (x*2 for x in data)? When would you choose each?sum(x*2 for x in data) computes the sum without materialising the list. For large files or data pipelines, generators are always preferable.The right tool for common patterns — knowing these separates fluent Python from homework Python
Knowing Counter vs building a frequency dict by hand, defaultdict vs a guarded dict.get, deque vs a list used as a queue — these choices show whether you know the language's stdlib. They also matter for performance: list.insert(0, x) is O(n); deque.appendleft(x) is O(1).
The built-in functions (enumerate, zip, any, all) remove boilerplate loops and make intent immediate. Interviewers notice when candidates reach for these vs writing manual counters and index tracking.
Python's collections module gives you dict-like structures optimised for specific access patterns. Think of them as specialised dicts: Counter is a dict that counts, defaultdict is a dict that never raises KeyError, deque is a list optimised for both ends.
Counter(words).most_common(3) — one line. Counter builds the frequency map directly from the iterable, and most_common(n) returns the n highest-count pairs using a heap, so it's O(n log k) not O(n log n). The manual approach — build a dict, sort by value, slice — works but is three times more code for no benefit. Knowing Counter is the expected answer for any frequency/ranking question in Python.Python's object model — how syntax maps to method calls
Every Python operation is a method call in disguise. len(x) calls x.__len__(). a + b calls a.__add__(b). for item in x calls x.__iter__(). Understanding this unlocks the ability to write Pythonic APIs — classes that slot naturally into built-in syntax like len, in, with, and iteration.
OOP design questions appear in system design rounds. The composition-vs-inheritance distinction matters when designing extensible pipelines — a common topic in ML engineering interviews.
Python's object model is protocol-based. To make your class "iterable," implement __iter__ and __next__. To make it "comparable," implement __eq__. Python checks for these methods and uses them — it doesn't care about the class hierarchy. This is structural typing, not nominal typing.
Prefer composition over inheritance: give a class a reference to another object rather than inheriting from it. Inheritance creates tight coupling ("is-a"); composition creates flexibility ("has-a") and makes testing easier.
If you define __eq__ without __hash__, Python sets __hash__ = None — instances become unhashable. Always define both together.
__eq__ on a class. A colleague tries to use instances as dict keys and gets a TypeError. Why, and how do you fix it?__eq__, Python automatically sets __hash__ = None, making instances unhashable. This is intentional — if two objects are equal, they must have the same hash. Without a custom __hash__, that invariant can't be guaranteed. Fix: define __hash__ using the same fields that determine equality, e.g. return hash((self.x, self.y)). If the object is mutable, it generally shouldn't be hashable at all — mutating it would break dictionary lookup.Trading flexibility for memory — when instances number in the millions
By default, every Python instance carries a __dict__ — a full hash map just for that object's attributes. For small programs this is fine. For classes instantiated millions of times (graph nodes, feature vectors, event records in a stream processor), the per-instance dict overhead is significant: typically 200–400 bytes per object that you're paying for nothing.
__slots__ is one of the few Python optimisations worth knowing by name. It comes up in interviews about memory-efficient data structures and high-throughput systems.
__slots__ tells Python: "this class will only ever have these exact attributes — no dynamic addition allowed." Python replaces the per-instance __dict__ with a fixed array of slots — like a C struct. No hash map, no rehashing, no wasted capacity.
The tradeoff: you gain memory and access speed, but lose the ability to add arbitrary attributes at runtime. It's a deliberate choice to lock down the class contract in exchange for efficiency.
The savings compound at scale: 1 million instances of WithSlots save roughly 200 MB vs WithDict. Attribute access is also slightly faster since Python uses fixed offsets rather than a hash lookup.
For __slots__ to work through an inheritance chain, every class in the hierarchy must define it. One missing __slots__ anywhere re-introduces a __dict__ on all instances.
__slots__ — breaks code that uses instance.__dict__ directly."__weakref__" in the slots tuple explicitly.__slots__ via @dataclass(slots=True) (Python 3.10+) — the cleanest modern approach.__slots__, and what do you give up by using it?@dataclass(slots=True) in Python 3.10+ as the cleanest way to get this benefit alongside auto-generated __init__, __repr__, and __eq__.The GIL, threading, asyncio, and multiprocessing — picking the right tool
The GIL explains why multithreading doesn't speed up CPU-bound Python work — a fact that surprises many developers. Picking the wrong concurrency model can make performance worse, not better, and this comes up in system design rounds when discussing parallel data fetching, serving ML models under load, or writing efficient ETL pipelines.
The GIL (Global Interpreter Lock) is a mutex inside CPython that allows only one thread to execute Python bytecode at a time. For IO-bound work (waiting on network/disk), threads release the GIL during the wait — multiple threads genuinely help. For CPU-bound work, threads fight over the GIL and don't help at all.
Both executors expose the same map / submit API — switching from threads to processes is often just changing one class name.
Coroutines, the event loop, tasks, and writing correct async code
Async Python is now mainstream — FastAPI, SQLAlchemy async, Redis async clients, and most modern ML serving frameworks are async by default. Understanding how the event loop, coroutines, and tasks actually work is essential for writing correct async code. Getting it wrong produces bugs that are extremely hard to debug: blocking the event loop freezes the entire server, not just one request.
Interviews for backend or ML engineering roles increasingly expect you to know the difference between a coroutine and a task, what await actually does, and how to run CPU-bound work without blocking the loop.
A coroutine is a function that can pause and resume. await pauses the current coroutine and gives control back to the event loop, which runs other ready coroutines. When the awaited operation completes, the event loop resumes the original coroutine. One thread, many coroutines, no OS context switching.
Think of the event loop as a restaurant manager: when a waiter (coroutine) is waiting for the kitchen (IO), the manager assigns them to another table (another coroutine). No waiter just stands idle — they all cooperate to keep things moving.
An async function called without await returns a coroutine object — it does not run. This is a common mistake: forgetting await means the operation never executes and no error is raised.
Any synchronous blocking call inside an async function (file reads, time.sleep, CPU-bound computation, sync database drivers) freezes the entire event loop for its duration. Every other request stalls. Use asyncio.to_thread or run_in_executor to move it off the loop.
time.sleep(5)) inside an async function? How do you fix it?await asyncio.sleep(5) for sleeping, await asyncio.to_thread(blocking_fn, arg) for any blocking function that can't be replaced. The to_thread approach runs the blocking call in a thread pool while the event loop continues serving other coroutines. For CPU-bound work, use ProcessPoolExecutor via run_in_executor instead, since threads still share the GIL.Guaranteed setup and teardown — the Pythonic way to manage any resource
Context managers are how Python guarantees resource cleanup even when exceptions occur. Open files, database connections, locks, GPU memory, network sessions, temp directories — all should be managed with with. Without it, a single exception leaves resources leaked.
Writing your own context managers unlocks patterns that appear everywhere in production code: database transaction rollback, profiling blocks, suppressing specific exceptions, and async resource management. It's also a go-to interview question because the protocol is small but the design implications are large.
The with statement is a guaranteed try/finally with a cleaner name and a reusable protocol. It calls __enter__ on entry (setup, returns the resource) and __exit__ on exit (cleanup) — regardless of whether the block raised an exception.
Think of it as a contract: "I promise to clean up after myself, even if something goes wrong inside." The context manager holds that promise so every call site doesn't have to remember to.
Code before yield is __enter__. Code in finally after yield is __exit__. Much less boilerplate than writing a full class. The try/finally ensures cleanup even if the block raises.
Async context managers use async with and implement __aenter__ / __aexit__ (both async). The @asynccontextmanager decorator works the same as its sync equivalent but with async def and await inside.
True from __exit__? Write a context manager that suppresses KeyError silently.with suppress(KeyError): ... — which is the idiomatic way. The contextlib.suppress implementation does exactly this — checks exc_type and returns True if it matches.Annotations for documentation, tooling, and catching bugs before runtime
Modern Python codebases — especially at FAANG and AI companies — use type hints extensively. They power IDE autocompletion, enable static analysis (mypy, pyright), and make function signatures self-documenting. They catch type errors at development time that would otherwise surface as cryptic runtime bugs.
Interviewers in Python-heavy roles increasingly expect annotated code and may ask about Optional, Protocol, or generics. Writing annotated code signals production hygiene even when a codebase doesn't strictly enforce types.
Type hints are annotations — Python itself ignores them at runtime entirely. They exist for humans and for tools. Think of them as a machine-readable docstring that tools can verify. Protocol is Python's structural typing — "any class that has these methods" — making duck typing explicit and tool-checkable.
Protocol is duck typing made tool-checkable. Any class with the required methods satisfies the protocol — no inheritance, no registration. This is how Python's own built-in protocols work: anything with __len__ is a Sized.
Failing safely — exceptions, exception chaining, and custom exception hierarchies
Production code fails — network requests time out, files are missing, APIs return unexpected shapes. The difference between hobby code and production code is how failures are handled. Catching the wrong exception silently swallows bugs; catching too broadly hides problems; not chaining exceptions loses the root cause.
Custom exception hierarchies are how you build APIs where callers can handle errors at the right level of specificity — catching a module's base exception type without needing to know all its internal subtypes.
try/except separates the happy path from the failure path. The else block runs only when no exception occurred — it's for code that should run on success but shouldn't be inside try where its exceptions might be caught by the wrong handler. The finally block always runs — use it for guaranteed cleanup.
Catch the most specific exception type you can handle. Use raise ... from e to chain exceptions so the original traceback isn't lost. The else block prevents a handler from masking errors from unrelated code.
Define a base exception per module. This lets you change internal exception types without breaking callers who catch the base. Always include enough context in the message to debug without a full stacktrace.
else clause in a try/except block? What's the difference from putting that code at the end of the try block?