Role overview
Design and operate systems that move, transform, validate, and serve data for analytics and ML.
Skills to learn
Python, SQL, DBMS, data modeling, orchestration, Spark, cloud storage, warehouses, and data quality.
Recommended learning order
Python → SQL → DBMS → Data modeling → Pipelines → Cloud → Reliability.
Interview focus areas
Batch and streaming tradeoffs, partitioning, schema changes, idempotency, backfills, and observability.
Resources
Use the curated resource library for data engineering books, videos, and repos.