ETL stands for Extract, Transform, Load—a data processing pattern in which data is extracted from source systems, transformed into a clean or structured format, and then loaded into a target system such as a database or data warehouse. ETL pipelines often run as batch processing jobs and may involve complex transformations.
Why it matters
ETL provides consistent, validated, and analytics-ready data to downstream systems. It helps organizations combine data from multiple sources and enforce business rules. ETL is a core component of traditional data engineering.
Examples
Pulling sales data from multiple APIs, cleaning it with Python scripts, and loading it into a warehouse is an ETL workflow. Lessons like ETL and ELT Pipelines provide deep comparisons.