BEGINNER • SQL Fundamentals
Data Pipeline for real-time dashboards #16
This lesson focuses on enhance schema evolution for a real-time dashboards environment. You will use: python -m venv venv | python etl_script.py | CREATE TABLE events (id SERIAL PRIMARY KEY). The content is designed for practical data engineering execution.
Code Example
# dbt model: fact_real-time_dashboards
{{ config(materialized='incremental') }}
SELECT
user_id,
event_date,
COUNT(*) as event_count
FROM {{ ref('staging_events') }}
{% if is_incremental() %}
WHERE event_date > (SELECT MAX(event_date) FROM {{ this }})
{% endif %}
GROUP BY 1, 2
-- Run: python etl_script.pyCommands & References
- python -m venv venv
- python etl_script.py
- CREATE TABLE events (id SERIAL PRIMARY KEY)
Lab Steps
- Prepare environment with: python -m venv venv
- Design or modify the data pipeline for the scenario.
- Validate data quality and document lineage.
- Propose one optimization for production.
Exercises
- Add one data quality check.
- Implement one incremental loading pattern.
- Write a rollback procedure for this pipeline.