BEGINNER • Python Data Foundation
Modeling Sprint: improve model quality #17
This lesson focuses on improve model quality using a practical retail demand forecasting scenario. You will apply commands: python -m venv .venv | jupyter lab | df.head(). The code example demonstrates a concrete workflow aligned with this lesson objective, not generic filler.
Code Example
import numpy as np
def clean_series(values: list[float]):
arr = np.array(values, dtype=float)
median = float(np.median(arr))
mad = float(np.median(np.abs(arr - median)))
threshold = median + 3 * max(mad, 1e-6)
filtered = arr[arr <= threshold]
return {
"median": median,
"mad": mad,
"count_before": len(arr),
"count_after": len(filtered),
}
series = [12, 13, 11, 14, 500, 12, 13, 11]
print("inspect:", "df.head()")
print(clean_series(series))Commands & References
- python -m venv .venv
- jupyter lab
- df.head()
Lab Steps
- Prepare environment using: python -m venv .venv
- Load a small sample dataset and validate schema.
- Run the core code workflow and collect metrics.
- Compare results and write one improvement note.
Exercises
- Change one hyperparameter and compare impact.
- Add one validation rule to reduce bad inputs.
- Document one failure mode and mitigation.