Capstone Project & Professional Assessment
Take a real dataset through the full CRISP-DM lifecycle — frame, wrangle, model, evaluate, deploy and document — and pass the final assessment to earn your Professional Certificate.
Your project
This is where the whole course converges. You will run a complete, end-to-end data-science project through the CRISP-DM lifecycle and present it like a professional — the flagship piece of your portfolio. Choose a domain you genuinely care about: finance, healthcare, retail, sports, climate or social good.
- Frame the problem. Pick a domain and write one clear question, plus how success will be measured (the metric and a baseline to beat) — Modules 1 & 5.
- Acquire real data. Source a genuine dataset — a public CSV, an API, a Kaggle dataset, or a database — and document where it came from (Module 2).
- Wrangle & engineer. Clean types, handle missing values, merge tables, and engineer meaningful features into a tidy, leak-free table (Modules 2 & 6).
- Explore (EDA). Profile distributions, outliers and relationships, and surface at least five evidence-backed insights with clear visualisations (Module 3).
- Reason statistically. Where it strengthens the work, add a hypothesis test, confidence interval or A/B analysis — and interpret it correctly (Module 4).
- Model. Build at least two models in scikit-learn Pipelines, validate with cross-validation, tune hyper-parameters, and beat your baseline (Modules 5 & 6).
- Go deep where it fits. If your problem suits it, apply deep learning, NLP, or time-series forecasting (Modules 7–9) — only where it genuinely helps.
- Deploy. Save the pipeline, wrap it in a FastAPI endpoint, and containerise it with Docker so it runs anywhere (Module 10).
- Be responsible. Audit fairness, explain predictions (SHAP/importance), handle any PII, and write a short model card (Module 11).
- Communicate. Write a report (Question → Finding → Evidence → Recommendation) and record a 10-minute walkthrough for a non-technical audience.
- Publish. Push the notebook, data dictionary, model card, API code and report to a well-documented public GitHub repository.
- Pass the final assessment below to complete the course and earn your Professional Certificate.
Final assessment
A final assessment covering the whole course. Pass it (70%+) — together with completing every module — to earn your Professional Certificate in Data Science using Python.
Take the final assessment →💡 Log in first so your result counts toward the certificate.