Skip to content

✅ MLOps Requirements Gathering

1. Process Data (Collect → Preprocess → Feature Engineering)

  • 💼 Business:

  • What decision will this model support?

  • What is the cost of false positives vs false negatives?
  • What are the business success metrics (revenue, churn, risk)?
  • How often are predictions needed (real-time vs batch)?
  • Are there regulatory or compliance constraints?

  • 🧠 Data Science:

  • What data sources are available? Are they reliable?

  • How is the label defined? Is it noisy or delayed?
  • How is data collected (batch, streaming, manual)?
  • What are known data quality issues?
  • What preprocessing steps are required?
  • Can features be reproduced consistently in production?

2. Feature Store (Online / Offline)

  • 💼 Business:

  • Is real-time prediction required?

  • What is the acceptable latency SLA?

  • 🧠 Data Science:

  • Which features are needed online vs offline?

  • Are features point-in-time correct (no leakage)?
  • What is the required feature freshness?
  • Can features be reused across models?

3. Develop Model (Train, Tune, Evaluate)

  • 💼 Business:

  • What is the minimum acceptable performance?

  • What is the current baseline (rules or human)?
  • How do model metrics map to business KPIs?

  • 🧠 Data Science:

  • What evaluation metrics best reflect business impact?

  • How do we handle class imbalance?
  • Do we need explainability?
  • What validation strategy will be used?
  • Are experiments reproducible?

4. Deploy (Batch / Real-Time Inference)

  • 💼 Business:

  • What latency is required (ms, seconds, hours)?

  • What is expected traffic volume?
  • What happens if the model fails?

  • 🧠 Data Science:

  • Are training and inference features consistent?

  • Batch or real-time inference?
  • What are compute and model size constraints?
  • Do we need A/B testing or shadow deployment?

5. Monitor (Model + Data + System)

  • 💼 Business:

  • What signals indicate business impact degradation?

  • How quickly must issues be detected and resolved?

  • 🧠 Data Science:

  • How do we detect data drift?

  • How do we detect concept drift?
  • How do we monitor prediction distributions?
  • What alert thresholds are defined?
  • Do we monitor inputs, outputs, and performance?
  • Do we have ground truth feedback loops?

6. Feedback Loop (Retraining / Continuous Learning)

  • 💼 Business:

  • How often should the model be updated?

  • What is the cost vs benefit of retraining?
  • Can users provide feedback?

  • 🧠 Data Science:

  • How do we collect new labeled data?

  • Is retraining scheduled or triggered?
  • How do we prevent data leakage in retraining?
  • Are data, features, and models versioned?

7. Governance (Registry, Lineage, Compliance)

  • 💼 Business:

  • Are there auditability requirements?

  • Who owns and is accountable for the model?

  • 🧠 Data Science:

  • Can we trace model → data → code → features?

  • Are models versioned and reproducible?
  • Are training artifacts and metadata stored?