MLOps
Training, evaluation, deployment, and monitoring as a studyflow
Machine-learning pipelines span data preparation, model training, validation, and deployment. Documentation usually diverges from implementation; model versions are created without clear tracking; deployment decisions lack transparent criteria.
Studyflow provides a formal specification language for ML operations. Researchers can document decision points, environmental dependencies, and gates that govern lifecycle management – all in one diagram.
This example illustrates an MLOps workflow for prediction models: raw behavioral-data ingestion → feature engineering → model training and cross-validation → performance evaluation → conditional deployment → monitoring → retraining.
The reference diagram for this example should live at docs/assets/img/examples/mlops-pipeline.svg and the source at docs/assets/img/examples/mlops-pipeline.studyflow. Author it in the modeler using the structure below.
Stages
- Ingest – a
Scriptactivity reading raw data from aDatasetrepresenting the data lake. The data operation marker isTransform. - Feature engineering – a sequence of
MapandReduceoperations on the raw data, producing a feature table with a schema. See Attach a schema. - Train/test split – a
Filteroperation produces two tables. - Model training – a
Scriptactivity that consumes the training table and produces a model artifact (aSnapshotof the modelDataset). - Cross-validation gateway – an Exclusive Gateway checking that CV scores meet a threshold. Failing branches loop back to feature engineering with logged failure reasons; passing branches continue.
- Fairness audit – a
Manualactivity (or aScriptwith reviewer sign-off) checking group-level performance. Output: a fairness report. - Fairness gateway – another exclusive gateway. Pass: continue to deployment. Fail: route to a retraining loop with the fairness criteria as part of the loss.
- Deployment – a
Scriptactivity that writes the model snapshot to the production store. - Monitoring – a sub-process triggered by a timer event (daily/weekly) that checks live performance against the held-out baseline.
- Retraining trigger – a boundary error event on the monitoring sub-process. When live performance drops below threshold, it routes back to feature engineering.
Why diagram it
ML pipelines collect decision-and-rollback paths that are hard to describe in prose:
- Gates are visible. The CV and fairness gateways show, in the diagram, what conditions a model must meet before it’s deployed. Reviewers can verify the policy without reading code.
- The retraining loop is the topology. Monitoring → boundary event → back to feature engineering is one path on the diagram, not a description split across runbooks.
- Model snapshots are first-class. A
Snapshotelement documents what version of the model was deployed when. - The fairness audit has a name and a place. Treating it as an explicit activity in the diagram makes it harder to skip and easier to demand.
See also
- Analysis pipelines – preprocessing patterns shared with MLOps.
- Preprocessing pipelines – Map/Filter/Reduce/Group reference.
- Data –
Dataset,Snapshot,Catalog.