Feeding Models and Generating Insights¶

Once your AI/ML-ready datasets are packaged as data products, the next step is to put them to work – training models, generating predictions, and producing insights that help people and systems make better decisions.

A well-prepared data product removes much of the friction in model development because it already contains clean, consistent, and well-documented features. With this foundation, teams can focus on:

Selecting the right modeling approach
Experimenting with algorithms
Measuring performance without worrying about data reliability

From Data Product to Model Input¶

Whether you are building a forecasting model, a classifier, or a recommendation system, the pipeline begins with loading the data product into the training process. Because the dataset is versioned and reproducible, you can:

Run experiments repeatedly under identical conditions
Compare models trained at different points in time
Ensure performance differences stem from the model, not the data

Example: Predicting Late Returns¶

In our library domain example, a borrowing history dataset might include:

A member's punctuality rate
Seasonal demand for certain genres
Average loan duration

From this, you could train:

A classification model to predict whether a loan will be returned late
A forecasting model to estimate overdue volume in upcoming weeks

Because the dataset is consistent and traceable, you can explain exactly how each feature was derived and how it influences the model's output.

Generating Insights¶

Not every AI-ready dataset needs to end in a deployed model. It can also support:

Statistical analysis
Correlation studies
Backtesting to evaluate historical model performance

These activities often reveal patterns or anomalies that inspire new features, refine business rules, or lead to more targeted models.

Closing the AI Feedback Loop¶

Predictions themselves can become new events in the event store – for example, LateReturnPredicted or HighDemandForecasted. Recording these predictions:

Enables ongoing monitoring of model accuracy
Supports retraining with actual outcomes
Builds a transparent history of AI-driven decisions

This feedback loop ensures models improve over time and remain aligned with real-world behavior.

Next up: Closing the Loop – act on predictions, measure their impact, and continuously improve your models.