Foundation's ML Approach

Foundation provides an integrated machine learning library that seamlessly works within the platform's transformation framework. ML models are implemented as transformations, following the same logic and patterns as any other data transformation in Foundation. This approach ensures consistency and ease of use while leveraging Foundation's existing infrastructure for data processing, storage, and lineage tracking.

Supported Models

Foundation currently supports the following ML models:

LightGBM: Gradient boosting framework for regression and time-series forecasting
LSTM: Long Short-Term Memory networks for multi-variate time-series prediction
K-Means: Clustering algorithm with outlier detection capabilities

Our ML library is continuously expanding, and we expect to add more models based on user needs and use cases.

ML Ops Approaches

Foundation supports two distinct approaches for machine learning workflows:

Training/Inference Approach (Model Persistence)

This approach separates model training and inference into distinct steps:Training Phase:

Train a model on historical data
Store the trained model in Foundation's storage under the /models directory
Generate a metadata data product containing model performance metrics and configuration

Inference Phase:

Load a previously trained model from storage
Apply the model to new data for predictions
Generate a predictions data product with results

This approach requires creating two data products:

Metadata Data Product: Stores model training information, metrics, and configuration
Predictions Data Product: Stores the inference results

Each data product requires:

A defined schema matching the expected output structure
A builder configuration (training builder for metadata, inference builder for predictions)
Linkage to the source data product containing the features

Transient Inference Approach (Single Execution)

This approach combines training and inference in a single transformation:

Train a model on the input data
Immediately apply it to generate predictions
Model is not persisted for future use

This approach requires one data product:

A single data product with schema and builder configuration
Direct transformation from input features to predictions
No model storage or versioning

PreviousOverview NextUsing a LightGBM Model

Last updated 4 months ago

hashtagSupported Models

hashtagML Ops Approaches

hashtagTraining/Inference Approach (Model Persistence)

hashtagTransient Inference Approach (Single Execution)

hashtag

Supported Models

ML Ops Approaches

Training/Inference Approach (Model Persistence)

Transient Inference Approach (Single Execution)