Hands-on Workshop

🤖 AI Feature Store Workshop

A comprehensive hands-on lab for building production-grade ML feature stores with Datorth streaming. Learn to serve features in real-time with sub-millisecond latency while maintaining training-serving consistency.

Machine Learning Feature Engineering Real-time Serving

Workshop Overview

This 4-hour hands-on workshop guides you through building a complete feature store implementation using Datorth. You'll learn to create, serve, and monitor ML features at production scale while avoiding common pitfalls like training-serving skew.

What You'll Learn

Feature engineering patterns — Batch and streaming feature pipelines
Real-time feature serving — Sub-millisecond online inference
Point-in-time joins — Correct historical feature retrieval for training
Feature monitoring — Drift detection and quality tracking
Feature discovery — Catalog and governance for feature reuse

Workshop Agenda

Module 1: Feature Store Fundamentals (45 min)

Understand the core concepts and architecture of modern feature stores.

Why feature stores matter for ML operations
Online vs. offline feature serving
Feature store architecture with Datorth
Training-serving skew and how to prevent it

Module 2: Building Feature Pipelines (60 min)

Hands-on lab creating batch and streaming feature pipelines.

Defining feature schemas and metadata
Building batch features with Spark SQL
Creating streaming features with Flink
Implementing windowed aggregations

Module 3: Feature Serving (60 min)

Deploy features for real-time inference with low latency.

Configuring online feature stores
Serving features via REST and gRPC APIs
Caching strategies for ultra-low latency
Handling missing features gracefully

Module 4: Training Data Generation (45 min)

Generate correct training datasets with point-in-time feature retrieval.

Understanding point-in-time correctness
Building training datasets with time-travel
Avoiding data leakage in features
Integration with ML training frameworks

Module 5: Monitoring & Governance (30 min)

Ensure feature quality and enable discovery across teams.

Feature drift detection and alerting
Data quality monitoring for features
Feature catalog and documentation
Access control and lineage tracking

Lab Environment

Each participant receives access to a fully configured Datorth environment with:

Pre-configured Kafka clusters for streaming
Flink and Spark environments for processing
Redis-backed online feature store
Sample datasets and starter notebooks
Jupyter environment for hands-on exercises

Use Cases Covered

Real-time Fraud Detection

Build features for transaction fraud scoring:

User spending patterns (30-day rolling averages)
Transaction velocity (last 5 minutes)
Device and location features
Merchant risk scores

Personalization Engine

Create features for real-time recommendations:

User engagement signals (clicks, views, purchases)
Item popularity and trending scores
Collaborative filtering embeddings
Session context features

Prerequisites

Basic understanding of machine learning concepts
Familiarity with Python and SQL
Experience with data engineering (helpful but not required)
Laptop with modern web browser

Workshop Formats

Live Virtual Workshop

Instructor-led sessions with live Q&A and hands-on support.

Duration: 4 hours
Class size: Up to 30 participants
Schedule: Monthly sessions (see calendar)

Private Workshop

Customized for your team with your use cases and data.

Duration: 4-8 hours (customizable)
Delivered on-site or virtually
Custom labs using your datasets

Self-Paced Lab

Work through the materials at your own pace.

Access to recorded sessions
Lab environment for 7 days
Community support via Slack

Upcoming Sessions

December 15, 2025 — 10:00 AM EST (Virtual)
January 12, 2026 — 10:00 AM EST (Virtual)
January 26, 2026 — 2:00 PM PST (Virtual)

Build your ML feature store

Reserve your seat in an upcoming workshop or request a private session for your team.

Reserve a seat ← Back to Resources