← Back to Home
AGAI-10110 Weeks · 60 Hours

Applied GenAI: LLMs, RAG & Agents

Go from zero GenAI knowledge to building and deploying production-grade AI applications. 10 weeks of live sessions, 8 mini-projects, 1 capstone — all on our Jupyter-powered AI platform.

Format
Live Cohort
Schedule
Sat + Sun, 3hrs
Projects
18 + Capstone
Batch Size
15–25 Students

Every Week, Three Tracks

SATURDAY — 3 Hours

Concepts & Live Demos

Instructor-led session. Theory explained with intuition, followed by live coding demos. You watch, ask questions, and understand the “why” before the “how”.

SUNDAY — 3 Hours

Hands-On Lab & Workshop

You build on the bsigma platform. Guided labs with increasing difficulty, ending in a mini-project you push to GitHub. AI instructor assists you in real time.

SUPPLEMENTAL — 2-3 Hours

Real-World Workshop

Apply the week's techniques to live data from real APIs. Build portfolio projects using Hacker News, USGS Earthquakes, GitHub, and more. Optional Challenge D for production-tier implementations.

4 Phases, 10 Weeks

Phase 1

LLM Foundations

Weeks 1–2

Phase 2

RAG — Retrieval-Augmented Generation

Weeks 3–5

Phase 3

AI Agents

Weeks 6–8

Phase 4

Fine-Tuning, Security & Capstone

Weeks 9–10

Week-by-Week Curriculum

Click any week to see the full session breakdown, lab exercises, and project details.

1
LLM Foundations & Your First AI Interaction
Phase 1 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • What are Large Language Models — the 10,000ft view
  • How LLMs are built: pre-training, SFT, RLHF (intuition, not math)
  • Self-attention as soft dictionary lookup — how transformers actually work
  • Model capability ladder — when to use which model (cost vs quality)
  • Reasoning models (o1, o3, DeepSeek-R1) — the next frontier beyond prompting
  • The LLM landscape: OpenAI, Anthropic, Google, Meta (open-source)
  • Live Demo: Same prompt across GPT-4, Claude, Llama — comparing outputs
  • Tokens, context windows, and why they matter for cost & quality
  • API anatomy: system/user/assistant messages, temperature, top-p
Sunday — Hands-On Lab
  • Set up bsigma workspace — note cells, AI prompt cells, code cells
  • Call multiple LLM providers via API (OpenAI, Anthropic, Ollama)
  • Experiment with temperature, top-p, system prompts — observe changes
  • Model selection experiment — same task, different models, quality comparison
Mini-Project

Build a Model Comparison Tool — same prompt, 3 models, side-by-side output with quality assessment

Real-World Workshop

AI News Intelligence Dashboard

Live APIs: Hacker News

Fetch live tech news, AI-powered categorization, audience-adapted briefings, token economics at scale

2
Prompt Engineering & Solving Real NLP Tasks
Phase 1 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • Prompt engineering as a discipline — why it matters
  • Zero-shot vs few-shot prompting with live examples
  • Chain-of-thought (CoT) and step-by-step reasoning
  • Why techniques work — CoT as scratchpad, in-context learning, system message mechanics
  • Role prompting and persona-based approaches
  • Structured output: getting JSON, tables, specific formats
  • Constrained decoding — structured output across providers (OpenAI, Anthropic, Outlines)
  • Prompt templates and reusable patterns
  • Prompt evaluation — consistency metrics and accuracy measurement
Sunday — Hands-On Lab
  • Zero-shot classification — categorize support tickets by topic
  • Few-shot sentiment analysis — extract sentiment + aspects from reviews
  • Chain-of-thought — solve multi-step reasoning problems
  • Cross-provider structured output — OpenAI vs Anthropic comparison
  • Prompt consistency experiment — same prompt, 5 runs, measure agreement
Mini-Project

Build a Customer Review Analyzer — raw reviews to structured JSON with multi-provider comparison and evaluation dashboard

Real-World Workshop

AI Recipe Transformer

Live APIs: TheMealDB

Dietary adaptation, cuisine fusion, chain-of-thought meal planning, structured nutrition extraction

3
Embeddings, Vector Search & RAG Fundamentals
Phase 2 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • Why LLMs hallucinate and why RAG exists
  • What are embeddings — turning text into numbers
  • How embeddings are trained — contrastive learning, InfoNCE loss, training data determines similarity
  • Choosing embedding models — MTEB benchmark, task-specific vs general-purpose
  • Semantic search vs keyword search (live comparison)
  • Vector databases: ChromaDB — and how they search (ANN, HNSW algorithm)
  • The RAG pipeline: Load → Split → Embed → Store → Retrieve → Generate
  • Chunking strategies: fixed-size, recursive, semantic
  • Principled retrieval — K as precision/recall trade-off, distance thresholding
Sunday — Hands-On Lab
  • Generate embeddings, visualize similarity
  • Set up ChromaDB, load and query a vector store
  • Experiment with different chunk sizes — observe retrieval quality
  • K selection experiment — vary K, measure noise vs recall
  • Distance threshold tuning — find the sweet spot between rejection and hallucination
Mini-Project

Build a Company Knowledge Base Bot — RAG with confidence-aware retrieval and distance thresholding

Real-World Workshop

AI Book Recommendation Engine

Live APIs: Open Library

Semantic search over real books, metadata filtering, RAG-powered reading recommendations

4
Building a Production RAG Application
Phase 2 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • The pipeline pattern — RAG in pure Python first, then LangChain
  • Document loaders: text, web pages, CSVs (LangChain + framework-free)
  • RecursiveCharacterTextSplitter — chunking with boundary awareness
  • Persistent ChromaDB with metadata enrichment
  • RAG with source citations — context labeling for traceable answers
  • Retrieval failure taxonomy — the 4 types and how to diagnose each
  • Distance thresholds and metadata filtering — when to say 'I don't know'
  • Hybrid search — combining BM25 keyword + semantic search from first principles
  • Conversational RAG — query rewriting for follow-up questions
Sunday — Hands-On Lab
  • Build a multi-source document loader (text + web + CSV)
  • Retrieval failure diagnosis lab — trigger and classify all 4 failure types
  • Build hybrid search from scratch — BM25 + ChromaDB with tunable alpha
  • Distance threshold calibration — find the sweet spot empirically
Mini-Project

Build a Document Q&A System — multi-document RAG with citations, hybrid search, threshold calibration, and failure diagnosis

Real-World Workshop

AI Country Intelligence Briefing

Live APIs: REST Countries + Open-Meteo

Multi-source RAG with live weather data, intelligent chunking, conversational retrieval

5
RAG Evaluation & Optimization
Phase 2 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • How do you know if your RAG is good? Evaluation frameworks
  • RAGAS metrics: faithfulness, answer relevancy, context precision
  • LLM-as-Judge: using one LLM to evaluate another
  • Validating your judge — Cohen's kappa, calibration, and reliability
  • Statistical rigor — confidence intervals, minimum sample sizes, paired comparison
  • Cost/latency trade-offs — the missing dimension in RAG optimization
  • Query expansion and HyDE (Hypothetical Document Embeddings)
  • Common RAG failures and how to debug them
  • Live Demo: Evaluating & improving a RAG pipeline end-to-end
Sunday — Hands-On Lab
  • Run RAGAS evaluation on your RAG chatbot — baseline score
  • Implement query expansion — compare retrieval before/after
  • A/B test: chunk size, retrieval method, re-ranking configs
  • Judge validation — compute Cohen's kappa, check for scoring bias
  • Statistical rigor lab — confidence intervals, sample size analysis, paired comparison
Mini-Project

Build a RAG Evaluation Pipeline — baseline scores, optimization experiments, judge validation, statistical rigor, cost/latency tracking

Real-World Workshop

AI Earthquake Analysis

Live APIs: USGS Earthquake

Live seismic data evaluation, retrieval metrics, LLM-as-judge, statistical rigor

6
AI Agents & Function Calling
Phase 3 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • What is an AI agent vs a chatbot — the key difference
  • Function calling / tool use — LLMs interacting with the world
  • Multi-provider tool calling — the common 5-step abstraction (OpenAI, Anthropic)
  • The agent loop pattern: observe → decide → act → repeat
  • Tool design principles — description quality, granularity trade-offs
  • Agent safety — max iterations, cost budgets, loop detection
  • Error handling and guardrails for production agents
  • Live Demo: Multi-tool personal assistant agent
  • Agent design patterns: ReAct, plan-and-execute, router
Sunday — Hands-On Lab
  • Build single-tool and multi-tool agents from scratch
  • Implement the agent loop with error handling and safety guards
  • Tool description quality experiment — measure selection accuracy
  • Multi-provider comparison — same agent on OpenAI and Anthropic
Mini-Project

Build a Personal Assistant Agent — 6 tools, cost tracking, safety guards, multi-provider comparison

Real-World Workshop

AI Space Tracker Agent

Live APIs: Open Notify (ISS) + Sunrise-Sunset + Open-Meteo

Real-time ISS tracking, multi-tool agent, tool design experiments

7
LangGraph & Agentic Workflows
Phase 3 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • The graph abstraction — why workflows need nodes, edges, and state
  • Pure Python graph executor — understand what frameworks do under the hood
  • LangGraph: nodes, edges, conditional routing, state management
  • Human-in-the-loop: agents that ask for approval
  • Agent memory: short-term (conversation) vs long-term (persistent)
  • Self-reflection pattern — generate, evaluate, revise cycles
  • Graph anti-patterns — god nodes, infinite cycles, state explosion
  • Live Demo: Multi-step research agent with branching logic
  • Comparison: LangChain agents vs LangGraph vs CrewAI
Sunday — Hands-On Lab
  • Build your first LangGraph workflow — 3-step research pipeline
  • Add conditional routing — agent decides which path to take
  • Implement human-in-the-loop — agent pauses for approval
  • Build a pure Python graph executor — understand the abstraction
  • Anti-pattern lab — identify and fix god nodes, infinite cycles, state explosion
  • Implement self-reflection — generate, evaluate, revise with scoring
Mini-Project

Build a Content Pipeline Agent — research, draft, review cycles with self-reflection, anti-pattern awareness, and graph design patterns

Real-World Workshop

AI Content Publishing Pipeline

Live APIs: Wikipedia + Hacker News + Quotable

LangGraph editorial workflow, self-reflection review cycles, checkpointing

8
Multi-Agent Systems & MCP
Phase 3 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • Why multi-agent systems — single agent vs specialized team
  • Multi-agent architectures: supervisor, sequential, hierarchical, swarm
  • Coordination strategies — supervisor vs debate vs consensus vs round-robin
  • Supervisor pattern with LangGraph — coordinator + specialized workers
  • Workers with tools — ReAct agents as sub-graphs
  • Multi-agent failure modes — cascading failures, delegation loops, conflicting outputs
  • Model Context Protocol (MCP) — the universal tool standard, JSON-RPC under the hood
  • Raw MCP messages — understanding initialize, tools/list, tools/call
  • Building MCP servers with FastMCP and connecting to agents
Sunday — Hands-On Lab
  • Build supervisor + workers multi-agent system from scratch
  • Create a custom MCP server with 5 tools and connect it to agents
  • Raw MCP messages exercise — write JSON-RPC by hand
  • Multi-agent failure mode detection and fixing
  • Coordination strategy comparison — supervisor vs round-robin
Mini-Project

Build a Multi-Agent NovaTech System — supervisor + specialized workers + MCP tool server, with failure recovery and coordination strategy comparison

Real-World Workshop

AI OSINT Intelligence Team

Live APIs: GitHub + Hacker News + Wikipedia

Multi-agent competitive intelligence, MCP server, supervisor pattern

9
Fine-Tuning & LLM Security
Phase 4 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • When to fine-tune: prompting vs RAG vs fine-tuning decision framework
  • Fine-tuning pitfalls — catastrophic forgetting, overfitting, cost-benefit analysis
  • LoRA & QLoRA: fine-tuning with limited compute
  • Why LoRA works — low-rank weight updates, rank selection, forgetting prevention
  • Dataset preparation for instruction tuning
  • Live Demo: Fine-tune a small model with Unsloth on Colab
  • LLM security: prompt injection, jailbreaks, data leakage
  • Alignment as the first defense — RLHF, Constitutional AI, guardrails frameworks
  • Guardrails: input/output validation, topic filtering, OWASP Top 10
Sunday — Hands-On Lab
  • Prepare a fine-tuning dataset in chat format
  • Fine-tune a small LLM (Llama 3.2 1B) with QLoRA on Colab
  • Compare base model vs fine-tuned model — with catastrophic forgetting check
  • Add input guardrails to your RAG app — detect prompt injection
  • Red team lab — attack and defend prompt injection in pairs
Mini-Project

Fine-tune a model + add guardrails to RAG app, with forgetting checks, held-out validation, and red team exercise. Capstone kickoff.

Real-World Workshop

AI Security Red Team Lab

Live APIs: NVD (NIST CVE database)

Real vulnerability data, layered defenses, automated red team attacks

10
Capstone Project & Graduation
Phase 4 · 6 hours (Sat + Sun)
Saturday — Concepts & Demos
  • The journey so far — 9 weeks in review, the full AI toolkit
  • Capstone architecture walkthrough — layered system design
  • Framework independence — architecture-first thinking, the portability test
  • Advanced orchestration — self-reflection, parallel execution, human-in-the-loop
  • Production deployment — FastAPI, monitoring, observability, graceful degradation
  • Instructor-guided capstone work session
  • 1-on-1 mentoring and code reviews
Sunday — Hands-On Lab
  • Student Presentations — live demo + architecture walkthrough (10 min each)
  • Framework-independent architecture description exercise
  • Production readiness review — latency tracking, deployment config
  • Peer feedback and Q&A
  • Certificate ceremony
  • Next steps: how to keep learning, community access
Mini-Project

Capstone — NovaTech AI Assistant with self-reflection workflows, framework-independent architecture, and production deployment sketch

Real-World Workshop

Build Your Own AI Startup MVP

Live APIs: Student's choice (all prior APIs)

Choose from 5 startup ideas, combine all course techniques, investor pitch

Every Week

Real-World Workshops

Every week includes a supplemental workshop using live APIs — no mock data, no tutorials, real production data that changes with every run. Each includes an optional Challenge D for production-tier implementations.

1
AI News Intelligence Dashboard
Hacker News

Fetch live tech news, AI-powered categorization, audience-adapted briefings, token economics at scale

2
AI Recipe Transformer
TheMealDB

Dietary adaptation, cuisine fusion, chain-of-thought meal planning, structured nutrition extraction

3
AI Book Recommendation Engine
Open Library

Semantic search over real books, metadata filtering, RAG-powered reading recommendations

4
AI Country Intelligence Briefing
REST Countries + Open-Meteo

Multi-source RAG with live weather data, intelligent chunking, conversational retrieval

5
AI Earthquake Analysis
USGS Earthquake

Live seismic data evaluation, retrieval metrics, LLM-as-judge, statistical rigor

6
AI Space Tracker Agent
Open Notify (ISS) + Sunrise-Sunset + Open-Meteo

Real-time ISS tracking, multi-tool agent, tool design experiments

7
AI Content Publishing Pipeline
Wikipedia + Hacker News + Quotable

LangGraph editorial workflow, self-reflection review cycles, checkpointing

8
AI OSINT Intelligence Team
GitHub + Hacker News + Wikipedia

Multi-agent competitive intelligence, MCP server, supervisor pattern

9
AI Security Red Team Lab
NVD (NIST CVE database)

Real vulnerability data, layered defenses, automated red team attacks

10
Build Your Own AI Startup MVP
Student's choice (all prior APIs)

Choose from 5 startup ideas, combine all course techniques, investor pitch

Week 10

Capstone Project

Choose one of these projects — or propose your own. You'll build it, deploy it, and present it live on Demo Day.

AI Research Assistant

RAG + Agents + Web Search + Summarization

Customer Support Bot

RAG + Guardrails + Multi-turn Conversation

Code Review Agent

Agents + Tool Use + GitHub Integration

Document Intelligence System

Advanced RAG + Multi-modal + Evaluation

Multi-Agent Content Studio

Multi-Agent + MCP + External APIs

+Propose your own project

What You Walk Away With

📂

18 Projects on GitHub

8 guided mini-projects + 10 real-world workshops with live APIs, each portfolio-ready with README

🚀

1 Deployed Capstone

A full-stack GenAI app — deployed, demo-ready, and shareable

💻

bsigma.ai Platform Access

2 workspaces, 10 notebooks each. Top up credits for more power anytime.

🎓

Certificate of Completion

Official bsigma.ai certificate to showcase on LinkedIn

👥

Private Community Access

Alumni community for ongoing support, networking, and job leads

🎬

Lifetime Session Recordings

All 20 sessions recorded — revisit any topic anytime

Prerequisites

  • Comfortable with Python (variables, functions, loops, dictionaries)
  • Basic understanding of APIs (what a REST API is, HTTP requests)
  • A laptop with internet access
  • No ML/AI background required — we start from zero on GenAI concepts
  • No GPU or expensive hardware needed — everything runs on our platform

Tools & Platforms Used

bsigma.ai
Primary lab environment
LangChain
Agent framework
LangGraph
Stateful workflows
CrewAI
Multi-agent systems
ChromaDB
Vector database
Ollama
Local LLMs
HuggingFace
Models & deployment
FastAPI
API deployment
Hacker News API
Live tech news
GitHub API
Repository analytics
USGS / Open-Meteo
Earthquake & weather
10+ Live APIs
Real-world data sources

Ready to Build with GenAI?

Batch 1 starts March 2026. Limited to 25 seats.