LLM Engineering & AI Systems

Building intelligence that
performs under pressure.

Custom RAG pipelines, fine-tuned models, and production ML systems on AWS Bedrock and SageMaker — battle-tested on enterprise data with measurable results.

From Michelin-starred kitchens to Fortune 500 AI systems — I bring the same obsessive precision to language models that I brought to the pass. The details are where it wins or breaks.

Get in touch View capabilities

Latest Evaluation RUN #4412

servicerag-pipeline

modelclaude-ft · v3

statusrunning

recall@100.00

hallucination0.0%

p95 latency0ms

Bedrock Lambda RAG

Every model ships measured — or it doesn't ship.

6×Retrieval Precision Improvement

80%+LLM Projects Fail in Production

F500Enterprise Experience

About

The same discipline, a different kitchen.

RAG systems that hallucinate under real queries. Fine-tuned models that ace benchmarks but drift in production. Bedrock deployments where nobody measured anything. The same pattern keeps showing up.

I started in technology, then spent years training in Michelin-starred kitchens where precision and flawless execution under pressure weren't optional — they were survival.

That taught me complex systems either work exactly as designed, or they fail at the worst possible moment.

When I returned to tech and went deep into LLM engineering, I found the same gap: teams shipping models that impressed in demos and collapsed on real data. The tooling had changed — Bedrock, SageMaker, vector databases, fine-tuning pipelines — but the discipline around evaluation, monitoring, and production readiness hadn't kept up.

I build RAG pipelines, fine-tuned domain models, and production ML systems on AWS that teams can actually maintain. The goal is always the same: make moving fast and building reliably the same thing — not competing priorities.

Capabilities

What I build.

Scalable AI infrastructure. Measurable performance. Production-ready from day one.

Retrieval-Augmented Generation

Pipelines that surface the right information at the right time, at scale — vector-database architecture, embedding optimization, hybrid semantic + keyword retrieval, and disciplined context-window and chunking strategies.

Fine-Tuning & Model Adaptation

Domain-adapted models using LoRA and QLoRA, evaluated against real-world benchmarks and edge cases, with curated training data and A/B testing frameworks that prove performance instead of asserting it.

AWS Bedrock

Production LLM deployment and infrastructure — prompt engineering, guardrails for safety and compliance, and cost and usage monitoring that keeps spend predictable under real traffic.

SageMaker Pipelines

End-to-end ML workflows with full observability — model monitoring, drift detection, automated retraining triggers, and feature stores with data versioning teams can maintain.

Selected Work

Production systems, end to end.

Alex — Multi-Agent Equities Research Planner

Ask for an analysis of a portfolio and a coordinated team of agents goes to work. A planner decomposes the request and fans it out over an SQS queue to specialized agents — a researcher that runs semantic search over financial research, plus tagging, report-writing, charting, and retirement-projection agents — then assembles their output into a single structured brief: a written research report, generated charts, and a long-horizon retirement projection, end to end in roughly 36 seconds. The retrieval layer deliberately pairs two modes — vector RAG (SageMaker-served embeddings indexed in S3 Vectors) over narrative research and structured SQL over an Aurora PostgreSQL store of instrument data — so the agents reason over both prose and numbers. Every agent runs as a container-image Lambda on Amazon Bedrock (Nova Pro), the entire stack is defined in Terraform, and it's fronted by an authenticated Next.js app behind CloudFront. Built scale-to-zero by design — Aurora Serverless auto-pauses, SageMaker is serverless, Lambdas idle at zero — so a production-grade agent orchestra costs next to nothing at rest. A capstone demonstration of agentic architecture and AWS deployment discipline, not licensed financial advice.

Multi-AgentAWS BedrockTerraform

View live →

di4health — Public Health Intelligence Assistant

Ask a plain-English question — "Compare obesity rates in Cook County and Harris County," or "What has the CDC reported on recent measles outbreaks?" — and get an evidence-backed answer in seconds. Under the hood, a multi-tool agent decides how to answer: querying CDC PLACES statistics across 3,000+ U.S. counties, pulling CDC WONDER mortality data, or running semantic search over CDC MMWR outbreak reports. It deliberately pairs two retrieval styles — structured SQL over tabular health data and vector RAG over narrative reports — then returns a structured intelligence brief with the numbers, sources, and caveats spelled out, rendered beside a live chat thread. Authenticated and deployed end to end (Next.js + FastAPI agent), every response is grounded in real data and names the tools it used. Built to demonstrate a production-grade agentic RAG system — typed output contracts, fail-fast startup, graceful degradation — not a demo that dazzles once and breaks on the second question.

Agentic RAGVector SearchFastAPI

View live →

AI Digital Assistant

An interactive digital twin powered by Amazon Bedrock and AWS Lambda. Ask questions and learn about my background, experience, and approach to LLM engineering.

BedrockLambdaRAG

View live →

Molecular Toxicity Screening Platform

Paste in a SMILES string and screen a compound against 12 Tox21 toxicity endpoints in seconds — nuclear receptor and stress-response pathways like NR-AR, NR-ER, and SR-p53. A fine-tuned ChemBERTa transformer drives the predictions, paired with an RDKit-powered ADMET profile: Lipinski and Veber drug-likeness rules, PAINS alerts, and key molecular descriptors. Results roll up into a weighted composite risk score with Low/Moderate/High tiers and a rendered 2D structure. Built to demonstrate a production-minded ML workflow — domain model fine-tuning, explainable outputs, and confidence-aware scoring — not to replace the wet lab.

ChemBERTaRDKitADMET

View live →

Building intelligence that performs under pressure.

The same discipline, a different kitchen.

What I build.

Retrieval-Augmented Generation

Fine-Tuning & Model Adaptation

AWS Bedrock

SageMaker Pipelines

Production systems, end to end.

Alex — Multi-Agent Equities Research Planner

di4health — Public Health Intelligence Assistant

AI Digital Assistant

Molecular Toxicity Screening Platform

Let's build something that holds up in production.

Building intelligence that
performs under pressure.