AI

From designing production-grade RAG pipelines to evaluating frontier models for enterprise use cases, my work is grounded in real engineering and practical outcomes, not hype.

Focus Areas

Retrieval-Augmented Generation — RAG pipelines, chunking strategies, embedding selection, reranking, and evaluation frameworks.

Agentic Systems — Multi-step agents with tool use, state management, memory, and orchestration.

Model Evaluation — Rigorous eval frameworks for benchmarking LLMs against specific use cases and detecting regressions.

Enterprise AI Adoption — Governance, change management, and building internal AI capability from scratch.

Fine-tuning & Alignment — Instruction tuning and RLHF-style techniques; knowing when fine-tuning beats prompting.

Responsible AI — Hallucination mitigation, bias detection, prompt injection guardrails, and production monitoring.

Models & Tools

Category	Tools
Models	Claude, GPT-4 / o-series, Gemini, Llama, Mistral
Frameworks	LangChain, LangGraph, LlamaIndex, Hugging Face
Infrastructure	Pinecone, Qdrant, pgvector, AWS Bedrock, Azure OpenAI
Evals & Ops	Braintrust, Langsmith, PromptLayer

Rapid prototyping with Claude
October 2025
Claude now does the coding around here
September 2025
The resistance to LLM coding is part of a software cultural tradition
September 2025
The role of software engineers in a generative world
June 2025