About me

I work with teams whose agents are great in controlled testing but struggle when real users hit edge cases. I focus on what happens after the demo works — when real traffic, edge cases, and adversarial inputs hit the system. Eval design, reliability architecture, failure-mode analysis, and post-incident remediation. I work especially well with teams in financial services, insurance, and other domains where “mostly works” isn’t acceptable.