Aishwarya Naresh Reganti + Kiriti Badam
But if you make a bunch of practitioners sit together and ask them, "Is it important to build an actionable feedback loop for AI products?" I think all of them will agree, covering AI product work, product design, and team leadership.
Episode
Aishwarya Naresh Reganti + Kiriti Badam
Summary
Two AI practitioners (one from Anthropic, one from OpenAI) who are also married discuss the fundamental differences between building AI products versus traditional software, specifically around non-determinism, the agency-control tradeoff, and why most teams fail by chasing hype over problems. They walk through a framework for shipping AI products successfully — starting small, building feedback flywheels, and developing robust evals.
Key Takeaways
AI products have two layers of non-determinism: you don't know how the user will behave, and you don't know how the LLM will respond. Design your system to handle both.
Every agentic system involves an agency-control tradeoff — the more decision-making you hand to AI, the less control you have. Start with narrow, well-defined tasks before expanding.
Build your feedback flywheel before scaling: companies winning with AI built the right infrastructure to improve over time (evals, data loops, human oversight).
'Pain is the new moat' — the institutional knowledge from iterating through what works and doesn't with AI is harder to replicate than any feature.
80% of successful AI engineers spend their time understanding customer workflows and looking at data, not building the fanciest architecture.
Notable Quotes
“I think that will be a huge problem once systems go mainstream. We're still so busy building AI products that we're not worried about security, but it will be such a huge problem to kind of, especially with this non-deterministic API again. So you're kind of stuck because there are tons of instructions that you could inject within your prompt and then it's going really bad.”
“You apply fixes for issues that you see, but you also design newer evaluation metrics to figure out that they are emerging patterns. And that doesn't mean you should always design evaluation metrics. There are some errors that you can just fix and not really come back to because they're very spot errors. For instance, there's a tool calling error just because your tool wasn't defined well and stuff like that. You can just fix it and move on. And this is pretty much how an AI product lifecycle would look like. But what we specifically also mention is while you're going through these iterations, try to think of lower agency iterations in the beginning and higher control iterations. What that means is constrain the number of decisions your AI systems can make and make sure that they're humans in the loop and then increase that over time because you're kind of building a flywheel of behavior and you're understanding what kind of use cases are coming in or how your users are using the system.”
“So you can kind of determine which of these use cases should go through that human and the loop layer versus which of the use cases AI can conveniently handle. And then all through this process, you're also logging what the human is doing because you want to build a flywheel that you could use in order to improve your system. So you're essentially not showing the user experience, not eroding trust, at the same time logging what humans would otherwise do so that you can continuously improve your system.”