Module 5: Behavioral & Leadership QuestionsLesson 5.3

Worked Example: Disagreements with ML Engineers

Walk through a complete answer to 'How do you handle disagreements with ML engineers?' showing technical empathy and influence without authority.

12 min readLesson 22 of 29

The Question

Here is the question: 'Tell me about a time you had a significant disagreement with an engineer (or ML engineer) about how to approach a problem. How did you handle it?' This question tests your ability to collaborate with technical teams, which is one of the most critical skills for an AI PM. The AI PM role requires deeper technical collaboration than traditional PM roles because AI products involve more ambiguity, longer iteration cycles, and decisions where both product judgment and technical expertise matter.

The trap is telling a story where you 'won' the disagreement by convincing the engineer you were right. Interviewers are not looking for evidence that you can override technical decisions. They are looking for evidence that you can engage productively with technical perspectives, update your position when the evidence warrants it, and reach a decision that accounts for both product and technical considerations.

Worked Answer: A Strong Response

"Situation: I was working on a content moderation product. Our ML engineer recommended building a custom fine-tuned model for detecting policy-violating content. This would take 3 months and require a dedicated annotation team to label training data. I wanted to use a pre-trained content safety API (from a major cloud provider) with custom rules on top, which we could ship in 3 weeks."

"Task: I needed to decide the approach with my engineering counterpart. The stakes were high: we had a regulatory deadline in 6 weeks, and our current content moderation was mostly manual review, which was not scaling."

"Action: Instead of debating in a meeting, I suggested we spend one week doing a structured comparison. We agreed on three evaluation criteria: accuracy on our specific content types (measured against 500 labeled examples from our recent manual reviews), latency, and cost per moderation decision. I created the evaluation framework, and the ML engineer ran both approaches against it."

"The results surprised us both. The pre-trained API performed well on common content types (95% accuracy) but poorly on our niche policy violations that were specific to our platform (62% accuracy). The custom model, even in its early training state, performed better on niche violations (78%) but was worse on common types (88%) and would take the full 3 months to reach production quality. Neither approach alone met our needs."

[Interviewer note: The candidate set up a data-driven comparison rather than debating opinions. This is the right approach for PM-engineer disagreements. The evaluation framework with specific criteria shows structured thinking.]

Worked Answer: Resolution and AI Insight

"We ended up with a hybrid approach that neither of us had originally proposed. We used the pre-trained API for the common content types (where it excelled) and built a simpler, targeted classifier for our niche policy violations using a fine-tuned version of a smaller model. This shipped in 5 weeks, just under our deadline. The ML engineer owned the niche classifier, and I owned the product integration and rules layer."

"Result: The hybrid system caught 91% of policy violations (up from 60% with manual review alone) and reduced manual review volume by 70%. The niche classifier continued to improve post-launch as we fed it more labeled data from the cases the API missed. Six months later, we had enough data to revisit whether a fully custom model would outperform the hybrid, and the answer was: only marginally, and not enough to justify the migration cost."

"AI Insight: This taught me that PM-ML engineer disagreements are often both-right situations. The engineer was right that a custom model would be better for our specific use case. I was right that we needed to ship faster than a custom model allowed. The solution was to decompose the problem: use off-the-shelf for what it is good at and custom for what requires our domain expertise. In AI product development, the 'build vs. buy' question is rarely binary. It is almost always a hybrid, and the PM's job is to find the seam where the hybrid makes sense."

[Interviewer note: The resolution is collaborative, not adversarial. The candidate did not 'win.' They found a better answer by combining perspectives. The AI Insight about hybrid approaches to build vs. buy is a genuine lesson that interviewers at companies like Anthropic and Google hear from experienced AI PMs. Score: 4.5/5. To reach 5/5, the candidate could have described how this experience changed their default approach to technical disagreements going forward.]

What Interviewers Watch For

There are four signals interviewers evaluate in disagreement stories. First, respect for technical expertise. Did you treat the engineer's perspective as legitimate, or did you override it with product authority? The best stories show genuine intellectual curiosity about the engineer's viewpoint. Second, data-driven resolution. Did you resolve the disagreement with data, not with hierarchy? Setting up a structured evaluation is the gold standard.

Third, influence without authority. AI PMs do not have authority over ML engineers. You cannot mandate a technical approach. You need to influence through evidence, empathy, and shared goals. Fourth, the outcome improved because of the disagreement. The best stories end with a solution that was better than either person's original proposal. This shows that productive disagreement leads to better products.

Red flags in disagreement stories: 'I convinced the engineer to do it my way.' 'I escalated to the engineering manager.' 'I was right and they came around eventually.' These signal that the candidate does not collaborate well with technical teams. In AI PM interviews specifically, collaboration with ML engineers is a non-negotiable requirement.

Key Takeaways

Disagreement stories should show collaboration, not winning. The best resolution is one that neither person originally proposed
Resolve PM-engineer disagreements with data: set up an evaluation framework and let the results guide the decision
AI PM disagreements are often 'both-right' situations. Decompose the problem to use the best approach for each component
Show respect for technical expertise. AI PMs who override ML engineers lose credibility and build worse products
Red flags: 'I convinced them,' 'I escalated,' 'I was right.' Green flags: 'We evaluated together,' 'The result was better than either proposal,' 'I learned something from the engineer's perspective'

5.2 Worked Example: Shipping an AI Feature That Failed

5.4 Worked Example: Decisions with Incomplete Data