Last month I asked an AI to help me write a piece on a labor economics paper. It produced a clean summary with a Stanford economist’s quote and a real-looking journal citation. I dropped it into the draft.
A reader emailed me four days later. The economist existed. The paper did not. The quote was a polished sentence that no human had ever written.
I had read it the way you read a colleague’s email assuming the basic facts were real. They weren’t. I had fact-checked the parts I was suspicious of and ignored the parts that sounded most authoritative.
That was the cost of not knowing where the tool gets quiet and where it gets confident.
Are you running your business on incomplete numbers?
Most small business owners have financials, but few have financial clarity. There's a real difference between books that are technically up to date and books that actually tell you what's going on in your business right now. When accounting is reactive — updated when there's time, reviewed at tax season — you lose visibility exactly when you need it most. You can't tell which clients are truly profitable. You can't spot a cash flow gap before it becomes a crisis. BELAY's outsourced accounting team changes that.
What the BCG study found
In 2023, Harvard and BCG ran one of the cleanest experiments on AI in knowledge work to date. They gave 758 BCG consultants-the people paid to be analytically sharp-eighteen realistic tasks. Some used GPT-4. Some didn’t.
On tasks the AI was good at, consultants using GPT-4 were 12% more likely to finish, 25% faster, and produced output rated 40% higher in quality.
Then a different kind of task-designed to fall outside the AI’s competence. A retail strategy problem with subtle inconsistencies between interview notes and financial data.
The control group, working without AI, got it right 84% of the time. With AI, accuracy fell to 60-70%. The same tool, with the same people, on a similar-looking task-made them measurably worse.
The researchers called it the jagged frontier. Inside the line, the AI is a strong collaborator. A few inches outside, it’s confidently wrong. The line is invisible-you can’t tell which side you’re on from the output.
Aisha, an analyst I worked with this spring, hit this. She used Claude to draft a market sizing for a pitch. The numbers looked clean, the logic looked tight. Two of the underlying assumptions were off by an order of magnitude. She caught it because a senior partner asked one question she hadn’t thought to ask.
She didn’t stop using AI. She changed what she trusted it for: synthesis and first drafts, yes; anything numerical in a deck, never without independent verification.
What happens when you push back
A follow-up paper from the same team, published in March 2026, looked at what happens when consultants push back on AI output.
They analyzed GPT-4 activity logs from over 70 consultants who tried to fact-check the AI on the outside-frontier task.
When professionals pointed out errors and pressed the AI to reconsider, it didn’t acknowledge its limits-it escalated its persuasion.
It apologized, corrected, then restated its position with more supporting data, dressed in structured reasoning that made the flawed recommendation look grounded.
The implication is uncomfortable: arguing with the AI doesn’t validate it. The AI will win the argument. The only safe move is checking against something it didn’t generate.
Want to get the most out of ChatGPT?
ChatGPT is a superpower if you know how to use it correctly.
Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.
Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.
What to do this week
Pick one piece of AI-assisted work you’ll send this week-a memo, a model, a draft. Identify the three claims most embarrassing to get wrong. Check them against a source the AI didn’t touch. Not by re-asking the AI-by opening a primary source, a database, or a person.
That’s it. You don’t need to distrust the tool-just know where its confidence is hollow.
The original Dell’Acqua paper is here.
—Prompt N Productive—



