A new survey making rounds in education circles declares that AI scores high as a math learning tool. The finding is being presented with the certainty of settled science. We are told this is the future. Districts should prepare. Teachers should adapt. Parents should embrace it.

This trend is being sold as inevitable. It deserves more skepticism than it is getting.

Let me be clear about what I am not saying. I am not arguing that AI has no role in education. I am not claiming that research into AI-assisted learning is worthless. What I am arguing is that we have developed a dangerous habit of treating positive survey results as evidence of transformative potential, when they often measure something far narrower and more ambiguous.

Consider what a survey actually captures. It measures perception at a moment in time. It tells us what educators or students report about their experience, often in controlled settings or with novel tools. It does not tell us whether those tools produce durable learning gains, whether benefits persist across different student populations, or whether they work better than existing methods when deployed at scale in under-resourced schools.

The education sector has been here before. We have watched technology after technology arrive with survey data in hand, only to flatten out once the novelty wore off. We have seen enthusiasm for particular tools outpace the evidence for their effectiveness. We have watched districts spend millions on solutions that looked promising in pilots but failed to move the needle on actual student outcomes.

AI math tutors might be different. They might genuinely help struggling students master concepts that human teachers cannot always reach. But "might" is the operative word. And "might," based on a survey, should generate caution, not momentum.

Here is what we should be asking instead. Are these tools helping students who typically fall behind, or primarily serving those already on track? Do they work equally well across different age groups and math skill levels? What happens to learning when the AI component is removed? Are teachers using these systems to enhance their instruction, or are they becoming replacements for human interaction? And perhaps most importantly: who bears the cost if this doesn't work?

The last question matters because it rarely gets asked. When new technology underperforms, the burden often lands heaviest on the students and schools with the fewest resources to absorb the failure. A well-funded district can experiment with an AI tool for a year, see mediocre results, and move on. A struggling school that has redirected limited funds toward that same experiment may not recover as easily.

There is also the broader research question that surveys cannot answer alone. We need longitudinal studies. We need randomized trials. We need implementation research that examines what actually happens when these tools reach diverse classrooms, not just test cases. We need honest assessment of cost-benefit tradeoffs. And we need researchers asking hard questions, not just measuring whether teachers feel the tool is helpful.

This is not an argument for paralysis. It is an argument for proportional skepticism. The fact that a survey reports positive feedback on AI math tools is worth noting. It is not worth treating as a mandate for widespread adoption.

The education industry moves slowly on many fronts and catastrophically fast on others. Technology adoption tends toward the latter. We should insist on more evidence before we decide that this particular wave of AI enthusiasm represents genuine progress rather than expensive experimentation at scale.

Skepticism is not the enemy of innovation. It is the guardian of it.