aiAmazoncsamcybertiplinedataFeaturedmoral panicncmecny times

Six Months Of ‘AI CSAM Crisis’ Headlines Were Based On Misleading Data

from the lies,-damned-lies,-and… dept

Remember last summer when everyone was freaking out about the explosion of AI-generated child sexual abuse material? The New York Times ran a piece in July with the headline “A.I.-Generated Images of Child Sexual Abuse Are Flooding the Internet.” NCMEC put out a blog post calling the numbers an “alarming increase” and a “wake-up call.” The numbers were genuinely shocking: NCMEC reported receiving 485,000 AI-related CSAM reports in the first half of 2025, compared to just 67,000 for all of 2024.

That’s a big increase! And it would obviously be super concerning if any AI company were finding and detecting so much AI-generated CSAM, especially as we keep hearing that the big AI models (perhaps with the exception of Grok…) have been putting in place safeguards against CSAM generation.

The source of most of those reports? Amazon, which had submitted a staggering 380,000 of them, even though most people don’t tend to think of Amazon as much of an AI company. But, still, it became a six alarm fire about how much AI-generated CSAM Amazon had discovered. There were news stories about it, politicians demanding action, and the general sentiment was that this proved how big the problem was.

Except… it turns out that wasn’t actually what was happening. At all.

Bloomberg just published a deep dive into what was actually going on with Amazon’s reports, and the truth is very, very different from what everyone assumed. According to Bloomberg:

Amazon.com Inc. reported hundreds of thousands of pieces of content last year that it believed included child sexual abuse, which it found in data gathered to improve its artificial intelligence models. Though Amazon removed the content before training its models, child safety officials said the company has not provided information about its source, potentially hindering law enforcement from finding perpetrators and protecting victims.

Here’s the kicker—and I cannot stress this enough—none of Amazon’s reports involved AI-generated CSAM.

None of its reports submitted to NCMEC were of AI-generated material, the spokesperson added. Instead, the content was flagged by an automatic detection tool that compared it against a database of known child abuse material involving real victims, a process called “hashing.” Approximately 99.97% of the reports resulted from scanning “non-proprietary training data,” the spokesperson said.

What Amazon was actually reporting was known CSAM—images of real victims that already existed in databases—that their scanning tools detected in datasets being considered for AI training. They found it using traditional hash-matching detection tools, flagged it, and removed it before using the data. Which is… actually what you’d want a company to do?

But because it was found in the context of AI development, and because NCMEC’s reporting form has exactly one checkbox that says “Generative AI” with no way to distinguish between “we found known CSAM in our training data pipeline” and “our AI model generated new CSAM,” Amazon checked the box.

And thus, a massive misunderstanding was born.

Again, let’s be clear and separate out a few things here: the fact that Amazon found CSAM (known or not) in its training data is bad. It is a troubling sign of how much CSAM is found in the various troves of data AI companies use for training. And maybe the focus should be on that. Also, the fact that they then reported it to NCMEC and removed it from their training data after discovering it with hash matching is… good. That’s how things are supposed to work.

But the fact that the media (with NCMEC’s help) turned this into “OMG AI generated CSAM is growing at a massive rate” is likely extremely misleading.

Riana Pfefferkorn at Stanford, who co-authored an important research report last year about the challenges of NCMEC’s reporting system (which we wrote two separate posts about), wrote a letter to NCMEC that absolutely nails what went wrong here:

For half a year, “Massive Spike In AI-Generated CSAM” is the framing I’ve seen whenever news reports mention those H1 2025 numbers. Even the press release for a Senate bill about safeguarding AI models from being tainted with CSAM stated, “According to the National Center for Missing & Exploited Children, AI-generated material has proliferated at an alarming rate in the past year,” citing the NYT article.

Now we find out from Bloomberg that zero of Amazon’s reports involved AI-generated material; all 380,000 were hash hits to known CSAM. And we have Fallon [McNulty, executive director of the CyberTipline] confirming to Bloomberg that “with the exception of Amazon, the AI-related reports [NCMEC] received last year came in ‘really, really small volumes.’”

That is an absolutely mindboggling misunderstanding for everyone — the general public, lawmakers, researchers like me, etc. — to labor under for so long. If Bloomberg hadn’t dug into Amazon’s numbers, it’s not clear to me when, if ever, that misimpression would have been corrected. 

She’s not wrong. Nearly 80% of all “Generative AI” CyberTipline reports to NCMEC in the first half of 2025 involved no AI-generated CSAM at all. The actual volume of AI-generated CSAM being reported? Apparently “really, really small.”

Now, to be (slightly?) fair to the NYT, they did run a minor correction a day after their original story noting that the 485,000 reports “comprised both A.I.-generated material and A.I. attempts to create material, not A.I.-generated material alone.” But that correction still doesn’t capture what actually happened. It wasn’t “AI-generated material and attempts”—it was overwhelmingly “known CSAM detected during AI training data vetting.” Those are very different things.

And it gets worse. Bloomberg reports that Amazon’s scanning threshold was set so low that many of those reports may not have even been actual CSAM:

Amazon believes it over-reported these cases to NCMEC to avoid accidentally missing something. “We intentionally use an over-inclusive threshold for scanning, which yields a high percentage of false positives,” the spokesperson added.

So we’ve got reports that aren’t AI-generated CSAM, many of which may not even be CSAM at all. Very helpful.

The frustrating thing is that this kind of confusion wasn’t just entirely predictable—it was predicted! When Pfefferkorn and her colleagues at Stanford published their report about NCMEC’s CSAM reporting system they literally called out the potential confusion in the options of what to check and how platforms would likely over-report stuff in an abundance of caution, because the penalty (both criminally and in reputation) for missing anything is so dire.

Indeed, the form for submitting to the CyberTipline has one checkbox for “Generative AI” that, as Pfefferkorn notes in her letter, can mean wildly different things depending on who’s checking it:

When the meaning of checking a single checkbox is so ambiguous that absent additional information, reports of known CSAM found in AI training data are facially indistinguishable from reports of new AI-generated material (or of text-only prompts seeking CSAM, or of attempts to upload known CSAM as part of a prompt, etc.), and that ambiguity leads to a months-long massive public misunderstanding about the scale of the AI-CSAM problem, then it is clear that the CyberTipline reporting form itself needs to change — not just how one particular ESP fills it out. 

To its credit NCMEC did respond quickly to Pfefferkorn, and their response is… illuminating. They confirmed they’re working on updating the reporting system, but also noted that Amazon’s reports contained almost no useful information:

all those Amazon reports included minimal data, not even the file in question or the hash value, much less other contextual information about where or how Amazon detected the matching file

As Pfefferkorn put it, Amazon was basically giving NCMEC reports that said “we found something” with nothing else attached. NCMEC says they only learned about the false positives issue last week and are “very frustrated” by it.

Indeed, NCMEC’s boss told Bloomberg:

“There’s nothing then that can be done with those reports,” she said. “Our team has been really clear with [Amazon] that those reports are inactionable.”

There’s plenty of blame to go around here. Amazon clearly should have been more transparent about what they were reporting and why. NCMEC’s reporting form is outdated and creates ambiguity that led to a massive public misunderstanding. And the media (NYT included) ran with alarming numbers without asking obvious questions like “why is Amazon suddenly reporting 25x more than last year and no other AI company is even close?”

But, even worse, policymakers spent six months operating under the assumption that AI-generated CSAM was exploding at an unprecedented rate. Legislation was proposed. Resources were allocated. Public statements were made. All based on numbers that fundamentally misrepresented what was actually happening.

As Pfefferkorn notes:

Nobody benefits from being so egregiously misinformed. It isn’t a basis for sound policymaking (or an accurate assessment of NCMEC’s resource needs) if the true volume of AI-generated CSAM being reported is a mere fraction of what Congress and other regulators believe it is. It isn’t good for Amazon if people mistakenly think the company’s AI products are uniquely prone to generating CSAM compared with other options on the market (such as OpenAI, with its distant-second 75,000 reports during the same time period, per NYT). That impression also disserves users trying to pick safe, responsible AI tools to use; in actuality, per today’s revelations about training data vetting, Amazon is indeed trying to safeguard its models against CSAM. I can certainly think of at least one other AI company that’s been in the news a lot lately that seems to be acting far more carelessly.

None of this means that AI-generated CSAM isn’t a real and serious problem. It absolutely is, and it needs to be addressed. But you can’t effectively address a problem if your data about the scope of that problem is fundamentally wrong. And you especially can’t do it when the “alarming spike” that everyone has been pointing to turns out to be something else entirely.

The silver lining here, as Pfefferkorn points out, is that the actual news is… kind of good? Amazon’s AI models aren’t CSAM-generating machines. The company was actually doing the responsible thing by vetting its training data. And the real volume of AI-generated CSAM reports is apparently much lower than we’ve been led to believe.

But that good news was buried for six months under a misleading narrative that nobody bothered to dig into until Bloomberg did. And that’s a failure of transparency, of reporting systems, and of the kind of basic journalistic skepticism that should have kicked in when one company was suddenly responsible for 78% of all reports in a category.

We’ll see if NCMEC’s promised updates to the reporting form actually address these issues. In the meantime, maybe we can all agree that the next time there’s a 700% increase in reports of anything, it’s worth asking a few questions before writing the “everything is on fire” headline.

Filed Under: , , , ,

Companies: amazon, ncmec, ny times

Source link

Related Posts

1 of 221