Twitter – now X – is a massive, public-facing platform with hundreds of millions of users and integrations everywhere. That kind of scale and openness? It’s a magnet for attackers and data scrapers. And let’s be honest, keeping access controls airtight in that environment is no small feat. Recent Twitter data breach incidents prove that even public-facing metadata can become a serious risk when someone scrapes it at scale.

Between 2021 and 2025, a string of data breaches and exposure incidents revealed how seemingly small flaws can ripple across a user base this big. The headline numbers changed from breach to breach, but the pattern stayed the same: contact data tied to accounts, massive troves of scraped records, and the kind of long-tail risks that security teams lose sleep over.

This guide walks you through the major incidents, what data was exposed, the fallout for users and the company, and the practical security lessons your team can apply. You’ll learn how to strengthen API security, tighten access management, and sharpen your incident response.

Twitter Data Breach List: History and Timeline

Twitter’s breach history isn’t one dramatic event. It’s a multi-year story that started with API abuse and just kept escalating. When you look across the years, you’ll see how control gaps compound when they’re not fixed quickly. You’ll also see how the same flaw can come back in new forms.

Twitter Data Breach 2025

In March and April 2025, underground forums started circulating what was billed as the largest social media breach ever. We’re talking about a 400 GB dump containing roughly 2.8 to 2.9 billion Twitter/X user records. Early analysis showed the bulk of that data looked like public profile metadata – the kind of information that’s technically available to anyone but becomes dangerous when assembled at this scale.

Around the same time, a separate 34 GB file with about 200 million records also surfaced. Reporting suggested it blended earlier scraped data with newer collections.

Even if the contents were mostly public, the scrutiny in 2025 was intense. Large, structured dumps like this still fuel the tactics that matter most: impersonation that looks real, phishing that lands, and doxxing that works. The company faced hard questions about insider risk and monitoring. And regulators made it clear they’re still watching API and access-governance practices closely.

Twitter Data Breach 2023

In January 2023, a massive dataset tied to somewhere between 200 and 235 million Twitter accounts showed up on a hacking forum. It included email addresses linked to those profiles. No passwords were exposed, but that didn’t make it harmless.

The real damage? De-anonymization. Users who relied on pseudonymous accounts for privacy or safety suddenly had their email addresses out in the open. That opened the door to phishing attacks, password-reset scams, and targeted harassment.

Investigators traced the breach back to an API flaw that was active between 2021 and 2022. The vulnerability let attackers match an email or phone number to a specific Twitter account. By the time the dataset surfaced in 2023, the damage was already done.

The fallout wasn’t just technical. Regulators around the world launched formal inquiries into how Twitter (now X) handled the vulnerability and whether it properly notified affected users. Spoiler: the answers weren’t great.

Twitter Data Breach 2022

In August 2022, Twitter confirmed what many had suspected: a bug introduced in June 2021 and patched in January 2022 had been exploited to scrape data on at least 5.4 million accounts. The exposed data included email addresses, phone numbers, and public profile information. Another dataset covering roughly 1.4 million suspended accounts also made the rounds.

Again, no passwords were leaked. But the risks were still serious. Every exposed field became fuel for the next wave of attacks – from phishing that lands in the right inbox to social engineering that knows exactly what to say. If you’ve ever wondered why attackers care about email addresses and phone numbers, this is why. They’re the keys to identity theft and social engineering.

This breach highlighted failures that security teams see everywhere but rarely fix in time. Once the data’s out there, it’s out there.

Twitter Data Breach 2021

The 2021 incident was where it all started. In June 2021, Twitter pushed an update that introduced a new API behavior. It let attackers submit an email or phone number and get back the Twitter account it belonged to. Simple. Effective. Devastating.

Threat actors jumped on it immediately. They started scraping accounts at scale before Twitter patched the flaw in January 2022. That’s why so many of the datasets that surfaced in 2022 and 2023 trace back to scraping activity that started in late 2021.

Later, whistleblower allegations added fuel to the fire. They painted a picture of deeper problems that went beyond a single bug. Those concerns set the stage for the regulatory and public pressure that would build over the next two years.

Twitter Data Breach Compensation

Let’s talk about what actually happened after these breaches – because regulators and users didn’t just shrug and move on.

The biggest U.S. action came in May 2022: a $150 million settlement with the FTC and DOJ. But it wasn’t about the API scraping. This one targeted Twitter’s earlier practice of collecting phone numbers and emails “for security,” then quietly using them for targeted ads. The settlement laid out exactly what had to change and who needed to be told.

Then came the class actions. Users filed lawsuits tied to the API exposure that enabled scraping from June 2021 to January 2022. Courts allowed some claims to move forward while dismissing others, so potential payouts remain on the table but unresolved. In 2024, the FTC clarified that Twitter Files access didn’t violate the existing order, and compliance obligations stayed in force. Meanwhile, Ireland’s Data Protection Commission opened an inquiry into the datasets disclosed in late 2022 and continued pushing the company on GDPR compliance.

Lessons Learned From Twitter Data Breaches

If you strip away the noise, the core lessons come down to API security basics, access controls, and your ability to detect and respond. The specifics change, but the themes? They keep showing up.

Shore up API design against enumeration and scraping. The 2021-2023 weakness was embarrassingly simple: an API would tell you if an email or phone number matched an account. No passwords needed. At scale, that mapping becomes a goldmine. Design your endpoints to avoid binary yes-or-no responses. Add opaque workflows. Harden rate limits and anomaly detection around lookup patterns.

Constrain data at the property level, not just the object. Even well-meaning endpoints can overshare. You should limit the fields your API returns, enforce authorization at both the object and property level, and keep response schemas tight. This reduces the damage if traffic suddenly spikes or an internal check fails under load.

Instrument for scraping patterns before they explode. High-volume, low-entropy requests – think sequential emails, predictable phone ranges, repetitive payloads – are detection gold. You can baseline legitimate traffic, score anomalies, and trigger graduated defenses like CAPTCHAs, proofs of work, temporary blocks, or forced friction at specific endpoints.

Treat insider risk as a first-class threat. Massive, structured dumps (especially those dominated by public metadata) raise red flags about export controls and logging. Consider data diodes for analytics exports, approval workflows for large pulls, and keystoned monitoring on warehouse egress.

Tighten third-party access and governance. Inventory your API keys. Rotate credentials. Require scoped, expiring tokens. Your vendors should face the same monitoring and rate limits as your internal services. And when a partner becomes a conduit for abuse, clear runbooks make all the difference.

Practice notification, then practice it again. In scraping cases, perfect attribution is rare. But you can still pre-draft user-friendly notices, coordinate with regulators, and explain concrete steps users should take. Good communication reduces secondary harm and rebuilds trust.

In practice, this means security teams need to think like attackers trying to map your entire user base through your own API. For users, the playbook is simpler: treat your password like it’s already compromised, add that extra authentication layer, and assume every unexpected email about your account is probably a trap.

Panorays helps companies reduce supply chain cyber risk so they can do business together with confidence. As a leading provider of third-party cyber risk management solutions, Panorays equips your team with AI-powered assessments, continuous oversight of vendor posture, and actionable remediation guidance tailored to each third-party relationship.

Looking to strengthen third-party governance without slowing the business? Panorays helps you stay ahead of emerging vendor threats and streamline risk workflows across your supply chain. Book a personalized demo to see how Panorays can support your program at scale.

Twitter Data Breach FAQs