Respectlytics Respect lytics
Menu
Data Deletion Strategy Anonymous Analytics Right to Erasure

Why Over-Deletion Is Better
Than Under-Deletion

7 min read

The over-deletion principle: When handling data deletion requests against anonymous analytics, it is always better to delete more data than strictly necessary than to risk leaving the requesting person's data behind. The anonymous records caught in the crossfire have no personal impact — but a missed deletion has legal and ethical consequences.

A user emails your support team: "Please delete all my data." You open your analytics dashboard and realise your analytics platform stores no user IDs, no emails, no device fingerprints. Just anonymous session events. How do you find their data? And what happens when you can't isolate it perfectly?

This is the over-deletion trade-off — and once you understand it, you'll see why it's not a problem to solve, but a feature of privacy-first architecture.

🎯 The deletion precision problem

Traditional analytics platforms store a user ID with every event. When someone requests deletion, you query WHERE user_id = 'abc123' and remove exactly their rows. Clean, precise, complete.

Privacy-first analytics platforms don't have that luxury — by design. When your analytics stores only anonymous session data (event name, rotating session ID, timestamp, platform, country), there is no column that says "this event belongs to Alice."

So how do you delete Alice's data? You use filter-based deletion: delete all events matching the date range, platform, and country that correspond to Alice's activity window — information you derive from your own application records, not from the analytics data.

The two possible outcomes

Over-deletion: Your filters capture Alice's events plus events from other users who share the same time window, platform, and country. Extra anonymous rows are removed.
Under-deletion: Your filters are too narrow and miss some of Alice's events. You tell Alice her data was deleted, but some remains in your system.

With anonymous analytics, there is no way to achieve perfect per-user precision. You will always land on one side or the other. The question is: which side do you want to err on?

🔍 What over-deletion looks like in practice

Let's walk through a concrete scenario.

Scenario: Alice's deletion request

Alice used your iOS app from Germany during January 2026 and has asked you to delete her data. From your own user database, you know:

  • Account created: January 3, 2026
  • Last active: January 28, 2026
  • Platform: iOS
  • Country: Germany (DE)

You set your deletion filters accordingly:

date_from: 2026-01-03
date_to:   2026-01-28
platform: ios
country:  DE

The preview returns 4,218 matching events. Alice probably generated a few hundred events during her usage. The rest belong to other German iOS users active during the same period.

You execute the deletion. All 4,218 events are removed. Alice's data is gone — along with anonymous records from other users who happened to share the same characteristics.

This is over-deletion. And it's exactly the right outcome.

Why over-deletion is acceptable

Over-deletion feels wasteful until you examine what's actually being lost — and what you're gaining.

Four reasons over-deletion is the right trade-off

1. The deleted data is anonymous

No personal information is lost. The extra rows contain event names, timestamps, and rotating session IDs that cannot be linked to any individual. Removing them has zero personal impact on anyone.

2. Aggregate analytics remain meaningful

Even after a broad deletion, your remaining data still powers charts, funnels, and trend analysis. A few thousand fewer events in one region during one month rarely changes overall product insights — especially if you have data from other time periods and regions.

3. The audit trail proves you acted

Every deletion creates a timestamped audit log with the filters used, the number of events removed, and a unique deletion_id. This proves you took action — which is what regulators and users actually care about.

4. It honours the spirit of the request

When a user says "delete my data," they want assurance their information is gone. Over-deletion guarantees this. It goes further than required — which is always defensible.

"The best response to a deletion request is not surgical precision — it's confident completeness. Over-deletion lets you say 'your data is absolutely, unequivocally gone' instead of 'we think we got most of it.'"

⚠️ The risks of under-deletion

Under-deletion is the opposite scenario: your filters are too narrow, and some of the user's data survives. Here's why that's far worse.

Three dangers of under-deletion

Regulatory risk: Many privacy frameworks give individuals the right to erasure. If a regulator audits your response and finds you missed data, you've failed to comply — even if the missed data is anonymous in your system.
Trust damage: If a user discovers their data wasn't fully deleted (e.g., through a subject access request), you've broken a promise. Trust, once lost, is difficult to recover.
Operational complexity: Trying to be too precise leads to complex processes — multiple deletion passes, manual review, uncertain coverage. Over-deletion is operationally simpler: set broad filters, preview, delete, document.
Dimension Over-deletion Under-deletion
Data impact Extra anonymous rows removed User's data remains in system
Personal impact None — data is anonymous User's privacy not fully honoured
Analytics impact Minor dip in one region/period No impact (data kept)
Regulatory risk Low — deletion exceeded scope High — deletion incomplete
User trust Strengthened Damaged if discovered
Audit defensibility Easy — clear audit trail Hard — gaps to explain

The asymmetry is clear: over-deletion costs you some anonymous data points. Under-deletion costs you trust, credibility, and potentially regulatory standing.

🎛️ How to minimise over-deletion impact

Over-deletion is the right default, but you can still be smart about it. The goal is to capture all of the user's data while minimising bystander records.

Five tactics to narrow your filters

  1. Use tight date ranges. Instead of "all of January," use the user's actual sign-up date through their last known activity date. A 25-day window catches less bystander data than a full month.
  2. Always include platform. Filtering by iOS or Android immediately cuts the matching population roughly in half (or more, depending on your user distribution).
  3. Add country when known. If the user's location is in your records, adding country as a filter narrows scope dramatically — especially in smaller countries.
  4. Filter by event name for targeted deletion. If you only need to remove specific types of activity (e.g., events related to a particular feature the user used), add the event_name filter.
  5. Always preview first. Use the preview endpoint or dashboard modal to check how many events match before committing. If the count seems disproportionately large, tighten your filters.

💡 The preview-then-delete workflow

Respectlytics provides a dedicated preview endpoint that returns the count of matching events without deleting anything. This lets you:

  • Test different filter combinations to find the tightest match
  • Compare the count against expected user activity volume
  • Only execute when you're satisfied with the scope

See our Data Deletion Guide for the full API reference with curl examples.

🤔 Why not just add user IDs?

The obvious objection: "If precise deletion is so important, why not store user IDs so you can delete exactly the right rows?"

Because user IDs create the very problem they appear to solve.

The user ID paradox

More data, more liability. Storing user IDs in analytics means your analytics dataset now contains personal data. This expands your regulatory surface area — you need consent mechanisms, data processing agreements, and more complex access controls.
Deletion becomes mandatory, not optional. With anonymous analytics, your legal team may determine that deletion isn't even required (since the data can't identify anyone). With user IDs, deletion is almost certainly mandatory under most privacy frameworks.
Re-identification risk. User IDs can be combined with other datasets to build behavioural profiles. Even "internal-only" user IDs can leak through breaches or be subpoenaed. What you don't store can't be exposed.
The real fix is not collecting the data. Our Return of Avoidance (ROA) principle: the best way to protect data is to never collect it. Over-deletion is a small trade-off for not having personal data in your analytics system at all.
"The reason precise per-user deletion is impossible is the same reason your analytics data is privacy-safe in the first place. You cannot have both perfectly targeted deletion and anonymous analytics — and we chose anonymity."

🔑 Key takeaways

  • Over-deletion is always safer than under-deletion. Extra anonymous data lost is a minor analytics blip; missed personal data is a liability.
  • The over-deleted data is anonymous. No person is harmed by removing extra rows of event counts.
  • Use tight filters + preview. Narrow your date range, add platform and country, and always preview before deleting.
  • Document everything. The audit trail — not surgical precision — is what proves you honoured the request.
  • Adding user IDs is worse. They create the personal data problem you're trying to avoid.

Frequently asked questions

Q: What if I over-delete data for a high-traffic region?

The impact is proportional to the filters you use. A deletion scoped to "iOS, Germany, January 2026" removes a subset of one region's data for one month. Your global analytics, other regions, and other time periods remain completely intact. For high-traffic apps, this is typically a negligible fraction of total data.

Q: Can I undo an over-deletion?

No. Deletions are permanent and irreversible — this is a feature, not a bug. If deleted data could be recovered, you couldn't honestly tell a user their data was removed. The permanence is what makes the deletion credible and defensible.

Q: How many deletion requests should I expect?

For most apps, very few. Deletion requests are typically 0.01–0.1% of your user base per year. When your analytics stores no personal data, some legal teams may determine that your analytics system doesn't even fall within the scope of deletion requests. Consult your legal team.

Q: Should I automate deletion for every user who deletes their account?

That depends on your risk tolerance and legal advice. Some teams automate it: when a user account is deleted, trigger an API call to delete analytics matching the user's activity window. Others only act on explicit erasure requests. Both approaches are valid — the Data Deletion Guide covers the API for either approach.

Legal Disclaimer: This information is provided for educational purposes and does not constitute legal advice. Regulations vary by jurisdiction and change over time. Consult your legal team to determine the requirements that apply to your specific situation.

📚 Additional resources

Analytics that makes deletion simple

No user IDs. No device fingerprints. Just anonymous events with a built-in deletion API.