The power of data filtering for criminal background checks

The following post is provided by DataDiver Technologies, a TazWorksâ„¢ integration partner  

Background screening demands precision. The data is vast, often disorganized, and tied to very real stakes. Nearly one in three American adults, roughly 77 million people, have a criminal record. Compliance professionals have to pull accurate, meaningful insights from that enormous pool of information, and how well they do it comes down to two things: the quality of their data sources and the filters behind them.  

Why the data differs: The patchwork of state laws and reporting standards 

No two states play by the same rules. Each jurisdiction sets its own laws on what can be reported, how long records are retained, and what has to be sealed or omitted. Consumer Reporting Agencies (CRAs) have to navigate that patchwork carefully, as a record that’s legally reportable in one state may be off-limits in another. 

California and New York are good examples. California caps reporting on most criminal records at seven years and broadly restricts arrests and non-convictions from appearing in reports at all. New York takes a similar approach, prohibiting the reporting of arrests that didn’t result in conviction, and its “Ban the Box” rules push criminal history inquiries later in the hiring process, so applicants get a fair shot before their background enters the picture. 

Utah works differently. It centers its framework on state-level repository searches and has been a leader on “Clean Slate” legislation laws that automatically seal qualifying records over time. That creates an interesting practical reality: A record that was in a database last year may not be reportable today, even if the underlying data hasn’t changed. 

The upshot is that raw data from national aggregators or multi-state databases will routinely include records that can’t legally be reported. Filtering them out requires jurisdiction-aware logic. Getting it wrong cuts both ways; you either flag records that shouldn’t appear, or you miss ones that should. 

The 18,000-outcome problem: Decoding dispositions 

Even when a record is legally reportable, reading it correctly is a separate challenge. Across U.S. jurisdictions, there are more than 18,000 distinct ways a charge disposition can be recorded. “Not guilty,” “dismissed,” and “nolle prosequi” are just a few of the well-known ones. Courts also use proprietary shorthand and abbreviations that aren’t standardized anywhere, and they vary county by county. 

The risk is real: A case resolved in a subject’s favor can look ambiguous, or even adverse, when it’s pulled from a raw database. Without thorough disposition mapping and normalization logic, a compliance professional might flag a clean record as a concern or miss one that’s genuinely reportable. Good data providers maintain extensive disposition libraries to translate those more than 18,000 variations into something clear and actionable. 

Why filtering matters 

Effective filtering isn’t a back-office detail; it’s what separates useful screening from noise. Several techniques are especially critical: 

  1. Alias Matching – Catches name variations to prevent false negatives. 
  2. Date of Birth Verification – Distinguishes between people with similar names. 
  3. Surname Adjustments – Accounts for inconsistencies in how names are formatted. 
  4. Sex Offender-Specific Filters – Keeps reporting in line with strict regulatory requirements. 

Without these, data providers flood users with unstructured records and leave them to sort it out manually. This drives up costs, slows processing, and introduces real compliance risk. Poor filtering doesn’t just create extra work; it sets up the conditions for ill-informed hiring decisions and legal exposure. 

The middle initial field: A small detail with outsized impact 

One of the most powerful filtering tools in background screening is also one of the most overlooked: The middle initial field. In a database with millions of records, names like “John Smith” or “Maria Garcia” can return dozens, and sometimes hundreds, of results. Without a middle initial, someone must sift through all those results manually. That’s slow, and it’s prone to errors. 

Add a middle initial and the candidate pool drops sharply. The effect is even stronger when you pair it with smart alias logic that accounts for nicknames, hyphenated surnames, maiden names, and cultural differences in name ordering. Together, these tools do two things well: They catch the matches that matter and cut the ones that don’t. 

Working with the right partners  

Background screening evolves constantly, from new legislation and expanding data sources to shifting compliance expectations. With 95% of employers running background checks, there’s little room for error and real cost to getting it wrong. 

The best providers don’t just hand over raw data and walk away. They bring jurisdictional expertise and the necessary context to interpret what the data actually means, as the same record can tell a very different story depending on where it was filed and when it was pulled. 

How CRAs improve accuracy by aggregating across sources 

No single database has the full picture for a candidate. Court records, state repositories, federal databases, sex offender registries, and commercial aggregators all have their own coverage gaps, update schedules, and formatting quirks. The most accurate background checks come from CRAs that draw on multiple sources and reconcile the overlap. 

Cross-referencing independent sources lets CRAs confirm records with greater confidence, catch discrepancies that point to data errors or outdated entries, apply jurisdiction-specific rules to determine what’s actually reportable, and cut duplicate entries that would otherwise inflate a subject’s apparent history. The result isn’t just more data; it’s a cleaner, more defensible report that reflects what the record means, not just what it says. 

From raw data to real decisions 

Background screening is complicated, but it doesn’t have to be chaotic. Good filtering turns a flood of raw records into something compliance professionals can actually use—accurate, defensible, and ready to inform real hiring decisions. 

In a field where the data is never perfect and the rules keep changing, the quality of your filters determines the quality of your outcomes. That’s not a technical footnote—it’s the whole point.