SardineCon SF/2026

Learn More
The Saturday Fraud Strategist

False Positives Masterclass, part 2: How to identify where FPs originate from

9 min

One of the most common mistakes I see fraud teams make is attacking false positives head on.

A customer complains. The CEO says the model is blocking too much. Someone opens a dashboard, adjusts a fraud model threshold, maybe tweaks a few fraud rules, and suddenly everyone feels like progress is happening.

Honestly, not a good look.

Not because false positive reduction is the wrong goal. It is absolutely the right goal. The problem is that most teams go tactical immediately. And if you are tactical about the way you reduce false positives, you should probably only expect tactical gains.

In this episode, we get into the second part of the false positives masterclass: how to break down false positive fraud detection models into buckets you can actually prioritize and fix. We look at where false positives come from, which ones are driven by fraud detection models, fraud rules, manual review, upstream partners, fraud analysts, data quality issues, corrupted fraud signals, and payment fraud detection workflows.

Quantifying false positives is useful.

But it is not a plan.

What you’ll hear in this episode:

  • Why reducing false positives requires root cause analysis, not just model tuning
  • How to identify who actually declined the event: a rule, fraud model threshold, AI agent, fraud analyst, manual review team, issuer, acquirer, or fraud vendor
  • Why upstream payment partners can create false positives your own fraud prevention systems cannot directly fix
  • How fraud decisioning breaks down across payment fraud detection, fraud risk scoring, and operational workflows
  • Why fraud system optimization starts with identifying the worst offenders
  • How data quality issues, corrupted fraud signals, and model drift create false positives that look like fraud risk
  • How fraud operations teams can prioritize the buckets that are large enough, fixable enough, and valuable enough to address first

Who should listen:

  • Fraud operations leaders trying to improve fraud detection accuracy
  • Fraud analysts working through manual review queues
  • Risk teams managing fraud rules, fraud model thresholds, and fraud risk scoring
  • Data science teams responsible for fraud detection models and model drift
  • Payment fraud detection teams dealing with issuer declines and upstream partner decisions
  • Fraud prevention teams trying to reduce false positives without increasing losses
  • Anyone who has ever stared at a false positive dashboard and thought, “Okay, now what?”

Episode notes:

A frustrating but familiar pattern

Fraud teams often respond to false positives only after a customer complaint, executive escalation, or sudden concern that “the model is blocking too much.”

Okay, fair.

But now you’ve got to ask yourself: are you solving the root cause, or just adjusting the part of the system that is easiest to see?

That distinction matters.

Modern fraud prevention systems give teams a lot of tools

Fraud detection models, fraud rules, fraud risk scoring, manual review, AI agents, payment fraud detection systems, and fraud analysts all play a role in stopping suspicious activity.

The same system that protects you can also block good users if you do not understand where each decision is actually coming from.

The big gap is visibility

A false positive might come from your own fraud rules. It might come from a conservative fraud model threshold. It might come from manual review. It might come from an issuer, acquirer, processor, fraud vendor, or another upstream partner.

If you do not know who said “no,” you cannot know what to fix.

False positives are not just model errors

Good customers get blocked. Fraud analysts waste time reviewing preventable cases. Manual review queues grow. Payment fraud detection teams chase the wrong problem. Fraud operations leaders struggle to explain why customer friction is rising.

And somewhere in the middle of all that, a perfectly legitimate user is wondering why your system decided they looked suspicious.

Not great.

Data quality issues

Sometimes the fraud logic is fine. The problem is that the system is operating on corrupted fraud signals. Maybe the IP field is wrong. Maybe device IDs are missing. Maybe payment metadata never passed through. Maybe a mobile SDK bug is making iOS traffic look strange.

Suddenly, your fraud detection models drift. Your fraud rules misfire. Your fraud risk scoring looks worse than it is.

And now the team is “optimizing the model” when the real problem is a broken input.

Classic.

The path forward

The better path is to bucket false positives by actor, partner, user journey, product flow, platform, payment method, geography, and data issue.

Prioritize the problems that are:

  • Large enough to matter
  • Actually under your control
  • Not caused by data issues you cannot fix
  • Connected to high-impact fraud rules, model thresholds, or manual review policies

That is how fraud system optimization becomes strategic instead of reactive.

Key takeaways:

Reducing false positives is not just about fraud model thresholds, fraud rules, or manual review. It is about understanding the full fraud decisioning system, including partners, platforms, workflows, analysts, corrupted fraud signals, and data quality issues.

Once you know where the problem comes from, you can finally decide what to fix first.

And honestly, that is a much better place to be.

Episode transcript
Chen Zamir
Chen Zamir
00:08
One of the common mistakes I see fraud teams make is attacking false positives head on. They'll get a customer complaint, and they'll now trace back why this false positive happened and how to make sure it doesn't happen again, or the CEO complains that the model is blocking too much, and now they try to optimize it. The problem, of course, isn't the fact that they're trying to minimize their system's false positives. It's the fact that they're being tactical about it. And if you're tactical about how you do things, you can only expect tactical gains at best. In part one of the false positives masterclass, and if you missed that one, the link is down below, we accepted the reality that your system isn't built to report its own mistakes. We discussed several methods that can help you work around that and generate a usable picture of your false positives. Then comes the next problem. Quantifying false positives is merely an observation. It is not a plan. To reduce false positives in a meaningful way, you have to go one level deeper and ask a different set of questions. Where do they come from? Which are driven by my own system, and which my partners? Which are actually data quality problems masquerading as fraud risk? And most importantly, where should I start? That's what the second part of the series is about, breaking down your false positives into buckets that you can prioritize and act on. How do you do that? Here's my seven-step process. The first and most important step is to attach every false positive to the actor that made a decision. When a transaction is declined or an onboarding attempt is blocked, someone or something said no. That someone might be a rule in your system, machine learning model threshold, an AI agent, human analyst, manual reviewer, even a third-party partner, an issuer, acquirer, or fraud vendor. Without this categorization, you will default to optimizing the things you can see, usually your rules and your models. That's how teams spend months fine-tuning rules, only to later learn that most of their declines were coming from an issuer they never spoke to. So, your first task is to map every decline to the actor that made that decision. Sometimes it will be a single rule. Sometimes it will be a decision based on a certain score threshold. Sometimes it will be a human decision. Sometimes it will be a response coming back from a payment partner. The point is that you don't have to do this perfectly on day one. Even a rough breakdown makes an enormous difference, because it will help you get a sense for where the most value lies. Once you have that basic map, you can ask the following question: How much of this is even under my control? This is particularly important in card payments, where a single card transaction can pass through half a dozen hands before the issuer finally says yes or no. Between the cardholder and the issuing bank, you often have the merchant or platform itself, a payment service provider, an acquirer, an acquiring processor, a card network, an issuer processor, and finally the issuing bank. And that's even without counting the third-party fraud vendors that some of these actors plug into their own stack. Each of these actors can decline a transaction. Each has its own risk logic, and each contributes its own false positives. Now, if 60% of your false positives are driven by upstream partners, then your maximum sphere of influence is capped at 40%. That doesn't mean you should give up and walk away, but it does mean you should rethink your goals, how you set expectations internally, and where you spend your energy. And I think that this is an important point that I see a lot of fraud leaders miss. You cannot tune rules you don't own. You can, however, quantify the impact, make it visible, make sure everyone in the organization understands where the limits of your influence actually are. And trust me on that. This saves a lot of frustration later. And also, strategically, if you're unhappy with your partner's performance and believe it's substandard, you can always work to replace them. Once you have your decision sorted into high-level buckets, it's time to go one level deeper.
Chen Zamir
Chen Zamir
03:50
For example, you take a rule engine bucket that is responsible, say, for 25% of all false positives, and you break those 25% down to individual rules. And you do that as best you can for each one of your buckets. Now, when I say as best you can, what I mean is that realistically this can easily become time-consuming and low-value work. But let's try to break it down into the top five offenders in each category, and remember that even if a rule is relatively accurate, high volume usually means it is a primary source of false positives. High-volume decision is a good rule of thumb in terms of where to start. The point is that you want to identify the specific actors which are responsible for the most amount of false positives in absolute terms. These are your quick wins. At the same time, your gut might tell you that some offenders are flying below radar, legacy rules, outdated policies, or solutions that were never properly validated. Don't ignore that gut instinct, even if these actors have low volumes, or at least don't ignore it before you collect data on it. So, we now know what is blocking your users, but we still need to know where those users are coming from, even within the part of the stack you control. False positives are rarely evenly distributed. They tend to cluster around specific flows, such as mobile versus web, iOS versus Android, maybe different products or payment methods, or maybe even like new versus established users. If you only look at the actor that declined the event, you will see a rule or a model misbehaving. But if you look at which flow the user went through, you might find a deeper issue. Specific flow that produces corrupted or missing data, causing your entire fraud stack to misfire. Here's an example. Imagine you have an integration bug in your mobile SDK that sends incorrect IP data for iOS signups. You don't see that bug at first. What you see is that several geo-based rules suddenly have higher false positive rates on that platform, or a model that uses IP-based features also seems to drift, or maybe analysts complain that events coming from mobile look weird. If you only look at the rule level, you waste weeks recalibrating rule sets. But breaking it down by flow, you quickly realize the logic is fine on web, and the issue is isolated to iOS. Now, keep in mind that this isn't always a data bug. It might also be that a specific flow concentrates many good users who behave differently than your general population, which on its own can drive false positives up. But the point is, once you've bucketed your false positives by actor, do the same by user journey. Which product, which platform, which payment method, which geography, which specific funnel? I guarantee you'll see patterns emerge very quickly. The moment you start seeing patterns by flow, you will almost always run into the same culprit: data quality. Sometimes the underlying fraud logic is actually fine, and yet on a particular slice of traffic, your performance tanks. That's often because the system is operating on corrupted or missing inputs, default IP addresses, where the real IP failed to capture placeholder emails or garbage values, device IDs that reset to null on some OS versions, payment metadata that is never passed through for a certain method, timeout, or integration errors with third-party intelligence sources. You want to locate the exact data fields that are affected in those population segments you've identified. A reliable method I use all the time is to simply group data points by their value and look for values that have a suspiciously high count. And if you think about it, how many times have you done exactly that to uncover fraud patterns just to stumble upon a bug? Once you identify data quality issues, you have some detective work to do. First, we need to remember that often corrupted data points have cascading effects. Corrupted email field would, of course, corrupt email velocity checks that are based on it. So, your first task is to identify all the impacted data points and link those to your misbehaving rules or models. For most organizations, this exercise can prove incredibly hard, but you have a shortcut: your list of top offender solutions from step three.
Chen Zamir
Chen Zamir
07:41
All you need to do is to cross-reference your corrupted data points with the inputs used by your worst-performing rules. This will save you days of analyzing data skills by associating data issues with solutions. You will be able to complete the last link in the chain, attributing a value to each of these issues, just as you've done with offenders. You have a dollar amount, which you can put on that email bug, at least in terms of false positives. And now it's time to bring it all together. Now you have a complete map, not only of how many false positives you have, but also where and why they are generated. With that, prioritization becomes much more straightforward. You start with buckets that are large enough to matter, under your control, and not caused by data quality issues that you cannot fix. In my experience, in many organizations, these are likely to be a small number of high-volume, high-impact rules, one or two model thresholds that were set conservatively, maybe a specific case management policy that encourages overblocking, a couple flows where your system produces data issues. These are exactly the areas we'll focus on in part three, where we'll get into the mechanics of fixing your decision logic to be less trigger-happy. The work would only be effective because you've done the root cause analysis first and know how to invest smartly. For now, if you've gone from, we have false positives, to, we know where most of them come from, and which parts we can actually fix, you are already well ahead of most teams, and that's a good place to be at.
Chen Zamir
Chen Zamir
09:07
You.
Host
Chen Zamir
Chen Zamir
Head of Fraud Strategy