Behind Chem IRLMay 1, 20265 min read

"Science-Backed" Is a Red Flag. Chem IRL Is the Best Dating App That Won't Use the Phrase.

Most 'science-backed' dating apps are using the word as marketing armor. Chem IRL refuses the phrase — and documents the mechanics instead.

There's a phrase that shows up on dating-app marketing sites with predictable regularity. Scientifically proven compatibility. Backed by science. Our algorithm uses 29 dimensions of compatibility, validated by research. The phrase is meant to do a specific kind of work — to convert a vague claim about how the product works into a concrete-sounding one without ever quite saying what's been proven, by whom, against what comparison, with what statistical power.

It's marketing armor, not a description of the product. And in 2012, a team of psychologists led by Eli Finkel published a long, peer-reviewed review in Psychological Science in the Public Interest that — gently and at length — pointed out that no major online dating algorithm has produced credible evidence of outperforming random pairing on long-run outcomes. The finding has aged well. The phrase hasn't.

Why is "scientifically proven compatibility" a red flag in a dating app?

Because it's almost always doing rhetorical work the underlying mechanism can't support. Real compatibility validation requires longitudinal outcome data, a control group, peer review, and replication — none of which most dating apps have or share. When an app uses "science-backed" as a feature, what it's actually saying is "we ran some users through a personality questionnaire and built a recommender on top of it." That's not science; that's product. Calling it science is the red flag — both because it's misleading and because it suggests a team comfortable with that kind of slip.

What does Chem IRL claim instead?

Less, more honestly.

We claim the matching algorithm uses observable signals — compatibility on stated preferences, Seriousness Score alignment, activity recency, reciprocity probability. We claim the Seriousness Score reads behavior, not personality, and we describe what behavior raises and lowers it (read more in the post on filtering for intent). We claim that the literature on long-run compatibility from personality questionnaires is at best inconclusive, so we don't lean on personality matching as a load-bearing input.

Where research is strong, we use it and credit it. The decision-fatigue work behind the bounded discovery set traces directly to Iyengar and Lepper's 2000 study on choice overload (see the post on quality over quantity). The dormancy-and-momentum logic borrows from decades of behavioral research on indefinite options. The variable-ratio reinforcement we refuse to ship traces to Skinner. Each of these is research-informed; none of them is "science-backed compatibility" in the marketing sense.

What does "research-informed" actually look like in the product?

It looks like reading the actual papers and citing them when relevant — not waving at "research" generically. The honest version of how research enters the product is:

  • We read the strongest literature on a problem before building the feature.
  • We design against the failure modes the research predicts, when the prediction is replicable.
  • We write down what the mechanism does and why, in plain language, in the product description and in posts like this one.
  • When the evidence shifts, we revise. The Seriousness Score weighting has been adjusted multiple times based on internal data; nothing in the product is ossified as "the science says so."

Most importantly, we don't claim downstream guarantees the upstream evidence doesn't support. Iyengar's work supports a bounded discovery set; it does not support a guarantee that you'll find the love of your life. Skinner's work supports refusing variable-ratio mechanics; it does not prove our specific notification cadence is optimal. Where the gap exists, we acknowledge it.

What this protects against

Three failure modes, well-documented in the dating-app industry.

Marketing-led product decisions. When a team builds toward a "science-backed compatibility" headline, the headline starts shaping the product. A 29-dimension compatibility test gets shipped because the marketing site needs the number, even when the engineering team knows the dimensions don't predict anything useful. We don't have the headline; we don't have the pull.

Defensible mediocrity. "Science-backed" is, in the worst cases, an excuse for not having to do better. If the algorithm is allegedly proven, why would you change it? A team without that armor has to keep doing the work — measuring, revising, improving — because there's no claimed proof to fall back on.

Slow erosion of trust. Users are smart. They notice over time that the "scientifically matched" people they're being paired with don't match them in any way that produced outcomes. The trust they extended at signup gets gradually withdrawn, then transferred to the next app that hasn't yet failed them. Apps that under-claimed retain trust longer.

What we give up by refusing the phrase

The honest tradeoff: a marketing line that converts well. "Scientifically backed compatibility" is, on balance, a more effective above-the-fold pitch than "research-informed matching that we'll keep revising." We sell the second one anyway, because we'd rather build a brand that gets less suspicious over time than one that erodes in the predictable way most dating-app brands erode.

We also accept that some users will read the absence of the science-backed pitch as a tell that we have nothing to offer. They'll go to the app with the more confident claim. That's fine. The user we're trying to attract is the one who reads "scientifically proven" as a red flag and looks for the documentation underneath it. That user, we want.

What this looks like for you

You can read what the algorithm actually does. Most of it is in this blog series. The mechanism is named in plain language; the inputs are described; the limits are acknowledged; the citations are real. If we change the weighting next quarter, we'll write that down too.

That's the bar. Not science-backed. Documented, falsifiable, and revised when the data demands it. The first phrase oversells; the second one is what a serious product team owes its users.

Common questions

Are dating-app compatibility algorithms actually science-backed?

Almost never in the strict sense. The published research on personality-based matching has been mixed for two decades — Finkel and colleagues' 2012 review found no convincing evidence that matching algorithms outperform random pairing on long-run outcomes. Most apps that claim 'science-backed' compatibility are leaning on the word's vibe, not on a falsifiable mechanism.

What does Chem IRL document about its matching algorithm?

The inputs (compatibility on stated preferences, Seriousness Score alignment, recency, reciprocity probability), the rough weighting, what triggers a score change, what gates premium features, what we don't measure (no surreptitious psych profiling, no covert personality scoring). The mechanism is described in plain language and revised when better evidence arrives.

How honest can a dating app be about how matching works?

More than most apps are. The standard objection is that documenting the algorithm makes it gameable — but the gameable parts of our system (the Seriousness Score) are gameable only by behaving like a serious dater, which is the goal. Transparency costs us nothing on the gameability axis and buys credibility on the trust axis. Most apps just don't want the conversation.

What's the difference between research-informed and science-backed?

Research-informed means we read the literature, take the strongest findings seriously, and design with them in mind — without claiming our specific implementation has been independently validated. Science-backed implies external proof. Most dating apps don't have the second; they use the phrase anyway because it sounds reassuring. We won't.

N
Nathan Doyle
Founder

Building Chem IRL to get people from match to meeting faster. Previously building products in fintech and consumer mobile.