The New Battle for Data Integrity in Market Research: What Synthetic Data Actually Is And When It’s Useful

Synthetic data isn’t fraud—it’s a tool. When built on verified human feedback, it can enhance research and CX. But if based on fake data, it distorts insights. Here’s the truth about using it right.

Sean McDade, PhD

Sean McDade, PhD

Founder & CEO, PeopleMetrics

Over the past few weeks, I’ve written a lot about AI survey fraud, human verification, and the $10 million fraud case involving Op4G and Slice.

That story raised a lot of justified fear but also some confusion.

Specifically, around one term that keeps getting thrown around:

Synthetic data.

Let’s clear this up now:

  • Synthetic data is not inherently bad
  • It is not the same thing as survey fraud
  • And yes, it can be useful in market research and CX

But here’s the catch: Only if it’s built on a foundation of verified, real human feedback.

So What Is Synthetic Data?

Synthetic data is data that’s artificially generated by algorithms to reflect the statistical structure of real-world data, without obtaining that data from actual individuals.

It’s often used for:

  • AI model training
  • Simulation of rare events
  • Privacy protection in healthcare
  • Controlled testing environments

It’s not “fake data pretending to be real.”

It’s simulated data, clearly labeled as such, used in specific contexts.

Let’s Be Clear: This Is Not the Slice Case

The Op4G/Slice scandal involved:

  • Fraudulent survey responses
  • Real people pretending to be other people
  • Manual VPN masking and coaching scripts
  • And $10M+ in deception

That wasn’t synthetic data.

That was fraud.

Synthetic data, when used correctly, is generated transparently and it’s never passed off as actual human response data.

When Synthetic Data Can Be Useful — With One Major Condition

There are valid use cases for synthetic data in market research and CX, such as:

  • Training AI models to detect sentiment or classify feedback
  • Stress testing different experience scenarios
  • Simulating low-incidence populations or segments
  • Privacy-conscious modeling in healthcare environments

But here’s the key:

Synthetic data in market research or CX is only useful if it’s based on high integrity, verified human feedback.

If your base data is unreliable, if it's fraud, filler, or bot-generated, then your synthetic data isn’t simulation. It’s distortion.

And that distortion compounds with every layer of analysis.

This is why PeopleMetrics is so focused on human-verified data:

  • Our custom research panels are built from the ground up with recruited, vetted, real people
  • Our CX data comes directly from real customer transactions
  • When we use online sample providers, we have a rigorous vetting process for each partner before we use them and multiple check points once we receive the data

If we can’t prove it’s a real human, it doesn’t get in our datasets!

That’s the only way synthetic modeling becomes trustworthy, when it’s built on a foundation of truth.

Synthetic ≠ Strategic (Unless It’s Transparent)

Synthetic data can help you simulate, prototype, or scale ideas.

But it can’t tell you:

  • What real customers feel
  • Why are they frustrated or delighted
  • What’s truly working (or broken) in the experience

Only real humans can do that.

So use synthetic data if:

  • You disclose it
  • You understand its limits
  • You base it on verified ground truth

And never use it as a shortcut for talking to real people.

Bottom Line

Synthetic data isn’t the problem.

In market research and CX, synthetic data can complement human insight. But it can’t replace it and it can’t be trusted unless it’s modeled on verified, fraud-free data.

If you start with fiction, you’ll scale fiction.

If you start with truth, you can scale trust.

The difference is everything.

Up next:

Post 10: The Future Belongs to the Real

Comment Here!

 

Latest Articles

The New Battle for Data Integrity in Market Research: What Synthetic Data Actually Is And When It’s Useful

The New Battle for Data Integrity in Market Research: What Synthetic Data Actually Is And When It’s Useful

Synthetic data isn’t fraud—it’s a tool. When built on verified human feedback, it can enhance research and CX. But if based on fake data, i...

The New Battle for Data Integrity in Market Research: The True Cost of Bad Data

The New Battle for Data Integrity in Market Research: The True Cost of Bad Data

The true cost of bad data isn’t just wasted money—it’s lost trust, missed opportunities, and strategic failure. The Op4G/Slice scandal show...

The New Battle for Data Integrity in Market Research: What to Do If You Have To Use Third-Party Sample Providers

The New Battle for Data Integrity in Market Research: What to Do If You Have To Use Third-Party Sample Providers

The Op4G/Slice fraud case exposed the dark side of online survey panels—$10 million in fake data. Learn the critical questions to ask your ...