What is synthetic data and how can it boost user testing?

By
Neya Abdi
February 23, 2024
min read
Share this post

The most successful businesses are decisive.

Scaleups often live and die based on knowing how and when to pivot. 

When they’re uncertain, they either pivot too often or get stuck in analysis paralysis when it comes to things like deciding which audience to focus on, developing and executing on a marketing strategy, or changing the way they talk about their product altogether. 

Making these kinds of decisions is stressful, especially when you’re facing resource limitations. 

Gathering the data needed to make informed decisions is even harder. 

Synthetic datasets may become every decision maker’s deputy.

What if you could get insight into what customers are thinking, without extensive interviews and research, and with over 90 percent accuracy? 

This is what synthetic data offers, and what researchers have already started confirming.

As Mark Ritson writes in Marketing Week, “There are huge swathes of the market that simply cannot access primary data. If I have heard one SME B2B marketer complain about being unable to build a funnel, I’ve heard it a hundred times. A lot of good marketers don’t even have a shitty empirical benchmark to beat with synthetic data, they have no benchmark at all. For that giant army of hard working, underserved marketers the advent of synthetic data is truly revolutionary.”

You can think of “synthetic data” as something like silicon personas. 

Computers generate synthetic data to mimic what data would typically say. Synthetic data sets can be thought of as AI-generated personas that are trained to use different data to make up its “individuality” such as socioeconomic and psychographic data. 

In other words, it’s a profile of an artificial person trained to have all of the complexities of a real person.

We first came across this concept thanks to Synthetic Users and speaking with Hugo Alves, the company’s founder. They have been pioneers of this thinking in the product design and customer research field and started experimenting with it in product design work for companies in mid-2023.

Organizations use synthetic data to:

  • Test whether there would be interest in a new product
  • Communicate the benefits and features of a new product most effectively
  • Design existing products so that they’re easy to use

These AI-generated personas help organizations understand what it takes to compel individuals to take specific action, such as signing up for a demo or clicking “buy.” 

This synthetic data has also been called silicon personas, and it offers a cost-effective way to supplement or replace those expensive and time-intensive methods.  

How can synthetic data support customer research and go-to-market strategies? 

When it comes to marketing, one approach to gathering the necessary data and training models is scraping review sites where there’s ample sentiment around a specific topic (e.g., Quora, Reddit, G2). 

The technology scans the market, gathers the data, and then produces something like a transcript or a report of how a specific customer (e.g., Marcus, aged 17) would feel. Teams can then use that sentiment to create more accurate personas around how a human would think and act. 

Companies have already started doing this by training large language models (LLM) and then asking those LLMs specific questions. 

Of course, the most obvious question is: how accurate is the data? 

While general tests show that synthetic data produced by a well-trained model offers pretty good parity with human users – 90% according to the researchers cited in Marketing Week –  time will tell how synthetic data helps answer more granular, nuanced questions. 

What can scaleups do in the meantime? They can combine the best of both worlds: real human interaction and the scalability that AI offers. 

This is what we offer at Heatseeker. A way to go beyond research and start testing from the very beginning. 

Our platform focuses on testing and learning in iterative cycles. Here’s how it works: 

  • A user types their company URL into Heatseeker
  • Heatseeker crawls the web to find competitors and present them to the user to ensure accuracy
  • The user starts “heatseeking” around a specific question, such as, “I want to test a new value proposition with a new group of customers” or “I want to conduct an A/B test around key messaging before investing money into developing this product or service.” 
  • Heatseeker then creates multiple variations of digital campaigns to see what people online actually engage with and uses data science to analyze the results and present recommendations 

In short, companies can test their riskiest assumptions in a closed environment in order to get extra confidence. 

Join our waitlist so you’re the first to know when Heatseeker is available to the general public.

Share this post
Neya Abdi
Hire us to build a website using this template. Get unlimited design & dev.
Buy this Template
All Templates