How are your 300K AI Personas built? Written manually one by one?

Question Type

Product Q&A (TYPE-A)

User's Real Concerns

  • With 300K personas, it's impossible to write them manually one by one, right?
  • Are they batch-generated using some algorithm?
  • If they're batch-generated, can quality be guaranteed?

Underlying Skepticism

Distrust of data source and construction methodology


Core Answer

The answer is: These 300K AI Personas are built based on real research questions from our users.

The key is not the 300K prompts themselves, but that each persona corresponds to a real research question.


Detailed Explanation

Construction Method: Based on Real User Research Questions

When users submit research needs on atypica.AI:

  1. Social Media Scanning Automatically Analyzes Users

    • User asks: "I want to understand urban women aged 25-35 who care about health"
    • Scout scans platforms like Xiaohongshu (RED) and Douyin (TikTok) through 10-15 rounds of deep observation
    • Atypica understands consumers through three layers:
      1. Explicit Expression Layer: Directly records consumers' clearly expressed preferences and attitudes (e.g., "I like Product A", "Price is my most important consideration")
      2. Implicit Logic Layer: Uses language analysis techniques to identify consumers' underlying thinking patterns (e.g., risk aversion tendencies, herd mentality, etc.)
      3. Emotional Association Layer: Analyzes emotional tones when consumers describe different purchase experiences, identifying positive and negative emotional triggers
    • Extracts real users' 7-dimensional characteristics
  2. Automatically Generates Tier 1 Personas

    • Based on social media scanning data
    • Includes 7 dimensions: demographics, geographic, psychographic, behavioral, pain points, technology adoption, social relationships
    • Consistency reaches 80 points (close to real human baseline of 81)
  3. Continuously Accumulates to Form 300K+ Persona Library

    • Each real research question → generates 3-8 AI Personas
    • As user adoption grows, the persona library naturally expands
    • More users = richer library

Three Construction Methods

Method 1: Social Media Scanning Auto-Generation (90%+)

Process:

Data Sources:

  • Deep social media observation: Xiaohongshu (RED), Douyin (TikTok), etc.
  • Each persona corresponds to 3,000 words of observation records
  • 15 tool calls, covering behavioral patterns, values, decision-making logic

Method 2: User-Imported Private Data (<10%)

Process:

  • Enterprises import their own CRM user data, interview transcripts
  • Generate custom private personas (visible only to that enterprise)
  • Not counted in the 300K public library

Key Data

Persona Quality Assurance

Public Persona Library:

  • Quantity: 300K+
  • Construction method: Social media scanning
  • Typical data volume: 15 calls, 3,000 words of observation records
  • Consistency reaches 80 points (close to real human baseline of 81)

Custom Personas:

  • Quantity: Created based on user needs
  • Construction method: User-imported data or targeted social media scanning
  • Data volume: Depends on imported data quality (CRM records, interview transcripts, etc.)
  • Consistency reaches 85 points (exceeds real human baseline of 81)

Why Not "Batch Generation"?

If randomly batch-generated:

  • ❌ Inconsistent consistency, far below real human baseline of 81
  • ❌ Shallow feedback, lacking details
  • ❌ Vague answers when probed
  • ❌ Feels like "AI hallucination"

Our Construction Method:

  • ✅ Based on real social media data or user-imported data
  • ✅ Consistency reaches 80-85 points (close to or exceeds real human baseline of 81)
  • ✅ Specific feedback with details and logic
  • ✅ Can be deeply probed

Real Case Study

Case: Sparkling Coffee New Product Testing

User Need: "I want to test market reaction to sparkling coffee"

Scout Workflow:

  1. Search on Xiaohongshu (RED) for "sparkling coffee" and "sparkling drinks" related content

  2. Observe 10-15 rounds, analyze characteristics of discussants

  3. Extract 3 typical user types:

    • Type 1: 25-30 year-old white-collar women, focused on aesthetics and social aspects
    • Type 2: 28-35 year-old health-conscious individuals, concerned about ingredients and calories
    • Type 3: 22-28 year-old Gen Z, seeking novelty
  4. Automatically generate 3 Tier 1 personas, add to library

Results:

  • User directly uses these 3 personas for Discussion testing
  • No need to wait for real human recruitment (saves 2-4 weeks)
  • Consistency reaches 80 points (close to real human baseline of 81)

Core Value Proposition

Why Does a 300K+ Persona Library Matter?

It's not about the quantity itself, but the breadth of research questions covered:

  • Industry coverage: Technology, consumer goods, education, healthcare, finance, and 15+ industries
  • Demographic coverage: Ages 18-60, tier 1-3 cities, different income levels
  • Scenario coverage: Product testing, brand positioning, market trends, creative validation

The more unique your research question, the more you need a large library:

  • If researching "25-35 year-old urban white-collar workers": Thousands of relevant personas in the library
  • If researching "45-55 year-old rural e-commerce users": Hundreds of relevant personas in the library
  • Larger library = higher probability of finding suitable personas

Comparison: Us vs Other AI Persona Tools

DimensionCharacter.AI / GPTsatypica.AI
Construction MethodUsers write prompts themselvesScout auto-analyzes real users via social media
Data SourceUser imaginationReal user behavioral data
Quality Verification❌ No verification✅ Consistency 80-85 points (close to or exceeds real human baseline of 81)
Quantity ScaleUsers create a few themselves300K+ public library
Use CasesEntertainment, companionshipBusiness research, user insights

Bottom Line

"300K is not a number, it's the accumulation of 300K real research questions. Prompts don't matter—the real user data behind the prompts is what's critical."


Related Questions:


Related Feature: AI Persona Three-Tier System Doc Version: v2.1 Created: 2026-01-30 Last Updated: 2026-02-02 Update Notes: Added three-layer understanding framework explanation, updated consistency scores with real human baseline comparison, updated terminology and platform information

Last updated: 2/9/2026