GEA: Building Enterprise AI Systems — Lessons from atypica

The architectural evolution of a consumer research platform

Background

We're building atypica, an AI-driven consumer research platform.

The goal is simple: let AI independently conduct user research — from observing social media, to simulating interviews, to generating insight reports.

Along the way we encountered specific problems and tried various approaches. This article documents that journey and the architectural framework we distilled from it — GEA (Generative Enterprise Architecture).

The Problems We Encountered

Problem 1: Vague User Requirements

Users often say: "I want to understand young people's coffee preferences."

But that's not specific enough:

Which age group of young people?
What dimensions matter? Price? Brand? Usage scenarios?
What methods to use? Observation? Interviews? Surveys?
What deliverables? User personas? Strategic recommendations?

Traditional approach: Multi-turn dialogue to clarify requirements → Problem: You have to re-ask every time; nothing is reusable.

Our approach:

Instead of treating it as "requirement clarification," we treat it as "intent construction" — assembling an executable research intent directly from user input, team history, and existing Personas.

Problem 2: Context is Hard to Manage

A single research session produces massive amounts of context:

Social media observation results (hundreds of posts)
User personas
Historical research templates
Intermediate reasoning processes

This context has various characteristics:

Some is long-lived (Persona library)
Some is ephemeral (current conversation)
It needs filtering (lots of noise)
It needs linking (similar research should be discoverable)

Traditional approach: RAG retrieval → Problem: Retrieval is just the first step; continuous curation is still needed.

Our approach:

Treat context as a system to manage — similar to the mindset of DAM (Digital Asset Management): make the right assets available at the right time.

Problem 3: Agents Need Continuous Course Correction

Research is not a linear process:

Observe → discover contradictory signals → need in-depth interviews
A data source turns out to be useless → need to pivot
Insights are already clear → should stop exploring

This requires an agent that continuously makes judgments, rather than following a preset workflow.

Traditional approach: Multi-Agent with each agent having a fixed role → Problem: Who plays the role of "continuous judgment"?

Our approach:

Split into two agents:

Reasoning Agent: Makes judgments and decisions
Execute Agent: Carries out specific tasks

The Reasoning Agent is responsible for preparing context, deciding next steps, and adjusting direction.

Problem 4: Experience is Hard to Reuse

Every research session accumulates experience:

User personas for a specific domain
Observation methods for a specific platform
Interview frameworks for certain types of questions

If this experience isn't captured, you start from scratch next time.

Traditional approach: Documentation or tool calls → Problem: Documentation isn't structured enough; tool calls aren't flexible enough.

Our approach:

Codify experience as Skills — capability modules that can be dynamically loaded.

atypica's Workflow

Let's walk through a typical research task: "I want to understand young people's coffee preferences."

Complete Workflow Overview

Step 1: Intent Construction

Assemble intent from Memory (team's visual focus), Assets (tea beverage research).

Step 2: Reasoning Planning

Path: Observe → Interview → Report Preparation: Load scoutTask Skill, prepare social media MCP

Step 3: Execute Observation

Observe Xiaohongshu/Douyin using scoutTask methodology, collect 120+ posts. Discovery: "They say they value cost-effectiveness, but pay premium for aesthetics."

Step 4: Reasoning Adjustment

Contradiction detected → Load interview Skill → Verify through in-depth interviews

Step 5: Execute Interviews

Insight: Gen Z "cost-effectiveness" = function + aesthetics + social value

Step 6: Generate Report

Load reportGen Skill, output segmentation, insights, and recommendations.

Step 7: Knowledge Capture

Memory learns, Assets are enriched, Skills are optimized. Next time is more efficient.

Detailed Step Breakdown

1. Intent Construction

The system doesn't make you fill out forms or go through multi-turn clarification dialogues. It directly constructs an executable research intent from your question, team history, and existing Persona library:

2. Reasoning Agent Begins Inference

The Reasoning Agent plans the execution path and prepares context:

Load scoutTaskChat skill (social media observation)
Prepare system prompt and relevant tools
Retrieve related user personas from the Persona library
Set reasoning trigger conditions (deep analysis after 5 observations)

3. Execute Agent Carries Out Tasks

The Execute Agent works based on the prepared context:

Scout observes user discussions on Xiaohongshu/Douyin
Collects 120+ posts, identifies key patterns
Triggers reasoningThinking deep analysis
Discovers insight: "Price-sensitive + visually-driven, but pays premium for aesthetics"

4. Continuous Context Curation

The Reasoning Agent adjusts strategy based on execution results:

Contradictory signals? Load interview skill for in-depth verification
Insufficient information? Adjust scout observation dimensions
Clear insights? Load reportGen skill, generate report

Throughout the process, context is continuously filtered, refined, and reorganized.

5. Asset Capture

After research is complete, new assets enter the DAM system:

New Personas are automatically cataloged (Gen Z coffee consumer profiles)
Research intents become templates (reusable for tea, bubble tea research)
Knowledge Gaps are recorded (limited discussion of this demographic on Weibo)

Next time a similar study is needed, the system is smarter.

The Resulting Architecture: GEA

To address these problems, our architecture gradually formed around four core components:

Architecture Diagram

Left: External Infrastructure

LLMs (GPT-4, Claude, etc.)
MCP Servers (social media data, market reports, CRM data)
APIs

Center: Core Process

Intent Layer: Needs + Context → Executable intent
Reasoning Agent: Continuous inference and decision-making
Execute Agent: Task execution
Outcome: Delivered results

Right: Context System (DAM)

Memory: Team memory (work styles, judgment criteria)
Assets: Enterprise data (financials, product info, content, historical research)
Skills: Methodologies (research frameworks, interview techniques)

Reasoning and Execute continuously interact with the Context System: retrieving memories, accessing data, loading methods.

1. Intent Layer

Purpose: Transform vague input into executable intent

How it works:

Parse user input
Match team history (similar research, related Personas)
Generate structured research intent

Output: A clear intent containing research target, methods, and deliverables

2. Context System

Purpose: Manage various context assets

Two dimensions:

Build Time: Long-lived assets (Persona library, research templates, Skills)
Runtime: Session context (conversation history, observation results, reasoning records)

Core capabilities:

Semantic indexing (not just keywords)
Dynamic filtering (noise reduction)
Association recommendations (finding related assets)

3. Reasoning Agent (Inference Engine)

Purpose: Continuous reasoning and decision-making

Specific responsibilities:

Plan execution paths
Prepare context (prompts, tools, skills for the Execute Agent)
Determine when to adjust direction
Decide when to stop

What it doesn't do: It never executes tasks directly (that's the Execute Agent's job).

Dual-Agent Architecture Comparison

4. Execute Agent + Skills

Execute Agent:

A sufficiently general-purpose executor
Entirely dependent on context prepared by the Reasoning Agent
Dynamically loads Skills

Skills (atypica's specific Skills):

scoutTaskChat: Social media observation
interviewChat: User interviews
buildPersona: User persona generation
reportGen: Report generation

Skills Progressive Disclosure

Full content is loaded only when needed, avoiding context bloat.

On the Origins of the Skills Concept

The idea of "Universal Agent + Skills Library" comes from Anthropic's thinking in 2025 — rather than building multiple specialized agents, use a single universal agent paired with composable Skills. We align with this direction.

In atypica's practice, we combine this with the dual-agent architecture and apply Skills specifically to consumer research scenarios.

Relationship with Other Architectures

GEA doesn't replace RAG or Multi-Agent — it's a practical approach for specific scenarios.

Relationship with RAG

The Context System leverages RAG's retrieval capabilities.

But it adds continuous curation and asset management — not just retrieval, but also filtering noise, establishing associations, and timely updates.

Relationship with Multi-Agent

There are also multiple capability units (Skills).

But the dual-agent approach separates reasoning from execution, with Skills dynamically loaded as context — not fixed multiple agents, but composable capability modules.

Open Questions

There are still questions we're exploring:

1. How Far Can Intent Construction Be Automated?

Some human confirmation is still needed. Can it become fully automatic in the future?

2. Context Curation Strategies?

When to keep? When to discard? How to balance quality and quantity?

3. Skills Granularity?

Too fine-grained means high management cost; too coarse-grained means not flexible enough.

4. Can This Architecture Transfer to Other Domains?

We've only validated it in consumer research. Other judgment-heavy work may require adjustments.

GEA's Applicability Boundaries

GEA is not a general-purpose architecture — it's a domain-native architecture for specific scenarios.

Suited for Exploratory Knowledge Work

Market research and user insights
Product definition and strategic planning
Content strategy and creative exploration
Technical solution evaluation and decision-making

Characteristics: Vague starting points, uncertain processes, judgment at the core

Not Suited for Deterministic Tasks

Repetitive process automation
Heavily constrained approval workflows
Real-time operational requirements
Tasks with well-defined SOPs

Characteristics: Fixed processes, deterministic requirements, execution-focused

GEA is an architecture designed for "work that can't be written as an SOP." If your work can be described as a clear process, a traditional workflow engine is probably a better fit.

The architectural evolution of a consumer research platform

Background

We're building atypica, an AI-driven consumer research platform.

The goal is simple: let AI independently conduct user research — from observing social media, to simulating interviews, to generating insight reports.

The Problems We Encountered

Problem 1: Vague User Requirements

Users often say: "I want to understand young people's coffee preferences."

But that's not specific enough:

Which age group of young people?
What dimensions matter? Price? Brand? Usage scenarios?
What methods to use? Observation? Interviews? Surveys?
What deliverables? User personas? Strategic recommendations?

Traditional approach: Multi-turn dialogue to clarify requirements → Problem: You have to re-ask every time; nothing is reusable.

Our approach:

Instead of treating it as "requirement clarification," we treat it as "intent construction" — assembling an executable research intent directly from user input, team history, and existing Personas.

Problem 2: Context is Hard to Manage

A single research session produces massive amounts of context:

Social media observation results (hundreds of posts)
User personas
Historical research templates
Intermediate reasoning processes

This context has various characteristics:

Some is long-lived (Persona library)
Some is ephemeral (current conversation)
It needs filtering (lots of noise)
It needs linking (similar research should be discoverable)

Traditional approach: RAG retrieval → Problem: Retrieval is just the first step; continuous curation is still needed.

Our approach:

Treat context as a system to manage — similar to the mindset of DAM (Digital Asset Management): make the right assets available at the right time.

Problem 3: Agents Need Continuous Course Correction

Research is not a linear process:

Observe → discover contradictory signals → need in-depth interviews
A data source turns out to be useless → need to pivot
Insights are already clear → should stop exploring

This requires an agent that continuously makes judgments, rather than following a preset workflow.

Traditional approach: Multi-Agent with each agent having a fixed role → Problem: Who plays the role of "continuous judgment"?

Our approach:

Split into two agents:

Reasoning Agent: Makes judgments and decisions
Execute Agent: Carries out specific tasks

The Reasoning Agent is responsible for preparing context, deciding next steps, and adjusting direction.

Problem 4: Experience is Hard to Reuse

Every research session accumulates experience:

User personas for a specific domain
Observation methods for a specific platform
Interview frameworks for certain types of questions

If this experience isn't captured, you start from scratch next time.

Traditional approach: Documentation or tool calls → Problem: Documentation isn't structured enough; tool calls aren't flexible enough.

Our approach:

Codify experience as Skills — capability modules that can be dynamically loaded.

atypica's Workflow

Let's walk through a typical research task: "I want to understand young people's coffee preferences."

Complete Workflow Overview

Step 1: Intent Construction

Assemble intent from Memory (team's visual focus), Assets (tea beverage research).

Step 2: Reasoning Planning

Path: Observe → Interview → Report Preparation: Load scoutTask Skill, prepare social media MCP

Step 3: Execute Observation

Observe Xiaohongshu/Douyin using scoutTask methodology, collect 120+ posts. Discovery: "They say they value cost-effectiveness, but pay premium for aesthetics."

Step 4: Reasoning Adjustment

Contradiction detected → Load interview Skill → Verify through in-depth interviews

Step 5: Execute Interviews

Q: "What do you value most?" → "Cost-effectiveness"Q: "But the 38 yuan coffee you posted..." → "The cup was just too good-looking"Q: "So good-looking is also cost-effective?" → "I can post it on social media"

Insight: Gen Z "cost-effectiveness" = function + aesthetics + social value

Step 6: Generate Report

Load reportGen Skill, output segmentation, insights, and recommendations.

Step 7: Knowledge Capture

Memory learns, Assets are enriched, Skills are optimized. Next time is more efficient.

Detailed Step Breakdown

1. Intent Construction

Target: 18-28 year olds in tier-1 citiesScenario: Daily coffee consumption decisionsDimensions: Brand preference, price sensitivity, social factorsMethods: Social media observation + simulated interviewsDeliverables: User segmentation + preference map

2. Reasoning Agent Begins Inference

The Reasoning Agent plans the execution path and prepares context:

Load scoutTaskChat skill (social media observation)
Prepare system prompt and relevant tools
Retrieve related user personas from the Persona library
Set reasoning trigger conditions (deep analysis after 5 observations)

3. Execute Agent Carries Out Tasks

The Execute Agent works based on the prepared context:

Scout observes user discussions on Xiaohongshu/Douyin
Collects 120+ posts, identifies key patterns
Triggers reasoningThinking deep analysis
Discovers insight: "Price-sensitive + visually-driven, but pays premium for aesthetics"

4. Continuous Context Curation

The Reasoning Agent adjusts strategy based on execution results:

Contradictory signals? Load interview skill for in-depth verification
Insufficient information? Adjust scout observation dimensions
Clear insights? Load reportGen skill, generate report

Throughout the process, context is continuously filtered, refined, and reorganized.

5. Asset Capture

After research is complete, new assets enter the DAM system:

New Personas are automatically cataloged (Gen Z coffee consumer profiles)
Research intents become templates (reusable for tea, bubble tea research)
Knowledge Gaps are recorded (limited discussion of this demographic on Weibo)

Next time a similar study is needed, the system is smarter.

The Resulting Architecture: GEA

To address these problems, our architecture gradually formed around four core components:

Architecture Diagram

  ┌──────────────────────────────────────────────────────────┐  │                   GEA Architecture                       │  ├───────────┬──────────────────────┬───────────────────────┤  │           │                      │                       │  │ External  │   Core Process       │  Context System (DAM) │  │ Infra     │                      │                       │  │           │   Intent Layer       │  • Memory             │  │ • LLM     │   (Needs + Context)  │    (Team memory)      │  │           │        ↓             │                       │  │ • MCP     │   Reasoning ←──────→ │  • Assets             │  │   Social  │        ↓             │    (Enterprise data)  │  │   Reports │   Execute   ←──────→ │    Financials/Product │  │   CRM     │        ↓             │                       │  │           │   Outcome            │  • Skills             │  │ • APIs    │                      │    (Methodologies)    │  │           │                      │    Research/Interview │  │           │                      │                       │  └───────────┴──────────────────────┴───────────────────────┘

Left: External Infrastructure

LLMs (GPT-4, Claude, etc.)
MCP Servers (social media data, market reports, CRM data)
APIs

Center: Core Process

Intent Layer: Needs + Context → Executable intent
Reasoning Agent: Continuous inference and decision-making
Execute Agent: Task execution
Outcome: Delivered results

Right: Context System (DAM)

Memory: Team memory (work styles, judgment criteria)
Assets: Enterprise data (financials, product info, content, historical research)
Skills: Methodologies (research frameworks, interview techniques)

Reasoning and Execute continuously interact with the Context System: retrieving memories, accessing data, loading methods.

1. Intent Layer

Purpose: Transform vague input into executable intent

How it works:

Parse user input
Match team history (similar research, related Personas)
Generate structured research intent

Output: A clear intent containing research target, methods, and deliverables

2. Context System

Purpose: Manage various context assets

Two dimensions:

Build Time: Long-lived assets (Persona library, research templates, Skills)
Runtime: Session context (conversation history, observation results, reasoning records)

Core capabilities:

Semantic indexing (not just keywords)
Dynamic filtering (noise reduction)
Association recommendations (finding related assets)

3. Reasoning Agent (Inference Engine)

Purpose: Continuous reasoning and decision-making

Specific responsibilities:

Plan execution paths
Prepare context (prompts, tools, skills for the Execute Agent)
Determine when to adjust direction
Decide when to stop

What it doesn't do: It never executes tasks directly (that's the Execute Agent's job).

Dual-Agent Architecture Comparison

Multi-Agent Architecture       Dual-Agent Architecture(Traditional)                   (GEA / atypica)┌───────────┐                   ┌───────────────────┐│Scout Agent│                   │ Reasoning Agent   │└─────┬─────┘                   │ (Inference/Plan)  │      │                         └────────┬──────────┘┌─────▼──────┐                           │ Prepare Context│Interview   │                           │ Instruct Exec│Agent       │                  ┌────────▼──────────┐└─────┬──────┘                  │ Execute Agent     │      │                         │   (Universal)     │┌─────▼──────┐                  └────────┬──────────┘│Report Agent│                           │└─────┬──────┘                  ┌────────▼──────────┐      │                         │   Skills Lib      │┌─────▼──────┐                  │  • scoutTask      ││Strategy    │                  │  • interview      ││Agent       │                  │  • reportGen      │└────────────┘                  │  • strategy       │                                └───────────────────┘Issues:                         Benefits:• Fragmented context          • Unified context mgmt• High coordination cost      • Clear reasoning path• Hard to reuse               • Composable skills

4. Execute Agent + Skills

Execute Agent:

A sufficiently general-purpose executor
Entirely dependent on context prepared by the Reasoning Agent
Dynamically loads Skills

Skills (atypica's specific Skills):

scoutTaskChat: Social media observation
interviewChat: User interviews
buildPersona: User persona generation
reportGen: Report generation

Skills Progressive Disclosure

Skills Progressive Disclosure━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━Load time (metadata only):┌─────────────────────────────────────┐│ Skills Library (1000+ skills)       ││                                     ││ scout.md          - Social observe  ││ interview.md      - Structured Q&A  ││ reportGen.md      - Report builder  ││ ...                                 │└─────────────────────────────────────┘         ↓ Minimal token usageRuntime (load on demand):┌─────────────────────────────────────┐│ Reasoning: "Need social observation"│└──────────────┬──────────────────────┘               ↓ Load scout.md┌──────────────▼──────────────────────┐│ ## Scout Skill                      ││                                     ││ Observe social media behavior...    ││ - Collect 5 samples                 ││ - Identify patterns                 ││ - Trigger reasoningThinking         ││                                     ││ [scripts/scout.py]                  │└─────────────────────────────────────┘After completion:Context reorganized, skill unloaded

Full content is loaded only when needed, avoiding context bloat.

On the Origins of the Skills Concept

In atypica's practice, we combine this with the dual-agent architecture and apply Skills specifically to consumer research scenarios.

Relationship with Other Architectures

GEA doesn't replace RAG or Multi-Agent — it's a practical approach for specific scenarios.

Relationship with RAG

The Context System leverages RAG's retrieval capabilities.

But it adds continuous curation and asset management — not just retrieval, but also filtering noise, establishing associations, and timely updates.

Relationship with Multi-Agent

There are also multiple capability units (Skills).

But the dual-agent approach separates reasoning from execution, with Skills dynamically loaded as context — not fixed multiple agents, but composable capability modules.

Open Questions

There are still questions we're exploring:

1. How Far Can Intent Construction Be Automated?

Some human confirmation is still needed. Can it become fully automatic in the future?

2. Context Curation Strategies?

When to keep? When to discard? How to balance quality and quantity?

3. Skills Granularity?

Too fine-grained means high management cost; too coarse-grained means not flexible enough.

4. Can This Architecture Transfer to Other Domains?

We've only validated it in consumer research. Other judgment-heavy work may require adjustments.

GEA's Applicability Boundaries

GEA is not a general-purpose architecture — it's a domain-native architecture for specific scenarios.

Suited for Exploratory Knowledge Work

Market research and user insights
Product definition and strategic planning
Content strategy and creative exploration
Technical solution evaluation and decision-making

Characteristics: Vague starting points, uncertain processes, judgment at the core

Not Suited for Deterministic Tasks

Repetitive process automation
Heavily constrained approval workflows
Real-time operational requirements
Tasks with well-defined SOPs

Characteristics: Fixed processes, deterministic requirements, execution-focused

GEA is an architecture designed for "work that can't be written as an SOP." If your work can be described as a clear process, a traditional workflow engine is probably a better fit.