Build an Agentic AEO Content Pipeline with Mastra, Sanity CMS, and Superlines
A complete guide to building an AI-powered content intelligence and optimization agent that monitors your AI search visibility, audits content, fact-checks claims, and creates optimized articles automatically.
Table of Contents
Status: Work in Progress — This guide covers local execution. Automated deployment (GitHub Actions) is coming soon.
Clone the repo to get started →
git clone https://github.com/Superlines/aeo-agent.git
cd aeo-agent
npm install
cp .env.example .env # Add your API keys
npm start
Or follow the step-by-step guide below to build it from scratch and understand every piece.
What you will build
This guide walks you through building a fully autonomous content intelligence agent that runs a 7-phase daily pipeline:
- Intelligence Gathering — Pull AI search visibility metrics, citation rates, and competitive gaps from Superlines
- Competitive Deep Dive — Identify top competitor URLs winning AI citations, then scrape and analyze their content
- Content Health Audit — Inventory your published articles and flag content that needs freshness updates
- Fact-Check — Extract pricing claims, statistics, and feature counts from articles, then verify them against live sources
- Industry Insights — Research trending topics in AI search, GEO, and AEO across the web and Reddit
- Data Storytelling — Mine your own analytics data for compelling insights to embed in content
- Content Actions — Create new articles, update outdated content, and fix incorrect facts in your CMS
The agent uses three specialized sub-agents (Analyst, Researcher, Content Manager) that each carry only the tools they need, avoiding context window overflow.
How it works
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Analyst │ │ Researcher │ │ Content Mgr │
│ (Superlines │ │ (Bright Data│ │ (Sanity CMS │
│ MCP tools) │ │ MCP tools) │ │ + webpage │
└──────┬───────┘ └──────┬──────┘ │ audit tools)│
│ │ └──────┬───────┘
└────────────┬───────┘ │
│ │
┌───────▼───────┐ │
│ 7-Phase │◄──────────────────┘
│ Pipeline │
└───────┬───────┘
│
┌───────▼───────┐
│ Report + │
│ CMS Updates │
└───────────────┘
Prerequisites
Before you start, you will need accounts and API keys for the following services:
| Service | Purpose | Sign up |
|---|---|---|
| Superlines (Starter plan or above) | AI search analytics via MCP | superlines.io |
| Anthropic | Claude model for agent reasoning | console.anthropic.com |
| OpenAI | Embeddings model (text-embedding-3-small) | platform.openai.com |
| Bright Data | Web scraping and search via MCP | brightdata.com |
| Sanity CMS | Content management (read + write) | sanity.io |
You also need Node.js 22.13+ installed. The agent uses ESM modules and modern JavaScript features.
Step 1: Project setup
Create a new directory for your agent and initialize the project:
mkdir aeo-agent && cd aeo-agent
npm init -y
Update your package.json:
{
"name": "aeo-agent",
"version": "1.0.0",
"description": "Mastra-powered AEO content pipeline using Superlines MCP, Bright Data MCP, and Sanity CMS",
"type": "module",
"scripts": {
"start": "tsx src/index.ts",
"dev": "tsx watch src/index.ts",
"pipeline": "tsx src/index.ts --run",
"build": "tsc --noEmit",
"typecheck": "tsc --noEmit"
},
"engines": {
"node": ">=22.13.0"
}
}
Install dependencies:
npm install @mastra/core @mastra/mcp @mastra/memory @mastra/libsql @sanity/client zod dotenv
npm install -D @types/node tsx typescript
Here is what each package does:
- @mastra/core — Agent framework with tool calling, memory, and multi-step reasoning
- @mastra/mcp — MCP client for connecting to Superlines and Bright Data
- @mastra/memory — Persistent memory with message history and working memory
- @mastra/libsql — LibSQL storage backend for memory (local SQLite or Turso)
- @sanity/client — Sanity CMS API client for reading and writing content
- zod — Schema validation for tool inputs
- dotenv — Load environment variables from
.envfile
Create a tsconfig.json:
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"lib": ["ES2022"],
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"declaration": true,
"declarationMap": true,
"sourceMap": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist", "data"]
}
Create the directory structure:
mkdir -p src/mastra/tools src/utils data/reports
Step 2: Environment variables
Create a .env file in your project root. This file holds all API keys and configuration:
# ── LLM Providers ────────────────────────────────────────
ANTHROPIC_API_KEY=sk-ant-... # Claude for agent reasoning
OPENAI_API_KEY=sk-... # Embeddings (text-embedding-3-small)
# ── MCP Servers ──────────────────────────────────────────
SUPERLINES_API_KEY=sl_live_... # Superlines API key (Starter plan+)
BRIGHTDATA_API_TOKEN=... # Bright Data API token
# ── Sanity CMS ───────────────────────────────────────────
SANITY_PROJECT_ID=your-project-id
SANITY_DATASET=production
SANITY_WRITE_TOKEN=sk... # API token with write permissions
SANITY_DEFAULT_AUTHOR_ID= # Default author Sanity _id
SANITY_DEFAULT_CATEGORY_ID= # Default category Sanity _id
# ── Pipeline Configuration ───────────────────────────────
SUPERLINES_BRAND_NAME=YourBrand # Your brand name in Superlines
SUPERLINES_DOMAIN_ID= # Your domain ID from Superlines
SITE_DOMAIN=yourdomain.com # Your website domain
AGENT_MODEL=anthropic/claude-sonnet-4-20250514 # Or claude-opus-4-6 for higher quality
AGENT_MAX_STEPS=25 # Max tool calls per agent invocation
# ── Memory ───────────────────────────────────────────────
MEMORY_DB_URL=file:./data/memory.db # Local SQLite (or libsql:// for Turso)
# ── Optional: Deployment ─────────────────────────────────
# VERCEL_DEPLOY_HOOK=https://api.vercel.com/v1/integrations/deploy/...
# SLACK_WEBHOOK_URL=https://hooks.slack.com/services/...
Where to find each key
Superlines API key: Go to Organization Settings in Superlines, then API Keys. Generate a key with read permissions. The key starts with sl_live_.
Superlines Domain ID: After setting up your brand in Superlines, call list_brands through the MCP server. The domainId field is what you need.
Bright Data API token: Sign up at brightdata.com, navigate to your account settings, and generate an API token. The free tier includes 5,000 requests/month which is plenty for daily pipeline runs.
Sanity credentials: In your Sanity project, go to Manage > API > Tokens and create a token with Editor permissions. The project ID is visible in the project URL.
Step 3: MCP client configuration
The agent connects to two MCP (Model Context Protocol) servers:
- Superlines MCP — 32 analytics tools for AI search visibility data, competitive analysis, and webpage auditing
- Bright Data MCP — Web scraping and search tools for competitor research and fact-checking
Each MCP server exposes tools that AI agents can call. By connecting through MCP, the agent gets structured access to real-time data without custom API integrations.
Create src/mastra/mcp.ts:
import { MCPClient } from "@mastra/mcp";
if (!process.env.SUPERLINES_API_KEY) {
throw new Error("SUPERLINES_API_KEY is required");
}
if (!process.env.BRIGHTDATA_API_TOKEN) {
throw new Error("BRIGHTDATA_API_TOKEN is required");
}
const MCP_CONNECT_TIMEOUT = 60_000;
const MCP_REQUEST_TIMEOUT = 120_000;
const MAX_RETRIES = 3;
const RETRY_BASE_DELAY = 5_000;
// Superlines MCP — connects via SSE
function createSuperlinesMcp() {
return new MCPClient({
servers: {
superlines: {
url: new URL(
`https://mcpsse.superlines.io?token=${process.env.SUPERLINES_API_KEY}`
),
timeout: MCP_REQUEST_TIMEOUT,
connectTimeout: MCP_CONNECT_TIMEOUT,
},
},
timeout: MCP_REQUEST_TIMEOUT,
});
}
// Bright Data MCP — connects via SSE
function createBrightdataMcp() {
return new MCPClient({
servers: {
brightdata: {
url: new URL(
`https://mcp.brightdata.com/sse?token=${process.env.BRIGHTDATA_API_TOKEN}&groups=advanced_scraping,social,browser,finance,research`
),
timeout: MCP_REQUEST_TIMEOUT,
connectTimeout: MCP_CONNECT_TIMEOUT,
},
},
timeout: MCP_REQUEST_TIMEOUT,
});
}
let superlinesMcp: MCPClient;
let brightdataMcp: MCPClient;
// Retry wrapper with exponential backoff.
// Creates a fresh MCP client on each retry to avoid stale transport state.
async function withRetry<T>(
label: string,
fn: () => Promise<T>,
recreate?: () => void
): Promise<T> {
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
try {
return await fn();
} catch (error) {
const isLastAttempt = attempt === MAX_RETRIES;
const delay = RETRY_BASE_DELAY * Math.pow(2, attempt - 1);
const errMsg = error instanceof Error ? error.message : String(error);
if (isLastAttempt) {
console.error(` [${label}] Failed after ${MAX_RETRIES} attempts: ${errMsg}`);
throw error;
}
console.warn(` [${label}] Attempt ${attempt}/${MAX_RETRIES} failed: ${errMsg}`);
console.warn(` [${label}] Retrying in ${delay / 1000}s...`);
if (recreate) recreate();
await new Promise((resolve) => setTimeout(resolve, delay));
}
}
throw new Error("Unreachable");
}
// Webpage-only tool prefixes for the Content Manager agent.
// Prevents loading 25+ analytics tool schemas that would bloat the prompt.
const WEBPAGE_TOOL_PREFIXES = [
"webpage_audit",
"webpage_analyze",
"webpage_crawl",
"schema_optimizer",
];
// Fetch all Superlines MCP tools, then split into two sets:
// - allTools: full analytics suite (for the Analyst agent)
// - webpageTools: webpage audit/analysis only (for the Content Manager)
export async function getSuperlinesMCPTools() {
superlinesMcp = createSuperlinesMcp();
const allTools = await withRetry(
"Superlines",
() => superlinesMcp.listTools(),
() => {
superlinesMcp.disconnect().catch(() => {});
superlinesMcp = createSuperlinesMcp();
}
);
const webpageTools = Object.fromEntries(
Object.entries(allTools).filter(([name]) =>
WEBPAGE_TOOL_PREFIXES.some((prefix) => name.startsWith(prefix))
)
);
console.log(
` Loaded ${Object.keys(allTools).length} Superlines tools (${Object.keys(webpageTools).length} webpage-only)`
);
return { allTools, webpageTools };
}
// Get tools from the Bright Data MCP server
export async function getBrightDataMCPTools() {
brightdataMcp = createBrightdataMcp();
const tools = await withRetry(
"BrightData",
() => brightdataMcp.listTools(),
() => {
brightdataMcp.disconnect().catch(() => {});
brightdataMcp = createBrightdataMcp();
}
);
console.log(` Loaded ${Object.keys(tools).length} Bright Data tools`);
return tools;
}
// Disconnect all MCP servers
export async function disconnectMCP() {
await Promise.allSettled([
superlinesMcp?.disconnect(),
brightdataMcp?.disconnect(),
]);
}
Why split tools across agents?
When you load all MCP tools from both servers into a single agent, the combined tool schemas can push the prompt past Claude’s 200k token context limit. By splitting tools across specialized agents, each agent only carries what it needs:
- Analyst gets the full Superlines analytics suite (~26 tools)
- Researcher gets Bright Data scraping tools (~8 tools)
- Content Manager gets Sanity CMS tools + only the webpage audit tools from Superlines (~12 tools)
Step 4: Memory configuration
The agent uses persistent memory to maintain context across pipeline phases and across daily runs. This is what lets the agent remember yesterday’s performance snapshot and track trends over time.
Create src/mastra/memory.ts:
import { Memory } from "@mastra/memory";
import { LibSQLStore } from "@mastra/libsql";
const dbUrl = process.env.MEMORY_DB_URL || "file:./data/memory.db";
export const memory = new Memory({
storage: new LibSQLStore({
id: "aeo-pipeline-storage",
url: dbUrl,
}),
options: {
// Keep recent messages within the current phase thread.
// Each phase uses its own thread, so 5 is enough for multi-step tool use.
lastMessages: 5,
// Semantic recall is disabled. Previous runs contain huge raw API
// payloads (100k+ tokens) that blow past the context limit when injected.
// Working memory provides a compact cross-run context instead.
semanticRecall: false,
// Working memory: persistent structured context that carries over between runs
workingMemory: {
enabled: true,
template: `
# AEO Pipeline Context
## Brand Info
- Brand Name: ${process.env.SUPERLINES_BRAND_NAME || ""}
- Domain: ${process.env.SITE_DOMAIN || ""}
- Domain ID: ${process.env.SUPERLINES_DOMAIN_ID || ""}
## Current Performance Snapshot
- Brand Visibility:
- Citation Rate:
- Share of Voice:
- Top Performing Prompts:
## Competitive Landscape
- Top Competitors:
- Competitor Strengths:
- Our Weaknesses:
## Content Inventory
- Total Published Articles:
- Articles Needing Updates:
- Recent Articles Created:
## Action Items Queue
- High Priority:
- Medium Priority:
- Low Priority:
`,
},
},
});
Memory architecture decisions
Why LibSQL instead of PostgreSQL? For a daily pipeline that runs once per day, SQLite (via LibSQL) is simpler to set up and has no infrastructure overhead. If you need remote persistence, swap in a Turso database URL.
Why disable semantic recall? In practice, MCP tool responses contain large JSON payloads. When semantic recall re-injects these into future conversations, they can push past Claude’s context limit. Working memory provides a structured alternative that is always compact.
Why thread-per-phase? Each pipeline phase writes to its own memory thread (e.g., intelligence-daily-2026-02-17, competitive-daily-2026-02-17). This prevents tool results from Phase 1 from crowding out Phase 3’s context window.
Step 5: Sanity CMS tools
While Superlines and Bright Data tools come from MCP servers, the Sanity CMS tools are custom Mastra tools that you define locally. This gives you full control over how content is read and written.
Create src/utils/sanity-client.ts first:
import { createClient } from "@sanity/client";
export const sanityClient = createClient({
projectId: process.env.SANITY_PROJECT_ID!,
dataset: process.env.SANITY_DATASET || "production",
apiVersion: "2024-01-01",
token: process.env.SANITY_WRITE_TOKEN,
useCdn: false,
});
// Fetch all published articles (metadata only)
export async function getAllArticles() {
return sanityClient.fetch(`
*[_type == "article" && !(_id in path("drafts.**"))] | order(publishDate desc) {
_id, title, "slug": slug.current, summary,
publishDate, updatedDate, tags, readTime,
"category": category->name,
"author": author->name
}
`);
}
// Fetch a single article by slug (full content)
export async function getArticleBySlug(slug: string) {
return sanityClient.fetch(
`*[_type == "article" && slug.current == $slug][0] {
_id, title, "slug": slug.current, summary, content,
publishDate, updatedDate, tags, readTime, quoteLine,
keyTakeaways, summaryList,
"category": category->{ _id, name },
"author": author->{ _id, name },
"faqs": faqs[]->{ _id, question, answer }
}`,
{ slug }
);
}
// List categories and authors (for creating articles)
export async function getCategories() {
return sanityClient.fetch(`*[_type == "category"] { _id, name, "slug": slug.current }`);
}
export async function getAuthors() {
return sanityClient.fetch(`*[_type == "author"] { _id, name, "slug": slug.current }`);
}
// Create a new article
export async function createArticle(fields: Record<string, any>) {
const doc = {
_type: "article",
title: fields.title,
slug: { _type: "slug", current: fields.slug },
thumbnailTitle: fields.thumbnailTitle,
summary: fields.summary,
content: fields.content,
category: { _type: "reference", _ref: fields.categoryId },
author: { _type: "reference", _ref: fields.authorId },
tags: fields.tags || [],
readTime: fields.readTime,
keyTakeaways: fields.keyTakeaways,
summaryList: fields.summaryList,
quoteLine: fields.quoteLine,
publishDate: new Date().toISOString(),
updatedDate: new Date().toISOString(),
faqs: fields.faqRefs || [],
};
// Create as draft for human review
const id = fields.draft ? `drafts.article-${fields.slug}` : undefined;
return sanityClient.create(id ? { ...doc, _id: id } : doc);
}
// Update an existing article
export async function updateArticle(articleId: string, fields: Record<string, any>) {
return sanityClient
.patch(articleId)
.set({ ...fields, updatedDate: new Date().toISOString() })
.commit();
}
// Create a FAQ document and return its reference
export async function createFAQ(question: string, answer: string) {
return sanityClient.create({
_type: "faq",
question,
answer,
});
}
// Create multiple FAQs and return reference array
export async function createFAQsAndGetRefs(
faqs: { question: string; answer: string }[]
) {
const refs = [];
for (const faq of faqs) {
const doc = await createFAQ(faq.question, faq.answer);
refs.push({
_type: "reference" as const,
_ref: doc._id,
_key: doc._id.replace(/[^a-zA-Z0-9]/g, "").slice(0, 12),
});
}
return refs;
}
Now create the Mastra tool wrappers in src/mastra/tools/sanity.ts:
import { createTool } from "@mastra/core/tools";
import { z } from "zod";
import {
getAllArticles,
getArticleBySlug,
getCategories,
getAuthors,
createArticle,
updateArticle,
createFAQsAndGetRefs,
} from "../../utils/sanity-client.js";
// ── Read Tools ──────────────────────────────────────────
export const listArticlesTool = createTool({
id: "sanity_list_articles",
description:
"List all published articles from Sanity CMS. Returns title, slug, summary, publishDate, tags, and author.",
inputSchema: z.object({}),
execute: async () => {
const articles = await getAllArticles();
return { articles, count: articles.length };
},
});
export const getArticleTool = createTool({
id: "sanity_get_article",
description:
"Get a specific article by slug. Returns full content, FAQs, key takeaways, and summary.",
inputSchema: z.object({
slug: z.string().describe("URL slug of the article"),
}),
execute: async (input) => {
const article = await getArticleBySlug(input.slug);
if (!article) return { error: `No article found: ${input.slug}` };
return { article };
},
});
export const listCategoriesTool = createTool({
id: "sanity_list_categories",
description: "List all article categories. Returns _id, name, slug.",
inputSchema: z.object({}),
execute: async () => ({ categories: await getCategories() }),
});
export const listAuthorsTool = createTool({
id: "sanity_list_authors",
description: "List all authors. Returns _id, name, slug.",
inputSchema: z.object({}),
execute: async () => ({ authors: await getAuthors() }),
});
// ── Write Tools ─────────────────────────────────────────
export const createArticleTool = createTool({
id: "sanity_create_article",
description: `Create a new article in Sanity CMS as a DRAFT.
Use markdown for markdownContent (converted to Portable Text).
Use plain text for title, summary, quoteLine.
Include faqs as array of {question, answer} objects.`,
inputSchema: z.object({
title: z.string().max(110),
slug: z.string(),
thumbnailTitle: z.string(),
summary: z.string().max(150),
markdownContent: z.string(),
tags: z.array(z.string()).optional(),
readTime: z.string().optional(),
keyTakeawaysMarkdown: z.string().optional(),
summaryListMarkdown: z.string().optional(),
quoteLine: z.string().optional(),
faqs: z.array(z.object({
question: z.string(),
answer: z.string(),
})).optional(),
categoryId: z.string().optional(),
authorId: z.string().optional(),
}),
execute: async (input) => {
// You would add markdown-to-portable-text conversion here
// See the full implementation in the repository
const categoryId = input.categoryId || process.env.SANITY_DEFAULT_CATEGORY_ID;
const authorId = input.authorId || process.env.SANITY_DEFAULT_AUTHOR_ID;
if (!categoryId || !authorId) {
return { error: "Missing categoryId or authorId" };
}
let faqRefs;
if (input.faqs && input.faqs.length > 0) {
faqRefs = await createFAQsAndGetRefs(input.faqs);
}
const result = await createArticle({
...input,
categoryId,
authorId,
draft: true,
faqRefs,
});
return {
success: true,
articleId: result._id,
slug: input.slug,
faqCount: faqRefs?.length || 0,
};
},
});
export const updateArticleTool = createTool({
id: "sanity_update_article",
description: "Update an existing article. Only specified fields are changed.",
inputSchema: z.object({
articleId: z.string(),
title: z.string().optional(),
summary: z.string().max(150).optional(),
markdownContent: z.string().optional(),
tags: z.array(z.string()).optional(),
}),
execute: async (input) => {
const fields: Record<string, any> = {};
if (input.title) fields.title = input.title;
if (input.summary) fields.summary = input.summary;
if (input.tags) fields.tags = input.tags;
if (Object.keys(fields).length === 0) {
return { error: "No fields to update" };
}
const result = await updateArticle(input.articleId, fields);
return { success: true, articleId: result._id, updatedFields: Object.keys(fields) };
},
});
// ── Export all tools ────────────────────────────────────
export const sanityTools = {
sanity_list_articles: listArticlesTool,
sanity_get_article: getArticleTool,
sanity_list_categories: listCategoriesTool,
sanity_list_authors: listAuthorsTool,
sanity_create_article: createArticleTool,
sanity_update_article: updateArticleTool,
};
Step 6: Agent definitions
Now we define the three specialized agents. Each agent gets a focused system prompt and only the tools it needs.
Create src/mastra/agents.ts:
import { Agent } from "@mastra/core/agent";
import { memory } from "./memory.js";
import { getSuperlinesMCPTools, getBrightDataMCPTools } from "./mcp.js";
import { sanityTools } from "./tools/sanity.js";
const model = process.env.AGENT_MODEL || "anthropic/claude-sonnet-4-20250514";
const brandName = process.env.SUPERLINES_BRAND_NAME || "YourBrand";
const domainId = process.env.SUPERLINES_DOMAIN_ID || "";
const siteDomain = process.env.SITE_DOMAIN || "yourdomain.com";
export type Agents = {
analyst: Agent;
researcher: Agent;
contentManager: Agent;
};
export async function createAgents(): Promise<Agents> {
// Connect MCP servers sequentially to avoid timeout issues
const brightdataTools = await getBrightDataMCPTools();
const { allTools: superlineTools, webpageTools } = await getSuperlinesMCPTools();
const today = new Date().toISOString().split("T")[0];
// Analyst Agent — Superlines analytics tools only
const analyst = new Agent({
id: "aeo-analyst",
name: "AEO Analyst",
instructions: `You are an AI Search Analytics analyst for ${brandName} (${siteDomain}).
CONTEXT:
- Today: ${today}
- Brand: "${brandName}" | Domain ID: ${domainId}
- ALWAYS pass brands=["${brandName}"] and domainId="${domainId}" in tool calls
YOUR ROLE: Analyze AI search visibility data to provide actionable insights
for improving brand visibility, citation rate, and share of voice across
AI platforms (ChatGPT, Gemini, Perplexity, Claude, Copilot, Grok).
OUTPUT: Structured findings with specific metrics, competitor names,
and recommended actions with priority levels.`,
model,
tools: superlineTools,
memory,
});
// Researcher Agent — Bright Data scraping tools only
const researcher = new Agent({
id: "aeo-researcher",
name: "AEO Researcher",
instructions: `You are a web researcher for ${brandName} (${siteDomain}).
YOUR ROLE: Research competitor content, industry sources, and verify facts.
CAPABILITIES:
- Search the web for topics
- Scrape specific URLs to analyze content
- Batch scrape multiple URLs simultaneously
OUTPUT: Always cite sources with full URLs. Keep output concise.
Today: ${today}`,
model,
tools: brightdataTools,
memory,
});
// Content Manager Agent — Sanity tools + Superlines webpage tools
const contentManager = new Agent({
id: "aeo-content-manager",
name: "AEO Content Manager",
instructions: `You are a content manager for ${brandName} (${siteDomain}).
YOUR ROLE: Create, edit, and manage articles in Sanity CMS.
WRITING GUIDE:
- H1 = the search query itself, followed by a 2-3 sentence direct answer
- TL;DR section (3-5 bullet points) after intro
- H2/H3 as search queries users would ask
- Minimum 3 external stats per article with source URLs
- 5 key takeaways, 5 summary points, 5 FAQs (the 5-5-5 rule)
- Brand mention only in conclusion, neutral tone throughout
- No em dashes. No fluff. No AI hype words.
WORKFLOW: Create articles as DRAFTS for human review.
Today: ${today}`,
model,
tools: {
...sanityTools,
...webpageTools,
},
memory,
});
return { analyst, researcher, contentManager };
}
Step 7: The pipeline
The pipeline orchestrates all 7 phases, passing context between them. Each phase writes to its own memory thread to keep context manageable.
Create src/pipeline.ts:
import type { Agents } from "./mastra/agents.js";
const brandName = process.env.SUPERLINES_BRAND_NAME || "YourBrand";
const domainId = process.env.SUPERLINES_DOMAIN_ID || "";
const siteDomain = process.env.SITE_DOMAIN || "yourdomain.com";
// Cap cross-phase context at ~12k tokens to leave room for
// system prompt, tool schemas, and new tool outputs
const MAX_CONTEXT_CHARS = 48_000;
function trimContext(text: string): string {
if (text.length <= MAX_CONTEXT_CHARS) return text;
return text.substring(0, MAX_CONTEXT_CHARS) +
"\n\n[... TRUNCATED — use the key findings above ...]";
}
type PhaseResult = {
phase: string;
status: "success" | "error";
summary: string;
duration: number;
};
function memoryOpts(threadId: string) {
return {
memory: { thread: threadId, resource: "aeo-pipeline" },
maxSteps: parseInt(process.env.AGENT_MAX_STEPS || "25", 10),
};
}
export async function runDailyPipeline(agents: Agents): Promise<PhaseResult[]> {
const today = new Date().toISOString().split("T")[0];
const runId = `daily-${today}`;
const results: PhaseResult[] = [];
console.log(`\n${"=".repeat(60)}`);
console.log(` AEO DAILY PIPELINE — ${today}`);
console.log(` Brand: ${brandName} | Domain: ${siteDomain}`);
console.log(`${"=".repeat(60)}\n`);
// ── Phase 1: Intelligence Gathering ──────────────────
results.push(
await runPhase("Phase 1: Intelligence Gathering", async () => {
const response = await agents.analyst.generate(
`Perform an AI search intelligence analysis for ${brandName}. Today is ${today}.
USE: brands=["${brandName}"], domainId="${domainId}".
DO:
1. analyze_metrics — Brand Visibility, Citation Rate, Share of Voice
2. get_weekly_performance — Last 4 weeks trends
3. get_competitive_gap — Where competitors are winning
4. find_content_opportunities — High-volume topics with low visibility
5. get_best_performing_prompt — Our strongest queries
6. get_period_comparison — Vs last period
Output: Structured report with metrics, trends, gaps, and opportunities.`,
memoryOpts(`intelligence-${runId}`)
);
return response.text;
})
);
// ── Phase 2: Competitive Deep Dive ───────────────────
results.push(
await runPhase("Phase 2: Competitive Deep Dive", async () => {
const citationAnalysis = await agents.analyst.generate(
`Analyze citation data for ${brandName}. Today is ${today}.
USE: brands=["${brandName}"], domainId="${domainId}".
DO:
1. get_citation_data (aggregateBy="domain") — Top competitor domains
2. get_citation_data (aggregateBy="url") — Specific winning URLs
3. get_top_cited_url_per_prompt — #1 URL per tracked query
4. get_competitor_insights — Full landscape
5. get_fanout_query_insights — What LLMs search for
Output: Top 5 competitor URLs, our competing URLs, and gap prompts.`,
memoryOpts(`competitive-${runId}`)
);
const competitorResearch = await agents.researcher.generate(
`Research these competitors:
${trimContext(citationAnalysis.text)}
DO:
1. Scrape top 3-5 competitor URLs
2. Analyze: structure, topics, stats cited, what makes them strong
3. Search for recent industry developments on these topics
Output: Actionable competitive intelligence.`,
memoryOpts(`research-${runId}`)
);
return `CITATIONS:\n${citationAnalysis.text}\n\nRESEARCH:\n${competitorResearch.text}`;
})
);
// ── Phase 3: Content Health Audit ────────────────────
results.push(
await runPhase("Phase 3: Content Health Audit", async () => {
const inventory = await agents.contentManager.generate(
`List and triage articles for ${brandName}. Today is ${today}.
DO:
1. sanity_list_articles — Get all published articles
2. Flag articles older than 6 months, outdated year references
Output: Total count, top 5 needing updates, single most critical slug.`,
memoryOpts(`health-inventory-${runId}`)
);
const audit = await agents.contentManager.generate(
`Deep-audit the most critical article:
${trimContext(inventory.text)}
DO:
1. Pick the most critical article slug
2. webpage_analyze_content on https://${siteDomain}/articles/[slug]
Output: Content health report with AEO score and top 3 improvements.`,
memoryOpts(`health-audit-${runId}`)
);
return `INVENTORY:\n${inventory.text}\n\nAUDIT:\n${audit.text}`;
})
);
// ── Phase 4: Fact Check ──────────────────────────────
results.push(
await runPhase("Phase 4: Fact Check", async () => {
const healthPhase = results.find(r => r.phase.includes("Content Health"));
const healthContext = healthPhase?.summary
? trimContext(healthPhase.summary)
: "No health audit data.";
const claims = await agents.contentManager.generate(
`Extract verifiable claims from articles. Today is ${today}.
HEALTH AUDIT: ${healthContext}
DO: Identify top 2 articles with potential outdated facts.
For each, extract pricing claims, statistics, feature counts.
Output: Structured claims with verification URLs.`,
memoryOpts(`factcheck-extract-${runId}`)
);
const verification = await agents.researcher.generate(
`Verify these claims by scraping sources. Today is ${today}.
CLAIMS: ${trimContext(claims.text)}
DO: Scrape pricing pages, verify stats, check feature counts.
Output: VERIFIED / OUTDATED (with new value) / UNVERIFIABLE per claim.`,
memoryOpts(`factcheck-verify-${runId}`)
);
return `CLAIMS:\n${claims.text}\n\nVERIFICATION:\n${verification.text}`;
})
);
// ── Phase 5: Industry Insights ───────────────────────
results.push(
await runPhase("Phase 5: Industry Insights", async () => {
const response = await agents.researcher.generate(
`Research AI Search, GEO, and AEO developments. Today is ${today}.
DO:
1. Search: "generative engine optimization trends ${new Date().getFullYear()}"
2. Search: "AI search analytics tools new"
3. Search: "AI citations optimization best practices"
4. Search: "reddit AI search optimization discussion"
Output: Trending topics, content ideas, data points, user questions.`,
memoryOpts(`industry-${runId}`)
);
return response.text;
})
);
// ── Phase 6: Data Storytelling ───────────────────────
results.push(
await runPhase("Phase 6: Data Storytelling", async () => {
const response = await agents.analyst.generate(
`Find data stories from ${brandName}'s analytics. Today is ${today}.
USE: brands=["${brandName}"], domainId="${domainId}".
DO:
1. analyze_metrics grouped by llm_service and topic
2. analyze_sentiment (overall and by platform)
3. analyze_brand_mentions grouped by brand
Look for: surprising patterns, platform differences, topics where we
outperform. Output data stories with exact numbers and article suggestions.`,
memoryOpts(`data-stories-${runId}`)
);
return response.text;
})
);
// ── Phase 7: Content Actions ─────────────────────────
results.push(
await runPhase("Phase 7: Content Actions", async () => {
const previousFindings = trimContext(
results.map(r => `### ${r.phase}\n${r.summary}`).join("\n\n---\n\n")
);
const response = await agents.contentManager.generate(
`Based on all findings, take content actions. Today is ${today}.
FINDINGS: ${previousFindings}
PRIORITIES:
1. Fix fact-check corrections (credibility)
2. Create new article for the biggest content gap
3. Update the most outdated article
All new articles as draft=true with 5 FAQs (5-5-5 rule).`,
{
memory: { thread: `actions-${runId}`, resource: "aeo-pipeline" },
maxSteps: 30,
}
);
return response.text;
})
);
// Print summary
console.log(`\n${"=".repeat(60)}`);
console.log(` PIPELINE COMPLETE — ${today}`);
console.log(`${"=".repeat(60)}\n`);
for (const result of results) {
const icon = result.status === "success" ? "[OK]" : "[ERR]";
console.log(` ${icon} ${result.phase} (${(result.duration / 1000).toFixed(1)}s)`);
}
return results;
}
// Phase runner with error handling and timing
async function runPhase(name: string, fn: () => Promise<string>): Promise<PhaseResult> {
console.log(`\n--- ${name} ---`);
const start = Date.now();
try {
const summary = await fn();
const duration = Date.now() - start;
console.log(` Completed in ${(duration / 1000).toFixed(1)}s`);
return { phase: name, status: "success", summary, duration };
} catch (error) {
const duration = Date.now() - start;
const message = error instanceof Error ? error.message : String(error);
console.error(` ERROR: ${message}`);
return { phase: name, status: "error", summary: `Error: ${message}`, duration };
}
}
Step 8: Report generator
After the pipeline runs, you want a structured report of what happened. This creates a markdown file, writes a GitHub Actions summary (when running in CI), and optionally sends a Slack notification.
Create src/report.ts:
import { writeFile, mkdir } from "node:fs/promises";
import { join } from "node:path";
type PhaseResult = {
phase: string;
status: "success" | "error";
summary: string;
duration: number;
};
export async function generateReport(
results: PhaseResult[],
totalDuration: number
): Promise<void> {
const today = new Date().toISOString().split("T")[0];
const markdown = buildReport(results, today, totalDuration);
// Save locally
const dir = join(process.cwd(), "data", "reports");
await mkdir(dir, { recursive: true });
await writeFile(join(dir, `${today}.md`), markdown, "utf-8");
console.log(` Report saved: data/reports/${today}.md`);
// GitHub Actions summary
const summaryFile = process.env.GITHUB_STEP_SUMMARY;
if (summaryFile) {
await writeFile(summaryFile, markdown, { flag: "a" });
}
// Slack notification (optional)
const webhookUrl = process.env.SLACK_WEBHOOK_URL;
if (webhookUrl) {
await sendSlack(webhookUrl, results, today, totalDuration);
}
}
function buildReport(results: PhaseResult[], date: string, totalDuration: number): string {
const succeeded = results.filter(r => r.status === "success").length;
const totalMin = (totalDuration / 1000 / 60).toFixed(1);
let md = `# AEO Pipeline Report — ${date}\n\n`;
md += `**${succeeded}/${results.length} phases succeeded** | Duration: ${totalMin} min\n\n---\n\n`;
for (const result of results) {
const icon = result.status === "success" ? "✅" : "❌";
const dur = (result.duration / 1000).toFixed(0);
md += `## ${icon} ${result.phase} (${dur}s)\n\n`;
const summary = result.summary.length > 2000
? result.summary.substring(0, 2000) + "\n\n*...truncated...*"
: result.summary;
md += `${summary}\n\n---\n\n`;
}
return md;
}
async function sendSlack(
webhookUrl: string,
results: PhaseResult[],
date: string,
totalDuration: number
) {
const succeeded = results.filter(r => r.status === "success").length;
const totalMin = (totalDuration / 1000 / 60).toFixed(1);
try {
await fetch(webhookUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: `AEO Pipeline ${date}: ${succeeded}/${results.length} phases OK (${totalMin} min)`,
}),
});
console.log(" Slack notification sent");
} catch (error) {
console.error(" Slack failed:", error);
}
}
Step 9: Mastra initialization and entry point
Create src/mastra/index.ts:
import { createAgents, type Agents } from "./agents.js";
let _agents: Agents | null = null;
export async function initializeMastra(): Promise<Agents> {
if (_agents) return _agents;
console.log("Initializing Mastra AEO Pipeline...");
console.log(" Connecting to MCP servers (Superlines + Bright Data)...");
_agents = await createAgents();
console.log(" Agents created: analyst, researcher, contentManager");
console.log(" Mastra initialized.\n");
return _agents;
}
Create the entry point src/index.ts:
import "dotenv/config";
import { initializeMastra } from "./mastra/index.js";
import { runDailyPipeline } from "./pipeline.js";
import { disconnectMCP } from "./mastra/mcp.js";
import { generateReport } from "./report.js";
async function main() {
const startTime = Date.now();
try {
// Validate required environment variables
const required = [
"ANTHROPIC_API_KEY",
"OPENAI_API_KEY",
"SUPERLINES_API_KEY",
"BRIGHTDATA_API_TOKEN",
"SANITY_PROJECT_ID",
"SANITY_WRITE_TOKEN",
];
const missing = required.filter(key => !process.env[key]);
if (missing.length > 0) {
console.error(`Missing env vars:\n ${missing.join("\n ")}`);
console.error("\nCopy .env.example to .env and fill in the values.");
process.exit(1);
}
const agents = await initializeMastra();
const results = await runDailyPipeline(agents);
const totalDuration = Date.now() - startTime;
console.log("\n--- Generating Report ---");
await generateReport(results, totalDuration);
const errors = results.filter(r => r.status === "error");
if (errors.length > 0) {
console.log(`\nCompleted with ${errors.length} error(s):`);
errors.forEach(e => console.log(` - ${e.phase}: ${e.summary}`));
} else {
console.log("\nAll phases completed successfully.");
}
console.log(`Total runtime: ${((Date.now() - startTime) / 1000 / 60).toFixed(1)} min\n`);
} catch (error) {
console.error("Pipeline fatal error:", error);
process.exit(1);
} finally {
try { await disconnectMCP(); } catch {}
}
}
main();
Step 10: Run the pipeline
With everything in place, run your first pipeline:
npm start
You should see output like:
Initializing Mastra AEO Pipeline...
Connecting to MCP servers (Superlines + Bright Data)...
Loaded 8 Bright Data tools
Loaded 32 Superlines tools (4 webpage-only)
Agents created: analyst, researcher, contentManager
Mastra initialized.
============================================================
AEO DAILY PIPELINE — 2026-02-17
Brand: YourBrand | Domain: yourdomain.com
============================================================
--- Phase 1: Intelligence Gathering ---
Completed in 45.2s
--- Phase 2: Competitive Deep Dive ---
Completed in 78.3s
...
============================================================
PIPELINE COMPLETE — 2026-02-17
============================================================
[OK] Phase 1: Intelligence Gathering (45.2s)
[OK] Phase 2: Competitive Deep Dive (78.3s)
[OK] Phase 3: Content Health Audit (34.1s)
[OK] Phase 4: Fact Check (56.7s)
[OK] Phase 5: Industry Insights (23.4s)
[OK] Phase 6: Data Storytelling (31.2s)
[OK] Phase 7: Content Actions (89.5s)
Total runtime: 6.0 min
Check data/reports/ for the full markdown report.
Key design patterns
Context truncation
Cross-phase context is capped at 48,000 characters (~12k tokens). This prevents context overflow when Phase 7 tries to reference all previous findings:
const MAX_CONTEXT_CHARS = 48_000;
function trimContext(text: string): string {
if (text.length <= MAX_CONTEXT_CHARS) return text;
return text.substring(0, MAX_CONTEXT_CHARS) +
"\n\n[... TRUNCATED ...]";
}
Thread isolation
Each phase uses its own memory thread. This keeps MCP tool responses from earlier phases out of later conversations:
const runId = `daily-${today}`;
// Phase 1 writes to: intelligence-daily-2026-02-17
// Phase 2 writes to: competitive-daily-2026-02-17
// etc.
Tool splitting
Rather than giving every agent every tool, split by role. This prevents the combined tool schemas from exceeding Claude’s context limit:
const analyst = new Agent({ tools: superlineTools }); // 26 tools
const researcher = new Agent({ tools: brightdataTools }); // 8 tools
const contentManager = new Agent({ tools: { ...sanityTools, ...webpageTools } }); // 12 tools
Draft safety
All new articles are created with draft=true. The agent never publishes directly. A human reviews every article in Sanity Studio before publishing.
Cost estimates
Running the pipeline daily with Claude as the reasoning model:
| Component | Cost |
|---|---|
| Claude Sonnet 4 | ~$0.50-2 per run |
| Claude Opus 4 | ~$1-4 per run |
| OpenAI Embeddings | ~$0.01 per run (if semantic recall enabled) |
| Superlines MCP | Included with Starter plan ($49/mo) |
| Bright Data MCP | Free tier (5,000 requests/mo) |
| Sanity CMS | Free tier (sufficient for most use cases) |
| Total (Sonnet) | ~$15-60/month for daily runs |
| Total (Opus) | ~$30-120/month for daily runs |
Coming soon: Deploy your agent
In the next guide, we will cover deploying this pipeline as an automated daily job using GitHub Actions with a cron schedule, including secrets management, error alerting, and cost monitoring.
Get the full source code
The complete, working implementation is available as an open-source repository:
github.com/Superlines/aeo-agent — Clone, configure your .env, and run npm start.
The repo includes additional utilities not covered in this guide, such as the full markdown-to-Portable-Text converter and the fact-check claim extraction engine. Contributions welcome.
What to read next
- Superlines MCP Server Setup Guide — Connect your MCP client in under 5 minutes
- Available MCP Tools Reference — Full documentation for all 32 Superlines tools
- Mastra Documentation — Agent framework reference
- Sanity CMS GROQ Query Language — Learn the query language used by the Sanity tools
- Bright Data MCP — Web scraping MCP server documentation