BEACON / V1.1 / A SIGHTBOX PRODUCT

Make your company
legible to AI agents.

Beacon turns your website into one canonical source of truth that both humans and AI agents can read clearly, beautiful pages for visitors, clean machine-readable surfaces for Cursor, Claude, ChatGPT, and every agent that comes next.
BEACON / PRESENCE CARD
LIVE
Acme Robotics
industrial-automation · series-b · public
canonical
acme.com/llm
surfaces
human · markdown · mcp
products
Loop · Beacon · Pilot
last_sync
2 min ago
legibility 97 / 100
SAMPLE · ILLUSTRATIVE presence-card.v1
READ BY AI AGENTS ACROSS THE STACK
ChatGPT
+ 14 more agents and crawlers tracked.
01 / SURFACES
Human + Machine
One layer, two outputs.
02 / STANDARDS
llms.txt · .md
Modern agent-native formats.
03 / DOMAIN
yourcompany.com
Native, no weird subdomains.
04 / ADVANTAGE
First mover
While the rest stay invisible.
01 · THE PROBLEM

Agents are reading your site before humans are.

AI coding agents and assistants, Cursor, Claude, ChatGPT, Perplexity, now discover, evaluate, and recommend companies long before a person ever opens a tab. They read your homepage, your docs, your changelog. They form an opinion in milliseconds.

Most websites fail at this layer. They were built for a Google-results world: hero animations, lazy-loaded JavaScript, blocked endpoints, fragmented documentation. When an agent visits, it hits dead ends.

And the cost is invisible. There's no analytics row for the conversation an agent never started, the recommendation it made to someone else, or the shortlist you were quietly left off. A misread today gets cached and repeated in tomorrow's answers.

Google itself has confirmed that generative AI search and agentic discovery are becoming a primary layer between users and the open web. The companies that adapt now will own that layer.

DIAGNOSTIC / TYPICAL SITE
discoverability75 / 100
content0 / 100
bot access control50 / 100
Our live scan reads the real signals AI agents rely on, discoverability, crawler access, and structured content, and makes the gap measurable. Most company sites score poorly.
82%
SITES BLOCK AGENTS
3.4s
TIME TO GIVE UP
0
SECOND CHANCES
02 · THE SOLUTION

One structured layer.
Two audiences.

Beacon is the structured presence layer underneath your site. You write content once, in one canonical place. Beacon generates two synchronized outputs: clean human pages and clean machine surfaces.

Native integration on your own domain, no weird subdomains, no third-party redirects. Modern agent-native standards: .md companion routes for every page, an llms.txt manifest at root, structured JSON-LD baked in.

Because both surfaces are generated from the same source, they can't drift apart. Edit once; the human page and the machine surface update together. Your designed site keeps its design — Beacon is the structured ground floor it reads from, not a replacement for it.

The result: when an agent visits your site, it doesn't guess. It reads.

llms.txt .md routes structured-data native-domain canonical-source
CANONICAL SOURCE
beacon.source
01company: Acme Robotics
02category: industrial automation
03products: [ Loop, Beacon, Pilot ]
04pricing: { tiers: 3, public: true }
05docs: /docs  //mirror at /docs.md
HUMAN SURFACE
Acme Robotics
Try Beacon Docs
AGENT SURFACE
# Acme Robotics
> industrial automation
## Products
- Loop · Beacon · Pilot
## Pricing
tiers: 3 · public
## Endpoints
/llms.txt · /docs.md
ONE EDIT · BOTH SURFACES STAY IN SYNC
03 · LIVE PROOF

See exactly what agents see.

This is sightbox.co/llm, the live, agent-readable surface of our own company, running on Beacon. Every page on Sightbox has a machine companion exactly like this one.

Open the real thing
sightbox.co/llm
/ LLM VIEW · MARKDOWN RENDER

Sightbox builds the structured presence layer for AI agents.

## What we do

We design and operate Beacon, a system that turns a company's website into one canonical source of truth, readable by humans and by AI agents (Cursor, Claude, ChatGPT, Perplexity) with equal fidelity.

## Status

Active · Limited release · Working with a small number of design partners.

## Products

- Beacon: structured presence layer
- Loop: change-aware sync engine
- Pilot: agent-readiness diagnostics

## Contact

beacon@sightbox.co

✓ valid llms.txt  ·  ✓ .md mirror  ·  ✓ JSON-LD  ·  ✓ canonical
view source →

Same content as the human site. Different surface, optimized for the reader. No subdomain, no redirect, no scraping required.

04 · HOW IT WORKS

Four moves. One coherent presence.

01 / 04

Curated content foundation

We work with you to build the canonical source: company facts, products, pricing, docs, all in one structured, versioned place.

02 / 04

Automatic markdown + llms.txt

Beacon generates a .md companion for every page and an llms.txt manifest at root, automatically, on every change.

03 / 04

Native domain integration

Everything serves from yourcompany.com. No subdomains, no proxies, no third-party redirects that agents distrust.

04 / 04

Agents read clearly

Cursor, Claude, ChatGPT and every new agent that follows get the same clean, structured answer, every time they ask.

TOTAL TIME TO LIVE / ~2 WEEKS
05 · THE PROOF, MEASURED

Run the scan.
See the score.

Beacon scores any site live across discoverability, content negotiation, and bot access control, measuring how well it speaks to agents. Your score is the share of checks you pass. Sightbox.co runs Beacon and scores near the top. Most sites don't.

discoverability content bot access control
HOW BEACON SCORES
% OF CHECKS PASSED
Discoverability
3 checks
robots.txt, Sitemap, and HTTP Link headers — validated against soft-404s, not just a 200 response.
Content
1 check
Markdown for Agents — whether Accept: text/markdown returns a markdown version of the page.
Bot Access Control
2 checks
AI bot rules in robots.txt (GPTBot, ClaudeBot, PerplexityBot…) and Content Signals declaring ai-train / search / ai-input preferences.
Your overall score is simply the share of enabled checks you pass — the same model as Cloudflare's agent-readiness scan. Levels run 0–4 (Basic Web Presence → Agent-Native).
EVERY POINT TRACES TO A REAL CHECK LIVE · NO FABRICATION
FREE TOOL · 30 SECONDS

How legible is your site?

Run our diagnostic on any domain. We read the real agent-native signals that matter, llms.txt, robots.txt AI-crawler rules, sitemap, and JSON-LD structured data, and score them transparently.

3
CATEGORIES
6
REAL CHECKS
live
NO FABRICATION

No signup to scan. We ask for an email only before showing the full report, so we can send you the PDF and follow up if you want help fixing what we find.

BEACON DIAGNOSTIC
scan.v1
WE CHECK FOR
robots.txt + crawler rules
llms.txt manifest
markdown companion routes
JSON-LD coverage
sitemap.xml structure
agent crawl response
Sample: try , , or
SIGHTBOX · THE PRACTICE BEHIND BEACON
VISION
for agents®
A NEW MILESTONE IN THE SIGHTBOX PRACTICE

Sightbox has spent fifteen years helping founders turn a vague idea into a venture the market can see. We call it narrative architecture, the practice of building the story, the structure, and the surfaces that make a company legible to the humans it wants to reach.

The audience has changed. The agents are here. They read the site, the docs, the changelog, the pricing page. They form an opinion. They make recommendations. They route customers.

Beacon is narrative architecture for the agent layer. The same practice, discipline, structure, voice, applied to a new reader. Vision for venture, extended to vision for agents.

06 · WHO THIS IS FOR

Built for teams that already care about clarity.

Beacon doesn't fix bad content. It amplifies good content for a new audience. If your team already invests in documentation and developer experience, you're our person.

/ 01
Devtool companies

Where the buyer is an engineer, often arriving through an agent recommendation.

/ 02
Documentation-heavy products

If your docs are your front door, they should be readable by every agent that knocks.

/ 03
AI-forward brands

Companies betting their distribution on the agent layer, and acting like it.

/ 04
Teams that ship DX

If you sweat the details for developers, you'll sweat them for agents too.

07 · PRICING

Three ways to get started.

We're still shaping pricing with our first partners. The ranges below are relative, exact figures are scoped on our call.

TIER / 01
Foundation
4–6 WK

A solid, production-ready Beacon layer, quickly, with minimal ongoing commitment.

$ $ $
ENTRY · ONE-TIME
  • Full structured knowledge base
  • llms.txt + agent-optimized markdown
  • Basic read-only MCP endpoint
  • 1–2 primary surface integrations
  • 90-day content sync support
  • Team training
Start with Foundation
RECOMMENDED
TIER / 02
Beacon + MCP
6–8 WK

For teams serious about staying ahead of agent-driven discovery, production MCP, continuous sync, ongoing optimization.

$ $ $
MOST POPULAR · ONE-TIME + MONTHLY
  • EVERYTHING IN FOUNDATION, PLUS
  • Production MCP server (auth + rate limits)
  • Custom agent surfaces & query handling
  • Automated sync pipeline
  • Monthly performance reports
  • Priority support + quarterly reviews
  • Ongoing optimization as standards evolve
Beacon's value compounds. The monthly fee covers continuous freshness, MCP reliability, monitoring, and updates as the agent layer evolves.
Start a pilot
TIER / 03
Enterprise
CUSTOM

Maximum agent leverage, heavy customization, internal KB exposure, analytics, SLAs, or white-label.

$ $ $
CUSTOM · SCOPED TO YOU
  • EVERYTHING IN BEACON + MCP, PLUS
  • Internal knowledge-base exposure
  • Advanced analytics & usage telemetry
  • SSO, SLAs, security review
  • White-label options
  • Dedicated partner
Talk to us
08 · FREQUENTLY ASKED

Honest answers to the questions we actually get.

Will this hurt SEO or confuse Google?

No. Beacon adds clean machine surfaces alongside your human pages with proper canonical tags and structured data. Google's own guidance points toward this direction, generative search and agent discovery are now first-class layers.

Do we have to rebuild our site?

No. We integrate with your existing stack, Next.js, Astro, Webflow, Ghost, Sanity, Contentful, custom CMS. Beacon sits as a structured layer underneath; the public site keeps its current design.

What is llms.txt, exactly?

A simple, agent-native manifest served at the root of your domain that tells AI agents what your company is, where the canonical content lives, and how to read it. Think of it as a sitemap.xml for the agent era.

How long until we see results?

Beacon scores move within days of going live. Behavioral impact, agents pulling your content into answers, typically shows up within 4–8 weeks as crawlers re-index.

Who owns the content?

You do, entirely. The canonical source is exportable as plain markdown and structured JSON whenever you want it. No lock-in.

Can we block specific agents?

Yes. Per-agent allow/deny policies are first-class. Beacon respects robots.txt, supports per-bot rules, and gives you an audit trail of who read what.

09 · START

Be the answer.
Not the dead end.

We're working with a small number of companies right now. If your team cares about being legible to the agent layer, and ahead of the people who aren't yet thinking about it, let's talk.

PILOT COHORT · Q3 2026
6 / 10
slots filled
We only work with a small number of companies at a time. Selection by fit, not first-come.