GIG Gulf · Content + GEO Audit · Pipeline & Tool Gaps

Every audit, in order

Four stages — Download → Index → Save to DB → Analyse. Status measured against work done this session. Benchmarked against the Visible proposal (the standard GIG is being pitched).

Benchmark note: Visible (Gilles Praet) is pitching GIG a $3.5k–$9.5k/month GEO+SEO retainer. Their moat is one thing we don't yet have: a 100-prompt × 4-LLM AI-visibility measurement. Everything else in their proposal, we can match from first-party data we now hold. The pipeline below closes the gap.

Stage 1 · Download — acquire raw data

#	Audit / pull	Source	Status
1	Full site crawl + URL inventory	Firecrawl sitemap	DONE 1,765 URLs
2	Blog corpus scrape — 131 posts → records	Firecrawl	PARTIAL index ✓, 1/131
3	Document-library inventory — 527 PDFs	Firecrawl + PDF tools	TODO
4	GSC export — pages · queries · dates (per market)	Search Console	PARTIAL UAE 16mo ✓
5	GSC decay export — 3mo vs prior 3mo (fatigue input)	Search Console compare	TODO
6	GA4 export — page engagement + quote/conversion events	GA4	PARTIAL live access + multi-market verified; per-page export pending
7	Google Ads export — paid keywords + landing pages	Google Ads	TODO
8	Competitor crawl — Sukoon · Tawuniya · Salama · Orient + Lemonade/Hippo	Firecrawl (no access needed)	TODO
9	AI-visibility capture — prompts × LLMs (ChatGPT·Claude·Google; Perplexity/Gemini when keyed)	LLM APIs (built in-house)	DONE tool built + baseline run (20 prompts ×3)
10	SERP + AI-Overview presence capture	Serper (UAE geo)	TODO
11	Backlink / authority profile	Ahrefs	BLOCKED plan tier

Stage 2 · Index — structure & classify the raw

#	Audit / step	Status
12	URL taxonomy — classify all 1,765 by type · LOB · language · funnel-stage · format	TODO
13	Blog record population — 22-field schema per post	schema proven
14	Date-stamping pass — publish/updated dates	TODO
15	Query classification — brand vs non-brand · intent · LOB · market	DONE 80/20
16	Prompt → citation index — which LLM cited which page, per prompt (GEO)	TODO (needs #9)

Stage 3 · Save to database — persist a queryable grid

#	Audit / step	Status
17	Airtable master audit base — pages × metrics × scores × disposition	TODO (Airtable live)
18	Per-URL metrics join — GSC + GA4 + Ads onto the inventory	GSC in hand
19	Fatigue index compute — age + decay + cannibalisation + CTR-gap + repetition	inputs partial
20	Richness scoring — the 10-element /20 per asset	TODO (rubric ready)
21	GEO phase scoring — Category Formation → Attribute Recall → Competitive Selection → Trust, per LOB/market	TODO (needs #9)

Stage 4 · Analyse — turn data into the plan

#	Audit / analysis	Status
22	Traffic concentration + dead-inventory	DONE top5=68%, 37%<10
23	Category ROI — clicks per post by lane	DONE Motor 897 vs Travel 86
24	Brand vs non-brand demand gap	DONE 80/20
25	CTR quick-wins — high-impression, low-CTR giants	STARTED
26	Cannibalisation + technical-junk (duplicate URLs, language-switcher index bloat)	STARTED
27	Schema / AI-answer readiness audit (JSON-LD, FAQPage, llms.txt)	TODO
28	Competitor gap — volume · format · topics · share-of-voice	TODO
29	AI-citation share — the GEO headline metric	DONE 83% overall / 92% category (baseline)
30	Multi-market + Arabic parity	STARTED Arabic dark
31	Gap → 6-month roadmap + needle-movers	DRAFTED

Missing tools — what to buy or build

Capability	Status / gap	Fix
AI-visibility measurement (100 prompts × 4 LLMs, citation tracking)	Not wired. This is Visible's entire moat.	BUILD — we have the LLM access (Claude, GPT, Gemini, Perplexity via API/OpenRouter) + Serper. ~1–2 days. Or BUY: Profound / Peec AI / otterly.ai.
GA4 Data API (scripted pulls)	Service-account JSON not captured	Generate in GCP (~15 min). UI export works for now.
Backlinks / keyword / competitor traffic	Ahrefs key valid but plan tier blocks API	Upgrade Ahrefs to an API tier, or Semrush.
Technical-SEO crawler at scale	Firecrawl scrapes; it's not a schema/indexation/redirect crawler for 1,765 URLs	Screaming Frog or Sitebulb (~$200/yr).
Multi-market GSC + GA4	Only UAE property reachable	GIG grants .bh / .om / .qa / .sa properties.
llms.txt + schema validators	None	Scriptable, no purchase.

Everything else the audit needs is keyed and working: Firecrawl, Serper, Airtable, the LLM stack, document/visual production, Cloudflare/Vercel hosting.

Articulate AI for GIG Gulf · Audit pipeline v1 · benchmarked vs Visible proposal (_competitor/visible-proposal-2026-05-14.pdf). Raw GSC saved to _gsc/.