ARTICULATE · MARKETING INTELLIGENCE

GIG Gulf — Review Intelligence

1,000 customer reviews · 09 Mar 2026 – 31 May 2026 · labelled on local qwen3:14b ($0) · analysis: Python (deterministic)

4.48/5

Mean rating

78.2%

Positive sentiment

16.5%

Negative (±2.3%)

194

Actionable complaints

PILOT · 1,000 of ~37,000 reviews · proof of method

Executive summary

GIG's ~37,000 reviews are an extraordinary, hidden and unused asset — insight, content and action sitting idle behind a star average. This pilot proves it: 1,000 reviews, labelled in 18 minutes on a local model at zero cost, already surface things the 4.48/5 headline hides. The real moat is human service and speed; the fixable money is in renewal and price-value perception, not claims; and at ~362 new reviews a month this is a live feed of customer truth nobody is mining. Scaled to all 37k — and run live — it becomes a standing engine for content, SEO/GEO, service recovery and positioning.

Three findings — actionable & surprising

1 · The star average can't see the unhappy "satisfied" customer. The headline is 4.48/5. But the text contradicts the score on 55 reviews that gave 4–5★ while writing clearly negative comments (40 at 4★). Watch only the average and those complaints are invisible. Open question for GIG: is anyone already reading the 4★ text? And whether these customers churn needs a renewal-outcome join we don't yet have — this is latent risk, not measured churn.

2 · Claims isn't the problem — renewal and price are. Counter to the insurer reflex, claims is only 32/1000 of the conversation and mostly fine. The fixable complaints concentrate in renewal (45 actionable, the largest queue) and pricing/value — where 70.0% of every mention is an actionable complaint, the most concentrated grievance in the book.

3 · The moat is speed + human service — and it's evidence-backed. 45% of all reviews are about reps and turnaround, overwhelmingly positive. The positioning isn't aspirational — 1,000 real customers already say it. (Bonus: detractors write 40.3 words to a promoter's 9.9 — complaints are long and specific, so they're easy to mine.)

What this export can't tell us — yet

Said plainly, so nothing here is mistaken for more than it is. Each gap has a cheap fix.

Question	Why not	Fix
How many reviews got a reply / were resolved?	No reply/resolution field in this export (6 fields only)	Re-export from eKomi with merchant-reply + reply-timestamp
How fast does GIG see & action them?	Internal cadence isn't in review data	Ask the team — and the live alert (§10) makes it measurable
Did the unhappy customers actually churn?	No retention/renewal outcome here	Join `order_id` to the renewal/policy system
Which product line is each review?	`order_id` encodes it, but we lack the codebook	GIG's product-code mapping (§15)

1 · The shape of the book

Rating distribution

J-curve: a wall of 5★ (726) with a hard 1★ pocket (55) larger than 2★+3★ combined — lovers and a committed-detractor minority, not a soft middle.

Sentiment (from text)

78.2% positive (±2.6%), 16.5% negative (±2.3%) at 95% confidence, n=1000.

2 · What customers actually talk about

Theme frequency

"Representative service" and "speed/efficiency" dominate the conversation. "Other" (194) is inflated by the 29.2% of reviews that are ≤3 words — tighten the taxonomy before the 37k run.

Theme × sentiment

Where the red concentrates is where to act. Service and speed are praise engines; renewal, pricing and digital carry the visible negativity.

3 · Where the fixable money is

Actionable-complaint density by theme

% of each theme's reviews that name a concrete, fixable problem. By density, pricing, coverage and claims top the list; by volume the most actionable complaints sit in renewal (45) and pricing (35). Read both: density tells you where a theme is mostly grievance, volume tells you where the queue is longest.

4 · The metric is lying to you

Rating vs text sentiment

The model reads the words, not the stars. 1–2★ are 100% negative (label sanity check passes), but 40 reviews scored 4★ while writing a clearly negative review and 15 did so at 5★ (a further 47 read neutral). This "satisfied-but-complaining" cohort is churn risk a CSAT score can't see.

5 · Movement over the window

Weekly volume & mean rating

Weekly % negative

Cadence: ~11.9 reviews/day (median 11.9, peak 23), ~83/week, ~362/month, across 82 of 84 days. At ~83/week, this 1,000-review pilot ≈ the most recent 12 weeks — i.e. effectively the latest reviews, not a sample. ~12 weeks reads direction only; too short for seasonality or YoY.

6 · Statistical footing — what's valid

Method	Valid here?	Note
Descriptive distributions + proportion CIs	Yes	n=1000 → ±~3% on a proportion at 95%
Chi-square: theme × sentiment / rating × theme	Yes	Small themes (claims n=32) = low power — caveat
Rating↔text-sentiment concordance	Yes	Doubles as label-quality validation (clean monotonic)
Length × sentiment test (H5)	Yes	Short reviews 90.4% positive vs long 73.2%
Weekly control charts	Limited	12 weeks; direction only, no seasonality
Regression implying causation	No	Observational, self-selected reviewers
True NPS / product / branch cuts	No	No 0–10 scale, no product/branch field (needs GIG join on order_id)

7 · Hypotheses (testable)

H1 — Service & speed are the moat, not price. Praise clusters on reps/turnaround; price is the top complaint. → Test against the market head-to-head.

H2 — The renewal journey is the primary fixable detractor driver. Highest actionable count sits in renewal. → Trace the renewal cohort end-to-end.

H3 — A "satisfied-but-complaining" 4★ cohort may mask churn (unproven). 40 four-star reviews carry clearly negative text — latent risk, not measured churn. → Join order_id to renewal outcome to test it.

H4 — Price complaints cluster around renewal. Weak in this sample: of 108 reviews mentioning "renew", only 7 are themed pricing — so this is a hypothesis to test on the 37k, not a finding. Don't assume it yet.

H5 — Happy customers write less (effort asymmetry). [testable now] Short reviews are 90.4% positive vs 73.2% for longer ones; mean length 9.9 words (positive) vs 40.3 (negative). Negativity is longer and more specific — which is exactly why it's actionable.

H6 — Arabic-language customers weight themes differently. n=22 here — flag, don't conclude; revisit at 37k.

8 · Ten things to do with this — a Marketing Director's list

SEO — review schema. Publish AggregateRating (4.48/5, 1,000+ reviews) + Review structured data → star rich-snippets in Google, higher organic CTR on the money pages.
GEO / AEO. Feed theme FAQs + verbatims into answer-engine content so GIG is the cited source when AI assistants answer "best motor insurance UAE" — ties directly to the BetterTribe answer-engine work.
Content engine for Slipstream. The actionable themes (renewal clarity, price explanation) are the exact FAQ/blog pages to build — written in the customers' own words, which are the keywords.
Testimonial pipeline. ~692 clean 5★-positive verbatims → GIGGulfReels scripts + landing-page social proof, pre-scored.
Service recovery. The 77 one/two-star detractors + the 40 hidden-negative 4★ → a prioritised win-back queue (join order_id to contact).
Ops feedback loop. Hand claims/renewals a ranked fix-list (renewal 45, pricing 35 actionable) with frequency + trend — not anecdotes.
Retention scripting. Arm renewal calls with GIG's proven strengths (speed, rep quality) and pre-empt the price objection with the value framing customers respond to.
Paid & landing copy. Voice-of-customer language as ad headlines; negative themes become objection-handling and audience exclusions.
Competitive positioning. Run the market head-to-head (next) → a "where GIG wins / where it's exposed" battlecard grounded in real review evidence.
Exec early-warning dashboard. Weekly sentiment + theme monitor on the five velocity metrics; alert when any actionable theme's share rises. Re-runs at $0 on local model.

Bonus: the labelled corpus is in-tenant grounding data for a GIG support/FAQ assistant — PDPL-safe because nothing leaves GIG.

9 · Next: the market head-to-head

This report is GIG's own voice. The head-to-head needs a matched sample of competitor reviews — same pipeline, same taxonomy, same local model — then GIG-vs-market crosstabs on rating, sentiment, theme mix and actionable density. That's a data-acquisition step (public Trustpilot for named UAE motor insurers), not an analysis step. Decision needed: the competitor set and the go to source it.

10 · Live negative-review alerting

The pilot is a snapshot. The same pipeline runs live — catching every at-risk review within hours of it posting, at $0.

How it works on the kit you already own:

Pull — a scheduled job collects new eKomi reviews since the last run.
Classify — local qwen3:14b labels each (theme · sentiment · actionable · reason). No data leaves GIG; $0 per run.
Filter — flag anything where sentiment = negative, rating ≤ 2, or actionable = true.
Alert — push to a channel (ntfy / email / Slack) with the verbatim, star, theme, order_id and a one-line suggested action + draft reply.

Immediate push: any 1★, or a claims/renewal-themed negative — the ones where response time changes the outcome.

Daily digest: everything else, batched — plus a flag when any actionable theme's share rises week-on-week (early warning).

To go live: automate the eKomi pull (API or scheduled export). Classify → filter → alert is already proven (it's what produced this report). Can be prototyped against the existing export this week.

11 · GEO citation plan — get GIG cited by the AI engines

When someone asks ChatGPT, Gemini, Perplexity or Google's AI Overview "best car insurance in the UAE" or "is GIG any good", GIG should be the named, cited source. The review corpus is the raw material — customers have already written the questions and the proof.

Mine the questions. The themes are the search queries — renewal, price/value, claims speed, the app. Convert each cluster into the real question a customer types.
Publish answer-shaped pages. Question as the H1; a tight 40–60-word direct answer first; detail below. Answer engines extract the top block — write for that.
Mark it up. FAQPage + AggregateRating (4.48/5 from 1,000+ verified reviews) + Review schema → rich-result eligibility and clean machine extraction.
Anchor the entity & authority. Consistent Google Business / Wikidata entity; cite the real evidence ("based on 1,000+ verified GIG customers"); earn mentions in UAE finance media so engines see corroboration.
Distribute multi-surface. The same Q&A feeds the explainer video (next), the help centre, and LinkedIn — engines weight repeated, corroborated answers.
Measure. A monthly probe set of the target questions across the four engines; track whether GIG is named/cited and share-of-voice vs competitors. (Ties to the BetterTribe answer-engine work.)

12 · Explainer video — GIG's "Autocomplete" (worked example)

Model it on WIRED's Autocomplete Interview: a host peels question cards off a board and answers them. GIG's version — "GIG Answers the Web's Most-Searched Car-Insurance Questions" — sources its cards from the actual review complaints. The grievance becomes the FAQ; the FAQ doubles as the GEO content above. One asset, three jobs: trust-building video, social cut-downs, and AI-citation fuel.

Worked example — the pricing finding, traced end-to-end. Note: price isn't the most frequent theme (service is) — but when customers complain, it's the most concentrated grievance (70.0% of pricing mentions are actionable). So we answer it head-on instead of hiding from it.

Card	VO answer (15–25s each)	On-screen / visual
"Why is GIG car insurance so expensive?"	"It usually isn't — it's priced to your car, your history and your cover. Here's the three things that move your premium, and two you can change today to bring it down."	Premium broken into 3 bars; the 2 controllable ones highlight green.
"How do I renew my GIG car insurance?"	"Sixty seconds. We send your renewal before it lapses — confirm, pay, done. Here's exactly where to tap, and what to check before you do."	Phone screen, the renewal tap-path; a "check your no-claims" callout.
"Is GIG car insurance actually any good?"	"Judge for yourself — 4.48 out of 5 across 1,000+ verified customers, and what they mention most is fast, human service. Not our words; theirs."	Rating badge; 3 real (anonymised) 5★ verbatims animate in.

Pattern: each card = one real customer question, answered in <25s, structured so an AI engine can lift the answer verbatim. Produce with the existing Higgsfield pipeline (per GIGGulfReels). The pricing card turns the book's sharpest complaint into a transparency moment — the fastest trust win in the data.

Generated examples — produced from this pilot

1 · Testimonial reel. Script — a verbatim 5★ review: "I didn't need to do anything. Everything went smoothly, and my policies renewed automatically."

2 · Pricing card (Autocomplete). Script: "Why's GIG so expensive? Honestly, it usually isn't. Your premium comes down to three things: your car, your history, and the cover you choose. And two of those, you can change today to pay less."

Disclaimers. Concept demonstrations — not published GIG assets. The on-screen spokesperson is an AI-generated character (the "Arjun" Soul model) — not a real GIG customer or employee; no real person's likeness is used. The testimonial words are a genuine verbatim 5★ review, but the face and voice are synthetic and are not the actual reviewer. The pricing answer is an illustrative script, not GIG's real rating logic — it must be fact-checked and compliance-approved (CBUAE/marketing) before any external use. AI-generated video may contain visual artefacts; audio and lip-sync are not yet QA'd.

13 · Data extraction & processing

End-to-end pipeline. Every step is reproducible; the labelling and analysis cost $0 and no customer data left GIG-controlled hardware.

Source — eKomi. GIG's review platform. Export delivered as reviews-2026-06-02.csv (1,000 records) on 2 Jun 2026. ⚠ Selection rule not documented — window is 09 Mar 2026–31 May 2026; treat as a recent slice, not a verified "latest 1000".
Transfer. CSV shipped from the Cowork sandbox to the Mac Studio over an ephemeral Tailscale tunnel (one-shot SSH). No third-party processor touches the data.
Labelling — local LLM. qwen3:14b via on-device Ollama (localhost:11434), thinking-mode off, temperature 0, JSON-constrained to the fixed taxonomy. 1,000 reviews in 18.5 min, 0 errors, $0 (≈$1.35 if run on a frontier API).
Validation. Rating↔text-sentiment concordance checked: 1–2★ map 100% to negative; the divergence at 4–5★ is a genuine signal, not noise. 30-row human spot-check provided for a formal accuracy score.
Analysis — Python (deterministic). All distributions, crosstabs, CIs and the weekly series computed in stdlib Python — no LLM in the maths. Figures are computed at build time, never transcribed.
Report. This self-contained HTML, regenerated from the labelled CSV on every run.

PII / PDPL: review_screen_name holds customer names and order_id ties to a policy — personal data. Kept on GIG-side hardware throughout; must be access-controlled and redacted before any external mirror.

14 · Data dictionary — every field

Field	Origin	Description	Type	Populated
`review_date`	source	Unix epoch (UTC), review submitted	integer ts	100.0%
`order_date`	source	Unix epoch (UTC) of the underlying policy order	integer ts	100.0%
`review_rating`	source	Star rating 1–5	int 1–5	100.0%
`review`	source	Free-text review body (EN + ~2% AR)	text	100.0%
`order_id`	source	Structured policy ref — encodes account, line code, product letter, policy no.	structured str	100.0%
`review_screen_name`	source	Customer display name — PII (PDPL)	text	100.0%
`ai_theme`	derived	One of 9 taxonomy themes	enum	100.0%
`ai_sentiment`	derived	positive / neutral / negative (from text, not stars)	enum	100.0%
`ai_actionable`	derived	True = names a concrete, fixable problem	bool	100.0%
`ai_reason`	derived	≤12-word rationale for the label	text	100.0%

Six source fields from eKomi; four derived by the local model. Note order_date ≠ review_date — the gap between buying and reviewing is itself analysable (e.g. time-to-review).

15 · Decoding `order_id` — the product key

Structure: {account}_{line}/{product}/{policy_no}/… — e.g. 112661_13/VA/911401827/0/0. Parsed cleanly on 98.9% of rows. This is the join key to product line — GIG's codebook turns it from inference into fact.

Product letter — 12 distinct

Code	n	%
`VA`	483	48.3%
`VP`	226	22.6%
`ZT`	148	14.8%
`FR`	82	8.2%
`VN`	19	1.9%
`BS`	15	1.5%
`VF`	5	0.5%
`VX`	4	0.4%
`VM`	4	0.4%
`VI`	1	0.1%
`EC`	1	0.1%
`PL`	1	0.1%

Line code (_NN) — 8 distinct

Code	n	%
`13`	672	67.2%
`60`	104	10.4%
`65`	99	9.9%
`75`	68	6.8%
`12`	27	2.7%
`14`	10	1.0%
`66`	8	0.8%
`67`	1	0.1%

Account / prefix — 13 distinct (top 8)

Code	n	%
`112661`	385	38.5%
`98996`	167	16.7%
`112641`	165	16.5%
`119673`	145	14.5%
`119732`	49	4.9%
`119667`	24	2.4%
`150309`	18	1.8%
`112666`	12	1.2%

Are they all motor? No. On the heuristic that a V-prefixed product letter = vehicle/motor: 742 (74.2%) likely motor, 247 (24.7%) other lines (ZT, FR, BS, EC, PL…), 11 unparsed. So ~three-quarters motor, a real non-motor quarter. The V=vehicle read is mine, not confirmed — GIG's code mapping makes it definitive, after which the whole analysis can be cut per product line.

16 · Classification taxonomy

Themes (9, single-label, model picks one): representative service · speed/efficiency · renewal process · digital app/website · pricing/value · communication clarity · coverage/product · claims handling · other.

Sentiment (3): positive · neutral · negative — read from the review text, independent of the star rating.

Actionable (bool): true only when the review names a concrete problem GIG could fix — this is what separates a gripe from a steer.

Guard-railed: any off-taxonomy label is forced to other/neutral rather than allowed to drift, so every row is comparable and auditable. Known weakness: the 29.2% of ≤3-word reviews inflate other — tighten the theme list (or add a "too-short-to-classify" class) before the 37k run.

Generated from reviews-labelled.csv · all figures computed in Python · charts: Chart.js · Articulate for GIG Gulf