Enrich venue data with Perplexity search
Use Perplexity's search-augmented API to automatically update venue descriptions, reviews, and operating hours in AGNT's knowledge graph.
AGNT's 149 venues need fresh data — opening hours change, new reviews appear, menus rotate. This guide builds a data enrichment pipeline that uses Perplexity's search-augmented generation to fetch the latest information for each venue and write it back to AGNT's knowledge graph automatically.
Prerequisites
- AGNT developer API key.
- Perplexity API key.
- Python 3.10+ with requests.
What you're building
A data enrichment pipeline with three stages: (1) pull a venue from AGNT's database, (2) query Perplexity for the latest reviews, operating hours, and description using search-augmented generation, (3) write the structured result back to AGNT's knowledge graph via the REST API. Wrap the whole thing in a scheduler so it runs weekly without manual intervention.
The key insight is that Perplexity's Sonar model doesn't just generate text — it searches the live web first, then generates a grounded answer with citations. This means your venue data stays current without anyone manually Googling each restaurant.
Step 1 — Get a Perplexity API key
Sign up at [perplexity.ai](https://www.perplexity.ai/) and navigate to the API section. Generate an API key. Perplexity's API is OpenAI-compatible — you use the same `Authorization: Bearer` pattern and the same chat completions endpoint shape.
The model you want is `sonar` — it's the search-augmented model that grounds its answers in live web results. The non-search models (like `sonar-reasoning`) are useful for analysis but don't have web access, which defeats the purpose of this enrichment pipeline.
Step 2 — Build the enrichment query
For each venue, craft a Perplexity prompt that asks for structured, factual information:
import requests
PERPLEXITY_API_KEY = "pplx-..."
def enrich_venue(venue_name: str, area: str, city: str = "Bali") -> dict:
resp = requests.post(
"https://api.perplexity.ai/chat/completions",
headers={"Authorization": f"Bearer {PERPLEXITY_API_KEY}"},
json={
"model": "sonar",
"messages": [
{
"role": "system",
"content": "Return JSON only. No markdown. Fields: opening_hours (string), description (2 sentences max), recent_reviews (array of {source, snippet, rating}), last_updated (ISO date)."
},
{
"role": "user",
"content": f"What are the current opening hours, a brief description, and the latest Google/TripAdvisor reviews for {venue_name} in {area}, {city}?"
}
],
"temperature": 0.1,
},
)
return resp.json()["choices"][0]["message"]["content"]The low temperature (0.1) keeps the output factual and consistent. The system prompt enforces JSON-only output so you can parse it reliably. Perplexity's search grounding means the response includes real data from Google Maps, TripAdvisor, and other sources — not hallucinated hours.
Step 3 — Update AGNT's knowledge graph
Take the parsed Perplexity response and write it back to AGNT's venue metadata via the PATCH endpoint:
import json
AGNT_API_KEY = "ag_..."
def update_venue_metadata(venue_id: str, enrichment: dict) -> None:
parsed = json.loads(enrichment) if isinstance(enrichment, str) else enrichment
requests.patch(
f"https://api.agntdot.com/api/venues/{venue_id}",
headers={
"Authorization": f"Bearer {AGNT_API_KEY}",
"Content-Type": "application/json",
},
json={
"opening_hours": parsed.get("opening_hours"),
"description": parsed.get("description"),
"recent_reviews": parsed.get("recent_reviews", []),
"enrichment_source": "perplexity-sonar",
"enrichment_date": parsed.get("last_updated"),
},
)The `enrichment_source` field is important — it tells downstream consumers that this data came from an automated pipeline, not a human edit. AGNT's knowledge graph tracks provenance so you can audit which data came from which source.
Step 4 — Schedule automated enrichment
AGNT's backend uses APScheduler for recurring jobs (see `agnt-backend/app/core/scheduler.py`). Add a weekly enrichment job that iterates over all venues:
from apscheduler.schedulers.asyncio import AsyncIOScheduler
async def run_venue_enrichment():
venues = await fetch_all_venues() # GET /api/venues?limit=200
for venue in venues:
try:
enrichment = enrich_venue(venue["name"], venue["area"])
update_venue_metadata(venue["id"], enrichment)
except Exception as e:
logger.error(f"Enrichment failed for {venue['name']}: {e}")
await asyncio.sleep(2) # respect Perplexity rate limits
scheduler = AsyncIOScheduler()
scheduler.add_job(run_venue_enrichment, "cron", day_of_week="mon", hour=3)
scheduler.start()The 2-second sleep between venues respects Perplexity's rate limits. At 149 venues, the full enrichment run takes about 5 minutes. Schedule it for Monday 3 AM when traffic is lowest. The scheduler logs every enrichment result so you can review what changed in the weekly delta log.
Why this matters
Venue data goes stale fast. A restaurant changes its hours, a cafe gets a bad review, a bar closes for renovation. In the old world, someone on the team had to manually research each venue and update the database. With Perplexity's search grounding, the AI does the Googling — and it does it with citations, so you can verify the source.
AGNT stores the structured result in the knowledge graph, and every downstream consumer benefits: the venue discovery engine returns accurate hours, the booking agent knows when a venue is open, and the calorie scanner has up-to-date menu information. One enrichment pipeline feeds the entire stack.
The cost is negligible. Perplexity's Sonar model costs a fraction of a cent per query. Enriching 149 venues once a week runs about $0.50 total. Compare that to a human researcher spending hours on Google Maps — the ROI is immediate and permanent.