Team · Stack · Partners
Built by people you can actually message back.
Three people in Bali, the best AI stack money can rent, and a network of protocol partners building the open agent economy. No fake names. No fake logos. No fake numbers.
3
Humans on the team
12
AI models in the stack
4
Protocol partners
1
Founder you can DM
The founders
Three founders. One bet.
Small enough that you'll talk to a founder on day one. Experienced enough that we've shipped to production before.
Agent Architecture
Ernesto
const ernesto = {
}
// Reading Postgres EXPLAIN plans
Infrastructure
Deniz
const deniz = {
}
// Webhook idempotency
Want to be number 4?
We're hiring engineers who like agents, protocols, and Bali.
The brain
12 models. One router. One chat.
We route every request to the model that actually does it best. Frontier reasoning for hard questions, open weights for cost, dedicated infra for speed. The user never knows which brain answered — they just get the answer.
Tier 1 · Frontier
Claude
Primary reasoning · default brain
Tier 1 · Frontier
GPT-5
Tool calling · code · fallback
Tier 1 · Frontier
Gemini
Long context · vision · live data
Tier 2 · Open & specialized
Llama
Open baseline
Mistral
Multilingual edge
Hermes
Open agent fine-tunes
DeepSeek
Cost-efficient reasoning
Qwen
Asian language coverage
Tier 3 · Inference & compute
Groq
Sub-second inference
Together AI
Open model hosting
Perplexity
Live web grounding
NVIDIA
GPU compute
“dinner Sat 6pm, sunset table”
reasoning
LATENCY
247ms
TOKENS
1432
COST
$0.00021
MODEL
Claude
REGION
ap-se-1
dinner Sat 6pm, sunset table
avg latency
198ms
cost / 1k
$0.14
The hands
Memory. Retrieval. Tool use that actually ships.
A model alone is just a chatbot. The hard part is everything around it — long-term memory, hybrid retrieval, embeddings, reranking, multi-agent orchestration. We use the best tool for each layer instead of trying to build it all ourselves.
agnt://memory · pipeline
Every chat makes it smarter about you.
Message
Embed
Store
Retrieve
Reply
Input: "I like sunset dinners"
1/5 stagesLangChain
Agent orchestration
LangGraph
Stateful graphs
LlamaIndex
RAG pipelines
Mem0
Long-term memory
Pinecone
Vector search
Voyage AI
Best-in-class embeddings
Cohere Rerank
Result reranking
Hugging Face
Model hub
The network
Four pillars. One open agent stack.
AGNT runs on open agent infrastructure that we either built ourselves or contribute back to. Identity, reputation, messaging, runtime — every layer is open and verifiable. No black boxes.
Identity
OpenClaw
Agent DNA & passport.
The open identity layer for agents. Every agent on AGNT has an OpenClaw DNA — a verifiable identity, signing key, and capability manifest. Like a passport for software that acts on your behalf.
Reputation
NemoClaw
Built on NVIDIA NeMo.
The reputation graph. Every completed booking, tool call, and rating feeds NemoClaw — a NVIDIA NeMo-powered model that scores agent trustworthiness so users (and other agents) know who to deal with.
Messaging
ClawPulse
The A2A messaging server.
Our open A2A messaging server. Routes envelopes between agents in real-time, handles delivery guarantees, retries, fallbacks, and discovery. The TCP/IP of agent-to-agent communication — built by us, open to anyone.
Runtime
Hermes Agents
Open agent fine-tunes.
Open-weight agent runtime from Nous Research. Hermes models are fine-tuned specifically for tool calling, structured output, and long-running agent loops — what we use when we need to run agents off the frontier APIs.
Built by AGNT · open to anyone
ClawPulse is our A2A messaging server. Open protocol, open server, MIT license. Run your own instance or use ours.
The compute · we own the silicon
We don't rent GPUs. We own them.
While everyone else queues for OpenAI quota, we run our own NVIDIA fleet — Blackwell, Hopper, Ampere. 96 chips, 9.4 TB of HBM, 32 PFLOPS of inference. The agent layer doesn't share.
96
NVIDIA GPUs
9.4 TB
HBM memory
32 PFLOPS
FP8 inference
0ms
Queue time
B200
Frontier reasoning · multi-modal
HBM
192 GB
B/W
8 TB/s
FP8
20 PFLOPS
H200
Long-context inference · 1M tokens
HBM
141 GB
B/W
4.8 TB/s
FP8
4 PFLOPS
H100
Production inference · the workhorse
HBM
80 GB
B/W
3.35 TB/s
FP8
2 PFLOPS
GH200
Vector DB + LLM fused on one die
HBM
144 GB
B/W
4 TB/s
FP8
1 PFLOPS
A100
Embeddings · fine-tuning · RAG
HBM
80 GB
B/W
2 TB/s
FP8
0.6 PFLOPS
L40S
Vision · speech · diffusion
HBM
48 GB
B/W
864 GB/s
FP8
0.7 PFLOPS
Why it matters: When OpenAI throttles, we don't. When Anthropic queues, we don't. The agent layer runs on metal we own — colocated in Singapore + Jakarta. Sub-50ms to every venue in SEA.
The plumbing
Boring tech. That actually works.
The unsexy stuff we don't have to think about so we can focus on the agent layer.
Vercel
Edge runtime
AWS
Compute + storage
Supabase
Postgres + auth
Cloudflare
Edge cache + DDoS
Stripe
Subscriptions + payouts
Twilio
WhatsApp Business API
Datadog
Metrics + tracing
Sentry
Error observability
Real team. Real stack. Real product.
The fastest way to verify any of this is to use the thing. Two minutes in WhatsApp will tell you more than any pitch deck.