# Your AI Studio — Complete Technical Reference

> Enterprise AI platform for AI phone calls, live video avatars, voice cloning, AI twins, and conversational AI. Australian-built, deployable on your own infrastructure with multi-tenant isolation.

This file is the long-form LLM-friendly corpus for Your AI Studio. It is intentionally written in clean, declarative prose so language models can quote it accurately. For the short overview, see [/llms.txt](/llms.txt). Canonical site: https://youraistudio.au

## About

Your AI Studio is an Australian enterprise AI media and communications platform. It is operated from Australia and serves customers worldwide. It provides production-grade AI phone agents, interactive video avatars, voice synthesis, digital AI twins, and a visual conversational AI flow builder.

- Domain: https://youraistudio.au
- Sales: sales@youraistudio.au · +61 474 491 399
- Book a demo: https://youraistudio.au/book

## Architecture

### Dual-server topology
- **Frontend**: React 19 + TypeScript + Vite 8 (port 5000 in development)
- **Backend**: Hybrid FastAPI + Flask via WSGIMiddleware (port 3000)
- **Database**: Supabase managed PostgreSQL with Row-Level Security on every table
- **Real-time**: Native FastAPI WebSockets, LiveKit for bidirectional audio
- **Telephony**: Twilio (inbound TwiML, outbound REST, programmable voice + SMS, AMD)

### Why hybrid FastAPI + Flask
The platform is mid-migration from a Flask monolith to FastAPI. FastAPI handles ~135 routes including all real-time WebSocket streams, async LLM streaming, async TTS streaming, and async transcription. Flask handles the remaining ~60 page-render and integration routes via WSGIMiddleware. A shared `utils/` package keeps helpers framework-agnostic, and a `flask_db()` context manager bridges Flask-SQLAlchemy sessions into FastAPI routes.

## Services

### AI Phone Calls
Production AI phone call system built on Twilio. Inbound and outbound, single-call and batch, with five AI routes optimized for different latency/cost trade-offs.

**Five AI Routes:**

| Route | TTS provider | LLM provider | First-byte latency | Cost tier |
|-------|--------------|--------------|--------------------|-----------|
| Ultra-Fast | Cartesia Sonic 3 | OpenAI GPT-4.1 Mini | ~90ms | $$ |
| Premium | ElevenLabs Flash v2.5 | OpenAI GPT-4.1 | ~200ms | $$$ |
| Standard | ElevenLabs Turbo | OpenAI GPT-4.1 Mini | ~350ms | $$ |
| Volume | Fish Audio | OpenAI GPT-4.1 Mini | ~250ms | $ |
| Groq | Cartesia Sonic 3 | Llama 3.3 70B (Groq) | ~120ms | $$ |

**Capabilities:**
- Outbound campaigns with batch calling and concurrent execution
- Inbound call handling with custom greetings, AI voice selection, knowledge base
- Real-time transcription via Deepgram
- Voicemail detection (AMD) with configurable actions: hang up, leave voicemail, continue call
- Live call transfer with primary/backup number failover
- SMS fallback when calls go unanswered
- Configurable maximum call durations
- Conversation history persistence
- Webhook integrations for CRM updates
- Multi-language support with auto-detection

**Call flow system:**
- Visual drag-and-drop flow builder with Start, Condition, Transfer, End, Knowledge, and Action nodes
- Condition-driven node transitions with natural language criteria
- Labeled edges evaluated first, then node conditions on unlabeled paths
- Template variable substitution (contact name, company, domain, custom fields)
- Knowledge base injection at any node
- Greeting stored in conversation_history for LLM context continuity

**LangGraph agent swarm:**
- Dynamic multi-agent orchestration for complex call scenarios
- Specialist agents for domain-specific tasks
- Supervisor agent for routing and coordination
- Real-time agent switching during live calls

### Live AI Avatars (Interactive AI)
HeyGen-powered interactive video avatars rendered in real time.

- Real-time lip-sync with natural facial expressions
- Voice cloning across ElevenLabs, Fish Audio, Cartesia
- Knowledge base integration for accurate, in-domain responses
- Personality customization (tone, style, expertise)
- Meeting intelligence: Vision Mode, Group Mode, Audio Alerts, Transcription
- Engagement Dashboard with viewer analytics
- Embeddable widget — one line of code on any site
- Session recording and playback

### Voice Studio
- Audio-reactive visualizations from any MP3, WAV, OGG, or M4A
- AI avatar music videos with lip-synced performances
- Voice cloning across three engines (ElevenLabs, Fish Audio, Cartesia Sonic 3)
- Video lip-sync via LipDub AI
- Text-to-speech in 30+ languages with emotion, speed, and laughter control

### AI Twins (Digital Replicas)
Build a deployable digital replica of a real person.

- HeyGen avatar generation from photo or video reference
- Multi-engine voice cloning
- Knowledge base training (documents, FAQs, catalogs)
- Personality design (communication style, tone, expertise)
- Zep Cloud persistent memory across sessions
- Contact profile awareness and relationship building
- Cross-platform deployment (phone, video, chat, embedded)
- Continuous learning with updatable knowledge bases

**Pre-built archetypes:**
- Executive Twin: investor inquiries, media, thought leadership
- Sales Twin: personalized pitches, demos, qualification
- Support Twin: tier-1 handling with escalation protocols
- Training Twin: interactive curriculum, adaptive teaching

### Conversational AI Engine

**Visual flow builder:**
- Start, Condition, Transfer, End, Knowledge, and Action nodes
- Condition-driven transitions with natural language criteria
- Labeled edges for pathway routing
- Dynamic variable injection

**Supported LLMs:**
- GPT-4.1 (OpenAI) — best reasoning
- GPT-4.1 Mini (OpenAI) — fast and cost-effective
- Llama 3.3 70B (Groq) — ultra-fast inference
- Claude Sonnet 4 (Anthropic) — superior nuance
- Claude Haiku 4.5 (Anthropic) — fastest Anthropic model

**Memory & context:**
- Zep Cloud for persistent memory and knowledge graphs
- Contact profiles with relationship context
- Multi-turn conversation awareness
- Session-spanning memory continuity

### Human Communications
- Browser Softphone (WebRTC-based calling)
- SMS Messaging (single and batch with placeholder personalization)
- Real-time dashboards with stats, activity logs, cost tracking
- AI Coaches (Sales & Marketing, Coding & Integration)
- AI Proposal Maker (Better Proposals integration + standalone HTML generator)

### Integrations
- n8n Workflow Builder for AI agent swarm orchestration
- Zep Memory Dashboard for visual memory management
- Stripe billing
- Google Calendar (auto-creates events with Google Meet link on demo bookings)
- Resend (transactional email for booking confirmations)
- Better Proposals / PandaDoc for e-signatures

## Security

### Application layer
- Prompt injection detection: 50+ regex patterns scan every input
- Input sanitization for all user inputs including phone transcription
- Output filtering for data leak prevention
- Rate limiting per endpoint
- Abuse tracking with auto-blocking
- Protected system prompts (never exposed to callers)

### Authentication layer
- Session-based auth with secure cookies
- Role-based access control (superadmin, admin, staff, client)
- Tab-level permissions per user
- Login attempt logging and tracking

### Data layer
- Fernet symmetric encryption for all API keys at rest
- Supabase Row-Level Security on all 23 tables
- Only `postgres` and `service_role` have access policies
- `anon` and `authenticated` roles fully blocked
- Data extraction rate limiting
- Hardened default privileges for future objects

### Infrastructure layer
- Security headers: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, Referrer-Policy, Permissions-Policy
- Webhook signature validation for Twilio
- Anti-devtools detection on the dashboard

### Multi-tenant isolation
- Per-account encrypted API key storage
- Row-level database security
- Tenant-scoped queries on all endpoints
- Separate credential management per account

## Demo booking

Demos are 15 minutes, booked through https://youraistudio.au/book (no third-party redirect). Slots run Monday to Friday, 9am–5pm Sydney time, with 15-minute granularity and a 2-hour minimum lead time, available 14 days out. Server-side validation rejects weekend, out-of-hours, and off-grid times. Booking creates a Google Calendar event with both the prospect and `sales@youraistudio.au` as attendees and an auto-generated Google Meet link, plus a confirmation email through Resend with reply-to wired back to the prospect.

## Pricing

| Plan | Price | What you get |
|------|-------|--------------|
| Bring Your Own Keys | $99/mo | You supply API keys for each provider. We charge a flat platform fee for the studio, flows, and dashboard. Encrypted credential vault, all 5 voice routes, full flow builder, knowledge bases, transfers, Engagement Dashboard, email support. |
| Powered by Us | $299/mo | We provide all upstream credentials and infrastructure under generous monthly limits: 5,000 AI call minutes, 500 minutes of avatar streaming, 50 voice cloning generations. |
| Done-for-You | Custom | Our team designs, builds, and operates the deployment for you. Project-based billing. |

All plans include a 14-day free trial.

## API surface

The platform exposes ~135 FastAPI routes across 13 routers:
- Setup & Credentials (7 routes)
- Analytics & Logs (11 routes)
- Contacts, Templates, Scheduled Calls (15 routes)
- Call Flows & AI Twins (13 routes)
- Twilio HTTP Actions (23 routes)
- Twilio Webhooks (11 routes)
- SMS (2 routes)
- Auth (7 routes)
- Export (3 routes)
- Admin & Security (6 routes)
- Voice/TTS (18 routes)
- HeyGen/Media (18 routes)
- Booking (2 routes)
- WebSocket Streams (1 route)

## Frequently asked questions

**What is Your AI Studio?**
An Australian-built enterprise AI platform that combines AI phone calls, live AI video avatars, voice cloning, AI twins, and a visual conversational AI flow builder in one product, with multi-tenant credential isolation.

**What is the lowest end-to-end call latency?**
Around 90 milliseconds on the Cartesia Ultra-Fast route, measured from end-of-user-utterance to first audio byte returned to the caller.

**Can I bring my own API keys instead of using yours?**
Yes. The Bring Your Own Keys plan ($99/mo) lets you plug in your own OpenAI, Anthropic, Cartesia, ElevenLabs, Fish Audio, Twilio, Deepgram, HeyGen, Groq, and Zep keys. They are encrypted at rest with Fernet symmetric encryption.

**Is the platform multi-tenant?**
Yes. Per-account encrypted credentials, Supabase Row-Level Security on all 23 tables, and tenant-scoped queries on every endpoint.

**Does it work for Australian phone numbers?**
Yes. Twilio is the telephony backbone, with support for Australian inbound and outbound numbers and Australian voices via Cartesia Sonic 3.

**How do I book a demo?**
Visit https://youraistudio.au/book — 15-minute slots Monday to Friday, 9am–5pm Sydney time, with 2 hours minimum notice. You receive a Google Calendar invite with a Meet link instantly on confirmation.

**Do you support voice cloning?**
Yes — across three engines (ElevenLabs, Fish Audio, Cartesia), selectable per project.

**Which LLMs do you support for live calls and conversations?**
GPT-4.1, GPT-4.1 Mini, Claude Sonnet 4, Claude Haiku 4.5, and Llama 3.3 70B via Groq.

**Is there a free trial?**
Yes — 14 days on every plan.

**How are customer API keys protected?**
Each customer's keys are encrypted at rest using Fernet symmetric encryption. Decryption happens only inside the request lifecycle and only for that customer's tenant scope.

**Can I deploy on my own infrastructure?**
Yes — the platform supports self-hosted deployment with bring-your-own-keys for full data sovereignty.

## Technical requirements

- Twilio account for telephony
- API keys for chosen voice engines (ElevenLabs, Fish Audio, and/or Cartesia)
- OpenAI and/or Anthropic API keys for LLM
- Deepgram API key for transcription
- HeyGen API key for avatars (optional)
- Groq API key for ultra-fast LLM inference (optional)
- Zep Cloud API key for persistent memory (optional)