# Your AI Studio — Complete Technical Reference > Enterprise AI platform for AI phone calls, live video avatars, voice cloning, AI twins, and conversational AI. Australian-built, deployable on your own infrastructure with multi-tenant isolation. This file is the long-form LLM-friendly corpus for Your AI Studio. It is intentionally written in clean, declarative prose so language models can quote it accurately. For the short overview, see [/llms.txt](/llms.txt). Canonical site: https://youraistudio.au ## About Your AI Studio is an Australian enterprise AI media and communications platform. It is operated from Australia and serves customers worldwide. It provides production-grade AI phone agents, interactive video avatars, voice synthesis, digital AI twins, and a visual conversational AI flow builder. - Domain: https://youraistudio.au - Sales: sales@youraistudio.au · +61 474 491 399 - Book a demo: https://youraistudio.au/book ## Architecture ### Dual-server topology - **Frontend**: React 19 + TypeScript + Vite 8 (port 5000 in development) - **Backend**: Hybrid FastAPI + Flask via WSGIMiddleware (port 3000) - **Database**: Supabase managed PostgreSQL with Row-Level Security on every table - **Real-time**: Native FastAPI WebSockets, LiveKit for bidirectional audio - **Telephony**: Twilio (inbound TwiML, outbound REST, programmable voice + SMS, AMD) ### Why hybrid FastAPI + Flask The platform is mid-migration from a Flask monolith to FastAPI. FastAPI handles ~135 routes including all real-time WebSocket streams, async LLM streaming, async TTS streaming, and async transcription. Flask handles the remaining ~60 page-render and integration routes via WSGIMiddleware. A shared `utils/` package keeps helpers framework-agnostic, and a `flask_db()` context manager bridges Flask-SQLAlchemy sessions into FastAPI routes. ## Services ### AI Phone Calls Production AI phone call system built on Twilio. Inbound and outbound, single-call and batch, with five AI routes optimized for different latency/cost trade-offs. **Five AI Routes:** | Route | TTS provider | LLM provider | First-byte latency | Cost tier | |-------|--------------|--------------|--------------------|-----------| | Ultra-Fast | Cartesia Sonic 3 | OpenAI GPT-4.1 Mini | ~90ms | $$ | | Premium | ElevenLabs Flash v2.5 | OpenAI GPT-4.1 | ~200ms | $$$ | | Standard | ElevenLabs Turbo | OpenAI GPT-4.1 Mini | ~350ms | $$ | | Volume | Fish Audio | OpenAI GPT-4.1 Mini | ~250ms | $ | | Groq | Cartesia Sonic 3 | Llama 3.3 70B (Groq) | ~120ms | $$ | **Capabilities:** - Outbound campaigns with batch calling and concurrent execution - Inbound call handling with custom greetings, AI voice selection, knowledge base - Real-time transcription via Deepgram - Voicemail detection (AMD) with configurable actions: hang up, leave voicemail, continue call - Live call transfer with primary/backup number failover - SMS fallback when calls go unanswered - Configurable maximum call durations - Conversation history persistence - Webhook integrations for CRM updates - Multi-language support with auto-detection **Call flow system:** - Visual drag-and-drop flow builder with Start, Condition, Transfer, End, Knowledge, and Action nodes - Condition-driven node transitions with natural language criteria - Labeled edges evaluated first, then node conditions on unlabeled paths - Template variable substitution (contact name, company, domain, custom fields) - Knowledge base injection at any node - Greeting stored in conversation_history for LLM context continuity **LangGraph agent swarm:** - Dynamic multi-agent orchestration for complex call scenarios - Specialist agents for domain-specific tasks - Supervisor agent for routing and coordination - Real-time agent switching during live calls ### Live AI Avatars (Interactive AI) HeyGen-powered interactive video avatars rendered in real time. - Real-time lip-sync with natural facial expressions - Voice cloning across ElevenLabs, Fish Audio, Cartesia - Knowledge base integration for accurate, in-domain responses - Personality customization (tone, style, expertise) - Meeting intelligence: Vision Mode, Group Mode, Audio Alerts, Transcription - Engagement Dashboard with viewer analytics - Embeddable widget — one line of code on any site - Session recording and playback ### Voice Studio - Audio-reactive visualizations from any MP3, WAV, OGG, or M4A - AI avatar music videos with lip-synced performances - Voice cloning across three engines (ElevenLabs, Fish Audio, Cartesia Sonic 3) - Video lip-sync via LipDub AI - Text-to-speech in 30+ languages with emotion, speed, and laughter control ### AI Twins (Digital Replicas) Build a deployable digital replica of a real person. - HeyGen avatar generation from photo or video reference - Multi-engine voice cloning - Knowledge base training (documents, FAQs, catalogs) - Personality design (communication style, tone, expertise) - Zep Cloud persistent memory across sessions - Contact profile awareness and relationship building - Cross-platform deployment (phone, video, chat, embedded) - Continuous learning with updatable knowledge bases **Pre-built archetypes:** - Executive Twin: investor inquiries, media, thought leadership - Sales Twin: personalized pitches, demos, qualification - Support Twin: tier-1 handling with escalation protocols - Training Twin: interactive curriculum, adaptive teaching ### Conversational AI Engine **Visual flow builder:** - Start, Condition, Transfer, End, Knowledge, and Action nodes - Condition-driven transitions with natural language criteria - Labeled edges for pathway routing - Dynamic variable injection **Supported LLMs:** - GPT-4.1 (OpenAI) — best reasoning - GPT-4.1 Mini (OpenAI) — fast and cost-effective - Llama 3.3 70B (Groq) — ultra-fast inference - Claude Sonnet 4 (Anthropic) — superior nuance - Claude Haiku 4.5 (Anthropic) — fastest Anthropic model **Memory & context:** - Zep Cloud for persistent memory and knowledge graphs - Contact profiles with relationship context - Multi-turn conversation awareness - Session-spanning memory continuity ### Human Communications - Browser Softphone (WebRTC-based calling) - SMS Messaging (single and batch with placeholder personalization) - Real-time dashboards with stats, activity logs, cost tracking - AI Coaches (Sales & Marketing, Coding & Integration) - AI Proposal Maker (Better Proposals integration + standalone HTML generator) ### Integrations - n8n Workflow Builder for AI agent swarm orchestration - Zep Memory Dashboard for visual memory management - Stripe billing - Google Calendar (auto-creates events with Google Meet link on demo bookings) - Resend (transactional email for booking confirmations) - Better Proposals / PandaDoc for e-signatures ## Security ### Application layer - Prompt injection detection: 50+ regex patterns scan every input - Input sanitization for all user inputs including phone transcription - Output filtering for data leak prevention - Rate limiting per endpoint - Abuse tracking with auto-blocking - Protected system prompts (never exposed to callers) ### Authentication layer - Session-based auth with secure cookies - Role-based access control (superadmin, admin, staff, client) - Tab-level permissions per user - Login attempt logging and tracking ### Data layer - Fernet symmetric encryption for all API keys at rest - Supabase Row-Level Security on all 23 tables - Only `postgres` and `service_role` have access policies - `anon` and `authenticated` roles fully blocked - Data extraction rate limiting - Hardened default privileges for future objects ### Infrastructure layer - Security headers: X-Content-Type-Options, X-Frame-Options, X-XSS-Protection, Referrer-Policy, Permissions-Policy - Webhook signature validation for Twilio - Anti-devtools detection on the dashboard ### Multi-tenant isolation - Per-account encrypted API key storage - Row-level database security - Tenant-scoped queries on all endpoints - Separate credential management per account ## Demo booking Demos are 15 minutes, booked through https://youraistudio.au/book (no third-party redirect). Slots run Monday to Friday, 9am–5pm Sydney time, with 15-minute granularity and a 2-hour minimum lead time, available 14 days out. Server-side validation rejects weekend, out-of-hours, and off-grid times. Booking creates a Google Calendar event with both the prospect and `sales@youraistudio.au` as attendees and an auto-generated Google Meet link, plus a confirmation email through Resend with reply-to wired back to the prospect. ## Pricing | Plan | Price | What you get | |------|-------|--------------| | Bring Your Own Keys | $99/mo | You supply API keys for each provider. We charge a flat platform fee for the studio, flows, and dashboard. Encrypted credential vault, all 5 voice routes, full flow builder, knowledge bases, transfers, Engagement Dashboard, email support. | | Powered by Us | $299/mo | We provide all upstream credentials and infrastructure under generous monthly limits: 5,000 AI call minutes, 500 minutes of avatar streaming, 50 voice cloning generations. | | Done-for-You | Custom | Our team designs, builds, and operates the deployment for you. Project-based billing. | All plans include a 14-day free trial. ## API surface The platform exposes ~135 FastAPI routes across 13 routers: - Setup & Credentials (7 routes) - Analytics & Logs (11 routes) - Contacts, Templates, Scheduled Calls (15 routes) - Call Flows & AI Twins (13 routes) - Twilio HTTP Actions (23 routes) - Twilio Webhooks (11 routes) - SMS (2 routes) - Auth (7 routes) - Export (3 routes) - Admin & Security (6 routes) - Voice/TTS (18 routes) - HeyGen/Media (18 routes) - Booking (2 routes) - WebSocket Streams (1 route) ## Frequently asked questions **What is Your AI Studio?** An Australian-built enterprise AI platform that combines AI phone calls, live AI video avatars, voice cloning, AI twins, and a visual conversational AI flow builder in one product, with multi-tenant credential isolation. **What is the lowest end-to-end call latency?** Around 90 milliseconds on the Cartesia Ultra-Fast route, measured from end-of-user-utterance to first audio byte returned to the caller. **Can I bring my own API keys instead of using yours?** Yes. The Bring Your Own Keys plan ($99/mo) lets you plug in your own OpenAI, Anthropic, Cartesia, ElevenLabs, Fish Audio, Twilio, Deepgram, HeyGen, Groq, and Zep keys. They are encrypted at rest with Fernet symmetric encryption. **Is the platform multi-tenant?** Yes. Per-account encrypted credentials, Supabase Row-Level Security on all 23 tables, and tenant-scoped queries on every endpoint. **Does it work for Australian phone numbers?** Yes. Twilio is the telephony backbone, with support for Australian inbound and outbound numbers and Australian voices via Cartesia Sonic 3. **How do I book a demo?** Visit https://youraistudio.au/book — 15-minute slots Monday to Friday, 9am–5pm Sydney time, with 2 hours minimum notice. You receive a Google Calendar invite with a Meet link instantly on confirmation. **Do you support voice cloning?** Yes — across three engines (ElevenLabs, Fish Audio, Cartesia), selectable per project. **Which LLMs do you support for live calls and conversations?** GPT-4.1, GPT-4.1 Mini, Claude Sonnet 4, Claude Haiku 4.5, and Llama 3.3 70B via Groq. **Is there a free trial?** Yes — 14 days on every plan. **How are customer API keys protected?** Each customer's keys are encrypted at rest using Fernet symmetric encryption. Decryption happens only inside the request lifecycle and only for that customer's tenant scope. **Can I deploy on my own infrastructure?** Yes — the platform supports self-hosted deployment with bring-your-own-keys for full data sovereignty. ## Technical requirements - Twilio account for telephony - API keys for chosen voice engines (ElevenLabs, Fish Audio, and/or Cartesia) - OpenAI and/or Anthropic API keys for LLM - Deepgram API key for transcription - HeyGen API key for avatars (optional) - Groq API key for ultra-fast LLM inference (optional) - Zep Cloud API key for persistent memory (optional)