Blog
Next

Text to Speech with Cloudflare Workers AI

Build a lightning-fast text-to-speech app using Cloudflare Workers AI. Transform any text into natural-sounding audio in milliseconds.

What if you could turn any text into speech instantly? Not just robotic-sounding audio, but natural, human-like voices that bring your words to life.

I built exactly that using Cloudflare Workers AI — a text-to-speech app that converts written words into audio in under 3 seconds, running entirely at the edge of the internet.

How It Works

The app uses a simple but powerful architecture:

TTS Design

Key Features

Smart Input Design — Siri inspired input morphing UI design

Local Storage — All recordings are saved locally with metadata for easy management

{
  "id": "b19bace8-5833-435e-8aca-a06760a217fa",
  "text": "Your text here",
  "latency": 2803,
  "createdAt": "2025-10-19T13:54:07.029Z"
}

Local Storage

Model Details

MeloTTS — A high-quality multi-lingual text-to-speech library by MyShell.ai

Model InfoDetails
Unit Pricing$0.0002 per audio minute

Try It Yourself

Demo Text: "S R B is active on X dot com — follow him there"

Source Code: GitHub Repository Live: text-to-speech.srb.codes

Coming Soon: Real-time TTS with Aura-1 — Deepgram's context-aware text-to-speech model that applies natural pacing, expressiveness, and fillers based on text context. Perfect for live conversations and real-time applications.

Model InfoDetails
Real-timeYes
Unit Pricing$0.015 per 1k characters
Speakers12 voice options available

Ready to build your own? The code is open-source and ready to deploy on Cloudflare Workers.