MOSS-TTS-Local-Transformer-v1.5
A high-fidelity, multilingual text-to-speech model supporting native 48kHz stereo output.
huggingface.coBuilt with
UnknownBuild evidence
Strong
This is an official model repository hosted on Hugging Face by the OpenMOSS-Team, providing code, installation guides, and usage examples.
Creator
OpenMOSS-Team @OpenMOSS-TeamShipped
2h agoMOSS-TTS-Local-Transformer-v1.5 is a transformer-based text-to-speech model supporting zero-shot voice cloning, multilingual synthesis across 31 languages, and fine-grained control over duration and prosody. It features improved stereo audio quality through the MOSS-Audio-Tokenizer-v2 and enhanced stability for consistent voice cloning and punctuation-aligned pauses.
Source post
Timeline
Teaser
Video
Playable
Product
Loading…



