A plugin for speech manipulation?

This forum is currently in read-only mode.
From the Asset Store
OpenAI Whisper
$9 USD
10% off
Whisper is a speech recognition system for making requests to the OpenAI speech to text API
  • There is a company called Phonetic Arts, and they've created this cool speech synthesis technology that creates continuous commentary from using a voice recording (although it's more complicated than that). There were videos of it on the web, but since Google bought them, they've all disappeared and Phonetic Arts has pretty much disappeared with them in terms of visibility. However, I have a hunch that a similar simpler type of technology could be done just by having some kind of tool that manipulates speech, maybe being able to manipulate each word in a typed sentenced in addition to being able to manipulate the entire sentence.

    So, I was using the text-to-speech object in Construct the other day and was wondering if it would be possible for a plugin to be made that could manipulate the tone, inflection, depth and speed of the speech voice? Right now, I think the object is nice, but being able to shape the voice would be make it incredibly useful for RPG/Drama games and would be huge for sports game that require play-by-play commentary. This way, you could make all kinds of voices to give your games a more professional edge and you wouldn't have to load your game up with a bunch of external audio files.

  • Mr miller, I actually have a few ideas for how speech synthesis could be improved drastically with procedural inflection, and emotion using the recorded phenomes type of synthesis, unfortunately when I asked a similar question a long time ago, the responses I got suggested it was using a built windows speech synthesis feature that doesn't have much flexibility. The source code is not available on the svn, either, so I can't check.

  • lucid,

    Darn. It would've been cool.

    As an alternative to learning how Phonetic Arts might've worked with a main sound file as their foundation source, I've been trying to find a way to extract some speech tracks from some old PS1 and PS2 sports games. If I can ever find a way to extract, I plan to turn the files into a library that I can edit by importing mass amounts of files into something like Audacity and then export them to predefined file folders that could obviously be used in any program that accepts external files. Seems like the easy part is doing everything besides finding a way to exract the files, but I'm going to keep searching until I find a tool or something.

  • Try Construct 3

    Develop games in your browser. Powerful, performant & highly capable.

    Try Now Construct 3 users don't see these ads
  • Mr miller, I actually have a few ideas for how speech synthesis could be improved drastically with procedural inflection, and emotion using the recorded phenomes type of synthesis, unfortunately when I asked a similar question a long time ago, the responses I got suggested it was using a built windows speech synthesis feature that doesn't have much flexibility. The source code is not available on the svn, either, so I can't check.Well, there are quite a few tts engines.

    http://sourceforge.net/directory/os:windows/?q=speech%20synthesis

    eSpeak

    FreeTTS (Java based))

    Hephaestus

    What they do have are complex phoneme creation algorithms, sometimes including intonation and rhythm. What they don't have is a professional voice talent recorded and splitted into phonemes, so it is a computer generated (say, built with basic waveforms) voice instead.

  • ah damn, I typed a detailed reply filled with ideas, etc. too lazy to go into again, so Thanks Tulamide for the links!

Jump to:
Active Users
There are 1 visitors browsing this topic (0 users and 1 guests)