Apologies for the delay - life caught up with me after my Christmas break. I had drawn up the project's overarching design a few weeks ago, but I didn't get a chance to transfer it to my computer until now.
So here it is:
It all starts with the user entering an alpha-numeric string, which the Text Reader converts into phonetic metadata. The Sound Maker then transforms the metadata into an audio file for the user to experience in the form of sound.
Additional software requirements include:
- Highly customizable database that can house a variety of voices and;
- User-friendly setup process and interface and;
- Adaptability to different languages, specifically the Sound Maker. (Optional)
The diagram could be seen as deceptively simplistic, but putting it into practice requires knowledge of current industry standards for text-to-speech software. The challenge for me is moving beyond the two fourth-year university courses on computational linguistics that I took over five years ago.
Another major challenge is making the system user-friendly by being easy to set up and easy to use. The target user will be proficient at using a computer in a workplace, but not at developing software. Pay particular attention to voice training, which may be too complex for the average user to complete.
I believe this is a good project to start practicing conveying my ideas to others and venturing into unfamiliar territory. Expect my next post - a design breakdown of the Text Reader and maybe the Sound Maker - around the end of February. Work is catching up with me, and requires a lot more attention than usual.
In the meantime, keep your stick on the ice!
