Category Archives: Uncategorized

Google Text-To-Speech (TTS)

Update : Andufo shared the happy news that more languages are now available in the Google TTS service! I have added a new language selection drop-down for English, Spanish, French, German, Italian, and Hatian Creole.

Google Translate announced the ability to hear translations into English spoken via text-to-speech (TTS). Looking at the Firebug Net panel for where this TTS data was coming from, I saw that the speech audio is in MP3 format and is queried via a simple HTTP GET (REST) request: http://translate.google.com/translate_tts?tl=en&q=text. Google Translate notes that the speech is only available for short translations to English Now multiple languages are supported, and it turns out that the TTS web service is restricting the text to 100 characters. Another restriction is that the service returns 404 Not Found if the request includes a Referer header (presumably one that is not for translate.google.com).

In spite of the limitations of the web service which certainly reflect the intention that the web service is only to be used by Google Translate, thanks to the new HTML5's Audio element and rel="noreferrer", the service may be utilized by client-side web applications like following (Google Chrome 4 recommended):

Google Text-To-Speech (TTS)

I am really excited at the prospect of text-to-speech being made available on the Web! It's just too bad that fetching MP3s on an remote web service is the only standard way of doing so currently; modern operating systems all have TTS capabilities, so it's a shame that web apps and can't utilize them via client-side scripting. I posted to the WHATWG mailing list about such a Text-To-Speech (TTS) Web API for JavaScript, and I was directed to a recent thread about a Web API for speech recognition and synthesis.

Perhaps there is some momentum building here? Having TTS available in the browser would boost accessibility for the seeing-impaired and improve usability for people on-the-go. TTS is just another technology that has traditionally been relegated to desktop applications, but as the Open Web advances as the preferred platform for application development, it is an essential service to make available (as with Geolocation API, Device API, etc.). And besides, I want to build TTS applications and my motto is: If it can't be done on the Open Web, it's not worth doing at all!

Accepted into UW Computational Linguistics Master’s Program

A couple years ago I learned of the University of Washington’s Computational Linguistics Master’s program and I was really interested. But since I had moved to Portland, it wasn’t feasible for me to attend classes—this is especially true now since I am employed here and got married a year ago.

For my birthday this year, my dad gave me one of the best presents ever: John McWhorter‘s audio course “The Story of Human Language“; LaVonne and I couldn’t get enough of his lectures. I’m sad to say I’ve finished them, but listening to them re-piqued my interest in the academic study of Linguistics; so I meandered over to UW’s Computational Linguistics website and, to my shock, I found that the entire program can now be completed via online correspondence! I was so excited! I was already a couple months past the admission application deadline, but I contacted the department and got approval to apply. Now, a few weeks later, I have just received news that I have been accepted into the program! This is one of my dreams come true!

For some more background and the reasons why I’m excited about this program, I’ve included below the statement of purpose I wrote for the application. If everything works out, I will begin studying part-time this Fall while still being employed full-time at Shepherd Interactive. (Note: when talking about Open Scriptures below, I don’t mention the others who are working so hard alongside me—it’s true I started it, but now it’s “my project” only in the sense that I am but a part of it.)

Statement of Purpose for Application to UW Computational Linguistics Program

Using the computer to solve linguistic problems has been a core interest of mine for the past decade. As I studied and entered the workforce as a web application developer, I have studied linguistics and languages on the side. When I started my undergraduate studies at Seattle Pacific University, I was intending to create a self-designed major in computational linguistics, but I was disappointed to find that the faculty weren’t experienced enough in this area to advise me. So I made do by majoring in computer science and minoring in both linguistics and Spanish. I thoroughly enjoyed taking CS and linguistics courses in parallel, taking concepts in one and applying them in another, for example studying the Chomsky hierarchy in my syntax course for linguistics but then applying his concepts in my compiler design CS course. Furthermore, I used my web application development skills to create relevant applications along the way, like a syntax tree drawer and a popular IPA chart keyboard tool, and I also completed various linguistics assignments by publishing them on the Web.

Although I learned much from taking computer science and linguistics in parallel, I have missed out on the focused intersection of the two in the sub-field of natural language processing. It is my desire to satisfy my initial undergraduate computational linguistic aspirations in the Master’s program at the University of Washington.

With regard to applying what I would learn in the program, I am the founder of the Open Scriptures project, an initiative which seeks to interlink the various scriptural corpora and derivative datasets to create a Linked Data infrastructure for scripture, and on top of this foundation provide a platform that allows developers to build innovative applications on top of the data available. One of the key problem areas in this endeavor is the alignment of translated texts with their source manuscripts. I had been thinking to utilize collective intelligence to power the semantic interlinking of the texts, but I have come to realize that NLP will be necessary to achieve the desired results. The concepts and techniques I learn in the Master’s program would be directly applicable to my project.

I am an advanced Spanish speaker, and I have also taken one term each of French and KoinĂ© Greek, and two terms of Biblical Hebrew. I am especially interested in Semitic languages and corpus linguistics of the Hebrew Tanakh and the Arabic Qur’an. I am studying Arabic on my own, and have attained a novice familiarity with the language. I desire that this Master’s program in computational linguistics would be a stepping stone to further graduate studies in the field.

Newly Designed Open Scriptures Website with Blog

I haven’t formally announced this here yet, but there is a newly designed Open Scriptures website with blog. Future posts regarding Open Scriptures will be posted there. There are two new posts, the first regarding the recently-unveiled exciting Tagged Tanakh project, and the second about redeeming the ill-fated Re:Greek Open Source Initiative with a call for participation. Check them out and subscribe to the feed.

Please also follow Open Scriptures on Twitter.

Join in on the discussion on the Google Group.