Welcome to the Troposphere

For the last few weeks, I’ve been playing with a fabulous new toy called Tropo. Tropo pretends to be a cloudy voicy serious businessy thing, but really it’s just a big fun toy written for me and all my fellow geeks who have always wanted to hack a phone system but never got around to it.

The best way to show you what Tropo can do is to just let you interact with it. So, fire up Skype and call +990009369991430024. You should hear a menu of options, each of which I discuss below. I won’t delve into the code too much, it’s pretty easy to read and the details are best left to Tropo’s own documentation which is quite good. My examples are in Ruby, but Tropo also has scripting support for Python, PHP, JavaScript and Groovy (and you could use any language you wanted via the Web API).

1. Leave a Voicemail

def voicemail
  timestamp_filename = Time.now.strftime("%Y-%m-%d--%H-%M-%S")

  event = record('Please leave a message.',
                 { :repeat              => 0,
                   :bargein             => true,
                   :beep                => true,
                   :silenceTimeout      => 2,
                   :maxTime             => 60,
                   :timeout             => 5,
                   :recordUser          => RECORD_USER,
                   :recordPassword      => RECORD_PASSWORD,
                   :recordURI           => "ftp://ananelson.com/voicemails/#{timestamp_filename}.wav",
                   :transcriptionOutURI => 'mailto:ana@ananelson.com',
                   :transcriptionID     => timestamp_filename})

  say "Thank you, we'll make sure Ana gets the message."
end

I loathe checking my voicemail. I much prefer receiving a text message, or just seeing a missed call which I can return. However, I recognize that if someone is calling me and I don’t answer, it’s probably easier for them to leave a voicemail. A tradeoff of inconveniences. Tropo has two features which make voicemail much more fun for me: saving voice messages as audio files and automated transcription. Having voice messages saved as audio files means it’s easy to quickly replay the one you want, save what you need for future reference and delete the rest without having to use your voicemail system’s badly-designed interactive prompting. It’s potentially ironic that one of the benefits for me of using Tropo is that I won’t have to phone an IVR voicemail system to listen to my messages, but the beauty of Tropo is that you can use interactive voice where necessary, but also give people the option of more appropriate and efficient means of interacting with the same data. Also, I’ve never had the pleasure of using a really well-designed voicemail system, one designed to minimize the time I have to spend navigating menus and listening to options, so with the accessibility of Tropo, I look forward to a user experience revolution coming to the world of voice.

Tropo’s automated transcription, while not always word-perfect, is pretty darn good and certainly good enough for getting the gist of a phone message. So, now my caller can have the convenience of voice, and I can receive a text message (or, in this script, an email) from Tropo a few minutes later with the transcription.

2. Do Currency Conversion

def currency_conversion
  currencies = {'dollars' => 'USD', 'pounds' => 'GBP'}

  choice = ask("Say the currency you want to convert from now. Available currencies are #{currencies.keys.sort.join(",")}",
    :choices => "dollars(1, dollar, dollars, bucks, greenbacks), pounds(2, pound, pounds, quid)")

  if choice.name === "choice"
    currency = choice.value
    say "you chose #{currency}"
  else
    raise "not a valid choice"
  end

  choice = ask "Now key in the amount in #{currency}", :choices => '[DIGITS]'
  amount = choice.value
  say "you entered #{amount}"
  
  ccy = currencies[currency]
  
  uri = URI.parse("http://download.finance.yahoo.com/d/quotes.csv?s=#{ccy}EUR=X&f=sl1d1t1c1ohgv&e=.csv")
  currency_data = Net::HTTP.get(uri) 

  euro_conversion_rate = currency_data.split(",")[1].to_f
  euro_amount = euro_conversion_rate * amount.to_i
  
  info = "The value of #{amount} #{currency} is #{euro_amount} euro, using a rate of #{euro_conversion_rate}."

  repeat = true
  while repeat do
    say info
    choice = ask "To repeat this, press 1 now. To return to main menu, press 2 now.", :choices => "1,2"
    if choice.name === "choice"
      repeat = (choice.value.to_i == 1)
    else
      repeat = false
    end
  end
end

The ask() function prompts the user for input, and you tell Tropo the input you’d like to accept using the :choices parameter. This can either be input from the telephone keypad, or from voice. I hope the trend moves towards giving users the choice, i.e. “press 1 or say sales, press 2 or say support” as sometimes it’s more convenient to be able to just speak the option you want without taking the handset away from your ear to look at the keypad, whereas a variety of factors may make voice input inconvenient or impossible. In this case, to indicate dollars you can press the number 1 or say “dollar” or a variety of variants on this term. The confirmation will say “dollars” or “pounds”, not “1” or “quid”.

I don’t do any fancy handling if the caller makes an invalid choice, in a user-facing app you’d want to have a loop which repeats the prompt or provides assistance if the user didn’t make a valid choice. I do a basic test and raise an error if no valid choice is made. Because of error handling code elsewhere, this causes the error message to be read out and then ends the call. However, assuming a valid choice is entered this repeats the choice to the user and then the script goes on to ask for the numerical amount to be converted, which should be entered using the keypad. Then the exchange rate is fetched from Yahoo! Finance and the result is read out, with a nice little loop to allow the caller to have the result repeated if desired.

I use Net/HTTP and URI libraries, which I require at the beginning of my script like so:

begin
  require "net/http"
  require "uri"
rescue Exception => e
  say e.message
  log e.message
end

This makes it easy to tell if you have required a library which the Tropo scripting environment doesn’t give you access to.

3. Listen to “Radio”

def listen_to_radio
  index_page_url = "http://www.archive.org/details/voyageofthescarletqueenotr"
  series_name = "The Voyage of the Scarlet Queen"
  episode_url_base = "http://www.archive.org/download/"

  uri = URI.parse(index_page_url)
  page_text = Net::HTTP.get(uri)

  page_text.match(/IAD\.names = \[(.*)\];/)
  episode_names = $1.split(",").collect {|x| x.gsub('"', '')}

  page_text.match(/IAD\.mp3s = \[(.*)\];/)
  episode_urls = $1.split(",").collect {|x| x.gsub('"', '')}
  
  prompt = series_name
  choices = []
  episode_names.each_with_index do |s, i|
    next if i > 9
    prompt << " Press #{i+1} for episode #{s}."
    choices << (i+1).to_s
  end
  
  episode = ask(prompt, :choices => choices.join(",")).value.to_i
  
  episode_url = episode_url_base + episode_urls[episode]
  say "Press 9 to return to main menu."
  ask(episode_url, :choices => "[DIGITS]")
end

This is a little example which parses a page from one of Archive.org’s collections and reads out the available audio files in that collection, then plays the requested file. In this case, episodes of a vintage radio programme called The Voyage of the Scarlet Queen.

I hope that Tropo + archive.org can put an end to evil musical hold. Most of the time when I’m forced to be on hold, I want to listen to… silence. This is because I am going to put the call on speakerphone and do something else until someone answers. Maybe give me a tiny little beep every now and then to reassure me that I haven’t been cut off. Musical hold, of course, makes it impossible to put the call on speakerphone if you’re in an open plan office, and annoying if you’re on your own. Silence might not be to everyone’s taste, in which case there’s a wealth of interesting public-domain listens available on archive.org. Preferably, give callers the choice of silence or a style of music or other audio, or maybe a trivia or math game to play. (Well, okay, preferably don’t make people wait on hold.)

Tropo makes it easy to play music, because the “say” command which normally synthesizes a string into speech will also play any mp3 file it’s pointed at. Typically you would use this to record a real voice to say a prompt which doesn’t need dynamic text in it, but it can also be used for playing hold music or playing “name that tune” games. :-) In a pinch, if you forget your headphones and need to listen to some audio content while at work or in some other context where you can’t cause noise pollution, you could rig up a Tropo app and use your phone to listen to a file anywhere on the internet.

4. Join Conference Call

def join_conference_call(identifier)
  conference(identifier)
end

That’s how easy it is to create a conference call in Tropo. The identifier parameter is a string which identifies the particular conference call to join, so you could have conversations on different topics. The identifiers are account-wide, so if you have 2 different apps they can have a common conference room by using the same identifier. For example, you could have one app which provides the host interface for a teleconference, and another which provides the interface for conference participants. Or, you could have a common help chat room available from all your apps.

You’re welcome to press Option 4, but it’ll be rather dull unless someone else happens to be reading this at the same time. I haven’t provided an exit key, so just hang up when you’re finished listening to the ghostly silence.

Main Menu

begin
  answer
  wait 1000


  finished = false
  until finished do
    choice = make_valid_choice(
      1 => 'leave a message', 
      2 => 'do currency conversion', 
      3 => 'listen to radio', 
      4 => 'join conference call', 
      9 => 'disconnect'
    )
    case choice
    when '1'
      voicemail()
    when '2'
      currency_conversion()
    when '3'
      listen_to_radio()
    when '4'
      join_conference_call("demo1")
    when nil, '9'
      hangup
    else
      raise "should not be allowed to have choice " + choice
    end
    say "going back to main menu, if you are finished, just hang up"
    finished = ($currentCall.state === "DISCONNECTED")
  end
  log "call finished"
  hangup
  
rescue Exception => e
  log e.message
  say e.message
  hangup
end

This is the main loop which answers the call, then presents the main menu options over and over again. Note that Tropo code doesn’t automatically stop running when the caller hangs up, so if you implement a loop make sure to give it an exit in the event the call is disconnected, otherwise it’ll iterate rapidly about 100 times and then you’ll get a nasty message in your Tropo log saying that the thread had to be shut down.

I have found that without a “wait 1000” after “answer”, the first word or two of the initial prompt gets cut off. Also note that the entire call is wrapped in a begin/rescue block which, in the event of an exception, reads out the text of the exception then hangs up. For a production app, I’d probably just say something like “Sorry, we have to disconnect this call, please call back later.” or “we’ll have someone call you” and log the error to a file.

The make_valid_choice function is a helper I wrote which allows me to pass a hash of options and automatically generate a menu. I haven’t followed my own rule here of allowing “press 1 or say voicemail” but that would be easy to implement.

def make_valid_choice(choices_hash)
  prompt = ""
  choices_hash.sort.each do |k, v|
    prompt << "To #{v}, press #{k} now. "
  end
  choices = choices_hash.keys.collect {|k| k.to_s}.sort.join(",")

  valid = false
  choice = nil

  until valid do
    event = ask(prompt, {:choices => choices, :mode => 'dtmf'})

    if event.name === 'choice'
      choice = event.value
      valid = true
    elsif event.name === 'badChoice'
      say "Sorry, that is not an available option."
    else
      say "unexpected event #{event.name}"
    end

    break if $currentCall.state === "DISCONNECTED"
  end

  choice
end

Say Each

I didn’t use it in any of these examples, but here’s a method which says each individual letter in a string. This is useful if you’re having someone input a number and want to confirm it. The pause_after lets you split the string up into blocks of, say, 4 characters with a longer pause between to help with auditory parsing.

def say_each(s, pause_after = 0)
  length = s.length
  0.upto(length-1) do |i|
    say s[i, 1]
    if (pause_after > 0) && ((i+1) % pause_after == 0)
      wait 1000
    end
  end
end

Not Just Voice

While it takes a little tweaking, your Tropo app can be used not only for voice interaction, but also for SMS, Instant Messaging or Twitter. And, you can phone a Tropo app via Skype, a “normal” phone number which can be rented anywhere in the world, or other new-fangled telephony interfaces.

Also, Tropo can initiate calls (or send SMS), not just receive them.

Tropo

You can write Tropo apps either as standalone scripts which Tropo can host for you, or you can use the Web API. The Web API is far more powerful and flexible since you have access to any libraries you have installed in your environment. The scripting environment gives you access to a limited subset of your language’s functionality, but is far simpler and the best way to get started.

If Tropo encounters an error, the call will disconnect. You’ll probably be able to see the error in Tropo’s logs, but I suggest you wrap your entire script in a try/catch or rescue block, then in your error handling code write “say(err.message)” followed by the “hangup” command. This will mean the error gets read out to you while you’re on the call. This is especially useful when you’re first starting out and aren’t sure what libraries or methods are available to you in the scripting environment. If you can’t phone your app at all, Skype may say the number is busy or unavailable, then you probably have a syntax error in your script. I run my script locally after I make changes to it, and if I get an error saying “undefined method say” or similar then I know I don’t have any syntax errors.

Cost

Tropo is free for development, but not free for production use. There are per-minute charges for inbound/outbound voice and transcription, and rental charges for local phone numbers. These charges are very low, especially by old-school telephony standards, and you would need to be handling a very large call volume for it to make sense to invest in your own hardware, software and setup time.

Review

So, I’m definitely a fan of Tropo and I’ll be adding voice integration to some of my web apps at the first opportunity, as well as building myself a custom voicemail system.

Tropo is not the only company in this space. Another is Twilio which I haven’t tried (mainly because they don’t have a big “hello world” on their home page). Twilio has an XML-based API with helper libraries in various languages, it has support for SMS and limited conferencing (up to 20 participants in a room) but not instant messaging or twitter integration. Because Twilio doesn’t provide local phone numbers outside of the US or Canada (for now), it’s not something I’m going to spend much time looking at, though I will check back periodically. The fact that there’s competition in this space is reassuring.

I don’t have much to say about Tropo on the negative side. Occasionally some errant sound or perhaps errant electron causes Tropo to think it has received a blank response when I haven’t pressed a button, but this doesn’t happen that often and it can be handled with a good healing loop. Fixed by specifying :mode => 'dtmf' as per Adam’s comment. There’s a little room for improvement in the synthesized voice quality, but it’s more than adequate for most purposes and better than most systems I interact with, and you can always record your own prompts. The call quality is like a good Skype connection. There are American and British variants of English available, Castilian and Mexican variants of Spanish, along with French, German, Italian and Dutch, presumably more languages and voice options will be forthcoming in the quest for global domination. Being in Ireland, I’d love to see Polish and Chinese added. :-)

Speaking of Ireland, there are Dublin phone numbers available from Tropo (mine is 019010129) but as yet these numbers cannot receive SMS messages, although I understand that Tropo is working to make this available.

Tropo is FUN. Really, really fun. And the fun starts almost immediately. The scripting interface is almost pure fun, web API development is a little more like work but much more powerful. Tropo is a tremendously useful tool, both for automating voice tasks and as a tool for integrating voice with web applications or data sources. It can be easy to forget that the vast majority of people don’t have iPhones or Android phones, and plain vanilla voice is a reliable, familiar and flexible tool.

It’s also a tool that has been relatively inaccessible. Tropo, with its cloud infrastructure allowing basically zero setup and a very programmer-friendly environment, is going to open up an industry that has been very protected against the need to innovate at internet-speed.




Adam Kalsey 15 Apr 2010

Ana, this is a fantastic article. Love the rich samples you've created here. Very well done.

We're glad you're having fun with Tropo.

If you're doing keypad input only and don't need any recognition, you can tell Tropo to turn off voice recognition for that particular prompt. Just sent the mode to "dtmf" (which is telco speak for Dual Tone Multi Frequency).

For example:

choice = ask "Now key in the amount in #{currency}", {:choices => '[DIGITS]', :mode => 'dtmf' }

Jason Goecke 15 Apr 2010

Great write-up! Thank you for taking the time to do this.

On the occasional audio issue, if you are using Skype you will experience this once in a while as it is simply an artifact of Skype. We provide the Skype access for easy developer access, but do not recommend for full production apps at this time for the reason you cite.

On languages, we have rolled out 8 of the 20 languages we have available, so stay tuned as we roll out more of those!


Source Code