Parse HTML and pass to Cognitive Services Text-to-Speech

Summary: Having some fun with Abbott and Costello’s “Who’s on first?” comedy routine, and multiple voices with Bing Speech.

-------------------------------

Hello everyone!

The last few posts, I showed you all about the Cognitive Services Text-to-Speech API. You learned about the process to authenticate with Windows PowerShell.

It was also a great showcase for Invoke-RestMethod, as it demonstrated how REST API services are accessible with no real code for the IT professional.

Today, as an IT pro, I’m just going to have some fun. Sometimes that’s the best way to learn how to code.

Initially, all of this came about as a challenge from other members of “Hey, Scripting Guy!” I demonstrated a silly little script I wrote to play Abbott and Costello’s most famous comedy sketch, “Who’s on first?” with the internal voices in Windows. It’s a neat trick many PowerShell people love to play with like this.

# Establish to the Voice Comobject

$voiceAPI=New-Object -comobject SAPI.SPVoice

# Speed up the rate of the Speaker's voice

$voiceAPI.Rate=3

I proceeded to get the voices, and then depending on who’s name (yes, that’s his name), I found I would pick a voice in Windows.

# Obtain the list of voices in Windows 10

$voiceFont=$voiceAPI.GetVoices()

# Establish a table to match the Microsoft voices with the names of the comedians

$nameMatch=@{'Abbott:' = 'ZIRA'; 'Costello:' = 'DAVID' }

So it was neat. I had the text file on the hard drive, and it was all fun and games.

Some people said, “Cool, but you should try the same approach with Cognitive Services!”

It was at this point I read and learned everything I showed you in the last several posts. Today we’re going to have some fun: “Who’s on first?” portrayed by the “Azure Cognitive Services Players.”

Challenge #1 – Learn how to use Text-to-Speech in Azure. Accomplished, and built a function to leverage it. I’ve prepopulated all of the available sound file options, so I could just select from an array in this function.

The challenge was that the returned content was one massive string. I needed it broken up into lines for an array.

I’m sure I could have contacted some friends like Tome Tanasovski or Thomas Rayner for some help with regular expressions, but I like trying alternative approaches sometimes.

There were a lot of CRLF (CarriageReturn / LineFeed) and Tabs prefacing the lines. I needed that cleaned up.

$CR=[char][byte]13

$LF=[char][byte]10

$Tab=[char][byte]9

$RawSketchContent=$RawSketch.Content

$RawSketchContent=$RawSketchContent.Replace($cr+$lf+$tab,' ')

Once I completed this, I just had a nice list of content terminating in carriage returns. I could split this up into an array now, in the following fashion:

$SketchArray=$rawsketchcontent.split("`r")

I took a look at the raw HTML, and found a “Before” and “After” on the sketch content. I passed this into Select-Object and captured the line numbers of the array. This allowed me to have a “Begin” parsing point, and an “End.”

With this achieved, I needed to select two voices in Cognitive Services Text-to-Speech. If you remember Part 4 in the series, we showed the list to choose from. I decided on an Australian female voice for Bud Abbott, and an Irish male voice for Lou Costello.

We need to initial certain variables to figure out Who is talking (well yes, of course he is, that’s his job), and to store away the audio content.

$CurrentSpeaker='Nobody'

$TempVoiceFilename='whoisonfirst.wav'

Now for the work to begin. We start our loop from the beginning of the content array to the end, and make sure any temporary WAV file is erased from a previous run.

For ($a=$StartofSketch+1; $a -lt $EndofSketch; $a++)

{

Remove-Item $TempVoiceFilename -Force -ErrorAction SilentlyContinue

We then identify a line of content to parse:

$LinetoSpeak=$sketcharray[$a-1]

Each line that has a speaker on the site began with either BUD: or LOU:, so I used a little RegEx to trap for where the identified speaker name ended. Anything after that would be their speaking content.