Want to See Your Course as an Example?

Dr. Joel Harband has made an offer to create some brief examples using PowerPoint-based courses that a corporation had developed using a real voice. He would need the original PowerPoint presentation of the course together with the narration script that was used. He could then create part or all of the course using TTS voices instead and we could demo the result. Of course, I would want to show the original so we can show a comparison. If you would like to do this, please contact me via email: akarrer@techempower.com

Questions?

And, if you have any questions at this point in the series on the use of Text-to-Speech, please ask away.

Monday, October 25, 2010

I received an inquiry about resources that would help instructors who are about to move into teaching online courses. It made me immediately think back to my first experience with an online session.

It was the first ever public session for Placeware - a virtual meeting software company that was much later acquired by Microsoft and became Microsoft Live Meeting. Because it was their first ever public session and my first ever online session neither of us knew what we were doing. The topic was roughly (surprise) New Technology for eLearning. They had 100 people participating. And because it was public they made sure that everyone was muted including the moderator.

So we start the session and I’m sitting alone in front of my computer at Loyola Marymount (this must be before 2000). I was holding my handset to my ear (no headset in my office). And I had prepared the way I always did at that point for live, in-person audiences. Remember I taught class several times a week to live audiences, and this was a topic that I presented all the time at professional conferences. No problem, right?

I was not at all prepared for what I experienced. About five minutes into the presentation, with me alone in my office, and everyone muted (literally there is zero sound coming back through) and no prepared stopping points for interaction. Well any good presenter who faces an audience that is completely quiet, sitting still knows they are dying. And I felt like I was completely dying. There was zero feedback. I felt my energy level get used up completely. I was doing everything I could to make it more interesting, but no matter how much passion I put into the phone – no reaction. Panic set in roughly 7 minutes into the presentation!

After that experience, I vowed to try to stink up virtual presentations less in the future. And every once in a while, I realize that I’m not doing a good job. It definitely takes additional thinking/preparation to be good online. And that’s only a single session. If you are going to teach a course online or run an online learning event or an online conference, then there’s even more to being successful at that.

So, what I thought I would do is go back and see what resources I could find some good resources that would help me and could be used by instructors be better prepared to teach online. What a difference a decade makes – now there’s almost TOO much information.

As always I do this by looking through eLearning Learning and related sites like Communities and Networks Connection. I looked at Virtual Classroom, Distance Learning, ILT, Teaching Distance Learning. I also did some quick searches for various kinds of things and added them into eLearning Learning (via delicious). So together, I’ve collected a bunch of resources pretty quickly. That said, there’s so much already out there on this – I’m at this point not quite sure what the real question was/is. Certainly a lot of this is already findable. I hope this is useful. But I think the problem at this point might be something else. Still here are 60 great resources.

Books

By going to one of these on Amazon – you can easily find a TON of additional books.

Side note – one of the cool things is that one of these book recommendations came because my delicious activity (related to eLearning) auto-tweets and someone saw my tweets and then put in a recommendation to their book on the subject. I would guess that’s an automated search or something – but still very smart way to market/PR.

One of the concerns raised by various comments during the series has been around the quality of the results of Text-to-Speech (TTS) Voices and if that was suitable for eLearning. This issue was partly addressed in the previous post. In this post we’ll take a different cut at it by looking at how authors can use punctuation and mark-up language with TTS voices to bring out the meaning of the text more accurately and to make them more interesting. Using these techniques a voice can be made similar enough to human narration to hold a learner’s interest during an entire eLearning course - with a retention rate equivalent to that of a human voice.

Value and Concern Around Voice-Over

Before we jump into this specific topic, let’s look back at some of the specifics from last month’s Big Question - Voice Over in eLearning. Here’s a very quick summary of some of the responses regarding the added learning value of a voice-over as opposed to plain screen text:

Audio provides an additional channel of information which the brain can process in parallel with the visual information [Kapp].

A voice should not just read screen text [Kapp] but can optionally be supported by running subtitles at the bottom of the slide as in Captivate and Speech-Over [Joel].

A great deal more information per slide can be transferred with voice than with plain text. One minute of speech is equivalent to 125 words – which would crowd the slide considerably [Joel].

A lively and interesting voice can motivate learning and increase retention. [Mike Harrison]

A voice can often express the intended meaning more accurately than plain text by changing speed, volume and pitch, emphasizing words, and pausing for emphasis [Mike Harrison] (This is the prosody that we discussed in the first post). For example: He reads well. He reads well. He reads well.

It’s these last two points that relate closely to this topic. Ultimately, we would like the voice (human or TTS) to be lively and interesting, help increase motivation and learning, and convey the meaning more accurately.

Some of the concern around the use of Text-to-Speech Voices in eLearning is whether you can achieve that level of voice use.

Making the Author into a Voice Talent

Today’s post aims to show that with state-of-the-art tools that simplify the use of markup language, like Speech-Over Professional, TTS voices can easily be made interesting as well as prosody-accurate (points 4 and 5 above).

The concept presented here is a bit of a change in thinking:

An author together with a TTS voice is equivalent to a voice talent!

While handling the grammar quite well, the TTS voice by itself cannot know the nuances and emphases (prosody) needed to bring out the intended meaning of the sentence and will produce a compromise prosody. Authors need to fill the gap. Some people in the world of TTS call them “Text Authors.” Throughout this post, we will refer to them simply as “authors” as they likely are also the course author. Authors know what the voice should sound like, they use punctuation and mark-up language to makes the TTS voice achieve the intended meaning and clarity as well as enlivening it.

In some ways this is not that new for people who have worked with voice talent before. If you’ve ever worked a recording session, you will sit there and listen to what’s being said and often correct the phrasing, pronunciation, pacing, and other aspects of how the voice talent is handling the script that you have written. What we are saying is that there’s an equivalent operation when dealing with TTS Voices. You need to listen to the results and make corrections. Of course as we’ve pointed out in Using Text-to-Speech in an eLearning Course, the effort to make changes is likely substantially less.

The Basics

Let’s see an example of what we are talking about. Here is a clip of the TTS voice Heather reading Elizabeth Barrett Browning’s poem “How I love thee?” produced by Speech-Over Professional.

How I love thee?

How do I love thee? Let me count the ways.

I love thee to the depth and breadth and height

My soul can reach, when feeling out of sight

For the ends of Being and ideal Grace.

I love thee to the level of every day's

Most quiet need, by sun and candlelight.

I love thee freely, as men strive for Right;

I love thee purely, as they turn from Praise.

I love with a passion put to use

In my old griefs, and with my childhood's faith.

I love thee with a love I seemed to lose

With my lost saints, I love thee with the breath,

Smiles, tears, of all my life! and, if God choose,

I shall but love thee better after death.

When you listen there are a few simple uses of punctuation and markup language with Speech-Over Professional’s SAPI editor that provide some improvements to how the default would have read this.

The Speech-Over SAPI editor shown above lets authors apply markup language quickly and accurately with simple text symbols, which are as easy to use as ordinary punctuation. The symbols used in this example are the em-dash (—) which inserts a 0.5 sec silent delay and the right and left arrows (⊳,⊲) which decrease and increase the voice speed by one unit.

Listen to the effect of ordinary punctuation on the voice in the example:

The question mark is obvious - Heather expresses it very nicely.

The colon after "Let me count the ways:" gives a feeling of expectation for what’s to come. Putting a comma or period there would not give the same flow. Colons are generally used to introduce sequences to good effect.

Commas are used to give phrasing and resolve ambiguous sentences. They are a powerful tool and are used more often than proper punctuation would require.

Listen also to the effect of the markup language:

A delay (—) was placed between “How do I love thee” and “Let me count the ways” to express a slight hesitation for thought and then again after “Let me count the ways” to further hesitate for thought before stating the reasons.

Delays are also inserted throughout introduce the hesitations that make the voice more realistic.

The decrease and increase in speed for groups of words give them a slight accent and emphasis. For example, the words “I love thee”, “most quiet need”, etc have a speed decrease before them and a return to normal speed afterwards to give them a slight accent, depth, and emotional content. The amount of accent is controlled by the amount of speed reduction two units (⊳⊳) or one (⊳). A similar effect can be achieved by the emphasis tag (!!).

Also Heather’s natural slight Southern accent is because she is made from a real Southerner’s voice!

Now let’s see these concepts more in detail.

Using Punctuation

The judicious use of punctuation goes a long way towards making the voices more expressive and precise, especially the comma and the colon.

Let’s see how the prosody of the following sentence becomes clearer as we add punctuation:

A color is described in three ways by its name how pure it is and its value. (no punctuation) Paul

A color is described in three ways: by its name how pure it is and its value. (adding a colon for expectation) Paul

A color is described in three ways: by its name, how pure it is, and its value. (adding commas for phrasing) Paul

In our experience, the really good voices like Paul and Heather do quite well on their own most of the time with well-placed commas, colons, and silent delays only.

Mark-Up Language

As we mentioned in the first post, many “small” innovations are needed to make text to speech useful and practical. The most important of these is the programming standard Microsoft Speech Application Programming Interface (SAPI) for Windows. SAPI standardizes the way authors control TTS voices: starting and stopping the voice, controlling its speed, volume and pitch, and its flow with silent delays. Manufacturers of SAPI-standard voices implement the SAPI controls in the voice software and developers of speech applications program SAPI controls into their applications to let the user control any SAPI-standard voice.

To control the properties and flow of the voice, SAPI provides a XML markup language, also called speech tags, which is added to the input text to communicate to the voice processor actions to take when converting the text to speech.

Some examples:

1. Volume - The Volume tag controls the volume of a voice on a scale of 0:100. The voice will change volume at the point it encounters the tag.

This text should be spoken at volume level 100.

<volume level="50">

This text should be spoken at volume level fifty.

</volume>

2. Rate - The Rate tag controls the rate (speed) of a voice on a scale of -10:10. The voice will change speed at the point it encounters the tag.

This text should be spoken at rate 0.

<rate absspeed="3"> This text should be spoken at rate 3.

<rate absspeed="-3"> This text should be spoken at rate -3.

</rate> </rate> Heather

The Pitch tag works the same as the Rate tag.

3. Emphasis - The Emph tag instructs the voice to emphasize a word or section of text.

<emph> boo </emph>!

Use the Emph tag to determine the prosody of an ambiguous sentence, for example the one referred to in the first post.

“He reads well” Paul

“He reads well” Paul

“He reads well” Paul

4. Silence - The Silence tag inserts a specified number of milliseconds of silence into the output audio stream.

Five hundred milliseconds of silence <silence msec="500"/> just occurred.

This tag lets you instruct the voice how to say highly technical words and company slogans. See the first post for an example.

6. The PartOfSp tag lets you resolve the part of speech of a word.

Notes:

· Not all voices have all the tags implemented, for example, Heather does not have an emph tag.

· The NeoSpeech voices in Captivate do not use the SAPI tags but rather a proprietary markup language, VTML. Speech-Over works with SAPI-standard voices only.

· For more info about SAPI and its markup language, download sapi.chm from here.

Automating the markup language – SAPI editor

Clearly, having to type in or even paste these XML tags into the input text is time-consuming and error-prone. This is another case where a small innovation is called for: as discussed above, Speech-Over Professional has a SAPI editor that represents XML tags with simple text symbols - which makes it very easy and error-proof to insert and manipulate speech tags in the input text. Speech-Over Professional also automates the Pron tag with its Pronunciation lexicon you can use to add highly technical terms and company slogans.

Bottom Line

You may be thinking that some of the cost savings that you get from using TTS as compared to human voice talent is lost in this effort and that’s true. However, the rework aspect is still substantially less. Again, the best comparison is that of going through a recording session with a script. That process is very similar to what you end up with doing punctuation and markup with text to get the TTS voice to be much improved for eLearning.

For me personally, this is still not the same quality as a good voice talent, but it is definitely a lower cost and has MUCH lower cost in the face of change. It’s a good balance in many situations.

Wednesday, October 13, 2010

I was just asked about trends in open source for eLearning and particularly open source eLearning tools. Probably one of the better sources on this is Jane Hart’s Instructional Tools Directory. You can find a long list of tools broken into authoring tools, games/simulations, quiz/test tools, social media, delivery platforms, tracking and whether they support mobile. In addition, she indicates if they are free or cost money – which is not quite the same thing as open source.

Monday, October 11, 2010

This month’s big question is Examples of Big Impact from Technology and I’ve taken it as an opportunity to go back and look at the elements of different projects that I’ve worked on over the years that have had a big impact. In this post, I’m going to focus on a common model that has been part of several of the highest impact projects.

At it’s core, the model is pretty simple:

Guide through setting meaningful personal goals

Teach how you can hold yourself accountable to those goals

Help the user set up social support

Teach the social supporters how they can help hold the personal accountable

Send lots of reminders to the individual and the supporters

This approach has been used for loan officers, automotive sales, management development, retail store management, and in lots of other industries and jobs. In fact, we’ve also used it as a means of some fairly generic goal setting processes.

As a side note, I believe that there’s a REALLY great business to be created around this.

Goal Setting and Making Plans

There’s a lot of content that already exists around the basics of goal setting, i.e., SMART goals:

Specific

Measurable

Attainable

Realistic

Timely

However, it’s far easier to teach someone about these than it is to help them create the goals themselves.

And if you are asked to create eLearning around goal setting, PLEASE DON’T GIVE THEM A BLANK TEXT ENTRY FIELD. I’ve seen that in courses and in design specs many times, and it’s a HORRIBLE IDEA. Yes, I’m yelling – it’s really that bad.

Remember what it’s like when you set your own SMART goals. It’s one thing when you provide a blank space to write your goals when you are in a classroom and there’s a teacher who can help you. It’s quite something else when you are on a computer and you are likely not very good at this. Actually, it’s rare to find people who really are good at setting SMART goals. Setting SMART goals sounds so easy and takes real work.

Give them criteria to evaluate the goal and plan – have them rework needed items

In my post Data Driven, we describe a use of this approach that helped retail store managers improve customer satisfaction. Of course, improving customer satisfaction is a goal, but the system would drill down to specific issues such as knowledge of store layout. We provide suggestions for particular interventions that have specific steps and particular associated goals. In this case, the plan was as important as the goal.

Accountability, Follow-up and Social Support

Of course, what’s often much more important than setting goals and plans is having a game plan around accountability and follow-up. Anyone who tries to lose weight, can tell you that it’s SO MUCH easier to talk about your goals and come up with specific plans than it is to follow-through on those plans.

There is a ton of material on how you can be better at holding yourself accountable to goals. I roughly boil this down to:

Establish importance

Take responsibility

Track progress

Overcome obstacles

Have reward / punishment system in place

Reminders

When you provide support for setting goals/plans, you need to be really careful to make sure that the person setting the goal actually believes in the importance of the goal and is taking responsibility for the goal/plan. They can easily copy, paste and edit a goal/plan from one of the examples and have no real intent on implementing. We always present why this is important and ask some questions around it. One trick that we’ve used is to ask users to evaluate how likely they are to implement the plan. If they don’t rate it really high, then go back to challenge the goal.

Tracking progress can be implemented in a very complex way or in a very simple way. I’ve worked on systems with both. In some cases, have specific days on a schedule, checking off completed items, providing ratings to evaluate progress, etc. all make sense. In other cases, having a very simple daily/weekly check in with a standard question or two can be effective.

When you do a check-in, there’s a great opportunity to provide support for overcoming obstacles. I’ve see courses on goal setting that have lots of up-front content on overcoming obstacles (they also have blank boxes for inputting your goals). The obvious place to put content around overcoming obstacles is when people run into obstacles. For example, one model is that at each check-in, the user rates how well they’ve done on completing each goal. If they rate themselves poorly, then the system can jump in and find out what the obstacles are that are preventing them from accomplishing the goal. It can provide them some strategies. It’s the learning opportunity that you look for. And again, doing it as performance support makes a lot of sense.

I’ve not done this as much, but having in a way for people to setup rewards and punishments for accomplishing their plans and goals is a great idea. I’ll treat myself to a massage if I do X is a great thing to have as part of the system. Or an account that goes up and down. For most of the systems that I’ve worked on, the assumption is that both intrinsic and extrinsic rewards are tied to accomplishing the goals. For example, the games that teach associates about product location in stores are fun and it’s rewarding to see the employee growth and certainly as it improves customer satisfaction, the store manager gets greater compensation and opportunities. For automotive sales associates, the rewards were prizes, trips, etc.

And last, but certainly not least, definitely keep in mind the necessity of having lots of reminders. Daily and weekly reminders are often really good and should have enough content to provide something of value. Otherwise, it quickly becomes ignored.

Actually, all of this can become ignored unless we step it up one more notch …

Social Support

I don’t know quite what to call it when you enlist other people to help hold someone accountable for goals and plans. In some context, you might call this a support network. I’ve seen accountability partners. I’m going to call it “Social Support” and the people doing it “Social Supporters.” But if you know what I should call this please let me know.

The idea here is pretty simple and has been used in lots of tough behavior change situations: drugs, alcohol, weight loss, etc. Enlist other people to help hold yourself accountable.

In corporate situations, the social supporters can be peers, colleagues or even your boss. In the retail store manager example, district managers were a critical part of the system. The retail store manager’s plan would be reviewed and approved by the district manager. The district manager was responsible for checking in periodically and reviewing progress. In other situations people have enlisted friends, family, etc. Most often they have some mutual interest in the outcome and willingness to accept responsibility to provide support.

Of course, just like most people are not very good at setting SMART goals and coming up with associated plans, most people (including district managers) are not very good at helping to hold people accountable to their goals / plans.

There’s a bit of training required to cover things like roles, alignment, how things work. But the majority of the assistance for social supporters is best provided through performance support. Send them periodic reminders that include specific performance suggestions based on the particular situation. For example, “The person you are supporting just missed their check-in. This might be a good time to jump remind them about the importance they’ve attached to the goal. As an example – …” In other words, here’s a template for a conversation (or email).

Big Impact

Certainly, it’s way easier to build some online training around all of this than to build a performance support solution that helps people set goals and plans, and setups up personal and social accountability. So the question is whether it’s worth the effort.

Well in looking at the situations where I’ve been personally involved in big impact, really moving the needle on factors like sales, customer satisfaction, loyalty – this kind of approach was commonly used. In several of these projects, we measured participants vs. non-participants and the impact was staggering. Of course, there’s always lots of question of the specifics – did non-participants care less? – but my strong belief and I believe it’s backed up by my experience is that this kind of approach has a BIG IMPACT.

About Me

Dr. Tony Karrer works as a part-time CTO for startups and midsize software companies - helping them get product out the door and turn around technology issues. He is considered one of the top technologists in eLearning and is known for working with numerous startups including being the original CTO for eHarmony for its first four years. Dr. Karrer taught Computer Science for eleven years. He has also worked on projects for many Fortune 500 companies including Credit
Suisse, Royal Bank of Canada, Citibank, Lexus, Microsoft, Nissan,
Universal, IBM, Hewlett-Packard, Sun Microsystems, Fidelity
Investments, Symbol Technologies and SHL Systemhouse. Dr. Karrer was
valedictorian at Loyola Marymount University, attended the University
of Southern California as a Tau Beta Pi fellow, one of the top 30
engineers in the nation, and received a M.S. and Ph.D. in Computer
Science. He is a frequent speaker at industry and academic events.