I have been revisiting the apple "Siri" dictation thing... it saves a little time. More important it just backs on the pains of typing dramatically.

It is not automated, but it does cut down on the time

None Siri: 5min of TSP = 1hrs of transcribing

Using Siri: 5min of TSP = 40min (10mins of siri, 30min of corrections)

Other than Siri, I am just using the Chrome web browser and Transcribe Chrome app. Doing this by hand for TSP group, would be crazy and Apple makes it hard to automate the dictation feature (most likely so people like me don't abuse it). I have tried Apples simple "Automator" tool, but can't find a way to invoke the dictation...

I am going to try using Selenium (free), but that is going to take time as I refresh my Java skills. This is not for Java beginners, so might take me a 6months (given this is a spare time project).

If you got java skills or want to help out, let me know. If I could get someone to figure out the java code that needs to be written, that would help fast track this. (Would be more then willing to run that on my computer to benefit the group.)

I honestly have never used Java. I'm proficient in C++, Python and C#. I use C# for my daily job. Supposedly its very similar to Java.

What exactly are you looking for? What needs to happen in order to automate this?

I've talked with a Co-worker of mine a million times on GUI automation, and while I haven't done it I get the gist of it. Basically need to profile the application, or do some sort coordinates matching and what not to get the button clicks right.

Regardless I'm willing to help you out. I'm actually sitting down today (right now) to work on getting those transcript pages to you.

I have been revisiting the apple "Siri" dictation thing... it saves a little time. More important it just backs on the pains of typing dramatically.

It is not automated, but it does cut down on the time

None Siri: 5min of TSP = 1hrs of transcribing

Using Siri: 5min of TSP = 40min (10mins of siri, 30min of corrections)

Other than Siri, I am just using the Chrome web browser and Transcribe Chrome app. Doing this by hand for TSP group, would be crazy and Apple makes it hard to automate the dictation feature (most likely so people like me don't abuse it). I have tried Apples simple "Automator" tool, but can't find a way to invoke the dictation...

I am going to try using Selenium (free), but that is going to take time as I refresh my Java skills. This is not for Java beginners, so might take me a 6months (given this is a spare time project).

If you got java skills or want to help out, let me know. If I could get someone to figure out the java code that needs to be written, that would help fast track this. (Would be more then willing to run that on my computer to benefit the group.)

I'm looking at it right now and it has both a Python and a C# version of the application. What are you trying to do with actual "web automation"? (As I'm asking you what are probably self-explainable questions, I'm researching both this app and the software).

Edit:So at first I'm looking at this application going why would one use it? Then I try giving it a shot and wow not too shabby. Slow jack down a notch or two then hit F3 a few times as you go along (rewind it a moment) and not too bad.

I can see somebody spending quite a bit of time doing this if they are not capable of typing pretty fast. Sitting at around 90wpm on a low average to 110wpm I'm not toooooo bad at it.

So Hooty is the plan to what we were talking about before? Running 20 seconds of the podcast through Siri at a time?

So here is the flow:Using my Mac and Chrome (with Transcrib plugin)1: press Esc to start Mp32: press FN twice (this invokes dication to start)3: let it run for 20sec (watching mp3 time on Cromes Trscriber)4: after 20sec press Fn once to stop dictation 5: wait till dication is done processing6: at this point, need to validate dictation processing actually entered new text. Some times it gives up and doesn't do anything.7: if no new text, use F3 to rewind and try again (step 2)8: if new text, then instert time stamp of end of 20sec transcription.9: (remember mp3 has been playing since step 4) use F3 to rewind to new time stamp (step 10: repeat step 2

So here is the flow:Using my Mac and Chrome (with Transcrib plugin)1: press Esc to start Mp32: press FN twice (this invokes dication to start)3: let it run for 20sec (watching mp3 time on Cromes Trscriber)4: after 20sec press Fn once to stop dictation 5: wait till dication is done processing6: at this point, need to validate dictation processing actually entered new text. Some times it gives up and doesn't do anything.7: if no new text, use F3 to rewind and try again (step 2)8: if new text, then instert time stamp of end of 20sec transcription.9: (remember mp3 has been playing since step 4) use F3 to rewind to new time stamp (step 10: repeat step 2

Gotcha. Yep, step by step flow did it.

Ok I can see how/why you're using the transcribe. So you're using Siri to do the actual dictation, where it then writes it to the transcription box in Chrome Transcribe?

I'm wondering if it might be faster just to create a stupid simple QT GUI that has an mp3 player in it with all of that built in logic. I'm not sure what would take longer, profiling the browser/application and writing something that knows how to read the times and what not or actually just doing it through QT. I already have some experience playing podcasts programmatically so I wouldn't have to reinvent the wheel.

Siri is on your phone right?Edit: I guess it can't possibly be your phone...

If you want I can whip together an extremely rough GUI to do this kind of stuff in QT. I'll need you to compile it on MAC though (i'm using Linux). All you would need to do is get the QT Libraries/SDK at http://qt-project.org/downloads. I'd give you my project file, you'd hit "build" and you'd have your application.

I've been meaning to get a mac just for testing/deploying purposes....

After doing a bunch of research I am going to try and use the Windows Speech API and see if i get results. I tried doing the out of the box speech recognizer yesterday but it was god awful. It was the worst looking madlibs I've ever seen and was worth the good laugh.

However, it would appear that this Windows API I might be able to directly send in MP3s into the engine for processing, so I'll try that out and see if there are any other results. I started the GUI, but I started asking around about how I'll be able to interact with the global hotkeys (function key for Siri) and I'm not very confident about it... so we'll see. It is possible that I might be able to interact with the global hotkeys possibly in linux, but I'm not sure if my solution will port to the mac. No idea at this point, so hopefully the Windows API works.....

Siri is on your phone right?Edit: I guess it can't possibly be your phone...

right, it plays from my computer speakers (chrome transcribe) to the mic on the same computer for Siri diction.

technically when you are not using a phone it is called "diction". only called siri when using a iphone. But lets keep calling it siri, because everyone know what siri is... easier to explain to new people.

If you got code you want me to try out, let me know. Work is kind of busy, so no time currently to learn a new syntax...

If you got code you want me to try out, let me know. Work is kind of busy, so no time currently to learn a new syntax...

I hear you on that. Balancing about a million projects. But essentially as it goes my #1 priority is getting my Arduino and automated garden monitoring system working and then I'll be back on working on the Podcast stuff. Yeah I spent pretty much the entire day (the last post) trying to get something going for Dictation but it was worse than awful.

What you reaaaaaaally need is to hook up (not physically but through a computer interface) the audio out to the input for that rather than actually physically sending sound wavs out. I'm sure you'd get a big accuracy boost out of it. Ontop of the million of things I do, I'm also a big music guy, so something like that I could hook up in a giffy. If you don't know how to do that no worries, it would appear the system is working out at least half decent for you now!

going to need some time to unwind and give me fingers a break. But then I am going to start "Steven Harris on Alternative Energy Technologies" Parts 1, 2, and 3 (Ep 840/873/897)... yep that should only be a short 4hr 5min 56sec to transcribe.....

It seems like this board hasn't been active in a while. However, I think I may be interested in giving this a try. I have a few episodes in mind to start with, it looks like none of them have been transcribed yet.

I downloaded the transcription template doc and am looking through some previous ones to get an idea of how to do this, but I may need some help along the way just due to lack of experience. Is anyone still working on anything? Doesn't look like it from the looks of the tell us what you're working on thread, but I just wanted to make sure since that one's also been inactive for a while.

It seems like this board hasn't been active in a while. However, I think I may be interested in giving this a try. I have a few episodes in mind to start with, it looks like none of them have been transcribed yet.

Feel free transcribe any ep. This effort fizzled out a year ago, but any new transcription really helps searching for data in the ep.

if you run into any issues let me know, chances are that we have solved them before