You are currently viewing our boards as a guest which gives you limited access to view most discussions, articles and access our other FREE features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, download files, upload your own photos and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!
If you have any problems with the registration process or your account login, please contact contact us.
I am a sucker for the idea that one day I'll be able to pull my Pocket PC type device out of my pocket, hit a button and start rambling off my task list, dictate an email and list out my ideas and have it automatically transcribe my words into text format. I've tried a lot of the desktop solutions for this including Dragon Dictate and IBM's ViaVoice. The results of my testing were pretty disappointing. Transcription of a simple sentence such as "Hello, my name is Brad" resulted in "Haloo I same is ??ad" Now I know my first name is pretty tricky it coming from the ancient Egyptian word Ôµ‘ˆ�, but what about the words "Hello" and "my"?
Well, there is hope. Engaget has a blurb about the devices coming from Samsung that have Voicesignal speech recognition software installed on them. "The speaker independent voice recognition MobileBurn describes sounds identical to the incredible software on my Samsung i700 Pocket PC Phone (pictured at right) called VoiceSignal. However, VoiceSignal doesn’t list LG in its list of manufacturers using the software. VoiceSignal works so well its creepy. It has never needed a stitch of training and responds just as well to names like “Zolnowski” as it does “Smith.” What I would like to know is this: how the heck are they able to acheive this on a handset while Via Voice and the others have been so bad for so many years — and on a desktop processors, no less?"
Most speech recognition engines on the desktop rely heavily on how words are used in context to accurately interpret them. Ponder this sentence:
The TWO of us will go TO the airport to pick up the dogs and the cats, TOO.
Context is the only way to accurately interpret which word of the two, to, and too triplet is meant in that sentence. And don't even think about sound alikes like through and threw.
Speech recognition engines that only work in a limited subset of everyday use, like recognition on a phone, can be very accurate as they can work entirely phonetically. This is an order of magnitude easier than the context example above.
MS Voice Command (which I reviewed for BostonPocket) is extremely accurate for that very reason. No training required although a little bit of training helps the engine get used to the speaker's accent. The Toshiba Voice Command distributed with the e800 is just as accurate and allows control of the PPC via voice.
Desktop recognition programs can interpret with 97%+ accuracy which is pretty phenomenal given the complexity of the problem.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
Right, it's like when I call Sears to inquire about a lawn mower repair or something. their voice activated system is alarming it's so good, however, I imagine it's not good for transcribing. :P
So do you have a desktop solution that gives you 97%? Even though I've trained some in the past, I've never gotten above 50% reliability.
For the desktop I've used both IBM's ViaVoice and Dragon's Naturally Speaking. Both have about the same accuracy for dictating with different approaches for the command and control stuff.
Dragon has an interesting capability of letting you dictate into your PPC and then transcribing it on the desktop later but the essential part of obtaining good accuracy is a high quality noise cancelling headset. Anything else just won't give you good results. It's the key to obtaining 97% accuracy.
You also have to remember that if you're dictating documents that deal with technical areas, like PPC related stuff, then the special terms and acronyms will eventually defeat your accuracy. These systems are really for everyday speech type of stuff. That's why these two companies both offer industry specific versions of their programs, primarily medical and legal.
Forgive me for being long winded but this is a subject that has interested me for over a decade. I'm actually an IBM-certified speech recognition specialist (whatever that's worth) as I took several technical courses they offered covering the ViaVoice software. I find this topic so interesting because when the accuracy of recognition can get just a little bit better with mobile platforms it will be the next big paradigm shift in computing. It will alter EVERYTHING we do with computers today.
Hah, finally got the word "paradigm" into a post.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
Well, I do appreciate you taking the time to respond. Voice to text has interested me for many years too. Remember the old Mac "Power Secretary" days? You are viavoice certified? I remember seeing a demo by one of you guys at Disneyworld once. It was amazing to see him go. He had it rolling so that it was darn near 100% accurate.
When I tried it though, I didnt get very far. Perhaps it was the lack of noise cancelling headphones? I never tried it with them.
I never did do the "full training" method where you have to read the unabridged A Tale of Two Cities or something to get through it...whew..
So, do you ever fire it up to do forum posts or anything these days?
Oh, BTW, I read an article about 5 years ago where a local company was using Dragon Dictate and was getting the 97% results, they replaced 5-6 secretaries with software. That was motivating for me to really give it a go.
I use ViaVoice for serious writing like reports and longer correspondence. Real email (work stuff ) is good for dictation too. I don't do much forum posting with it as it consists of too many acronyms, smilies, and stuff like that.
With a good headset and full training (after all you only have to do it once) I can easily get 98%+ on normal dictation.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
Am I right in thinking that the voice to text software can have issues with small variations in the users voice e.g. if you've got a slight cold or hangover .
Can anyone offer a words per minute (WPM) count for the various voice to text software?
__________________ Afterism (n) - A concise, clever statement you don\'t think of until too late.
To err is human, but to really foul things up you need a computer.
An ailment that grossly changes your voice can certainly impact the accuracy of the engines but in my experience it has to be pretty bad to be a major influence.
The average speaking rate is 150+ words per minute and both major recoginition packages can easily keep up with this.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
So the message I'm getting from this thread is that if you want workable voice to text production, you better be prepared to bust out your copy (reader format of course ) of 'A Tale of Two Cities' and get a decent microphone/headset.
It's a shame that a headset thingiemajig is required, but I guess the technology is always advancing.
__________________ Afterism (n) - A concise, clever statement you don\'t think of until too late.
To err is human, but to really foul things up you need a computer.
The full training for Via Voice is 15 minutes. That's not too bad to get you good accuracy out of the box. Both program's engines learn as you use them so they will eventually get to a 97%+ accuracy without the full training but I say why not do it and get that accuracy right out of the chute.
As far as the headset you don't have to use a good noise cancelling one but it adds at least 3 - 4% of accuracy. Background noise wreaks havoc on these engines that are trying to discern syllables.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
So James,
I think you are talking me into it... I'm thinking I should give it another go
Do you recommend one product over another? The Dragon Dictate Pocket PC component is intriguing.
Is there decent demos of these products? I think I might even have a NC mic around here somewhere.
What I've found is both products are comparable, with Dragon probably being a little easier to set up and go. I prefer Via Voice which is probably because I've used it for 10 years.
What's interesting is both products are now sold and supported by ScanSoft and instead of merging into one product line they see merits in keepting both separate.
One thing that bothers me about Dragon if you look at their product comparison matrix they show that the Professional version is the only one that supports using it directly into Outlook. Their web site doesn't even offer the Professional version for sale. If you web search for it the price difference between their Preferred version and the Professional version is $199 vs. $569. Ouch.
Via Voice USB Pro version is the top of the line and retails for $189. It comes with a USB microphone by Plantronics.
__________________ James Kendrick Microsoft MVP - Tablet PCwww.jkontherun.com Lockergnome contributor- Mobile Lifestyle...using mobile devices since they weighed 30 lbs.
So is there any justification to have a side by side comparision organised by someone who's impartial, yet knowledgable about both products?
__________________ Afterism (n) - A concise, clever statement you don\'t think of until too late.
To err is human, but to really foul things up you need a computer.