KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: accuracy dwindles
Topic Summary: again and again
Created On: 11/07/2009 07:41 PM
Status: Post and Reply
Linear : Threading : Single : Branch
1 2 3 Next Last unread
Keyword
 11/07/2009 07:41 PM
User is offline View Users Profile Print this message


oncdoc
Member

Posts: 52
Joined: 10/17/2008

I have noticed created a new user for me invariably improves the accuracy. Of course I loose the words that I added to voc (and that is a pain to retrain).

 but why is this? soo annoying. 

 11/07/2009 11:29 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

This topic has been discussed many times, in many forms, on the KnowBrainer Forum. Accuracy degradation is something that many users experience, and the reason (or reasons) it happens is not entirely clear, although there are a number of interesting theories circulating!

In my opinion, the most likely cause is shutting down, putting the computer to sleep, or hibernating the computer while NaturallySpeaking is running, or before NaturallySpeaking has completely exited. When this happens, the user files may become corrupted.

This may be only part of the answer, and there may be other reasons: correcting misrecognized words and phrases improperly, running the language and acoustic models optimizer, and using certain NaturallySpeaking settings. I have met people don't understand how to fix misrecognitions, and end up with lousy accuracy. I run the optimizers, and have never had problems. Nor have I been able to discover any program settings that lead to accuracy degradation.

Whatever the reason for dwindling accuracy, creating a new user does not have to be an ordeal. I have started from scratch so often that it only takes me 5 to 10 minutes. I would rather spend time creating an accurate user than fighting to make an existing user work.

The trick to creating a user quickly is to have a backup of your unique words and phrases, and import the list into the new user. I also keep a 20 or 30 page writing sample, and feed it to NaturallySpeaking to analyze.

For me, initial accuracy with a new user is around 95%. I usually opt for the no training option, but more often than not, I end up doing a five-minute general training. I find that it makes a difference. At any rate, achieving 95% accuracy in under 10 minutes is a worthwhile time investment.

For the next several days, I try to correct words/phrases in context. Correcting in context is more work than correcting individual words, but I do it as much as possible, at least until NaturallySpeaking has collected enough data to run the acoustic and language model optimizers.

Soon after creating a new user, my accuracy levels off at 95% to 98%. Some people may get greater accuracy, but to my way of thinking, anything north of 95% is an achievement!

At that point, I become less assiduous about correcting in context and saving my user. I only save my user files after

1. I add important words to the vocabulary;

2. I delete problematic words from the vocabulary; and

3. I train troublesome words or commands.

The oldest user files that I currently have are over two months old, and are working fine. Before I adapted the above regime, I would notice accuracy degradation within a month.

 11/08/2009 12:41 AM
User is offline View Users Profile Print this message

Author Icon
phils
Top-Tier Member

Posts: 2624
Joined: 10/02/2006

I consistently get about 97 to 98% accuracy and my current user profile is more than a year old. I dictate several hours per day, run the optimizer regularly, try to always correct in context, add and delete words regularly, save my user files daily and have a fully custom vocabulary of about 40K words for technical writing which does not use "add word automatically". I export my user every two weeks and use the latest export on four other machines at least every other week with similar accuracy.

I have NO idea why I don't have problems.

Phil Schaadt

 

 

 

 11/08/2009 01:59 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Before creating a new user, why not export your custom words and then import them into your new user.

I periodically export my custom words as a precaution against losing them and likewise with my custom commands.

I also create a copy of my user folder from time to time and that has saved me on a number of occasions.

I also save my user eveytime that I close DNS down.

I had problems running the optimisers in the past as it corrupted my user but I ran it recently and so far it has not created a problem.

I suppose it is a matter of taking precautions like regularly backing up your harddrive.

So often people who should know better find out the hard way when their system crashes or suffers a lightning strike and lose everything! 

I use "correct that" or "Correct" using KB after dictaing phrases to make corrections. Most of the time it pick up the correct phrase and when it doesn't I select and dictate over it again.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/08/2009 08:11 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Quote:
Soon after creating a new user, my accuracy levels off at 95% to 98%. Some people may get greater accuracy, but to my way of thinking, anything north of 95% is an achievement!

At that point, I become less assiduous about correcting in context and saving my user. I only save my user files after

1. I add important words to the vocabulary;

2. I delete problematic words from the vocabulary; and

3. I train troublesome words or commands.

The oldest user files that I currently have are over two months old, and are working fine. Before I adapted the above regime, I would notice accuracy degradation within a month.

Alan,

First, given the current state-of-the-art in speech recognition 95% accuracy is horrible. 95% accuracy reduces your overall productivity by almost 25%. If I had to waste that much time over the course of dictating for eight hours, it would cost me about an hour and a half. There are times when my overall accuracy will drop to about 98%. However, for me this happens for two basic reasons:

1. I start getting sloppy. That is, the length of my utterances start decreasing from 10 to 12 words down to dictating in short, choppy three word phrases. This also happens because I tend to maintain my current dictation speed (between 140 words to 160 words per minute), but I start mumbling, slurring words, or running words together. At that point I know that it is time to take a break.

2. In order to maintain a high level of accuracy it is necessary to rerun the Audio Setup Wizard and periodically close Dragon NaturallySpeaking and relaunch it. On occasion, I do notice that DNS can have a tendency to insert some bizarre substitutions for things that it normally gets absolutely correct. This is the first indicator to which every user should pay attention and perform the suggestions in #1 above. This will usually correct this problem. It is important to remember that, depending upon your hardware (CPU, memory, & microphone/soundcard) DNS will tend to overload the system resources when dictating over long periods of time with outperforming these basic reinitialization steps.

Second, I'm not surprised that you can't maintain high accuracy with your user profiles over time given the methods that you describe above. Everything that you do fails to take advantage of the advanced features of DNS. For example:

1. Failing to save your user profile on a regular basis results in your losing all of the modifications made by SilentAdapt over the course of any given dictation session. What you gain through SilentAdapt is lost permanently by not saving your user profile. Running the Acoustic and Language Model Optimizer will not recapture any of that. It's gone, permanently. So, what you gain by virtue of SilentAdapt, you lose by virtue of not saving your user profile. Basically, it simply means that you don't continuously adapt your Acoustic Model on-the-fly.

2. While you can recapture corrections that you make via the Acoustic and Language Model Optimizer, becoming "...less assiduous about correcting in context..." you lose the advantage of another process that tends to maintain your accuracy and improve it over time.

3. One of the most significant advantages to in improving accuracy is the use of the (enabling) "Always preserve wave data" option in the DNS Options> Data tab> Advanced button. This feature stores all of your dictation in dra files, which are then used by the Acoustic and Language Model Optimizer in the exact same manner that analyzing and adapting documents to your writing style by running either the Voctool (recommended) or the "Add words from your documents to the vocabulary" from the Accuracy Center. In addition, using this feature also has the advantage of adapting both your Acoustic Model and your Language Model when running the Acoustic and Language Model Optimizer. In short, this is an additional "SilentAdapt" feature.

Granted, some users can't take advantage of this because of either limited RAM. Also, this tends to make user profiles very large. Therefore, users with limited hard drive space or running off the USB thumb drives can experience certain negative aspects of using this feature. For example, when running off a USB thumb drive, using "Always preserve wave data" can tend to slightly reduce overall performance because this results in more frequent access to any storage media. The slower the read/write access to your storage media, the greater the performance hit on DNS. Although, this doesn't seem to overly impact on latency even when it is noticeable.

Third, I tend to agree with Phil. My current user profile was created in August of last year. I set it up with the initial training set to "None" and have never trained it. That user profile is still in use any accuracy overall is about 99% with the accuracy being 99.9% much of the time. I have never suffered from profile corruption or accuracy degradation since its original creation. I also export that user profile on a regular basis just in case something happens. I've only had to replace my user profile a couple of times since DNS 6, and that was because of a complete system crash, which totally corrupted files in the user profile to the point where they were unreadable. However, my accuracy today is better than my accuracy ever was when I first created the user profile, and it has never degraded.

Lastly, there is a complex interaction between computer hardware, microphone/soundcard, and user dictation style. Any one of these or combination thereof can produce problems with accuracy. In addition, they can also produce problems with overall performance (speed/latency). I won't get into this in any detail at this point. Suffice it to say that less than optimal, and by that I don't mean that you have to have the latest technology, computer, microphone/soundcard configurations can reduce the overall performance and accuracy of DNS. For example, unless you are hardware has changed, I notice that you're using a Pentium 4 with 4 GB of RAM under Windows Vista. Now it's not clear whether or not you're using Vista 32-bit or Vista 64-bit, but your hardware configuration if it is still what you're using doesn't take full advantage of the capabilities of DNS 10 because at best it would be a dual core Intel 950, which is limited to 2 MB of L2 cache and overall fairly slow, particularly with Windows Vista. Granted, 4 GB of RAM certainly helps improve performance on such a system running DNS 10, but it is significantly less than ideal.

On the other hand, is it necessary to take advantage of the new Core™ i7 (Nehalem) technology to get the best performance out of DNS 10? No. A good Core2™ Duo or Core2™ Quad with at least 4 MB of L2 cache (note the more L2 cache the better) is all that's necessary to take maximum advantage of the capabilities of DNS 10 in terms of overall performance, reduction in latency, and accuracy. However, the only reason that I generally recommend going to a Core™ i7 system is because they are not any more expensive, and in fact in many cases they're cheaper, than a higher end Core2™ Duo or Core2™ Quad system. Here there are two factors: (a) cost/benefit, and (b) newer technology that will perform significantly better when software developers begin to incorporate multithreading, which the Nehalem chip is designed specifically to take advantage of. In short, you can get a Core™ i7, or Core™ i5 for that matter, system for the same or less than some current Core2™ Duo and Core2™ Quad systems. In addition, the Core™ i7 920 will outperform the Core2™ Extreme Quad Core QX9650, which is many hundreds of dollars more expensive. Because my systems vary from a basic Core2™ Duo 2 GHz, 4 MB of L2 cache and 2 GB of RAM (laptop) to a Core2™ Quad Q6600, Core2™ Extreme Quad Core QX9650, Core2™ Duo E8400, and to Core™ i7's (Core™ I7 Extreme 975 and a Core™ i7 920). Of all of these, I find no advantage to the Core™ i7 Extreme 975 over the Core™ i7 920, and both of these exhibit better overall performance than any of the Core2™ Duo or Core2™ Quad systems that I have. Nevertheless, the better the computer hardware the better DNS 10 will perform, and the better that DNS 10 performs, the more accurate it will be.

Further, using computer hardware that is capable of taking advantage of DNS 10 by setting the Speed vs. Accuracy slider all the way to 100% without any increase in latency will gain you better accuracy overall. The reason for this is that the default setting at 50% for the Speed vs. Accuracy slider eliminates the use of the quadgram model. Only if the Speed vs. Accuracy slider is set to 75% or better does DNS engage the quadgram model.

Also, the less optimal your microphone/soundcard is, the more difficult it is for DNS to transcribe your dictation. Less than effective noise canceling results in background noise making it more difficult for DNS to understand what you say clearly because the volume and type of background noise interference makes it difficult for DNS to get a clear voice pattern that it can easily interpret. This causes delay (latency) issues and makes it more difficult for DNS to distinguish between your speech and background noise interference. Under these conditions, the higher the volume of background noise, the more difficult it becomes for DNS to separate out speech from background and the less clear your dictation. High-volume and low volume interfere with clarity and/or can mask out your speech. The end result is that it takes longer for DNS to correctly and clearly interpreting what you say.

One final point. Even the most optimal microphone/soundcard and/or computer hardware are of little value in producing accurate transcription if your dictation style is poor. That is, if you mumbled, slurred your words, run your words together, dictate faster then you can clearly enunciate and separate words, or generally dictate in short choppy phrases will defeat even the best hardware configuration.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

If there's more than one way to do a job, and one of those ways will result in disaster, then somebody will do it that way.  (variant of Murphy's law - Edward A. Murphy, Jr.)



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/08/2009 08:25 AM
User is offline View Users Profile Print this message

Author Icon
Dana
Top-Tier Member

Posts: 1411
Joined: 10/01/2006

Quote:
I consistently get about 97 to 98% accuracy and my current user profile is more than a year old. I dictate several hours per day, run the optimizer regularly, try to always correct in context, add and delete words regularly, save my user files daily and have a fully custom vocabulary of about 40K words for technical writing which does not use "add word automatically". I export my user every two weeks and use the latest export on four other machines at least every other week with similar accuracy. I have NO idea why I don't have problems.

I ditto Phil exactly: I also routinely get 98% - 99% accuracy, and my current profile is from 10.0 (1-2 years old?)  I also save my files daily and run the Optimizer about every 2 weeks.  I regularly back-up my profile and store it on another hard drive.  I have had to restore my User file a couple of times - but never have any problems with accuracy after going to a restored User.

I'm not sure what causes accuracy degradation for some users.  I do not get accuracy degradation - and other than the above, I don't use any special techniques to keep my User in "good health!"

                 Dana



-------------------------

Dana Joan - Vero Beach, FL  -  DMPE, Version 2.2; Oncology Large Vocabulary; Windows 7.1 (on local computer); Sennheiser MD 431 II mic with the Andrea USB pod; work on a Remote Desktop.

 11/08/2009 09:41 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Quote:
"Always preserve wave data" option

Chucker

I seemed to recall in previous discussions on this forum that after running the accoustic optimiser it would be advantageous to delete the dra files before dictating again or is my memory failing.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/08/2009 10:42 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2257
Joined: 04/02/2008

Jomark,

Yes, that is correct. After running the Acoustic and Language Model Optimizer, there is no gain in keeping the dra files that were used to do this optimization. They have already served their purpose. Additionally, the dra files can accumulate rather quickly hogging alot of memory. So it is best to delete the dra files after running the ACO.

MDH



-------------------------
 11/08/2009 11:18 AM
User is offline View Users Profile Print this message

Author Icon
phils
Top-Tier Member

Posts: 2624
Joined: 10/02/2006

Quote:
That is, if you mumbled, slurred your words...

I can demo DNS to my colleagues at 99%+ writing highly technical SOA integration design text full of buzzwords, jargon, abbreviations and strange product names.  If I didn't regularly slip onto my into my bad habits and mumble my small words and endings, especially in front of new custom words, I would consistently have 99%+ recognition but I don't because I get sloppy. 

Phil Schaadt

 11/08/2009 11:36 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Quote:
I can demo DNS to my colleagues at 99%+ writing highly technical SOA integration design text full of buzzwords, jargon, abbreviations and strange product names. If I didn't regularly slip onto my into my bad habits and mumble my small words and endings, especially in front of new custom words, I would consistently have 99%+ recognition but I don't because I get sloppy.

It's all in the way you dictate! (1. Commandment)

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/08/2009 01:47 PM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Quote:
It's all in the way you dictate

Rüdiger

I totally agree.

It is to master the concept of thinking to speech as opposed to thinking to writing. Also good diction is required.

As someone used to writing for most of my professional life, dictation has been something quite hard to master and become proficient at.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/08/2009 01:59 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Jomark,

You're leaving nothing to add. Learning to dictate using speech recognition means adopting a completely different cultural technique. It's equivalent to learning a different way to move.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/08/2009 10:53 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

Quote:
That user profile is still in use any accuracy overall is about 99% with the accuracy being 99.9% much of the time.

Hi Chucker,

Interesting post. I sometimes hit 99% or better accuracy, but it never lasts. Misrecognitions begin to happen, even when I am speaking clearly. I work in a quiet environment. My CPU is actually a Core 2 Duo 1.5 GHz, not a Pentium.

I achieve the best accuracy when I am testing the system. But when I am writing and revising real texts, accuracy is less. I don't see how it could be otherwise. The errors are (usually) not far fetched: I say "wouldn't talk" and DNS outputs "would not talk" or "wooden talk." DNS mixes up phrases like "printed out" and "print it out," despite there being more than sufficient context, e.g., "I printed out the document."

Perhaps I use more more homophonic (or near homophonic) words and phrases than most people. So to me, 98% seems reasonable for free-form dictating and editing. It does not bother me when accuracy falls slightly from its highest levels. It's still easier, faster, and more fun than typing. When the system starts to act bizarre, my protocol is similar to yours: I run the Audio Setup Wizard; exit as many applications as practical; close and relaunch DNS; and if the problem persist, reboot my PC... or take a break.

When you say you get 99.9% much of the time, are you estimating, or have you measured this? It seems remarkable that anybody could dictate 1000 words and commands with only a single misrecognition.

 11/08/2009 11:35 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

Quote:
3. One of the most significant advantages to in improving accuracy is the use of the (enabling) "Always preserve wave data" option in the DNS Options> Data tab> Advanced button. This feature stores all of your dictation in dra files, which are then used by the Acoustic and Language Model Optimizer in the exact same manner that analyzing and adapting documents to your writing style by running either the Voctool (recommended) or the "Add words from your documents to the vocabulary" from the Accuracy Center. In addition, using this feature also has the advantage of adapting both your Acoustic Model and your Language Model when running the Acoustic and Language Model Optimizer. In short, this is an additional "SilentAdapt" feature.

 

Chucker, 

When "Always preserve wave data" is unchecked, what does running the "Acoustic and Language Model Optimizer" do? What data does the optimizer draw upon to tweak the acoustic and language models?

 11/09/2009 01:21 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

MDH

Not all files have a .dra extension. Is it safe to delete all files in the Dra folder?



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/09/2009 03:37 AM
User is offline View Users Profile Print this message


Ron Len
Member

Posts: 116
Joined: 10/02/2006

Chuck,

Just out of curiosity, I am intrigued with your statement that "one of the most significant advantages to improving accuracy is the use of (enabling) Always Preserve Wave Data...."

With that being said, what is the reasoning for Nuance essentiallly burying such a "significant" feature deep within the settings of the program?

I am not being argumentative, but I am more surprised that this has not been brought up in the past as one of the main ways to increase accuracy. One would think that this would be of primary importance as basic information that should be imparted to end-users.

If it has been discussed in depth on this forum, I had missed it. So thanks for pointing out the necessity of checking the box, which I have now done and which I am looking forward to seeing the net effect.

Thanks -- Len

 11/09/2009 08:28 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2257
Joined: 04/02/2008

Jomark,

The dra files that you want to delete, are safe to delete, and are no longer necessary are located at:

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking(version #)\Users\(your user-name)\current\voice_container\drafiles

You can use the following command to do this:

AppBringUp "C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking(version # here)\Users\(your user-name here)\current\voice_container\drafiles"
Wait 1
SendDragonKeys "{Ctrl+a}"
Wait .3
SendDragonKeys "{Shift+Del}"
Wait 1
SendDragonKeys "{Alt+y}"
Wait 1
SendSystemKeys  "{Alt+F4}"

 

MDH



-------------------------
 11/09/2009 02:56 PM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

MDH

Thanks, I have deleted all files in the Dra folder.

I presume the file extension changes from .dra after the optimiser has run.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/09/2009 04:23 PM
User is offline View Users Profile Print this message


photoman
Top-Tier Member

Posts: 311
Joined: 07/08/2009

I feel like a dwarf entering the land of the Giants but I will proceed anyway. I was following this post and made a few the suggestions. Regarding the Drafiles, I found a number of files that began with "DRA" which had afile extension of "dft." What is the significance of these files and can we also delete them?

-------------------------

Photoman
Digital Cameras Don't Take Good Pictures. People Do!

DNS Preferred V11, Quad 4 Intel i7-920; 2.66GHz; 8GB RAM; Windows 7 - 64 bit

 11/09/2009 07:38 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39769
Joined: 10/01/2006

You can safely delete the entire dra folder if you choose. NaturallySpeaking will simply re-create it when you re-launch DNS.

-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

KnowBrainer Speech Recognition » Dragon Speech Recognition » accuracy dwindles

1 2 3 Next Last unread
Statistics
32285 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 1 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 391 guests browsing this forum, which makes a total of 392 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.