KnowBrainer Speech Recognition
Decrease font size
Increase font size
Topic Title: accuracy dwindles
Topic Summary: again and again
Created On: 11/07/2009 07:41 PM
Status: Post and Reply
Linear : Threading : Single : Branch
 accuracy dwindles   - oncdoc - 11/07/2009 07:41 PM  
 accuracy dwindles   - Alan Cantor - 11/07/2009 11:29 PM  
 accuracy dwindles   - phils - 11/08/2009 12:41 AM  
 accuracy dwindles   - Dana - 11/08/2009 08:25 AM  
 accuracy dwindles   - Chucker - 11/08/2009 08:11 AM  
 accuracy dwindles   - Jomark - 11/08/2009 09:41 AM  
 accuracy dwindles   - MDH - 11/08/2009 10:42 AM  
 accuracy dwindles   - phils - 11/08/2009 11:18 AM  
 accuracy dwindles   - R. Wilke - 11/08/2009 11:36 AM  
 accuracy dwindles   - Jomark - 11/08/2009 01:47 PM  
 accuracy dwindles   - R. Wilke - 11/08/2009 01:59 PM  
 accuracy dwindles   - Jomark - 11/09/2009 01:21 AM  
 accuracy dwindles   - MDH - 11/09/2009 08:28 AM  
 accuracy dwindles   - Jomark - 11/09/2009 02:56 PM  
 accuracy dwindles   - mkweiss - 11/11/2009 04:20 PM  
 accuracy dwindles   - Alan Cantor - 11/08/2009 10:53 PM  
 accuracy dwindles   - Alan Cantor - 11/08/2009 11:35 PM  
 accuracy dwindles   - Ron Len - 11/09/2009 03:37 AM  
 accuracy dwindles   - photoman - 11/09/2009 04:23 PM  
 accuracy dwindles   - Lunis Orcutt - 11/09/2009 07:38 PM  
 accuracy dwindles   - Chucker - 11/10/2009 02:11 PM  
 accuracy dwindles   - CtRptr - 11/11/2009 10:48 AM  
 accuracy dwindles   - Alan Cantor - 11/11/2009 11:07 AM  
 accuracy dwindles   - Lunis Orcutt - 11/11/2009 12:06 PM  
 accuracy dwindles   - R. Wilke - 11/11/2009 12:27 PM  
 accuracy dwindles   - Chucker - 11/11/2009 01:03 PM  
 accuracy dwindles   - R. Wilke - 11/11/2009 03:49 PM  
 accuracy dwindles   - monkey8 - 11/11/2009 04:26 PM  
 accuracy dwindles   - R. Wilke - 11/11/2009 04:49 PM  
 accuracy dwindles   - CtRptr - 11/11/2009 05:37 PM  
 accuracy dwindles   - R. Wilke - 11/11/2009 05:41 PM  
 accuracy dwindles   - CtRptr - 11/11/2009 06:05 PM  
 accuracy dwindles   - David.P - 11/12/2009 10:10 AM  
 accuracy dwindles   - Chucker - 11/12/2009 10:44 AM  
 accuracy dwindles   - David.P - 11/12/2009 11:03 AM  
 accuracy dwindles   - Chucker - 11/12/2009 12:39 PM  
 accuracy dwindles   - Keith - 11/12/2009 04:13 PM  
 accuracy dwindles   - Lunis Orcutt - 11/12/2009 07:08 PM  
 accuracy dwindles   - Chucker - 11/12/2009 11:55 PM  
 accuracy dwindles   - Keith - 11/13/2009 10:06 AM  
 accuracy dwindles   - Chucker - 11/13/2009 10:32 AM  
 accuracy dwindles   - Dana - 11/12/2009 06:29 PM  
 accuracy dwindles   - Chucker - 11/12/2009 11:44 PM  
 accuracy dwindles   - monkey8 - 11/12/2009 11:11 AM  
 accuracy dwindles   - David.P - 11/12/2009 11:21 AM  
 accuracy dwindles   - monkey8 - 11/12/2009 12:41 PM  
 accuracy dwindles   - Chucker - 11/12/2009 12:56 PM  
 accuracy dwindles   - monkey8 - 11/12/2009 02:00 PM  
 accuracy dwindles   - Jomark - 11/08/2009 01:59 AM  
Keyword
 11/07/2009 07:41 PM
User is offline View Users Profile Print this message


oncdoc
Member

Posts: 52
Joined: 10/17/2008

I have noticed created a new user for me invariably improves the accuracy. Of course I loose the words that I added to voc (and that is a pain to retrain).

 but why is this? soo annoying. 

 11/07/2009 11:29 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

This topic has been discussed many times, in many forms, on the KnowBrainer Forum. Accuracy degradation is something that many users experience, and the reason (or reasons) it happens is not entirely clear, although there are a number of interesting theories circulating!

In my opinion, the most likely cause is shutting down, putting the computer to sleep, or hibernating the computer while NaturallySpeaking is running, or before NaturallySpeaking has completely exited. When this happens, the user files may become corrupted.

This may be only part of the answer, and there may be other reasons: correcting misrecognized words and phrases improperly, running the language and acoustic models optimizer, and using certain NaturallySpeaking settings. I have met people don't understand how to fix misrecognitions, and end up with lousy accuracy. I run the optimizers, and have never had problems. Nor have I been able to discover any program settings that lead to accuracy degradation.

Whatever the reason for dwindling accuracy, creating a new user does not have to be an ordeal. I have started from scratch so often that it only takes me 5 to 10 minutes. I would rather spend time creating an accurate user than fighting to make an existing user work.

The trick to creating a user quickly is to have a backup of your unique words and phrases, and import the list into the new user. I also keep a 20 or 30 page writing sample, and feed it to NaturallySpeaking to analyze.

For me, initial accuracy with a new user is around 95%. I usually opt for the no training option, but more often than not, I end up doing a five-minute general training. I find that it makes a difference. At any rate, achieving 95% accuracy in under 10 minutes is a worthwhile time investment.

For the next several days, I try to correct words/phrases in context. Correcting in context is more work than correcting individual words, but I do it as much as possible, at least until NaturallySpeaking has collected enough data to run the acoustic and language model optimizers.

Soon after creating a new user, my accuracy levels off at 95% to 98%. Some people may get greater accuracy, but to my way of thinking, anything north of 95% is an achievement!

At that point, I become less assiduous about correcting in context and saving my user. I only save my user files after

1. I add important words to the vocabulary;

2. I delete problematic words from the vocabulary; and

3. I train troublesome words or commands.

The oldest user files that I currently have are over two months old, and are working fine. Before I adapted the above regime, I would notice accuracy degradation within a month.

 11/08/2009 12:41 AM
User is offline View Users Profile Print this message

Author Icon
phils
Top-Tier Member

Posts: 2624
Joined: 10/02/2006

I consistently get about 97 to 98% accuracy and my current user profile is more than a year old. I dictate several hours per day, run the optimizer regularly, try to always correct in context, add and delete words regularly, save my user files daily and have a fully custom vocabulary of about 40K words for technical writing which does not use "add word automatically". I export my user every two weeks and use the latest export on four other machines at least every other week with similar accuracy.

I have NO idea why I don't have problems.

Phil Schaadt

 

 

 

 11/08/2009 08:25 AM
User is offline View Users Profile Print this message

Author Icon
Dana
Top-Tier Member

Posts: 1411
Joined: 10/01/2006

Quote:
I consistently get about 97 to 98% accuracy and my current user profile is more than a year old. I dictate several hours per day, run the optimizer regularly, try to always correct in context, add and delete words regularly, save my user files daily and have a fully custom vocabulary of about 40K words for technical writing which does not use "add word automatically". I export my user every two weeks and use the latest export on four other machines at least every other week with similar accuracy. I have NO idea why I don't have problems.

I ditto Phil exactly: I also routinely get 98% - 99% accuracy, and my current profile is from 10.0 (1-2 years old?)  I also save my files daily and run the Optimizer about every 2 weeks.  I regularly back-up my profile and store it on another hard drive.  I have had to restore my User file a couple of times - but never have any problems with accuracy after going to a restored User.

I'm not sure what causes accuracy degradation for some users.  I do not get accuracy degradation - and other than the above, I don't use any special techniques to keep my User in "good health!"

                 Dana



-------------------------

Dana Joan - Vero Beach, FL  -  DMPE, Version 2.2; Oncology Large Vocabulary; Windows 7.1 (on local computer); Sennheiser MD 431 II mic with the Andrea USB pod; work on a Remote Desktop.

 11/08/2009 08:11 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Quote:
Soon after creating a new user, my accuracy levels off at 95% to 98%. Some people may get greater accuracy, but to my way of thinking, anything north of 95% is an achievement!

At that point, I become less assiduous about correcting in context and saving my user. I only save my user files after

1. I add important words to the vocabulary;

2. I delete problematic words from the vocabulary; and

3. I train troublesome words or commands.

The oldest user files that I currently have are over two months old, and are working fine. Before I adapted the above regime, I would notice accuracy degradation within a month.

Alan,

First, given the current state-of-the-art in speech recognition 95% accuracy is horrible. 95% accuracy reduces your overall productivity by almost 25%. If I had to waste that much time over the course of dictating for eight hours, it would cost me about an hour and a half. There are times when my overall accuracy will drop to about 98%. However, for me this happens for two basic reasons:

1. I start getting sloppy. That is, the length of my utterances start decreasing from 10 to 12 words down to dictating in short, choppy three word phrases. This also happens because I tend to maintain my current dictation speed (between 140 words to 160 words per minute), but I start mumbling, slurring words, or running words together. At that point I know that it is time to take a break.

2. In order to maintain a high level of accuracy it is necessary to rerun the Audio Setup Wizard and periodically close Dragon NaturallySpeaking and relaunch it. On occasion, I do notice that DNS can have a tendency to insert some bizarre substitutions for things that it normally gets absolutely correct. This is the first indicator to which every user should pay attention and perform the suggestions in #1 above. This will usually correct this problem. It is important to remember that, depending upon your hardware (CPU, memory, & microphone/soundcard) DNS will tend to overload the system resources when dictating over long periods of time with outperforming these basic reinitialization steps.

Second, I'm not surprised that you can't maintain high accuracy with your user profiles over time given the methods that you describe above. Everything that you do fails to take advantage of the advanced features of DNS. For example:

1. Failing to save your user profile on a regular basis results in your losing all of the modifications made by SilentAdapt over the course of any given dictation session. What you gain through SilentAdapt is lost permanently by not saving your user profile. Running the Acoustic and Language Model Optimizer will not recapture any of that. It's gone, permanently. So, what you gain by virtue of SilentAdapt, you lose by virtue of not saving your user profile. Basically, it simply means that you don't continuously adapt your Acoustic Model on-the-fly.

2. While you can recapture corrections that you make via the Acoustic and Language Model Optimizer, becoming "...less assiduous about correcting in context..." you lose the advantage of another process that tends to maintain your accuracy and improve it over time.

3. One of the most significant advantages to in improving accuracy is the use of the (enabling) "Always preserve wave data" option in the DNS Options> Data tab> Advanced button. This feature stores all of your dictation in dra files, which are then used by the Acoustic and Language Model Optimizer in the exact same manner that analyzing and adapting documents to your writing style by running either the Voctool (recommended) or the "Add words from your documents to the vocabulary" from the Accuracy Center. In addition, using this feature also has the advantage of adapting both your Acoustic Model and your Language Model when running the Acoustic and Language Model Optimizer. In short, this is an additional "SilentAdapt" feature.

Granted, some users can't take advantage of this because of either limited RAM. Also, this tends to make user profiles very large. Therefore, users with limited hard drive space or running off the USB thumb drives can experience certain negative aspects of using this feature. For example, when running off a USB thumb drive, using "Always preserve wave data" can tend to slightly reduce overall performance because this results in more frequent access to any storage media. The slower the read/write access to your storage media, the greater the performance hit on DNS. Although, this doesn't seem to overly impact on latency even when it is noticeable.

Third, I tend to agree with Phil. My current user profile was created in August of last year. I set it up with the initial training set to "None" and have never trained it. That user profile is still in use any accuracy overall is about 99% with the accuracy being 99.9% much of the time. I have never suffered from profile corruption or accuracy degradation since its original creation. I also export that user profile on a regular basis just in case something happens. I've only had to replace my user profile a couple of times since DNS 6, and that was because of a complete system crash, which totally corrupted files in the user profile to the point where they were unreadable. However, my accuracy today is better than my accuracy ever was when I first created the user profile, and it has never degraded.

Lastly, there is a complex interaction between computer hardware, microphone/soundcard, and user dictation style. Any one of these or combination thereof can produce problems with accuracy. In addition, they can also produce problems with overall performance (speed/latency). I won't get into this in any detail at this point. Suffice it to say that less than optimal, and by that I don't mean that you have to have the latest technology, computer, microphone/soundcard configurations can reduce the overall performance and accuracy of DNS. For example, unless you are hardware has changed, I notice that you're using a Pentium 4 with 4 GB of RAM under Windows Vista. Now it's not clear whether or not you're using Vista 32-bit or Vista 64-bit, but your hardware configuration if it is still what you're using doesn't take full advantage of the capabilities of DNS 10 because at best it would be a dual core Intel 950, which is limited to 2 MB of L2 cache and overall fairly slow, particularly with Windows Vista. Granted, 4 GB of RAM certainly helps improve performance on such a system running DNS 10, but it is significantly less than ideal.

On the other hand, is it necessary to take advantage of the new Core™ i7 (Nehalem) technology to get the best performance out of DNS 10? No. A good Core2™ Duo or Core2™ Quad with at least 4 MB of L2 cache (note the more L2 cache the better) is all that's necessary to take maximum advantage of the capabilities of DNS 10 in terms of overall performance, reduction in latency, and accuracy. However, the only reason that I generally recommend going to a Core™ i7 system is because they are not any more expensive, and in fact in many cases they're cheaper, than a higher end Core2™ Duo or Core2™ Quad system. Here there are two factors: (a) cost/benefit, and (b) newer technology that will perform significantly better when software developers begin to incorporate multithreading, which the Nehalem chip is designed specifically to take advantage of. In short, you can get a Core™ i7, or Core™ i5 for that matter, system for the same or less than some current Core2™ Duo and Core2™ Quad systems. In addition, the Core™ i7 920 will outperform the Core2™ Extreme Quad Core QX9650, which is many hundreds of dollars more expensive. Because my systems vary from a basic Core2™ Duo 2 GHz, 4 MB of L2 cache and 2 GB of RAM (laptop) to a Core2™ Quad Q6600, Core2™ Extreme Quad Core QX9650, Core2™ Duo E8400, and to Core™ i7's (Core™ I7 Extreme 975 and a Core™ i7 920). Of all of these, I find no advantage to the Core™ i7 Extreme 975 over the Core™ i7 920, and both of these exhibit better overall performance than any of the Core2™ Duo or Core2™ Quad systems that I have. Nevertheless, the better the computer hardware the better DNS 10 will perform, and the better that DNS 10 performs, the more accurate it will be.

Further, using computer hardware that is capable of taking advantage of DNS 10 by setting the Speed vs. Accuracy slider all the way to 100% without any increase in latency will gain you better accuracy overall. The reason for this is that the default setting at 50% for the Speed vs. Accuracy slider eliminates the use of the quadgram model. Only if the Speed vs. Accuracy slider is set to 75% or better does DNS engage the quadgram model.

Also, the less optimal your microphone/soundcard is, the more difficult it is for DNS to transcribe your dictation. Less than effective noise canceling results in background noise making it more difficult for DNS to understand what you say clearly because the volume and type of background noise interference makes it difficult for DNS to get a clear voice pattern that it can easily interpret. This causes delay (latency) issues and makes it more difficult for DNS to distinguish between your speech and background noise interference. Under these conditions, the higher the volume of background noise, the more difficult it becomes for DNS to separate out speech from background and the less clear your dictation. High-volume and low volume interfere with clarity and/or can mask out your speech. The end result is that it takes longer for DNS to correctly and clearly interpreting what you say.

One final point. Even the most optimal microphone/soundcard and/or computer hardware are of little value in producing accurate transcription if your dictation style is poor. That is, if you mumbled, slurred your words, run your words together, dictate faster then you can clearly enunciate and separate words, or generally dictate in short choppy phrases will defeat even the best hardware configuration.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

If there's more than one way to do a job, and one of those ways will result in disaster, then somebody will do it that way.  (variant of Murphy's law - Edward A. Murphy, Jr.)



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/08/2009 09:41 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Quote:
"Always preserve wave data" option

Chucker

I seemed to recall in previous discussions on this forum that after running the accoustic optimiser it would be advantageous to delete the dra files before dictating again or is my memory failing.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/08/2009 10:42 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2257
Joined: 04/02/2008

Jomark,

Yes, that is correct. After running the Acoustic and Language Model Optimizer, there is no gain in keeping the dra files that were used to do this optimization. They have already served their purpose. Additionally, the dra files can accumulate rather quickly hogging alot of memory. So it is best to delete the dra files after running the ACO.

MDH



-------------------------
 11/08/2009 11:18 AM
User is offline View Users Profile Print this message

Author Icon
phils
Top-Tier Member

Posts: 2624
Joined: 10/02/2006

Quote:
That is, if you mumbled, slurred your words...

I can demo DNS to my colleagues at 99%+ writing highly technical SOA integration design text full of buzzwords, jargon, abbreviations and strange product names.  If I didn't regularly slip onto my into my bad habits and mumble my small words and endings, especially in front of new custom words, I would consistently have 99%+ recognition but I don't because I get sloppy. 

Phil Schaadt

 11/08/2009 11:36 AM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Quote:
I can demo DNS to my colleagues at 99%+ writing highly technical SOA integration design text full of buzzwords, jargon, abbreviations and strange product names. If I didn't regularly slip onto my into my bad habits and mumble my small words and endings, especially in front of new custom words, I would consistently have 99%+ recognition but I don't because I get sloppy.

It's all in the way you dictate! (1. Commandment)

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/08/2009 01:47 PM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Quote:
It's all in the way you dictate

Rüdiger

I totally agree.

It is to master the concept of thinking to speech as opposed to thinking to writing. Also good diction is required.

As someone used to writing for most of my professional life, dictation has been something quite hard to master and become proficient at.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/08/2009 01:59 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Jomark,

You're leaving nothing to add. Learning to dictate using speech recognition means adopting a completely different cultural technique. It's equivalent to learning a different way to move.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/09/2009 01:21 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

MDH

Not all files have a .dra extension. Is it safe to delete all files in the Dra folder?



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/09/2009 08:28 AM
User is offline View Users Profile Print this message

Author Icon
MDH
Top-Tier Member

Posts: 2257
Joined: 04/02/2008

Jomark,

The dra files that you want to delete, are safe to delete, and are no longer necessary are located at:

C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking(version #)\Users\(your user-name)\current\voice_container\drafiles

You can use the following command to do this:

AppBringUp "C:\Documents and Settings\All Users\Application Data\Nuance\NaturallySpeaking(version # here)\Users\(your user-name here)\current\voice_container\drafiles"
Wait 1
SendDragonKeys "{Ctrl+a}"
Wait .3
SendDragonKeys "{Shift+Del}"
Wait 1
SendDragonKeys "{Alt+y}"
Wait 1
SendSystemKeys  "{Alt+F4}"

 

MDH



-------------------------
 11/09/2009 02:56 PM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

MDH

Thanks, I have deleted all files in the Dra folder.

I presume the file extension changes from .dra after the optimiser has run.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

 11/11/2009 04:20 PM
User is offline View Users Profile Print this message


mkweiss
Senior Member

Posts: 118
Joined: 12/21/2006

just this morning I changed preferences to "save" .wav data.  Even without doing optimization, I am seeing a remarkable improvement in accuracy.  I have no idea what this has done to my system, but I am very pleased.  I really do not think it is simply "placebo effect", I am not dictating any differently than I usually do nor am I under voice stress.  Even if no one has an idea why this has helped, I certainly am leaving this setting!
 11/08/2009 10:53 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

Quote:
That user profile is still in use any accuracy overall is about 99% with the accuracy being 99.9% much of the time.

Hi Chucker,

Interesting post. I sometimes hit 99% or better accuracy, but it never lasts. Misrecognitions begin to happen, even when I am speaking clearly. I work in a quiet environment. My CPU is actually a Core 2 Duo 1.5 GHz, not a Pentium.

I achieve the best accuracy when I am testing the system. But when I am writing and revising real texts, accuracy is less. I don't see how it could be otherwise. The errors are (usually) not far fetched: I say "wouldn't talk" and DNS outputs "would not talk" or "wooden talk." DNS mixes up phrases like "printed out" and "print it out," despite there being more than sufficient context, e.g., "I printed out the document."

Perhaps I use more more homophonic (or near homophonic) words and phrases than most people. So to me, 98% seems reasonable for free-form dictating and editing. It does not bother me when accuracy falls slightly from its highest levels. It's still easier, faster, and more fun than typing. When the system starts to act bizarre, my protocol is similar to yours: I run the Audio Setup Wizard; exit as many applications as practical; close and relaunch DNS; and if the problem persist, reboot my PC... or take a break.

When you say you get 99.9% much of the time, are you estimating, or have you measured this? It seems remarkable that anybody could dictate 1000 words and commands with only a single misrecognition.

 11/08/2009 11:35 PM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

Quote:
3. One of the most significant advantages to in improving accuracy is the use of the (enabling) "Always preserve wave data" option in the DNS Options> Data tab> Advanced button. This feature stores all of your dictation in dra files, which are then used by the Acoustic and Language Model Optimizer in the exact same manner that analyzing and adapting documents to your writing style by running either the Voctool (recommended) or the "Add words from your documents to the vocabulary" from the Accuracy Center. In addition, using this feature also has the advantage of adapting both your Acoustic Model and your Language Model when running the Acoustic and Language Model Optimizer. In short, this is an additional "SilentAdapt" feature.

 

Chucker, 

When "Always preserve wave data" is unchecked, what does running the "Acoustic and Language Model Optimizer" do? What data does the optimizer draw upon to tweak the acoustic and language models?

 11/09/2009 03:37 AM
User is offline View Users Profile Print this message


Ron Len
Member

Posts: 116
Joined: 10/02/2006

Chuck,

Just out of curiosity, I am intrigued with your statement that "one of the most significant advantages to improving accuracy is the use of (enabling) Always Preserve Wave Data...."

With that being said, what is the reasoning for Nuance essentiallly burying such a "significant" feature deep within the settings of the program?

I am not being argumentative, but I am more surprised that this has not been brought up in the past as one of the main ways to increase accuracy. One would think that this would be of primary importance as basic information that should be imparted to end-users.

If it has been discussed in depth on this forum, I had missed it. So thanks for pointing out the necessity of checking the box, which I have now done and which I am looking forward to seeing the net effect.

Thanks -- Len

 11/09/2009 04:23 PM
User is offline View Users Profile Print this message


photoman
Top-Tier Member

Posts: 311
Joined: 07/08/2009

I feel like a dwarf entering the land of the Giants but I will proceed anyway. I was following this post and made a few the suggestions. Regarding the Drafiles, I found a number of files that began with "DRA" which had afile extension of "dft." What is the significance of these files and can we also delete them?

-------------------------

Photoman
Digital Cameras Don't Take Good Pictures. People Do!

DNS Preferred V11, Quad 4 Intel i7-920; 2.66GHz; 8GB RAM; Windows 7 - 64 bit

 11/09/2009 07:38 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39769
Joined: 10/01/2006

You can safely delete the entire dra folder if you choose. NaturallySpeaking will simply re-create it when you re-launch DNS.

-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

 11/10/2009 02:11 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Photoman,

After running the Acoustic and Language Model Optimizer you will find that many of the original dra files have been converted to *.dft and *.nwv.  The former are text files (dft standing for draft) or archive files converted for use by the Acoustic and Language Model Optimizer (nwv).

However, as Lunis points out, you can simply delete the entire folder and/or the files in it with impunity.  DNS will simply re-create the folder and proceed to add new dra files to the list.  You will also find a file in the voice_container folder labeled drafiles.ini.  This contains a list of the dra files that have been stored in the voice_container\drafiles folder.  This list is what the Acoustic and Language Model Optimizer uses when processing these files.  This is why you occasionally see dra files left untouched (i.e., on analyzed by the Acoustic and Language Model Optimizer).  Also, by leaving this file intact you can end up with warning messages in the Dragon log that basically say that Dragon couldn't find a particular dra file.  This is simply because the file is listed in the drafiles.ini but not in the list of dra files actually contained in the voice_container\drafiles folder.  These entries are harmless.  My recommendation is to simply leave this file intact.  Doing so does no harm.  However, deleting it vs. editing it can result in the Acoustic and Language Model Optimizer ignoring any new dra files that may have been stored in this file list.  I only bring this to everyone's attention so that you know that it's there.  My recommendation is leave it alone unless you know exactly what you're doing.  There are fewer problems associated with leaving it alone than there are with deleting it.  Therefore, this is just FYI.  Leave it alone unless you want to spend the time tediously checking the list and deleting individual entries that are no longer contained in the drafiles folder.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

"Life's Rule #1: Once you pull the pin, Mr. Grenade is no longer your friend."  (Variant of Murphy's Law  -  Edward A. Murphy, Jr)



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/11/2009 10:48 AM
User is offline View Users Profile Print this message


CtRptr
Advanced Member

Posts: 170
Joined: 10/02/2006

Further, using computer hardware that is capable of taking advantage of DNS 10 by setting the Speed vs. Accuracy slider all the way to 100% without any increase in latency will gain you better accuracy overall. The reason for this is that the default setting at 50% for the Speed vs. Accuracy slider eliminates the use of the quadgram model. Only if the Speed vs. Accuracy slider is set to 75% or better does DNS engage the quadgram model.

Chucker - You're saying to set the Speed vs. Accuracy slider to 75% to the right, on the side of Accuracy rather than Speed, correct?

Don



-------------------------

M-Tech M8600-7gen, Microprocessors: P-I7700-HK - Intel Core i7-7700HQ Processor (6M Cache, up to 3.80 GHz)e, up to 4.00 GHz, 64-bit, 32GB RAM, Windows 10 Professional, Dragon Professional Individual 15, SpeechMatic USB MultiAdapter, SmartMic (closed microphone court reporting)

 11/11/2009 11:07 AM
User is offline View Users Profile Print this message


Alan Cantor
Top-Tier Member

Posts: 4280
Joined: 12/08/2007

Quote:
Further, using computer hardware that is capable of taking advantage of DNS 10 by setting the Speed vs. Accuracy slider all the way to 100% without any increase in latency will gain you better accuracy overall. The reason for this is that the default setting at 50% for the Speed vs. Accuracy slider eliminates the use of the quadgram model. Only if the Speed vs. Accuracy slider is set to 75% or better does DNS engage the quadgram model.

This is so interesting! Are there other settings along the slider that engage/disengage features? 

 

 11/11/2009 12:06 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39769
Joined: 10/01/2006

Quote:
Chucker - You're saying to set the Speed vs. Accuracy slider to 75% to the right, on the side of Accuracy rather than Speed, correct?


Yes, we believe that's what Chuck meant and we didn't know that the quadgram algorithms were not engaged below the 75% minimum so would like to thank Chuck that information too.


-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

 11/11/2009 12:27 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Keeping in mind, however, that, in addition to where you set the slider, it takes dictating nine words in a single utterance at least to take full advantage of the quadgram models for at least one word. Anyway, it just goes to show that the error rate usually drops significantly if you dictate in very long utterances, whether it be trigram or quadgram model processed.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/11/2009 01:03 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Quote:
Keeping in mind, however, that, in addition to where you set the slider, it takes dictating nine words in a single utterance at least to take full advantage of the quadgram models for at least one word. Anyway, it just goes to show that the error rate usually drops significantly if you dictate in very long utterances, whether it be trigram or quadgram model processed.

Rüdiger & Lunis,

The quadgram model only requires five words minimum. The reason for this is that if you dictate an utterance that is five words in length, the quadgram model will engage analyzing the context of all the words to the left of the fifth word. For the remainder of the utterance the appropriate bigram or trigram model is engaged for the other words. So, there is kind of a double and triple check in terms of context. However, this only applies during normal dictation.

When you run the Acoustic and Language Model Optimizer and it is analyzing dra files from the voice_container\drafiles folder, the quadgram model is employed because the audio isn't parsed by utterance, it's parsed by context. So, having the Speed vs. Accuracy slider set as far to the right as possible given the specific hardware configuration gives you better context analysis for these dra files. When dra files are recorded in this manner, they are not recorded by utterance, the audio is continuous.

So, the bottom line is that engaging the quadgram model depends upon utterances during normal dictation like takes the audio in dra files as a whole. Nevertheless, as long as the Speed vs. Accuracy slider is set so as to engage the quadgram model, it also engages all the other models and runs the entire utterance or audio through all of them limiting it to the bigram and trigram models only in the case of a single utterance being less than five words.

Nevertheless, engaging the various n-gram models is determined by the specific circumstance. What happens when you move the Speed vs. Accuracy slider further to the right and particularly with regard to moving it to the 100% mark is that the quadgram HMM is very large. So, the lower the performance of the system in terms of overall speed, the greater the latency, particularly if DNS engages the quadgram model. This is why the default is set to 50%. That is, most system should be able to handle 50% (bigram/trigram analysis) without any increase in latency.

On the other hand, don't ask me to go into detail because what actually occurs is based on the unique aspects of the specific circumstance. By that I mean that you can't draw any general conclusions about what model is engaged and when without the specific condition under which it is applied.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

"At times we shall simply have to admit that, one way or another, what we can neither explain nor understand certainly doesn't cease to exist because we cannot see how it does or why it should." - Dr. Mark Hyman



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/11/2009 03:49 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Quote:
The quadgram model only requires five words minimum.

Chuck,

Please, take a look at this:

Quote:
There are major changes to the speech models, particularly the language models, which now include not only the vocabulary, bigram, and trigram models, but include a new quadgram model. However, those of you who dictate in short choppy phrases consisting of three or four words or less and pause frequently during the course of your dictation will never see the quadgram model working. The reason for this is that the accuracy performance provided by the quadgram speech model (hidden Markov models) is only invoked when users dictate nine or more words without pausing (and pausing means what Dragon detects as pauses not what the user detects pauses). This is simply because the quadgram model requires that each target word have at least four words either side of it.

From: http://www.speechcomputing.com/node/917

I'm not quoting this to blame you, just to show where I've had it from, among other sources. I'm not so sure what's right here.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/11/2009 04:26 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3878
Joined: 01/14/2008

Rüdiger

are you aware that you are quoting Chuck to contradict Chuck?, Personally I believe Chuck.

Anyway I am confused, I thought that the quadgram came into play with just four words quoted (okay I know it sounds like I am stating the obvious). So if it isn't why does the bigram, for example, work with only two words? This is easily demonstrated by for example quoting the word

"write"

you usually get "write" or "right" from the unigram, however if you dictate

"Mr Right"

you always get "Mr Wright" which is pretty conclusive that the bigram is kicking in. So why is the quadgram different?

I am not questioning the facts just confused about the reasons.

Thanks

Lindsay



-------------------------

 11/11/2009 04:49 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Lindsay,

well, I think as far as quadgrams, and five or nine, vs. four (or eight) words are concerned, Chuck, Chuck, and Rüdiger were wrong, and Lindsay was correct, at least as far as counting is concerned.

Basically, a quadgram by definition, is a string consisting of four elements, characters, phonemes, syllables, words, and so on and forth. The sentence "the red fox jumped over the lazy brown dog", will give you "* the red fox, the red fox jumped, red fox jumped over, ...". When building up a word context language model based on analyzing source data (texts), such strings would be considered as for their statistical occurence, to provide a method to predict their probability within a given utterance, by computing the confidence scores of competing results that the acoustic model couldn't solve (homophones), thus deciding on the Likelihood Maximum Estimation (LME).

The question, however, is whether the search only goes backwards on a given utterance, but also forth and back, during intial recognition. If you type "n-gram backwards forwards" into Google, you will find lots of articles, also technical (scientific) papers, that state both.

Besides, there are just so many different approaches being used in various builds of speech recognition engines, and no-one so far has ever made public which are the ones being used by DNS - at least I haven't heard of it.

Basically, I would presume that there is a certain amount of speculation going on. For what it's worth.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/11/2009 05:37 PM
User is offline View Users Profile Print this message


CtRptr
Advanced Member

Posts: 170
Joined: 10/02/2006

Uh, so what do I learn from all this?
Don

-------------------------

M-Tech M8600-7gen, Microprocessors: P-I7700-HK - Intel Core i7-7700HQ Processor (6M Cache, up to 3.80 GHz)e, up to 4.00 GHz, 64-bit, 32GB RAM, Windows 10 Professional, Dragon Professional Individual 15, SpeechMatic USB MultiAdapter, SmartMic (closed microphone court reporting)

 11/11/2009 05:41 PM
User is offline View Users Profile Print this message

Author Icon
R. Wilke
Top-Tier Member

Posts: 7809
Joined: 03/04/2007

Quote:
Uh, so what do I learn from all this?

Don,

don't worry about it.

Rüdiger

 



-------------------------



No need to buy if all you want to do is try ...

DragonCapture KB Download (Latest)
DragonCapture Homepage

 11/11/2009 06:05 PM
User is offline View Users Profile Print this message


CtRptr
Advanced Member

Posts: 170
Joined: 10/02/2006

Okay, then.  Thanks!

Don



-------------------------

M-Tech M8600-7gen, Microprocessors: P-I7700-HK - Intel Core i7-7700HQ Processor (6M Cache, up to 3.80 GHz)e, up to 4.00 GHz, 64-bit, 32GB RAM, Windows 10 Professional, Dragon Professional Individual 15, SpeechMatic USB MultiAdapter, SmartMic (closed microphone court reporting)

 11/12/2009 10:10 AM
User is offline View Users Profile Print this message

Author Icon
David.P
Top-Tier Member

Posts: 638
Joined: 10/05/2006

It seems that there is some confusion about the amount of context that a) exists around any given word under different conditions, and that b) can be used by NaturallySpeaking for improving accuracy during the recognition process.

 

NaturallySpeaking improves its recognition accuracy by guessing what you said not only based on acoustic information (using the Acoustic Model of each word) but also based on the context of your utterance or sentence. The latter is the job of the Language Model.

 

Since a picture is worth more than a thousand words, I'll try and save words and instead post a picture.

 

The image below shows a graphical representation of a (small) paragraph's worth of text, consisting of several dictated utterances. In the image, the words are depicted by a row of small squares. The color of a word-square shows how much context around that word (see color legend) can be used by NaturallySpeaking's language model to improve that word's recognition accuracy.

 

The larger colored fields around the word-squares show the extent (length) of the context for any given word having the same color.

 


(Click to enlarge)

 

So what does that mean, in practice?

 

First, it means that the longer the utterance, "the greener" meaning the better, accuracy will be for that utterance. Starting with an utterance of at least four words, NaturallySpeaking can employ the quadgram model on both sides of the first word(s) of the utterance (see the green words turning up in every utterance longer than three words), giving those words the highest possible accuracy.

 

Second, the longer the utterance, the more green words it will contain, each having maximum accuracy. Thus, there is basically no limiting number of words from when on accuracy will be "optimal". As long as the recognizer can handle the length of the utterance, your accuracy will asymptotically increase with the length of your utterance.

 

Third, accuracy will always drop along the last three words of an utterance of any length, since the right hand side context of the last three words is always less than a quadgram. The last word of an utterance of any length will always get the lowest accuracy, statistically, since it has only left side context, but no right side context at all.

 

Fourth, when dictating in short, choppy utterances or even by only saying one word at a time, NaturallySpeaking can and will take into account the respective left side context, thus still improving your accuracy by employing the Language Model, as good as it can. This is because NaturallySpeaking (however only in Select-And-Say applications) will always know about and take into account the text on the left side of the text cursor, even if this text belongs to a previous utterance, or even if that text was only typed.

 

Fifth, numbers will always be more of a problem, accuracy-wise, than normal words.

 

Why is that? The simple reason is that there's no such thing as context around a digit or number, which could help NaturallySpeaking to figure out that you meant "nine" when you mumbled something like "mine" or "fine" or "five" etc. (other than, possibly, in cases like "Ali Baba and the 40 Thieves" ). Numbers have no useful context, therefore number recognition will be always (considerably) inferior, even in comparison to single word dictation.

 

Anyway and summing up, it actually IS best to always try and dictate in utterances as long as possible. However, due to the advanced context recognition abilities of NaturallySpeaking (in Select-And-Say applications), you will get fairly acceptable results even when only dictating in short(er) utterances, or in single words.

 

Hope that clears up things a bit,

 

David.P



-------------------------

Sennheiser MKH Mic
Visual & Acoustic Feedback + Automatic Mic Control



 11/12/2009 10:44 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

David,

Kudos.  Very nicely done and right on the money.  Thank you.

The only problem is that for some reason or other trying to enlarge the image produces a 404 error.  I'm not sure exactly why this occurs, but I've run into it before trying to post screens.  It may have something to do with the fact that the KnowBrainer forum simply doesn't like PNG files, or it could be simply because it's attempting to access it from your website.  Regardless, would you be so kind as to save it as an attachment.  Also, it might be more accessible if you converted to a JPEG format.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

"Kindness is the language which the deaf can hear and the blind can see." -- Mark Twain



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/12/2009 11:03 AM
User is offline View Users Profile Print this message

Author Icon
David.P
Top-Tier Member

Posts: 638
Joined: 10/05/2006

Thanks Chuck. Below's the image as an attachment in two formats that should display in any browser or image viewer. I'd rather not supply a JPEG version since that (photographic) format tends to degrade/blur on line drawings.

Btw., I have been using *.png images on the KB forum for years with no problem -- so I'm not sure about that 404 error. Maybe just re-loading the page could help.

Regards David.P



n-Grams.png
n-Grams.png  (15 KB)
n-Grams.gif
n-Grams.gif  (15 KB)



-------------------------

Sennheiser MKH Mic
Visual & Acoustic Feedback + Automatic Mic Control

 11/12/2009 12:39 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

David,

Thanks for attaching the files.  The problem that I was having with Internet Explorer was that something corrupted it.  When I finally realized that that was the problem, reinstalling IE 8 solves the problem.

Nevertheless, thank you for the files.  Also, my problem with it attempting to insert screen captures that were saved in PNG format was related to the same problem with IE noted above.  Can't figure out what corrupted IE, but everything is back to normal.  Looks like this happened a day or so ago.  Go figure!!!

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong, it usually turns out to be impossible to get at or repair.- Douglas Adams (1952 - 2001)



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/12/2009 04:13 PM
User is offline View Users Profile Print this message


Keith
Top-Tier Member

Posts: 240
Joined: 10/03/2006

Hi Chuck - just a comment on your excellent post a few days ago about moving the Accuracy Slider all the way to the right.  I have an I7 chip, 8GB RAM and DNS Med 10.1 but it makes the words appear in my dictation extremely slowly after I dictate them.  Hard to get used to!

Regards

Keith



-------------------------

Keith


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 11/12/2009 07:08 PM
User is offline View Users Profile Print this message

Author Icon
Lunis Orcutt
Top-Tier Member

Posts: 39769
Joined: 10/01/2006

Try moving the slider left to the 50% mark (faster) and see if the performance goes up noticeably. If it does, there's probably something wrong in your system because from a speed perspective, you shouldn't see any difference when the cursor is set all the way to the left (fastest response) to all the way to the right (most accurate) in an i7.

-------------------------

Change "No" to "Know" w/KnowBrainer 2020
Trial Downloads
Dragon/Sales@KnowBrainer.com 
(615) 884-4558 ex 1

 11/12/2009 11:55 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Quote:
Hi Chuck - just a comment on your excellent post a few days ago about moving the Accuracy Slider all the way to the right. I have an I7 chip, 8GB RAM and DNS Med 10.1 but it makes the words appear in my dictation extremely slowly after I dictate them. Hard to get used to!

Keith,

Lunis's suggestion is a good way to test your problem. However, if you look at my configuration at the bottom of each of my posts I'm running a Core™ i7 920 with 12 GB of RAM and Windows 7 Ultimate 64-bit. I have never experienced any slowdown with the system and the display of text after I pause is instantaneous.

Therefore, I would suggest that you have a problem somewhere along the line. Your profile is unclear because it hasn't been updated. Therefore, my question(s) would be:

1. What version of Windows are you currently using (i.e., Vista or Windows 7, 32-bit or 64-bit)?

2. Are you currently running this on your iMac? If not, please specify what your overall hardware configuration is.

The bottom line is that something is interfering with the performance of DNS.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

If computers get too powerful, we can organize them into a committee - that will do them in. - Bradley's Bromide



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/13/2009 10:06 AM
User is offline View Users Profile Print this message


Keith
Top-Tier Member

Posts: 240
Joined: 10/03/2006

Hi Chuck - sorry about not including info on my profile. 

 

I have Windows 7 64 bit on a new HP PC with an  I7 processor with 8GB RAM and 1 TB harddrive.

 

thanks again

Keith



-------------------------

Keith


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 


 11/13/2009 10:32 AM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Keith,

What is the model number of your Core™ i7 (i.e., 975, 920, 860, etc.)?

Off the top of my head I have two possible causes:

1.  Turn off the Windows pagefile.  That is, go into your virtual memory settings and change the pagefile settings to set the pagefile to none.  The more RAM you have the less you need it.  In addition, having a pagefile is worthless because all it does is add processing time, albeit relatively insignificant in the long run.  The only purpose of the pagefile is to either provide a repository for the blue screen of death dump and providing virtual memory (disk space) in case an application has to be unloaded (low resources), which almost never occurs at or above 2 GB of RAM and never occurs at or above 4 GB of RAM.  At the same time, this is not the source of your problem.  It's just one of the things that you can do to improve Windows performance.

2.  You have some application or utility running in the background that is interfering with DNS's access to the RPC server.  That is, DNS makes extensive use of the RPC server for the execution of its various ActiveX controls (COM programs).  If there is something else in the background that also makes extensive use of the RPC server at the same time, DNS can be pushed lower in the queue, and hence increased latency.  Not many applications and/or utilities actually interfere with DNS in this manner, but I suspect this is the major cause of your problem.

Keep in mind that DNS is resource intensive and anything that has either a higher priority or current focus that also makes intensive use of the system resources will impede DNS's performance.  You may have to disable these one at a time to figure out which one(s) is/are interfering.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong, it usually turns out to be impossible to get at or repair.- Douglas Adams (1952 - 2001)



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/12/2009 06:29 PM
User is offline View Users Profile Print this message

Author Icon
Dana
Top-Tier Member

Posts: 1411
Joined: 10/01/2006

Chuck:

If you right-click, select Save, then paste it into a blank Word document set to Landscape with .25 margins all around - it is vividly displayed - large and clear!

             Dana



-------------------------

Dana Joan - Vero Beach, FL  -  DMPE, Version 2.2; Oncology Large Vocabulary; Windows 7.1 (on local computer); Sennheiser MD 431 II mic with the Andrea USB pod; work on a Remote Desktop.

 11/12/2009 11:44 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Dana,

Thank you for the suggestion, but the problem was with Internet Explorer, which I have subsequently fixed.  Of course, I'm assuming that you're referring to my problem with displaying David P.'s graphic link.  Problem resolved.

Nevertheless, thank you for your suggestion.  I appreciate your trying to help.

Old dogs need to be whipped into shape sometimes.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

"Kindness is the language which the deaf can hear and the blind can see." -- Mark Twain



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/12/2009 11:11 AM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3878
Joined: 01/14/2008

Okay David thanks, so if your theory is correct (and we start with a new session of DNS and a new document)

then you only need to dictate 4 words for the quadgram to come into effect and then it is only effective with right side context on the fourth word.

In order to obtain maximum accuracy (and again starting with a clean sheet so no select and say context is taken into account) you need to dictate seven words and only then will the fourth word have maximum context ie the fourth word will have left side and right side quadgram context.

Please correct me if I misunderstand.

Thanks again, Lindsay



-------------------------

 11/12/2009 11:21 AM
User is offline View Users Profile Print this message

Author Icon
David.P
Top-Tier Member

Posts: 638
Joined: 10/05/2006

Yep Lindsay, that's exactly it -- provided that you change the seventh-to-last word of your second paragraph to "left"

Regards David.P



-------------------------

Sennheiser MKH Mic
Visual & Acoustic Feedback + Automatic Mic Control

 11/12/2009 12:41 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3878
Joined: 01/14/2008

Actually what I meant to say was, in the second paragraph, the "first word" of the utterance,  would have the right quadgram  context and of course the fourth word would have the left quadgram context.

It's also really interesting what you say about select n' say  and I just had to test it:-), so again with a clean document (select n' say  enabled) I dictate the following:

" I would like to introduce Mr"  then I pause, then I say  "right"  and sure enough it gets it bang on and puts in  "Wright", however if I do exactly the same in a non-select and say application like notepad it gets it wrong every time and puts in "right" (okay it's finding the context from the two words in this case) which of course supports what you're saying.

Thanks again, Lindsay

-------------------------

 11/12/2009 12:56 PM
User is offline View Users Profile Print this message

Author Icon
Chucker
Top-Tier Member

Posts: 14123
Joined: 10/10/2006

Lindsay,

In nonstandard Windows (i.e., where Select-and-Say is not enabled), I would expect this kind of behavior under most conditions.  In Select-and-Say supported applications, DNS (via SAPI) always has access to all the text visible on the current page.  This is not the case with application Windows that don't support Select-and-Say.  Therefore, anything that interferes with DNS being able to link previous word(s) with the most recently dictated one(s) results in the inability to analyze context.  Since nonstandard Windows vary in terms of their underlying rich text editors, you may find that it works properly in some, but not in others.  Nevertheless, since I don't use nonstandard Windows applications on a regular basis, I can't verify what applications context checking might or might not work, by DNS has to be able to access the entire collection of text in order to apply any of the n-gram models.  When it can't, for whatever reason, at least as in your testing and my limited experience with non-Select-and-Say supported application Windows, it appears to treat any and all words that it can access as individual words.  Hence, no context checking is applied.

David might be able to better clarify that.  I can't seem to get my thoughts organized today.  I think my Alzheimer's is kicking in.

Chuck Runquist
Technical Project Manager
VoiceTeach LLC

I know that you believe you understand what you think I said, but, I am not sure you realize that what you heard is not what I meant.



-------------------------

VoiceComputer: the only global speech interface.

The views, thoughts and opinions expressed in this post are my own and do not reflect those of VoiceTeach LLC.

Chuck Runquist
VoiceComputer technical support

 11/12/2009 02:00 PM
User is offline View Users Profile Print this message

Author Icon
monkey8
Top-Tier Member

Posts: 3878
Joined: 01/14/2008

Okay thanks Chuck, I guess the behaviour is consistent with the fact that the select and say text is included in the context.

So in your previous posts which were checked by the 'Quote Police' :-), the 5 and 9 that you were referring to should have possibly been 4 & 7 (single and double sided quadgram checking), presuming that you were possibly overlooking including the context checked word itself in the quadgram? Or am I missing something with the 5 (or 9)?

Also while I have your attention, I was reading another post regarding the natural language commands and decided to have another look at them. A couple of questions, presuming you turn off the "Only Available" (thanks Graham didn't know about that either) and it then shows you other applications like America online etc, do you then have a list of all the natural language commands (give or take any that have been overlooked) available with DNS (for all the applications they have written natural language commands for)?

Also if I use the "Go to inbox" natural language command with Microsoft Office Outlook 2007 it closes Microsoft Outlook, I am using Windows 764 bit (which I think you have somewhere on one of your systems), do you get the same behaviour?

Thanks again, Lindsay



-------------------------

 11/08/2009 01:59 AM
User is offline View Users Profile Print this message


Jomark
Top-Tier Member

Posts: 1505
Joined: 10/19/2006

Before creating a new user, why not export your custom words and then import them into your new user.

I periodically export my custom words as a precaution against losing them and likewise with my custom commands.

I also create a copy of my user folder from time to time and that has saved me on a number of occasions.

I also save my user eveytime that I close DNS down.

I had problems running the optimisers in the past as it corrupted my user but I ran it recently and so far it has not created a problem.

I suppose it is a matter of taking precautions like regularly backing up your harddrive.

So often people who should know better find out the hard way when their system crashes or suffers a lightning strike and lose everything! 

I use "correct that" or "Correct" using KB after dictaing phrases to make corrections. Most of the time it pick up the correct phrase and when it doesn't I select and dictate over it again.



-------------------------

Jomark


 


DPI 15.61, KB2017, SpeechStart+, MS Office 2019 Professional, Windows 10 Pro

KnowBrainer Speech Recognition » Dragon Speech Recognition » accuracy dwindles

Statistics
32285 users are registered to the KnowBrainer Speech Recognition forum.
There are currently 0 users logged in.
The most users ever online was 12124 on 09/09/2020 at 04:59 AM.
There are currently 342 guests browsing this forum, which makes a total of 342 users using this forum.

FuseTalk Standard Edition v4.0 - © 1999-2022 FuseTalk™ Inc. All rights reserved.