![]() |
KnowBrainer Speech Recognition | ![]() |
Topic Title: Merits of Vocola vs Talon vs Caster vs Serenade etc Topic Summary: Created On: 07/26/2020 01:21 AM Status: Post and Reply |
|
![]() |
![]() |
- nreith | - 07/26/2020 01:21 AM |
![]() |
![]() |
- Lunis Orcutt | - 07/26/2020 01:12 PM |
![]() |
![]() |
- nreith | - 07/26/2020 04:19 PM |
![]() |
![]() |
- quintijn | - 07/27/2020 02:44 PM |
![]() |
![]() |
- quintijn | - 07/27/2020 02:45 PM |
![]() |
![]() |
- haughki | - 07/27/2020 03:39 PM |
![]() |
![]() |
- kkkwj | - 07/28/2020 03:29 PM |
![]() |
![]() |
- alexander | - 07/29/2020 01:12 AM |
![]() |
![]() |
- quintijn | - 07/30/2020 02:22 PM |
![]() |
![]() |
- alexander | - 08/10/2020 04:06 PM |
![]() |
![]() |
- quintijn | - 08/15/2020 08:03 AM |
![]() |
![]() |
- alexander | - 08/18/2020 06:14 PM |
![]() |
|
As a newbie looking to code by voice, it seems there are now quite a few options for voice programming. I'm using Windows 10 out of necessity (workawork requirements) and dragon 15 pro because I happen to own it Given this setup, what are the pros and cons of each of the main options you use? Also bonus points if there is a setup that works great for Linux, though I'm primarily stuck with Windows. I do work in a Ubuntu VM. |
|
|
|
![]() |
|
• Vocola is very robust and free. However, there is also a learning curve • KnowBrainer VerbalBasic is significantly easier, includes lots of support and the patented ability to Verbally create Visual Basic commands. KnowBrainer 2017 (w/2020 AI Commands) includes a few thousand ready to go macros which you can additionally view and edit. If you think of the built-in commands as examples, you will find a few thousand. • AutoHotkey is a popular free keyboard macro utility that can be triggered by Dragon or KnowBrainer simple hotkey commands. • Macro Express is similar but less popular than AutoHotkey. Note that you will find a lot of support for Vocola and KnowBrainer here and on the dedicated Vocola forum. ------------------------- Forum Mission Statement |
|
|
|
![]() |
|
Thanks! Does anyone have any experience with the others?
|
|
|
|
![]() |
|
Also have a look at Unimacro, with Natlink. But I think for Linux users there is a lot to find with Dragonfly, and Aena. Not sure, I do not use it.
|
|
|
|
![]() |
|
Caster, on top of Dragonfly is very popular, and fine tuned for lots of applications, if I am not mistaken.
|
|
|
|
![]() |
|
Hi, I have been using Dragonfly + Dragon on Windows for many years for coding and general automation. Caster is built on Dragonfly. I have not used Caster, but I hear great things about it, and it is under very active development, which is of course a good sign. If I were going to start from scratch, that's probably where I would start. Caster was built _for_ voice coding. It comes with a fully defined command set, so you don't have to build/borrow everything from scratch. I mean: if you start with something like Dragonfly or Vocola, some commands will already be predefined for you, but there will be many more you will either have to write yourself or find somewhere else online; e.g., for Dragonfly, I know you can find many grammar modules in various github repositories; but, those are not necessarily easy to find, and you will probably have to tweak them to your own needs.
If you have any Python experience, Dragonfly plus Caster would make sense for that reason alone. Also note Aenea (https://github.com/dictation-toolbox/aenea), in case you think you might one day move to Mac/Linux. I've never used it, but I am glad to know it's there. I assume a native solution would be faster, but so far, I think Dragon/Dragonfly/Aenea is the most "viable" cross-platform solution (although it looks like Talon and Serenade are trying).
Vocola: There are many other people here who have a lot of experience with Vocola; I've only used it a little. It is a very powerful, well-supported voice automation language/system. I _think_ the language is "custom" to Vocola -- someone correct me if I'm wrong.
Talon: I've never used it, and I don't know much about the status of the effort. Originally, I think it relied on Dragon NaturallySpeaking for OSx, which was discontinued. Looks like they now have a beta for Windows? I'd be curious to know what the underlying recognition engine is: does it use Dragon or something else?
Serenade: I hadn't heard of it. I just downloaded it and tried to get through the Python tutorial -- I only made it a few steps into the tutorial before it "didn't work" -- I think it was recognizing the commands correctly, just inserting some strange characters -- not the correct code. Maybe a bug, maybe something with my environment, who knows. There was also a fair bit of delay between the command recognition and inserting the actual code into the editor. A 10 minute evaluation, grain of salt, but I'm guessing this isn't quite ready for "prime time". That said, it looks really promising, and I hope they are able to create a great product out of it. But, if I were starting out, I would not invest in it yet: it's very young, and it looks to me like it's closed source (?), so I think it could just disappear at any moment at this point. Also, it looks like the free version is using a "cloud" voice recognition engine -- that says latency to me.
One thing to try and figure out with any of these products is: what is the underlying speech recognition engine. As far as I know, Dragon NaturallySpeaking is still the best voice recognition engine in terms of speed and quality. I DEFINITELY may be wrong here: things are changing very quickly in this field, and I'm not up to speed with the latest. Aside from Dragon and WSR, have no idea what's available, or whether it's any good. If you're going to be depending a lot on voice recognition, you will very quickly want things to be more accurate and faster, so understanding the capabilities/quality of the underlying recognition engine is critical.
Good luck! |
|
|
|
![]() |
|
Talon has a full-time guy working on it (and he's very smart and is making good progress). It ships with the free Kaldi speech engine, which is not as good as Dragon.
------------------------- Win10/x64, AMD Ryzen 7 3700X, 64GB RAM, Dragon 15.3, SP 6 PRO, SpeechStart, Office 365, KB 2017, Dragon Capture, Samson Meteor USB Desk Mic, Klim and JUKSTG earbuds with microphones |
|
|
|
![]() |
|
Vocola is a very elegant, simple language to learn and is specifically written for voice commands and programming. It also has an extension mechanism, in which you can write any Python code you wish for the most part. That said as of this moment it only works on Windows. But what I usually do is SSH, xterm, or VNC into a Linux box(which of course could be anywhere even on the same machine as a VM. If you do use a VM it seems that virtual box handles the keystrokes directly about better than VMware but I digress). This is my daily driver and has been for a number of years. It's very simple to continuously change and update, and that ability to go into Python for extensions
With regards to Talon, I believe their speech engine is a specially trained/fine-tuned version of wave2let. There are elements of that system that are open source and closed source, and to get the latest and greatest you have to be part of the patreon group/Beta. My understanding is that the developer' s working on adding Windows and Linux support, but currently the version available is for Mac, and there are some issues there due to Apple's recent changing of the dictation engine, and the deprecation of Dragon Dictate. That said it has a strong following and active slack channels. FYI this article is a bit dated but has some good information about the different open source options https://explosionduck.com/wp/tag/voice-programming/ (see the second or third article in the series). Dragonfly as far as I can tell seems to be the core and possibly most flexible system. Caster is built on top of it, as is Aenea (which is a client/server which allows you to send commands to McIntosh or Linux). It can also use Dragon as a speech recognition engine, but has recently integrated kaldi which is open-source, and works directly on Linux and Mac (although that's still in development). |
|
|
|
![]() |
|
When you look at Dragon only (and Windows), Natlink offers Dragonfly, which has many possibilities and extensions (Caster, Mathfly), as Alexander and others point out. Unimacro and "raw" user defined grammars, written in the way Joel Gould originally presented Natlink to us, is even more flexible, as all the possibilities that the glue between Dragon and the python grammars , natlink.pyd, are exploited.
Dragonfly has other "backends" too, which gives many possibilities for non Dragon, non Windows, users. Although... many Linux users run Dragon via a Windows virtual machine, and a package Aenea, based upon Dragonfly if I am correct, so Dragon seems to be still the most favourite Speech Recognition program, also for Linux users. The Natlink developers are working on a scheme which clarifies the differences and connections between the different packages. |
|
|
|
![]() |
|
Quintijn, just curious what are the things that Natlink supports that dragonfly does not?
|
|
|
|
![]() |
|
Hi Alexander, here is my (not completely checked) answer...
As far as I can see, DictObj, and DictGramBase and SelectGramBase are not included in Dragonfly. These are needed when you want to implement full dictation into applications, or for example catching global dication and put the result into a specific target window. They are not often used in practice, but they are very powerful features of Natlink. I think Dragonfly also exposes all information of the results object (resObj), but I am not sure. Things related to word properties (of words in your vocabulary) are not tested in Natlink for a long time, but potentional present. I think they are not included in Dragonfly. I think that is about it... For most users Dragonfly uses enough facilities of Natlink, for completeness, "raw" natlink grammars remain very important. IMHO for grammars that are ready to use, Unimacro offers quite a bit too, and for making your custom commands, Vocola is still a very valuable tool. So, as long as you work on Windows with Dragon, you should benefit from Dragonfly (including its sub packages like Caster), Unimacro and Vocola. I think delving into Dragonfly is a bit more for programmers, using Vocola en Unimacro is better fit for non programmers. In the meantime, several people are working on a release for Natlink with python3, and to make documentation more accessible, including more overview and comparisons between the different packages. Greetings, Quintijn |
|
|
|
![]() |
|
Thanks for the detailed answer Quintijn. As a programmer I will say that Vocola is also for programmers :-) especially since the individual contains Vocola is also a hard-core programmer. The one thing that has me looking toward dragonfly at the moment is Mac OS support which dragonfly/caster do with the help of the kaldi engine. That said the state of the voice programming world today is rather exciting with many options and different platforms. I do remember Mark created a tool called Vortex if I remember correctly that plugged into Natlink's full dictation capabilities. I never got to try that out but it seemed rather impressive.
|
|
|
FuseTalk Standard Edition v4.0 - © 1999-2021 FuseTalk™ Inc. All rights reserved.