Voice recognition in healthcare

By Dale Vile

I first discovered dictation software back in 2001 when I ended up with my arm in a sling following an accident. With some heavy writing commitments, I was desperate for something to help, and a friend recommended Dragon NaturallySpeaking.

My initial experience with the software at that time was quite tedious. Back then, with version 5 of Dragon, you had to spend hours training it to achieve an acceptable level of accuracy.

Over the years, however, the software has changed hands a number of times and along the way has progressed significantly. Now it’s owned by Nuance and I’m currently using version 12, which provides reasonable results straight out-of-the-box, and a good level of accuracy after just a few minutes of verbal training. The next level of fidelity can then be achieved by pointing Dragon at documents you have authored previously and emails in your ‘Sent Items’ folder, so it can ‘learn’ your vocabulary and the way you write.

So if it’s that good, an obvious question is why dictation software isn’t used more widely in the mainstream, and why have those that have tried it so often given up.

There are a couple of main reasons for this.

As I discussed in an article back in 2006, what most people don’t realise is that dictation is a skill that does not come naturally and has to be learned. When they don’t get the quick results they expected, they put it down to a failing of the software. The real problem, however, is usually that they haven’t given themselves enough time to get the hang of forming properly constructed phrases in their head before speaking them into the document. There is a big difference between composing a sentence iteratively using a keyboard and dictating it.

Beyond skill set considerations, there is then the question of environment. It’s all very well speaking into a machine when no one else is around, but it can be an issue when others are within earshot. It’s bad enough having to listen to other peoples’ telephone conversations in open plan offices and on public transport without adding further intrusion. The potential for privacy to be compromised and for confidential business information to be leaked is then a concern from a risk perspective. Such factors limit the number of use-cases in which the dictation approach is viable.

Having said this, dictation software has been thriving in niches where such challenges are less of an issue, e.g. in the legal profession and healthcare. Common to both of these is the historical use of more traditional forms of dictation, e.g. solicitors dictating letters for an assistant to type up and send to a client, and consultants dictating patient notes to be later transcribed by a medical secretary.

Nuance has supported the natural affinity for dictation in these sectors by providing versions of Dragon NaturallySpeaking that come out of the box pre-programmed with a comprehensive knowledge of the relevant vocabulary. Specialist partners with appropriate vertical knowledge then take care of user training and any integration work necessary to embed dictation capability into the client’s application infrastructure and processes.

Picking up on the use of dictation software in healthcare in particular, I didn’t realise how transformational this could be until I attended a recent event on the topic hosted by Nuance. Some interesting case studies were presented, including a GP who was using the solution with his colleagues in a primary care setting.

One of the obvious benefits that comes through strongly in this context is the boost to efficiency. With the dictation solution in place, doctors are able to update online patient records and produce letters and referrals directly and rapidly, even if their typing skills are limited. This frees up valuable back office resource that can be redeployed to work proactively with patients on preventative programmes, reminders and so on, which boosts the quality of care at an overall practice level.

In addition to the efficiency gains, a valuable spinoff benefit of GPs dictating straight into a computer system is that the notes generated are generally more comprehensive, with the ‘patient narrative’ captured much more faithfully. This gives rise to another tangible difference to the ongoing quality of care as details such as the absence of symptoms and negative responses to questions are more likely to be recorded in the patient history. When someone in the future then wants to determine if something was checked or asked in a previous examination, the record will be there explicitly.

This process is further enhanced if the notes from a consultation are dictated into the system while the patient is still sitting there. Experience shows that when the patient is listening and potentially even checking the words being dictated as they appear on the screen, the completeness and accuracy of records is doubly assured. Thinking back to the problem of potential intrusion, this is an interesting example of when dictating with other people around is a positive rather than a negative.

Other case studies presented included the use of dictation in a histology lab, demonstrating how voice recognition can be applied in a more process oriented environment. This emphasised that the technology can be used to populate coded forms in a workflow context, as well as for free text entry. You could easily see how this might be useful in sectors other than healthcare.

Nuance also demonstrated a proof of concept system called “Florence” (the name of a virtual assistant in the form of a speaking avatar), as a way of showing what could be achieved if you bring together voice recognition with search capability and rules based access to reference data. The difference between this and the form-filling workflow example is the high level of interactivity. If a doctor attempts to prescribe a dose of a drug outside of the expected limits (based on the patient age, sex, condition and so on), Florence warns the doctor verbally in case it was a mistake. It’s a bit like Siri on steroids, if you’ll forgive the pun.

In technology terms, Florence is based on a natural language integration platform with the relevant APIs for tapping into pretty much any reference data source in a highly interactive manner. When you see this type of system in action, it really brings home the potential of voice recognition and synthesis in an everyday context. As a long time user of dictation software, it’s all very interesting and inspiring.

In the meantime, if writing is a big part of your own life, I would highly recommend learning how to use dictation software. Acquire the dictation skill and habit quickly follows, as does a significant boost to productivity. You just need a little patience at the outset.


Click here for more posts from this author

Dale is a co-founder of Freeform Dynamics, and today runs the company. As part of this, he oversees the organisation’s industry coverage and research agenda, which tracks technology trends and developments, along with IT-related buying behaviour among mainstream enterprises, SMBs and public sector organisations.