NLM's Value Set Authority Center

Monday, 28 December 2015

The Latest in a Long Line of Terminology Initiatives

NLM's Value Set Authority Center


Creation, management, and deployment of controlled vocabulary value sets has become an important part of what we do here at Apelon. In this post, I'll review the history behind value sets' turn in the health information technology spotlight and describe the NLM's VSAC, a great and growing resource for caregivers who use controlled vocabularies in applications and the developers and terminologists working behind the scenes.

Background and Ancient History

The US National Library of Medicine (NLM) has played a central role in the electronic distribution of controlled medical vocabulary since, well, since there was such a thing. What started as an internal project, indexing medical literature with the Medical Subject Headings (MeSH), morphed in the 1980s into what eventually became the Unified Medical Language System (UMLS). For years, the UMLS was helped along by people and technologies from Apelon and our predecessor company, Lexical Technology. More recently, the NLM has brought more and more of the UMLS production in-house and built its own software. Other US Federal Agencies, most notably the National Cancer Institute, emulated and expanded the ideas behind the UMLS.

For the most part, NLM's vocabulary efforts have stayed away from the creation of original content... we commonly have to correct our customers who think that UMLS's most visible artifact, the Metathesaurus, is the vocabulary they should use for whatever purpose. Instead, the Metathesaurus is a collection of vocabularies that NLM has organized and cross-referenced. It's an important difference: you can think of the Metathesaurus like a Google search: it helps you find what you're looking for, but you still need to follow those Google links and suggestions to retrieve the resources that show up in your search results. If you think about it, NLM's strategy is clearly in keeping with the mission of a library to collect, organize, and distribute, but not so much to create new works.

In the early days of the UMLS, its primary audience consisted mostly of full-time vocabulary geeks (like me!). Whether their role was to create a vocabulary from scratch or to customize an existing one, those early UMLS users tended to be down in the weeds, arguing about the best practices for vocabulary creation and maintenance. But things are changing. Today's audience for controlled vocabulary topics consists mostly of implementers: people who expect the controlled vocabulry to already exist and just want to put it to use. Why? Spurred on by Meaningful Use incentives and technological change, most health providers use an electronic medical record (EMR) these days. When you're trying to enter data quickly, a controlled vocabulary goes from being an arcane informatics exercise to a daily necessity. It was true in the paper era as well: think of the superbill forms that many clinicians used as their primary -- or only -- method of documenting routine encounters. But in the world of the EMR, there's so much more you can do. Instead of forcing every encounter into one of the 50 or so codes that fit on a page, you can now use thousands of codes, with type-ahead completion. Your software can now implement logic to guide your documentation choices... a different set of "most common diagnoses" should show up in the dropdown for male versus female patients. And that's just the trivial beginning... EMRs that harness structured data can revolutionize the way we record the care we do. And it's so much easier to share electronic data for care coordination and, as we move into the era of big data analytics, care quality improvement. Again, think of Google: if you've ever been surprised at how relevant a targeted ad seemed to be, just consider how much value there will be in targeted, personalized medicine. Of course there are security and economic questions still unanswered... but the potential benefits are enormous.

Anyway... all these implementers want to use the controlled vocabularies in their daily work. Great! But here's the problem: nobody ever wants the entire thing. A vocabulary like SNOMED CT has more entries than the Oxford English Dictionary (really!), but the average clinician might only use 20-50 billing codes per year. And remember, SNOMED CT is just one of more than 100 vocabularies available in the UMLS. Just like you don't need every word in the dictionary to write a thank-you letter, you don't need every code in SNOMED to document the earaches and appendectomies and pregnant mom checkups you manage every day. It's great that all those other codes are out there... but you just don't want to wade through them all every single time you open your EMR.

Rise of the Value Sets

That's where value sets come in. A value set is just a defined collection of codes from a controlled vocabulary. Value sets range from very small collections (gender codes, pressure ulcer stages) to very large ones (surgical procedures, drugs on formulary). Almost any time you want to do something "real" with a controlled medical vocabulary, you are most likely interested in a value set. One particular area where value sets play an important role is in the calculation of quality measures. The measure will make a statement like "All diabetic patients are to be included in this measure except those with a co-occurring mental health diagnosis." Well, how do you define "diabetic patients" and "mental health diagnosis"? At a high level, of course we all know what those phrases mean, but at the level of judging whether the correct care was provided in a particular case, how do we unambiguously decide which patients are in that set? The answer, of course, is with a value set. You can look at a vocabulary like SNOMED CT or ICD-10-CM and list out all the codes associated with the idea of being a diabetic patient, for instance. And that process of determining which codes are included and excluded can result in important adjustments to the measure language. For instance, we'd all agree that a diagnosis of "Diabetes, Type 1" puts a patient in the group. But what about "Gestational diabetes" or "Diabetes mellitus in remission"? And while it's pretty straightforward to arrive at a clinical consensus on what a measure is trying to achieve for a condition like diabetes, it can be very challenging to agree on the boundaries between, for instance "Mild depression" and "Dysthymia", and whether or not those codes represent an appropriate exclusion criterion. If every implementer had to make those decisions on their own, nobody would ever measure quality in a useful or comparable way. But if all that discussion could happen just once, as part of the measure development process, then every implementer in the country... in the world... could implement the measure the same way.

Standards organizations like Health Level 7 (HL7) recognized the importance of value sets a long time ago, and most HL7 message and document standards describe how a value set should be used rather than a whole vocabulary. The Meaningful Use requirements increased the importance of standardized reporting of quality measures based on value sets, and required the use of new vocabularies, often much more complex than the ICD-9-CM codes most institutions were familiar with. In order to ensure easy and consistent access to the value sets used in these measures, NLM created the Value Set Authority Center (VSAC) in late 2012. 


The VSAC Web site requires a free NLM UTS license in order to enter the site. Once there, the VSAC site includes two main tabs for most users, "Search Value Sets" and "Download." The Search tab allows a user to search for a value set by name, ID, included codes, and keywords. You can also select a quality measure or a vocabulary to see all the associated value sets. There are more than 3,000 separate value sets available, drawn from more than a dozen stand-alone vocabularies and two versions of the HL7 standard. The Download tab lets you download the value sets organized in varous ways, and provides the materials either as an Excel workbook or in text and XML formats compatible with the Sharing Value Sets (SVS) standard. Value set authors and stewards also have a third tab available for authoring and uploading their value sets to the collection. 

Over the three years of its existence, VSAC has added an impressive number of features and greatly expanded its content. The most recent announcement from NLM regarding VSAC was just last week. The new VSAC Collaboration tool "supports communication, knowledge management and document management by value set authors and stewards." With threaded discussions and other features, this tool should encourage the various quality measure developers and other stakeholders to harmonize their efforts and foster a community of best practices around value set creation and maintenance. 

VSAC is only the latest in a long line of initiatives at the NLM and elsewhere to empower controlled medical vocabulary usage in electronic health records. And, with VSAC Collaboration, it's back to the future: the value set authors are down in the weeds working on all the little details that will make tomorrow's EMRs better and easier to use, just like the vocabulary builders who first built the UMLS community more than 25 years ago. 

About the Author

John Carter