Harnessing the hidden power of EHRs to transform thyroid cancer care

May 03, 2025

The rising incidence of thyroid cancer over recent decades illustrates a paradox of modern medicine. Increasingly sophisticated diagnostic tools have inadvertently fueled an epidemic of overdiagnosis — particularly of small, indolent papillary thyroid cancers (PTCs) that might never threaten a patient's life.

Despite this, mortality rates remain unchanged, and treatment strategies vary widely, influenced by physician attitudes and local practice environments. Although the American Thyroid Association's 2015 guidelines and risk stratification system have been crucial steps toward de-escalation and personalized care, their broad risk categories — and the reliance on studies derived from claims-based datasets with limited granularity, small or single-center studies — highlight a challenge: the inability to harness the rich data hidden within electronic health records (EHRs). But what if physicians could tap into this untapped resource to rewrite the narrative of thyroid cancer care?

The answer may lie in artificial intelligence, specifically natural language processing (NLP). Every day, clinicians document detailed observations in EHRs, from nuanced pathology reports and descriptive operative notes to vigilant surveillance updates. Yet as much as 80% of this valuable information remains locked in unstructured text, invisible to traditional analytics.

David Toro Tobon, M.D., Endocrinology at Mayo Clinic in Rochester, Minnesota, explains, "We are pioneering the use of NLP to liberate this 'dark data' and reshape thyroid cancer management." As evidenced in a study published in Endocrine Practice, an algorithm has been developed that can review surgical pathology reports, extract key features and assign a risk recurrence classification at scale with near-perfect accuracy. In a 2024 Mayo Clinic Proceedings study, NLP was applied to assess the appropriateness of thyroid ultrasound orders against clinical guidelines, a tool now being used to investigate patterns of overuse that may contribute to overdiagnosis. These achievements are not merely technical milestones; they demonstrate that we can decode the hidden layers of clinical data to illuminate better care pathways for patients.

When imagining a future where every patient's journey — from the initial discovery of a suspicious nodule through decades of follow-up — is captured in a dynamic, learning registry, NLP makes this vision attainable. The goal is to build large, highly detailed multicenter virtual registries of structured datasets, overcoming the limitations of time-consuming, costly human annotation and single-institution studies. With these robust datasets, precise AI-driven predictive models can be developed that empower patients with personalized risk profiles and help deepen their understanding of the complex patterns of thyroid cancer.

Dr. Toro Tobon says, "Yet for all its promise, this revolution demands caution. EHR data is inherently messy — a product of human variability in documentation, institutional culture and regional language. Biases embedded in historical practice patterns can perpetuate disparities if left unchecked. And as we entrust machines with data extraction, we must never abdicate ethical responsibility. Patient privacy, algorithmic transparency and equity must underpin every innovation." The current applications and challenges of this approach are highlighted in Thyroid. Dr. Toro Tobon continues, "We rigorously validate our models against clinician-curated datasets to ensure that technology serves the needs of our patients, not the other way around."

The vision for the future of thyroid cancer is one where every treatment decision is informed by the collective wisdom encoded in millions of patient stories, artfully deciphered by NLP. The key to unraveling thyroid cancer's complexities lies not in a distant frontier but in the data that has already been collected and is waiting to be unlocked.

For more information

Loor-Torres R, et al. Use of natural language processing to extract and classify papillary thyroid cancer features from surgical pathology reports. Endocrine Practice. 2024;30:1051.

Soto Jacome C, et al. Thyroid ultrasound appropriateness identification through natural language processing of electronic health records. Mayo Clinic Proceedings: Digital Health. 2024;2:67.

Toro-Tobon D, et al. Artificial intelligence in thyroidology: A narrative review of the current applications, associated challenges, and future directions. Thyroid. 2023;33:903.

Refer a patient to Mayo Clinic.