Posted in | Life Sciences

Why We Should Boost Genome Sequencing

The potential within 3.5 million de-identified health records is substantial. Jud Schneider, the CTO of Nashville Biosciences, was approached to discuss his company's collaboration with Amgen and Illumina to sequence 35,000 African-American genomes. A recent podcast looked at how such a feat is achievable.

Nash Bio, a Vanderbilt University Medical Center subsidiary, possesses approximately 3.5 million de-identified health records. Of these, around 10% are linked with consented DNA. This reservoir of data presents a rich opportunity for pharmaceutical and artificial intelligence (AI) enterprises to uncover patterns and develop new therapies.

Image Credit: Yurchanka Siarhei/Shutterstock.com

The sheer quantity of genomes involved in the project is staggering. But, within genome datasets, African Americans are notably underrepresented. This initiative represents a chance to make significant discoveries benefiting this demographic and the broader population.

However, merely having genome data is insufficient. The true value lies in correlating genetic data with health records to discern and elucidate patterns connecting genetic variation with disease or biological traits.

The conversation with Jud included the myriad of possibilities afforded by such an extensive dataset for training AI, encompassing imaging data and strategies to maximize its utility. They can use their clinical expertise to construct highly specific cohorts for training AI models.

Jud suggests they can conceive approximately a hundred ways to use the data, though customers may envision thousands of potential applications. Below is an excerpt from the conversation in which Jud explains the power of the model:

Let's just say we're looking at chest CTs. And you're looking for your, I'm just making something up, but we're developing a product that looks for lung nodules. right? Well, we've got lots of chest CTs that have diagnosed certain types of lung cancers, and you can actually get very specific on the type of lung cancer.

And we've also got lots of, you know, just kind of blank chest CTs, you know, and ones with artifacts that are important, like there's a pacemaker in there. There's other types of medical devices that may be implanted or, you know, there's some different anatomies that you need to take into account.

We just got quite a diversity of information that you can really end up with an extremely powerful and well-trained model you know is really starting from a place of grounded in the actual diagnosis and not necessarily just CPT codes.”

Having a large volume of data is not enough. The protocols governing its generation and a range of latent factors must be understood. It could be easily assumed that a dataset based on an ICD code is automatically suitable for customer use. However, ICD codes may be “squishy”, with initial diagnosis occasionally being uncertain or inaccurate. Nevertheless, careful analysis can significantly enhance the data’s utility. The clinical team regularly engages with the data and undertakes this process. Jud underscores the relevance of this by stating:

We find ourselves in a situation all the time where we're really able to disambiguate the, uh, the nuances of the ICD coding system and the billing data to really find the patients that actually have the diagnosis and actually have the data, you know, within routine clinical care that's necessary.”

While generating and capturing vast amounts of data each day within different industries, such as healthcare, it remains important to understand the limitations constraining how the data is gathered, as exemplified by ICD codes, and to be cognizant about what is sought after to derive the most value from it.

Listen to the Podcast Here

Acknowledgments

Produced from materials originally authored by Chris Conner from cc: Life Science.

About Life Science Marketing Radio

Life Science Marketing Radio is connecting life science professionals with the brightest minds and best thinking in the industry to grow their network and advance their careers.

Stay on top of new technologies around life science and embark on a journey to learn as much as possible on artificial intelligence, machine learning, and more.


Sponsored Content Policy: News-Medical.net publishes articles and related content that may be derived from sources where we have existing commercial relationships, provided such content adds value to the core editorial ethos of News-Medical.Net which is to educate and inform site visitors interested in medical research, science, medical devices and treatments.

Other Podcasts by this Supplier

Life Science Podcasts by Subject Matter

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.