Bioinformatics in Research

Bioinformatics is the application of information techniques for the storage, retrieval, and analysis of large quantities of biological data.

Image Credit: mopic/

Through the development of algorithms and statistical testing, research can be carried out faster and more accurately. Bioinformatics is used in different fields of research; however, it is especially important in genomics, such as in genome analysis, gene identification, genome-wide association studies, and evolutionary studies.

Bioinformatics and genomic analysis

Traditional and next-generation processes aim to sequence the genome allowing the analysis of DNA sequences. However, these methods produce many fragments of DNA, like fragments of a jigsaw puzzle, which need to be aligned and compiled to create a final complete sequence. The use of bioinformatics can align these fragments quickly and cheaply, aiding genomic sequencing.

The human genome was initially sequenced between 1990 and 2003 and has since been uploaded online and extensively annotated. Annotation is the process whereby genes and their protein products are labeled directly onto the genome.

The volume and complexity of the produced data would have taken many years to compile manually. However, with the advent of bioinformatics, scientists have the capacity to carry out the compilation and annotation processes quickly and with better precision.

Bioinformatics and identification of mutations

Bioinformatics is vital in the research of de novo mutations. One example of a method that is used to identify these mutations is whole-exome sequencing. Whole exome sequencing is used to sequence only the protein-coding regions of DNA (the exomes), which makes up only 1% of the genome, thereby making it much faster than genome sequencing.

However, large quantities of data are produced whereby bioinformatics application becomes vital for data curation, sequence alignment, and analysis.

An example of the application of bioinformatics in genome sequencing is the diagnosis of Cantu syndrome. This syndrome is characterized by cardiac defects, unique facial features, and excessive hair. One study compared the exome of a child (with the condition) to the parent exomes (without the condition) which resulted in the identification of five candidate genes that were significantly different.

These genes were then sequenced allowing for the identification of a causative dominant missense mutation in the ABCC9 gene. The ABCC9 protein is part of the ATP-dependent potassium channel responsible for relaying chemical messages across cells.

This mutation has also been identified in many other patients with this condition, and therefore it has been suggested that loss of function of this kinase results in Cantu syndrome.

Using whole-exome sequencing and bioinformatics, 50% of rare disease genes have so far been identified, with the rest is expected to be sequenced by 2020.

Another use of bioinformatics is in the identification of cancerous mutations. Through the development of automated systems, large volumes of sequential data can be produced and used to identify previously unknown point mutations.

Bioinformatics also works to create new algorithms that can compare different sequences, thereby aiding in the identification of mutations.

Bioinformatics and genome-wide association studies

Genome-wide association studies (GWAS) carry out genomic scans in the attempt to identify specific markers that can indicate an individual’s susceptibility to a genetic disease. Genetic association between a specific marker and the disease can improve detection and treatment. If used on a large scale, this can also aid in the development of prophylactic treatments.

To carry out GWAS, the genomes of individuals with a disease and those without a disease are compared. The development of highly automated systems has led to the high-throughput identification of single nucleotide polymorphisms (SNPS).

By comparing SNPS, those which are more common in individuals with the disease can be identified and used as disease markers. This information is then stored online and made available to scientists across the globe.

The first published GWAS was age-related macular degeneration (AMD). Out of 116,204 SNPS that were genotyped, one study observed a link between the complement factor “H” (CFH) gene and AMD. Therefore, individuals susceptible to AMD can be screened for the presence of the CFH gene.

Several other disease genes have been characterized after that with the intention of helping doctors and other health care professionals in identifying the possible risk of a genetic disease and allowing for appropriate disease management.

Bioinformatics and evolutionary studies

By studying the changes in DNA within organisms and comparing them to other species, the genetic changes associated with evolution can be classified. Evolution is the process that involves small, cumulative changes in DNA that eventually leads to the formation of novel species.

Bioinformatics has aided research in the evolutionary process by allowing comparison of DNA sequences, sharing of data, prediction of future evolution, and classification of complex evolutionary processes.

When put together, the data can be used to create a phylogenetic tree that can trace several species to their original ancestry.

These are only a few of the myriad applications of bioinformatics within genetics. Overall, bioinformatics has thrown open enormous opportunities in the field of genomics and targeted gene therapy.

Further Reading

Last Updated: Aug 13, 2022

Hannah Simmons

Written by

Hannah Simmons

Hannah is a medical and life sciences writer with a Master of Science (M.Sc.) degree from Lancaster University, UK. Before becoming a writer, Hannah's research focussed on the discovery of biomarkers for Alzheimer's and Parkinson's disease. She also worked to further elucidate the biological pathways involved in these diseases. Outside of her work, Hannah enjoys swimming, taking her dog for a walk and travelling the world.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Simmons, Hannah. (2022, August 13). Bioinformatics in Research. AZoLifeSciences. Retrieved on April 24, 2024 from

  • MLA

    Simmons, Hannah. "Bioinformatics in Research". AZoLifeSciences. 24 April 2024. <>.

  • Chicago

    Simmons, Hannah. "Bioinformatics in Research". AZoLifeSciences. (accessed April 24, 2024).

  • Harvard

    Simmons, Hannah. 2022. Bioinformatics in Research. AZoLifeSciences, viewed 24 April 2024,


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoLifeSciences.
Post a new comment

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Enhancing Antimicrobial Resistance Surveillance with Big Data on Livestock Farms