Accepted Articles of Congress

  • Applications of LLMs in BioInformatics

  • sina alemohammad,1,*
    1. Ahvaz Jundi Shapoor university of medical sciences


  • Introduction: Artificial intelligence has been rapidly expanding into diverse sectors in recent years, fundamentally altering our approach to these domains. The accelerated development of NLP-powered AI and the proliferation of LLM models like Gyma, LLAMA, and ChatGPT have placed AI at the forefront of technological advancements, offering unprecedented opportunities to enhance knowledge across multiple disciplines. Bioinformatics is no exception; LLMs hold immense potential to revolutionize this field. By integrating LLMs into bioinformatics, we anticipate significant advancements, including accelerated processes and novel insights.
  • Methods: The following prompt has been searched in google scholar:‘large language models in bioinformatics genomic OR proteomic OR metabolomic OR Multiomic source:"Bioinformatics" OR source:"PLOS Computational Biology" OR source:"Nature" OR source:"Briefings in Bioinformatics" OR source:"Wiley -ethics -social -future’ . Time range has been narrowed between 2022 and 2024, because first release of one of the most powerful LLM models “OpenAI chat-GPT” was 2022. and the The result was About 926 articles. The articles with unrelated subjects removed. The remining articles categorized based on subject and title then these articles analyzed.
  • Results: There is a significant acceleration in growing LLMs with diverse tasks and applications. These models are revolutionizing the field of bioinformatics and way we analyze biologic data. these are articles can be categories in following topics. 1)Text Mining and Knowledge Extraction: LLMs excel at processing vast amounts of scientific literature, extracting valuable insights, relationships, and knowledge. Tools like PEDL+ leverage LLMs for relation extraction, while projects like PlantConnectome use LLMs to uncover nearly 400,000 functional relationships from over 100,000 plant biology abstracts. 2)Multi-Omics Integration: LLMs can integrate data from multiple omics layers, including genomics, transcriptomics, proteomics, and metabolomics. This allows for gaining comprehensive insights, elucidating disease mechanisms, and identifying pathways involved in biological processes. 3)Genomics Applications: In genomics, LLMs can assist in tasks like identifying potential coding regions, extracting named entities for genes and proteins, and detecting antimicrobial and anti-cancer peptides. GeneGPT, a method that teaches LLMs to use NCBI Web APIs, achieves state-of-the-art performance on genomics tasks by leveraging database utilities. 4)Protein Structure Prediction and Drug Discovery: LLMs have shown promise in predicting protein structures and designing novel drugs. By processing and learning from vast amounts of protein sequence and structure data, LLMs can make accurate predictions and generate novel molecules with desired properties. 5)Bioinformatics Education and Problem-Solving: LLMs can help solve educational bioinformatics problems and assist researchers in tackling complex tasks. By understanding the context and leveraging their broad knowledge, LLMs can provide guidance and insights to bioinformatics students and professionals. 6) Engineering biomolecules and synthetic biology integration: Recently molecule programming model like MPM4 has been released and it can make entirely new proteins. Models like MPM4 are text to molecule models and the prompt for them is describing the function of the molecule and its structure.
  • Conclusion: It seems with growing LLMs , bioinformatics is entering new phase and it can have more impact than ever. There is a great opportunity for bioinformatics researcher to use the different LLM models and improve their researches.
  • Keywords: AI, LLMs, bioinformatics

Join the big family of Pharmacogenetics and Genomics!