Lucas Busta, CPPG Visiting Scholar Workshop Friday, Oct. 25th

Lucas Busta image

About the Workshop:

Using language models to embed proteins in high-dimensional space

Please join us for a hands-on, one-hour workshop designed for graduate students and postdocs to explore the rapidly evolving field of protein language models. 

This session will introduce the basics of protein language models and guides participants through embedding protein sequences from FASTA files using Python and R. 

  • We will start by covering the foundational concepts behind protein language models—how they function similarly to natural language models but are specialized for protein sequences. 
  • We will then proceed with practical work in Jupyter notebooks, using Python and R libraries to generate embeddings from protein sequences with the help of a provided script. 
  • After creating the embeddings, we will use R to perform Principal Component Analysis (PCA) and visualize the relationships between the sequences. 

This workshop is hands-on, and participants will gain experience in both generating and analyzing protein embeddings using their personal computers. By the end of the session, we aim to equip everyone with the skills needed to create and work with protein embeddings, enabling downstream applications in areas such as structure prediction and functional annotation. 

Participants should have: 

  • a basic knowledge of Python and/or R
  • intermediate experience in bioinformatics or computational biology


Prior to the workshop, attendees will receive installation instructions and a helper script to prepare for the session. Seating at this event is limited to 50 people. 

Registration Deadline = Monday Oct. 21st

About Lucas Busta:

Dr. Busta is an Assistant Professor at the University of Minnesota Duluth (UMD) in the Swenson College of Science and Engineering, Department of Chemistry and Biochemistry. He completed his undergraduate studies at UMD, earned a Ph.D. in chemistry from the University of British Columbia while working with Reinhard Jetter, then was a postdoctoral fellow at University of Nebraska mentored by Dr. Edgar B. Cahoon, sponsored by the National Science Foundation’s Plant Genome Research Program. 

Busta is fascinated by the unique chemistries that biological systems use to survive harsh environments. His research uses informatics to unite classical analytical chemistry with emerging high-throughput DNA sequencing technologies to understand the molecular structures and biosynthesis of plant chemicals, polymers, and composites. His goal is to use this approach to develop and apply new knowledge about chemical biology to sustaining and improving human life while protecting the planet.