Intro to Big Data Bioinformatics

On this page, you will find information on the program training topics, syllabus and ways to register. Introduction to Bioinformatics - this training program is designed for everyone, including students who don't have a background in bioinformatics, as well as life science researchers. The objective is to introduce topics and examples to help participants understand Omics data, and the use of bioinformatics in life science research. As a result of this training, you will learn about Next Generation Sequencing (NGS) data analysis. This includes processing and preparing data for analysis in application to Genomics, Transcriptomics, and Metagenomics. You will also get an overview of downstream analysis and interpretation of various types of -omics data using bioinformatics, including commonly used annotation databases, statistical analysis and machine learning techniques.

Why Next Generation Sequencing? With the decreasing cost of Next Generation Sequencing (NGS) and the increasingly broad range of applications, this technology has transformed biomedical research, the biotechnology industry, and now is becoming increasingly becoming popular in clinical use. Analysis of NGS data can help identify pathogenic, germline, and somatic DNA variants; measure gene expression; detect methylation patterns, and even study microbial communities on human skin, in the gut, lungs, and other organs. That is why this program can help anyone who is getting started with life science research and bioinformatics to understand these techniques, their applications and a broad overview of various methods to know getting started with bioinformatics.

To learn more, we welcome you to view the video below and register for a free orientation session:

Key Topics Covered:

Big Data in Biology
Introduction to Omics Data in Life Science Research, including various types: Genomics, Transcriptomics and Metagenomics. Learn how Next Generation Sequencing can be used to study biological variation and understand genes, mutations and microorganisms responsible for experimental conditions or clinical factors in disease.
Statistical Analysis of Omics Data

Learn about statistical analysis for Big Data, including how to use appropriate analysis techniques to measure differences between groups of samples. See examples of advanced data analysis methods and ways to perform visualization, annotation and interpretation of analysis results.

Machine Learning for Biomedical Data Science

Understanding of analytical methods for processing, visualization and analysis of complex biomedical data, Learning terminology for machine learning and artificial intelligence in biomedical discovery

Bioinformatics Project Examples: Machine Learning for Biomedical Data

Case Study to learn about: Modelling Cancer Precision Medicine: Learn to analyze various omics data types, integrate them and associate them with a phenotype (response to treatment) using sophisticated machine learning algorithms.

Program Syllabus : Introduction to Big Data Bioinformatics

Introduction to Bioinformatics

Introduction & Orientation 

  • Introduction and commencement of the program
  • Introduction to the Faculty/mentors & Trainers
  • Demonstration of the slack channel and program page
  • Omics Logic Learn courses, projects, and profile demonstration
  • T-BioInfo Analytics Platform - Email and password, show the access to the demo pipeline.
  • Introduction of the participants & Q&A discussion 
  • Expectations and schedule review for the training program, description of the structure of the course, important deadlines. 

Associated online resources:

Introduction to Genomics Introduction to NGS Genomics 
  • Genome variations: A detailed understanding
  • Targeted Sequencing, Whole Exome Sequencing, and Whole Genome Sequencing
  • Logical steps for Genomic Data Analysis and associated Algorithms
  • Analysis with Integrative Genomics Viewer

Associated online resources:

Introduction to Transcriptomics Introduction to Gene Expression NGS Data Analysis 
  • Analysis logic: from raw reads to a table of expression (RNA-seq example)
  • Common sources of unwanted technical variation 
  • Pre-processing steps, filtering and cleaning the table of expression
  • Loading processed data for analysis

Associated online resources:

Transcriptomics in Research Transcriptomics in Research: DEGs & Pathway annotation  
  • Introduction to Differential gene expression
  • Volcano plots, MA Plots, Heatmaps 
  • Regression and Factor Regression Analysis
  • Application of transcriptomics in research

Associated online resources:

Statistical Analysis and Machine Learning
Biomedical Data Science: Introduction to Machine Learning for NGS Data
  • Introduction to Machine Learning
  • Types of Machine Learning methods 
  • Overview of unsupervised machine learning methods
  • Finding patterns and similarities in data
  • Principal Component Analysis (PCA) Hierarchical and K-means clustering

Associated online resources:

Bioinformatics NGS Machine Learning Projects

Bioinformatics NGS Machine Learning Projects

  • Overview of supervised machine learning methods
  • Preparing Training and test datasets
  • Classification: Decision Trees, Random Forest (RF), Support Vector Machine (SVM)
  • Bioinformatics Project examples & Omicslogic Research Fellowship

Associated online resources:

Project Examples:


Register for the Upcoming Webinar Session:

"This lesson gave a good explanation and example for how normal citizens can participate in complex biomedical work without the extensive background many scientists have".
- Lane Yutzy, PhD Fellow
"It literally helped me to clear out my basic concepts. Also the study material provided in reference was extremely helpful. It simply explained the terms involved".
- Jeevanjot Kaur, Graduate Student
"It helped me understand the overview of the research fellowship and how it guides newcomers to become experts in the field of bioinformatics, through the use of courses, literature, examinations, applications etc".
- Kakshil Patil, Graduate Student