1. Primary data analysis (1차 데이터 분석)
각 다중 오믹스 시퀀싱의 FASTQ 형태를 비롯한 원데이터( raw data) 로 부터 서열정렬, 변이 동정, 각 유전자별 발현량 동정등의 과정을 진행하는 preprocessing 과정
-
Genome
(WES, WGS)
- Raw data quality control
- Alignment of raw data to genome or proteome
-
Transcriptome
(RNA-seq, scRNA-seq)
- Raw data quality control
- Alignment of raw data to genome or proteome
- Expression matrix generation
-
Proteome
(LC-ms/ms)
- Raw data quality control
- Alignment of raw data to genome or proteome
- Expression matrix generation
- Raw data quality control
- Alignment of raw data to genome or proteome
- Variant calling
- Expression matrix generation
2. Secondary data analysis (2차 데이터 분석)
preprocessing 과정이 끝난 데이터를 사용하여 유전 변이 별 주석 처리, 차별 발현 유전자 분석, 단일 세포 군집분석 등 각 실험에 맞는 통계 기법을 사용하여 데이터를 분석하는 과정
-
Genome
(WES, WGS)
- Basic variant annotation
- Disease-Phenotype annotation
- Clinical annotation
- Functional score annotation(SIFT..)
- Gene identifier annotation
-
Transcriptome
(RNA-seq, scRNA-seq)
- RNA-seq, LC-ms/ms
Differentially expressed gene analysis
- Quality control
- Normalization
- Statistical test
- Multiple correction
- Single cell RNA-seq(scRNA-seq)
- QC
- Normalization
- Dimensional reduction (PCA, UMAP, t-SNE)
- DEG
- Cell type annotation
-
Proteome
(LC-ms/ms)
- RNA-seq, LC-ms/ms
Differentially expressed gene analysis
- Quality control
- Normalization
- Statistical test
- Multiple correction
- Basic variant annotation
- Disease-Phenotype annotation
- Clinical annotation
- Functional score
annotation(SIFT..)
- Gene identifier annotation
- RNA-seq, LC-ms/ms
Differentially expressed gene analysis
- Quality control
- Normalization
- Statistical test
- Multiple correction
-
Single cell RNA-seq(scRNA-seq)
- QC
- Normalization
- Dimensional reduction (PCA, UMAP, t-SNE)
- DEG
- Cell type annotation
3. Advanced Data analysis (고급 데이터 분석)
각 대량의 유전자 발현 변화를 다양한 pathway , GO(Gene ontology), 단백질/전사체 상호작용(PPI, TRI) 네트워크 등의 지식베이스를 기반으로 다양한 실험 목적에 맞는 생물학적, 의학적, 약물학적 의미를 분석하는 과정
-
Genome
(WES, WGS)
- Gene set enrichment annotation (GSEA)
- Multiple gene identifier annotation
- Network analysis( PPI, TRN)
- Clinical, Pharmacological insight knowledge annotation (Disease/Drug-gene)
-
Transcriptome
(RNA-seq, scRNA-seq)
- Gene set enrichment annotation (GSEA)
- Multiple gene identifier annotation
- Network analysis( PPI, TRN)
- Clinical, Pharmacological insight knowledge annotation (Disease/Drug-gene)
-
Proteome
(LC-ms/ms)
- Gene set enrichment annotation (GSEA)
- Multiple gene identifier annotation
- Network analysis( PPI, TRN)
- Clinical, Pharmacological insight knowledge annotation (Disease/Drug-gene)
- Gene set enrichment annotation (GSEA)
- Multiple gene identifier annotation
- Network analysis( PPI, TRN)
- Clinical, Pharmacological insight knowledge annotation (Disease/Drug-gene)
4. Collaboration Research (협력 연구)
고급 통계 분석, 최신 bio-medical, 데이터 사이언스, AI 기술 활용 분석 등 다양한 고차원 분석 기술기반 데이터 맞춤 분석을 진행하는 과정
-
Genome
(WES, WGS)
- Variant(gene) prioritization (custom variant filtering and prioritization)
- Exploratory analysis to identify clusters and patterns in the data sets using methods (PCA, Regression, SVM..)
- Advanced Quantitative Research Methods (Survival analysis, time series data)
- Prediction of Clinical outcomes based on RNA-seq data(Disease risk prediction)
- Integrative analysis and causal inference of multiple omics data sets in order to gain mechanistic insights into diseases and biological processes
- Integrative network and pathways analysis for omics data
- Advanced analysis of single cell genomics data, including scRNA-seq data using the state-of-art methods
- Computational pharmacogenomics analysis(drug repositioning, target discovery)
-
Transcriptome
(RNA-seq, scRNA-seq)
- Variant(gene) prioritization (custom variant filtering and prioritization)
- Exploratory analysis to identify clusters and patterns in the data sets using methods (PCA, Regression, SVM..)
- Advanced Quantitative Research Methods (Survival analysis, time series data)
- Prediction of Clinical outcomes based on RNA-seq data(Disease risk prediction)
- Integrative analysis and causal inference of multiple omics data sets in order to gain mechanistic insights into diseases and biological processes
- Integrative network and pathways analysis for omics data
- Advanced analysis of single cell genomics data, including scRNA-seq data using the state-of-art methods
- Computational pharmacogenomics analysis(drug repositioning, target discovery)
-
Proteome
(LC-ms/ms)
- Variant(gene) prioritization (custom variant filtering and prioritization)
- Exploratory analysis to identify clusters and patterns in the data sets using methods (PCA, Regression, SVM..)
- Advanced Quantitative Research Methods (Survival analysis, time series data)
- Prediction of Clinical outcomes based on RNA-seq data(Disease risk prediction)
- Integrative analysis and causal inference of multiple omics data sets in order to gain mechanistic insights into diseases and biological processes
- Integrative network and pathways analysis for omics data
- Advanced analysis of single cell genomics data, including scRNA-seq data using the state-of-art methods
- Computational pharmacogenomics analysis(drug repositioning, target discovery)
- Variant(gene) prioritization (custom variant filtering and prioritization)
- Exploratory analysis to identify clusters and patterns in the data sets using methods (PCA, Regression, SVM..)
- Advanced Quantitative Research Methods (Survival analysis, time series data)
- Prediction of Clinical outcomes based on RNA-seq data(Disease risk prediction)
- Integrative analysis and causal inference of multiple omics data sets in order to gain mechanistic insights into diseases and biological processes
- Integrative network and pathways analysis for omics data
- Advanced analysis of single cell genomics data, including scRNA-seq data using the state-of-art methods
- Computational pharmacogenomics analysis(drug repositioning, target discovery)
5. Comprehensive integrated data analysis (종합적인 데이터 통합 분석)
1차 데이터 분석부터 고급 데이터 분석까지 각 실험 디자인에 맞는 종합적이고 포괄적인 데이터 분석을 제공하는 과정
- Primary data analysis
- Secondary data analysis
- Advanced Data analysis