» Articles » PMID: 28361693

Protein Complex-based Analysis is Resistant to the Obfuscating Consequences of Batch Effects --- a Case Study in Clinical Proteomics

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2017 Apr 1
PMID 28361693
Citations 9
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In proteomics, batch effects are technical sources of variation that confounds proper analysis, preventing effective deployment in clinical and translational research.

Results: Using simulated and real data, we demonstrate existing batch effect-correction methods do not always eradicate all batch effects. Worse still, they may alter data integrity, and introduce false positives. Moreover, although Principal component analysis (PCA) is commonly used for detecting batch effects. The principal components (PCs) themselves may be used as differential features, from which relevant differential proteins may be effectively traced. Batch effect are removable by identifying PCs highly correlated with batch but not class effect. However, neither PC-based nor existing batch effect-correction methods address well subtle batch effects, which are difficult to eradicate, and involve data transformation and/or projection which is error-prone. To address this, we introduce the concept of batch-effect resistant methods and demonstrate how such methods incorporating protein complexes are particularly resistant to batch effect without compromising data integrity.

Conclusions: Protein complex-based analyses are powerful, offering unparalleled differential protein-selection reproducibility and high prediction accuracy. We demonstrate for the first time their innate resistance against batch effects, even subtle ones. As complex-based analyses require no prior data transformation (e.g. batch-effect correction), data integrity is protected. Individual checks on top-ranked protein complexes confirm strong association with phenotype classes and not batch. Therefore, the constituent proteins of these complexes are more likely to be clinically relevant.

Citing Articles

Thinking points for effective batch correction on biomedical data.

Hui H, Kong W, Goh W Brief Bioinform. 2024; 25(6).

PMID: 39397427 PMC: 11471903. DOI: 10.1093/bib/bbae515.


Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method.

Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z Genome Biol. 2023; 24(1):201.

PMID: 37674217 PMC: 10483871. DOI: 10.1186/s13059-023-03047-z.


Normalization of Large-Scale Transcriptome Data Using Heuristic Methods.

Yosef A, Shnaider E, Schneider M, Gurevich M Bioinform Biol Insights. 2023; 17:11779322231160397.

PMID: 37020503 PMC: 10068970. DOI: 10.1177/11779322231160397.


PCLassoLog: A protein complex-based, group Lasso-logistic model for cancer classification and risk protein complex discovery.

Wang W, Yuan H, Han J, Liu W Comput Struct Biotechnol J. 2022; 21:365-377.

PMID: 36582441 PMC: 9791601. DOI: 10.1016/j.csbj.2022.12.005.


Perspectives for better batch effect correction in mass-spectrometry-based proteomics.

Phua S, Lim K, Goh W Comput Struct Biotechnol J. 2022; 20:4369-4375.

PMID: 36051874 PMC: 9411064. DOI: 10.1016/j.csbj.2022.08.022.


References
1.
Goh W, Lee Y, Chung M, Wong L . How advancement in biological network analysis methods empowers proteomics. Proteomics. 2012; 12(4-5):550-63. DOI: 10.1002/pmic.201100321. View

2.
Rost H, Rosenberger G, Navarro P, Gillet L, Miladinovic S, Schubert O . OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol. 2014; 32(3):219-23. DOI: 10.1038/nbt.2841. View

3.
Hanley J . The statistical legacy of William Sealy Gosset ("Student"). Community Dent Health. 2009; 25(4):194-5. View

4.
Goh W, Guo T, Aebersold R, Wong L . Quantitative proteomics signature profiling based on network contextualization. Biol Direct. 2015; 10:71. PMC: 4678536. DOI: 10.1186/s13062-015-0098-x. View

5.
Marusyk A, Almendro V, Polyak K . Intra-tumour heterogeneity: a looking glass for cancer?. Nat Rev Cancer. 2012; 12(5):323-34. DOI: 10.1038/nrc3261. View