» Articles » PMID: 29136145

A Two-step Method for Variable Selection in the Analysis of a Case-cohort Study

Overview
Journal Int J Epidemiol
Specialty Public Health
Date 2017 Nov 15
PMID 29136145
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies.

Methods: We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression.

Results: Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods.

Conclusions: The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method.

Citing Articles

The association between osteoprotegerin and arterial stiffness in a 10-year longitudinal study of patients with type 2 diabetes.

Low S, Pek S, Moh A, Liu J, Pandian B, Ang K Diab Vasc Dis Res. 2024; 21(6):14791641241304435.

PMID: 39626773 PMC: 11615981. DOI: 10.1177/14791641241304435.


The Association between Serum Lipid Profile Levels and Hypertension Grades: A Cross-Sectional Study at a Health Examination Center.

Huang L, Liu Z, Zhang H, Li D, Li Z, Huang J High Blood Press Cardiovasc Prev. 2024; 32(1):87-98.

PMID: 39602007 DOI: 10.1007/s40292-024-00683-9.


Plasma sphingolipids mediate the association between gut microbiome composition and type 2 diabetes risk in the HELIUS cohort: a case-cohort study.

Overbeek M, Rutters F, Nieuwdorp M, Davids M, van Valkengoed I, Galenkamp H BMJ Open Diabetes Res Care. 2024; 12(4).

PMID: 39025794 PMC: 11261679. DOI: 10.1136/bmjdrc-2024-004180.


High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study.

Hu W, Chen S, Cai J, Yang Y, Yan H, Chen F BMC Med Res Methodol. 2024; 24(1):125.

PMID: 38831262 PMC: 11145821. DOI: 10.1186/s12874-024-02254-x.


Predicting Adaptations to Resistance Training Plus Overfeeding Using Bayesian Regression: A Preliminary Investigation.

Smith R, Harty P, Stratton M, Rafi Z, Rodriguez C, Dellinger J J Funct Morphol Kinesiol. 2021; 6(2).

PMID: 33919267 PMC: 8167794. DOI: 10.3390/jfmk6020036.


References
1.
Jones E, Sweeting M, Sharp S, Thompson S . A method making fewer assumptions gave the most reliable estimates of exposure-outcome associations in stratified case-cohort studies. J Clin Epidemiol. 2015; 68(12):1397-405. PMC: 4669309. DOI: 10.1016/j.jclinepi.2015.04.007. View

2.
van Houwelingen H, Putter H . Comparison of stopped Cox regression with direct methods such as pseudo-values and binomial regression. Lifetime Data Anal. 2014; 21(2):180-96. DOI: 10.1007/s10985-014-9299-3. View

3.
Newcombe P, Ali H, Blows F, Provenzano E, Pharoah P, Caldas C . Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival. Stat Methods Med Res. 2014; 26(1):414-436. PMC: 6055985. DOI: 10.1177/0962280214548748. View

4.
Barlow W . Robust variance estimation for the case-cohort design. Biometrics. 1994; 50(4):1064-72. View

5.
Sharp S, Poulaliou M, Thompson S, White I, Wood A . A review of published analyses of case-cohort studies and recommendations for future reporting. PLoS One. 2014; 9(6):e101176. PMC: 4074158. DOI: 10.1371/journal.pone.0101176. View