» Articles » PMID: 30305743

The UK Biobank Resource with Deep Phenotyping and Genomic Data

Abstract

The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

Citing Articles

Mapping variants in thyroid hormone transporter MCT8 to disease severity by genomic, phenotypic, functional, structural and deep learning integration.

Groeneweg S, van Geest F, Martin M, Dias M, Frazer J, Medina-Gomez C Nat Commun. 2025; 16(1):2479.

PMID: 40075072 PMC: 11904026. DOI: 10.1038/s41467-025-56628-w.


Genetically supported targets and drug repurposing for brain aging: A systematic study in the UK Biobank.

Yi F, Yuan J, Somekh J, Peleg M, Zhu Y, Jia Z Sci Adv. 2025; 11(11):eadr3757.

PMID: 40073132 PMC: 11900869. DOI: 10.1126/sciadv.adr3757.


Identification of effect modifiers using a stratified Mendelian randomization algorithmic framework.

Man A, Knusel L, Graf J, Lali R, Le A, Di Scipio M Eur J Epidemiol. 2025; .

PMID: 40072671 DOI: 10.1007/s10654-025-01213-0.


Harnessing the power of genomics in hypertension: tip of the iceberg?.

Naderi H, Warren H, Munroe P Camb Prism Precis Med. 2025; 3:e2.

PMID: 40071139 PMC: 11894416. DOI: 10.1017/pcm.2025.1.


Sample observed effects: enumeration, randomization and generalization.

Ribeiro A Sci Rep. 2025; 15(1):8423.

PMID: 40069178 PMC: 11897334. DOI: 10.1038/s41598-024-80839-8.


References
1.
Wain L, Shrine N, Miller S, Jackson V, Ntalla I, Artigas M . Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank. Lancet Respir Med. 2015; 3(10):769-81. PMC: 4593935. DOI: 10.1016/S2213-2600(15)00283-0. View

2.
Wood A, Esko T, Yang J, Vedantam S, Pers T, Gustafsson S . Defining the role of common variation in the genomic and biological architecture of adult human height. Nat Genet. 2014; 46(11):1173-86. PMC: 4250049. DOI: 10.1038/ng.3097. View

3.
Loh P, Tucker G, Bulik-Sullivan B, Vilhjalmsson B, Finucane H, Salem R . Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet. 2015; 47(3):284-90. PMC: 4342297. DOI: 10.1038/ng.3190. View

4.
McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood A, Teumer A . A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016; 48(10):1279-83. PMC: 5388176. DOI: 10.1038/ng.3643. View

5.
Shibata K, Hozawa A, Tamiya G, Ueki M, Nakamura T, Narimatsu H . The confounding effect of cryptic relatedness for environmental risks of systolic blood pressure on cohort studies. Mol Genet Genomic Med. 2014; 1(1):45-53. PMC: 3893157. DOI: 10.1002/mgg3.4. View