» Articles » PMID: 37601966

Genotyping and Population Characteristics of the China Kadoorie Biobank

Abstract

The China Kadoorie Biobank (CKB) is a population-based prospective cohort of >512,000 adults recruited from 2004 to 2008 from 10 geographically diverse regions across China. Detailed data from questionnaires and physical measurements were collected at baseline, with additional measurements at three resurveys involving ∼5% of surviving participants. Analyses of genome-wide genotyping, for >100,000 participants using custom-designed Axiom arrays, reveal extensive relatedness, recent consanguinity, and signatures reflecting large-scale population movements from recent Chinese history. Systematic genome-wide association studies of incident disease, captured through electronic linkage to death and disease registries and to the national health insurance system, replicate established disease loci and identify 14 novel disease associations. Together with studies of candidate drug targets and disease risk factors and contributions to international genetics consortia, these demonstrate the breadth, depth, and quality of the CKB data. Ongoing high-throughput omics assays of collected biosamples and planned whole-genome sequencing will further enhance the scientific value of this biobank.

Citing Articles

Evolution, genetic diversity, and health.

Palma-Martinez M, Posadas-Garcia Y, Shaukat A, Lopez-Angeles B, Sohail M Nat Med. 2025; .

PMID: 40055519 DOI: 10.1038/s41591-025-03558-1.


Polygenic risk scores for pan-cancer risk prediction in the Chinese population: A population-based cohort study based on the China Kadoorie Biobank.

Zhu M, Zhu X, Han Y, Ma Z, Ji C, Wang T PLoS Med. 2025; 22(2):e1004534.

PMID: 40019942 PMC: 11870365. DOI: 10.1371/journal.pmed.1004534.


Comparative studies of 2168 plasma proteins measured by two affinity-based platforms in 4000 Chinese adults.

Wang B, Pozarickij A, Mazidi M, Wright N, Yao P, Said S Nat Commun. 2025; 16(1):1869.

PMID: 39984443 PMC: 11845630. DOI: 10.1038/s41467-025-56935-2.


Efficient storage and regression computation for population-scale genome sequencing studies.

Rivas M, Chang C Bioinformatics. 2025; 41(3).

PMID: 39932865 PMC: 11893150. DOI: 10.1093/bioinformatics/btaf067.


Introduction to Mendelian randomization.

Au Yeung S, Luo S, Iwagami M, Goto A Ann Clin Epidemiol. 2025; 7(1):27-37.

PMID: 39926273 PMC: 11799858. DOI: 10.37737/ace.25004.


References
1.
OConnell J, Sharp K, Shrine N, Wain L, Hall I, Tobin M . Haplotype estimation for biobank-scale data sets. Nat Genet. 2016; 48(7):817-20. PMC: 4926957. DOI: 10.1038/ng.3583. View

2.
Zhou W, Nielsen J, Fritsche L, Dey R, Gabrielsen M, Wolford B . Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet. 2018; 50(9):1335-1341. PMC: 6119127. DOI: 10.1038/s41588-018-0184-y. View

3.
Zgheib H, Wakil C, Shayya S, Mailhac A, Al-Taki M, El Sayed M . Utility of liver function tests in acute cholecystitis. Ann Hepatobiliary Pancreat Surg. 2019; 23(3):219-227. PMC: 6728249. DOI: 10.14701/ahbps.2019.23.3.219. View

4.
Arnold M, Raffler J, Pfeufer A, Suhre K, Kastenmuller G . SNiPA: an interactive, genetic variant-centered annotation browser. Bioinformatics. 2014; 31(8):1334-6. PMC: 4393511. DOI: 10.1093/bioinformatics/btu779. View

5.
Giannakopoulou O, Lin K, Meng X, Su M, Kuo P, Peterson R . The Genetic Architecture of Depression in Individuals of East Asian Ancestry: A Genome-Wide Association Study. JAMA Psychiatry. 2021; 78(11):1258-1269. PMC: 8482304. DOI: 10.1001/jamapsychiatry.2021.2099. View