Large Scale Health Data Initiatives
- goodgreenlife

- 3 days ago
- 3 min read
UK Biobank (link) is a national and international health resource (link). Its aim is to improve the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. It follows the health and well being of 500,000+ volunteer participants and provides health information, which does not identify them, to approved researchers in the UK and overseas, from academia and industry.
Biobanks (Fig 1.) are large scale initiatives following a populations health over time. By collecting blood, saliva, urine and combining the omics data with health records the aim is to study causes of chronic disease and use this for health treatments. UKBioBank has cost the public over £300 million with comparable costs for the creation of other countries large scale health data initiatives such as $180 million for BioBank Japan or $700 million for the Military Veteran Programme in the US.
Data Initiative | Country | Initiative Goal | Samples / Data Collected | Participants |
UK Biobank | UK | Improve prevention, diagnosis and treatment of disease | Blood, urine, saliva, DNA, imaging, metabolomics, proteomics, EHR | 500,000+ residents; ~20 million samples, 2 samples / patient / year |
OpenSAFELY | UK | Secure, large-scale analysis of NHS patient records | EHRs: diagnoses, prescriptions, clinical lab results, vitals | 58 million+ NHS patients |
Our Future Health | UK | Prevent, detect, and treat diseases earlier | Blood (plasma, serum, buffy coat), DNA, lifestyle data, EHRs | Target: 5 million UK adults (1M+ enrolled) |
China Kadoorie Biobank | China | Study genetic and environmental causes of chronic diseases | Blood (plasma, buffy coat), DNA, biochemistry, metabolomics, proteomics | 512,000+ residents |
CARTaGENE | Canada | Population biobank for precision health research | Blood, urine, DNA, genotyping, bone density, lifestyle and health data | 43,000+ residents in Québec |
CanPath | Canada | National platform for chronic disease and population health research | Blood, saliva, urine, physical measurements, EHRs | 330,000+ participants |
FinnGen | Finland | Integrate genetic data with national health records | Blood (DNA), genotyping, WGS, EHRs, planned proteomics and metabolomics | 500,000+ residents |
All of Us | United States | Enable precision medicine and health equity | Blood, urine, saliva, EHRs, genomics (WGS, genotyping), surveys, wearables | Target: 1 million; 414,000+ with WGS as of 2025 |
Million Veteran Program (MVP) | United States | Study genetics and health in veterans | Blood, genotyping, WGS, EHRs, surveys | 900,000+ veterans enrolled |
BioBank Japan | Japan | Genomic and biomarker research for common diseases | Blood, genotyping, proteomics, metabolomics (46k serum metabolome), EHRs | 200,000+ disease-specific patients |
Taiwan Biobank | Taiwan | Population health and genetic diversity research | Blood, urine, saliva, DNA, genotyping, surveys, imaging | 200,000+ enrolled; goal 300,000+ |
Korean Genome and Epidemiology Study (KoGES) | South Korea | Genomic epidemiology of chronic diseases | Blood, urine, DNA, genotyping, metabolomics, surveys | ~200,000 participants across cohorts |
Estonian Biobank | Estonia | Genomic and health data integration for national precision medicine | Blood, genotyping, WGS, clinical and registry data | ~210,000 participants (~20% of Estonia’s population) |
Lifelines Biobank | Netherlands | Three-generation cohort for health and disease research | Blood, urine, feces, saliva, DNA, metabolomics, microbiome, EHRs | ~167,000 participants |
Generation Scotland | Scotland (UK) | Family-based genetic and health research | Blood, DNA, physical and mental health records | 24,000 (GenScot 1); expanding to >20,000 more |
Fig 1. List of current international biobanks. EHR (England Health Records)
This research infrastructure has allowed over 18,000 peer review papers and hundreds of research projects. They generate discovery science, make drug target selection more evidence based, enable faster public health analytics and create long lived national infrastructure that can be reused for decades.




Comments