Skip to main content

100,000 genomes

Posted by: and , Posted on: - Categories: New techniques

6.lab samples

In December 2012 the Prime Minister announced an ambitious plan to sequence the genomes of 100,000 NHS patients over the next five years. To meet the ambitious target, the Department of Health has established Genomics England. By Professor Dame Sally Davies, Chief Medical Officer, and Dr Mark Bale, Deputy Head of Health Science and Bioethics Division, Department of Health.

"It is crucial that we continue to push the boundaries and this new plan will mean we are the first country in the world to use DNA codes in the mainstream of the health service."

The Prime Minister, 10 December 2012

In December 2012, the Prime Minister announced a plan to sequence the whole genomes of 100,000 NHS patients and use this to push forward understanding of diseases and to inform the treatment of patients with rare diseases, cancer and infections.

Without wishing to over- exaggerate, there are parallels with US President John F Kennedy's public commitment in 1961 that his government would land a man on the moon by the end of the decade. The President made this commitment before NASA had successfully sent an astronaut into orbit. Huge technical challenges had to be overcome, and significant advances had to be made to achieve a moon landing.

The Prime Minister has laid down a similar bold challenge in the field of life sciences. There has not been a whole genome sequence for any NHS patient in the UK to date. The machines of the current market leader that sequence whole genomes to clinical quality cost around £500,000 each. If all 50 of these machines in the UK were put to work sequencing the 100,000 genomes, it would take eight years. Meanwhile, the genome sequences will need to be linked to patient data, diagnosis, treatment and response.

The Department of Health (DH) has established a wholly- owned company, Genomics England, to overcome these challenges and deliver on the Prime Minister’s commitment.

What is a genome?

Most of us know something about genetics, the study of the way particular features are inherited by children through genes contributed by their mother and father.

In healthcare terms, genetics is a medical speciality that deals with inherited disorders. These may be common and relate to changes in single genes, like Cystic Fibrosis or Muscular Dystrophy. Or they may be more complex due to the spontaneous changes in more than one gene – sometimes whole chromosomes, as in Down's Syndrome.

At present, the NHS diagnoses genetic conditions by testing for single changes, known as mutations in a gene. They often sequence (decode) the four letters of DNA in a whole gene or several genes. Genetic testing is also proving vital in understanding the genetic changes in tumours and the origins of, and risks from, infectious organisms such as E.coli O104:H4 which caused a food poisoning outbreak in Germany in 2011.

By contrast, focus group research conducted for DH in 2012 showed the public's understanding of genomics is very limited. Some just said "I've never seen that word before". Our favourite question asked was: "Is it the economics of genetics?"

Actually genomics is the study of genomes; and a genome is all of the genes in a cell, all of the DNA that codes for a human (or mouse or elephant). Genomics relates to the way that all of the different combinations of genes interact to determine the characteristics of an individual.

Whole genome sequencing

There are a large number of research initiatives around the world aimed at sequencing partial or whole genomes. Many of these projects, such as the Wellcome Trust funded UK10K Genomes project, have looked at sequencing different healthy populations worldwide, or focused on people who are over 100 years old. Others have focussed on particular diseases, such as the International Cancer Genome Consortium. Most focus on the small part of the genome that codes for proteins, known as the exome. UK investment in genomic research this century, led by the Wellcome Trust, has put us at the forefront of genomics research internationally.

Sequencing a genome for research purposes, however, is a totally different prospect to sequencing for a clinical diagnosis. Genome sequencing is inherently prone to errors. It requires the DNA to be broken into thousands of pieces, probed through complex chemical reactions and then re-assembled like a giant jigsaw puzzle so it can be compared with a reference genome map to highlight the differences. During the sequencing process each DNA letter is assigned a value depending on how confident the software is that it is correct. For research purposes it is often sufficient for each letter to be tested (read) five times. However, for clinical use each letter has to be read many times, even up to 100 times to help diagnose tumours. This dramatically increases the cost and the data challenges.


Sequencing whole genomes to clinical quality at this scale is not routine anywhere in the world. Although there are around four or five promising technologies, currently the market leader is Illumina (a US company that uses Cambridge technology developed in the UK). There are approximately 50 of these machines in various UK laboratories, with a further 100 in the USA and 100 in China. The current quoted cost of a clinical whole genome is around $3-5,000, and although price reductions and new entrants to the UK market are already being catalysed by early pilots, Genomics England does not yet have the budget to complete the 100,000 Genomes project. We need sequencing to become more affordable and further external investment to develop firm delivery plans.

The data challenge

The sequencing of a whole genome is relatively easy compared to the handling of the resulting data. Each 'raw' sequence is over two terabytes of information, more than would fit on 500 DVDs. With processing this can be reduced to less than 300 gigabytes, or around 64 DVDs. Once the variation from the reference genome is calculated this shrinks to around one gigabyte of information, mapping millions of differences between the patient's genome and the reference genome.

Once we have the information, the challenge is then to interpret the significance of the differences and arrive at a diagnosis of the patient's condition. The vast majority of these differences are harmless natural variations between individuals. Some of the differences will be obviously harmful – for example those recognised from other patients or from experiments in genetically modified laboratory animals such as mice, fish or fruit flies. But the vast majority are not fully understood and may only give clues about areas for further investigation or research.

Our lack of understanding of the majority of the expected differences is the main reason why the 100,000 Genomes initiative is so important. By combining patient’s’ whole genomes with their anonymised medical records we can create a vital resource, open to carefully controlled access by researchers. These include academic researchers looking for basic understanding of genetic changes as well as companies researching new therapies or diagnostic tests. But perhaps most importantly, the data could provide an opportunity for novel data mining to develop new ways of visualising the patterns between different genes, genomes and clinical syndromes. The Prime Minister's vision is one where the UK is the leader in a new industry, developing personalised, or precision, medicine from genomic and health data to help patients.

Genomics England

DH civil servants have been active every step of the way to deliver this challenging initiative, leading a number of key work areas on the science, data and ethics, together with an analysis of the most appropriate delivery vehicle and assurance framework. DH established Genomics England (GeL) in June and appointed Sir John Chisholm as Executive Chair. He is Chair of NESTA and formerly Chair of the Medical Research Council, GeL is responsible for procuring the sequencing capacity, the data architecture, and the necessary tools to securely store and interpret the 100,000 sequences and allow access for clinicians and researchers.

Sir John’s initial aim is to develop and launch the pilot phases of the programme that will deliver a large number of whole genome sequences. This will help to drive competition in the sequencing market, reducing prices, encouraging new facilities to locate to the UK, and helping identify solutions to the data challenges. Genomics England has already developed a partnership with the University of Cambridge, and started on the plan to sequence 10,000 rare disease patients. A collaboration with Cancer Research UK was announced in September 2013 to sequence 3,000 cancer patients. This will involve the sequencing of both the tumour genome and the patient's normal genome to identify the mutations that caused the cancer. In total Genomics England is committed to delivering 8,000 whole genomes sequences by spring 2015. The scale and speed of these pilot phases exceeds any other clinical whole genome initiative in the world.

Ethical challenge and public confidence

The development of genomics since the start of the Human Genome Project has involved a detailed debate around the ethical concerns, particularly around privacy, and fears about the misuse of data. Many of these have been addressed through policy initiatives such as making it a criminal offence to test DNA without consent, a moratorium by the insurance industry on accessing genetic test results and a recent updating by Dame Fiona Caldicott of the principles around patient data confidentiality.

Since the 100,000 Genomes project promises the integration of genomics research into the mainstream of the NHS there is a vital role for Genomics England and DH to play in ensuring public trust and confidence.

The Prime Minister's announcement said that I, Professor Dame Sally Davies, as Chief Medical Officer would be responsible for overseeing the interests of patients in matters of science, data security and ethics. Immediately after the announcement I established three rapid working parties which reported in March 2013. The Ethics Working Party, chaired by Professor Michael Parker from Oxford, concluded that an appropriate and rational approach to ethical issues would be vital to maintain public trust and confidence. This should build on, rather than duplicate, existing good practice drawn from other projects.

The report recommended five core principles to guide the programme:

  • The programme should bring benefit to current patients, future patients and to the NHS.
  • The findings should be available to patients in the NHS, and drive improved diagnosis or care within the NHS.
  • Data access should be subject to a transparent and accountable governance process and made in the public interest.
  • Consent by participants should be based on an understanding of the implications of participation for themselves and of this programme more broadly.
  • There should be a well-designed and comprehensive programme of public engagement.

These principles have been accepted. Genomics England's first phase relies on patients who are already recruited for clinical research, but the main sequencing programme will involve patients being referred by NHS clinicians. The Prime Minister's announcement emphasised that this project would require explicit patient consent and that all information would be handled in line with other NHS safeguards. Genomics England has established an Ethics Advisory Group, chaired by Professor Parker, which is developing the core policies that will be crucial to the successful delivery of the programme.

As well as the individual patients, the Government is keen to ensure that the wider aspirations of the public are included. Genomics England has already started holding public events and has plans to work closely with the established expertise such as Sciencewise, the Wellcome Trust, and the Medical Research Council. One of the key challenges is to try to demystify genetic testing, genome sequencing and other diagnostic testing or screening. But perhaps the main challenge is to reassure patients about how their data will be protected, and to build trust that those accessing the data are vital to helping to understand and derive benefit from the complex information in human genome sequences. This will need to address the evident concerns by some patients about access for the development of commercial products such as medicines or diagnostics.

A history of advancements

The UK has played a central role in the development and application of the life sciences, from the identification of the structure of DNA in 1953, the discovery of methods for sequencing DNA in 1977 (by Sir Fred Sanger who died recently), to the mapping and sequencing of the human genome in 2001.

The Prime Minister's 100,000 Genomes initiative is part of a much wider programme to build on this history of achievements and our current strengths in academia, the infrastructure of the NHS, and our partnerships in the life sciences industry. As the 100,000 Genomes project progresses we plan that the UK will stay out in front, in the lead in this, a new life sciences field.

Don’t forget to sign up for email alerts from CSQ

Other CSQ articles you may be interested in:
Trends in the international oil and gas markets
Catapulting new technologies from idea to reality
International disaster risk reduction

Sharing and comments

Share this page


  1. Comment by Ringo posted on

    I for one will not be signing up to this scheme & I will be recommending to others not to.

    It is a highly dangerous scheme that will allow our medical data to be shared by companies, never a good thing, that are non-UK resident, giving the private sector access to medical records (they say anonymised but I think we're all aware of past data security breaches in the private sector (loss of bank details, etc)).

    • Replies to Ringo>

      Comment by chrisbarrett posted on

      A reply from Mark Bale, one of the authors:
      "It is helpful to hear concerns such as these so that the consent and wider communications can address them. Genomics England will operate under strict consent arrangements and will only involve patients with cancer or rare diseases. Genomics England is also going to adopt the strict safeguards for all NHS medical data. Any users of the data service will be barred from extracting patient identifiable data."

  2. Comment by Aj posted on

    300GB Cannot fit on to five DVD's a standard DVD is 4.7GB.

    Please change text in the paragraph "The Data Challenge"

  3. Comment by Jeannette posted on

    I was glad to take part in a couple of clinical trials when I was asked through the NHS, using my DNA to research into hereditary cancer for the benefit of future generations. I can see no problem with this and am waiting for someone to explain to me what on earth they think is going to happen to this information that will have any effect on my security?

  4. Comment by JJ posted on

    Aj - 21/03/2014 is right - the information in the data challenge section is wrong & should be corrected. This is disappointing as it lowers the reader's confidence level and one is left thinking - are there other inaccuracies in the article?

    • Replies to JJ>

      Comment by chrisbarrett posted on

      JJ and Aj, thanks for pointing out the error — it was the editorial team's fault, not the authors' fault. Now corrected.

      Mark Bale, one of the authors, adds the following: "The analogy is a rough one anyway because using DVDs is neither realistic or cost effective. The actual figure is between 21-64 DVDs and in practice the pilot phase is using commercially available 1Tb USB hard drives with strong encryption."