Growing a government data science community

The first small group to bring data science skills into government was formed with a clear imperative: to build a community of practical knowledge within government so that it had sufficient skills to begin a process of transformational change.

There was also a recognition that, with this bedrock of skills, government could  engage with private data science providers from a position of strength.

From 2011, there had been a revolution in digital and technology in UK central government. This saw more agile digital expertise come back in house, bringing an end to a period where skills and strategy were outsourced to a small number of systems integrators. During this period, government’s role was often relegated to little more than contract management, mopping up after too many expensive and protracted failures.

To capture the best of the new data agenda and fit it within digital services – without losing the ability to know what and when to buy from the market – government had to develop its own community of data science expertise. But there were two big challenges: the first was that the number of known data scientists in government could be counted on the fingers of one hand, and recruitment of these sought-after individuals would be difficult and expensive; the second was a lack of awareness – and even some defensiveness – among existing analytical professions.

This is the story of how this joint project team overcame these hurdles, developed a community in government of more than 350 individuals with a data science capability, and started to put this capability to use to drive value for citizens.

Two men at desk looking at computer screen
Data scientists at GDS

Kicking things off

The UK was an early world leader in open data. It released non-personal data sets collected by government in machine-readable formats, for no cost, and with licence terms that permitted anyone to use (or even sell) the data as they saw fit. This activity has driven real-world applications – from how we find information about the next train or bus, to how businesses manage due diligence.

Those who witnessed this rapid change at first hand were struck by the different tools and approaches external data companies and civic society activists brought to government data, compared to those used by the government’s own sizeable (and excellent) community of statisticians, economists, social and operational researchers. They knew that in our personal lives, data-driven digital companies had already utterly transformed how we shop and socialise.

The Cabinet Office and the Government Office (GO) for Science horizon-scanning team had also picked up on this technological shift and government’s lack of capability in the area. With the Cabinet Secretary’s support they set up a small project team to explore the potential of this new approach to data within government. The Cabinet Office Innovation Group, the Government Digital Service (GDS), and GO Science combined under the leadership of the Economic and Domestic Secretariat (EDS) Director-General and with guidance from the operational research and government statistical professions to get things rolling.

Initial strategy

The initial strategy had four parts, which have stood the test of time over the subsequent three years:

  • to ‘show not tell’, by doing some practical demonstration projects, as opposed to writing strategy papers to explain in the abstract what the project might mean;
  • to find and grow data science capability, and to broaden the understanding of what value these new tools can provide;
  • to overcome practical and technical barriers; for example, common data science tools such as the programming languages ‘R’ and ‘Python’ not being accessible from many government computer systems;
  • to ground this work in an ethical approach that, from the start, aimed to consider what we should do with these potentially powerful tools, not just what we could do with them.

Data science work soon became a critical component of the cross-government data programme led from GDS. That programme grew out of an understanding that to maximise the reform potential of this agenda it would not be enough merely to increase data science capacity. Instead, this capability had to be grounded in the new digital services being built across government.

In addition, government’s often woeful data infrastructure had to be fixed. One way was with trustworthy sources of core reference data, such as countries, local authorities and schools through open registers; and the construction of a scalable system for appropriate personal data exchange through APIs (application programming interfaces), which allow datasets across departmental boundaries to be queried, rather than shared in bulk. This, in turn, would require an updated policy and legislative framework, not only to remove unnecessary friction (through data access provisions in the Digital Economy Bill, for example), but also to put in place new rules and procedures, for instance, on the ethical application of these new tools.

Building a community

On the capability side, the original strategy was a combination of hiring small amounts of relatively junior data science talent from outside government, and complementing this by developing digital skills and culture within the existing government analytical community. The aim was always to build local hubs of expertise across government, rather than trying to form a single central team in GDS or elsewhere.

This community would be built around the nucleus of  the small handful of existing data scientists within GDS and the Office for National Statistics, and others with the necessary skills elsewhere in government. Finding them was easily done in some cases; for example, with a few calls to colleagues in the intelligence communities, and connecting with the world-leading expertise in GCHQ and the Defence Science and Technology Laboratory. However, pockets of expertise existed in unexpected areas. There were some truly impressive individuals and work going on in places such as the Health and Safety Executive Labs in Buxton, Derbyshire, and Bootle, on Merseyside.

These early enquiries revealed a degree of untapped interest. However, given the aim of  bringing analysis out of the shadows and putting it centre-stage in government, there was also a surprisingly cautious response in pockets of the established analytical community. While many clearly relished the opportunity to update their skills, there were some who dismissed the new data science agenda as "trying to pretend it invented maths" and claimed data science had been practised in government since the time of the experimental physicist Patrick Blackett and the amazing innovations in operational research during and since World War II.

Spreading the word

The first group of around 30 data scientists was assembled in February 2014 to discuss the emerging data science programme. provide ideas for demonstration projects to show the potential of data science to a wider network of senior leaders in government.

A plethora of ideas and experiments circulated in this early phase of our work, with some noble and necessary failures (Cabinet Office ran some lunchtime coding clubs for both analysts and non-analysts that proved quite popular but not especially useful). However, in building the now thriving community of data scientists, three types of activity proved particularly telling.

First, spreading the word about our data science ambitions by presenting to as many public service organisations and professional boards as possible, and blogging about early prototypes and intentions. This included a number of senior seminars in the main departments, with Permanent Secretaries and their analytical, digital/technology and policy leaders, chaired by Sir Mark Walport, the Government Chief Scientific Adviser. A surprise in some of these sessions was that the senior analysts were unaware of the lack of connections between their digital and technology counterparts and their work.

Use of wider communications channels and presentations, such as sessions at Civil Service Live, conferences of the different professions, including policy as well as the formal analytical professions, and at numerous external events was also effective in spreading the word of data.

With the data in government blog, the aim was to foster connections between existing talent and to focus the attention of senior leadership on the data agenda. Many departments produced data science action plans in these early days. These plans were a useful lever to concentrate minds on the projects that might best add value, bring together various communities within departments, and provide space and permission for experimentation, though this would rarely produce significant value at the start.

The second area that helped create the data science community was the Data Science Accelerator, a programme for ambitious government analysts wanting to grow their budding data science skills. The accelerator linked them up with data science mentors across government. In the early days, it also provided them with a working Macbook Pro, free of the sometimes overly enthusiastic government IT restrictions that put everyday data science tools out of their reach. In return,the government analysts brought a practical project and sufficient time to devote to their learning, including spending a day together as a community every week in GDS’s Holborn, London, office and, as the programme developed, in three hubs around the country, in London, Sheffield and Newport. The Data Science Accelerator was designed as a minimum viable training programme, which was spun up quickly and iterated rapidly, rather than as more strategic, longer-term training for existing analysts. Lying squarely in the remit of the existing professional communities, this more routine training is now also really beginning to show results.  

The third activity that made a difference in developing the government data science community was the efforts made to connect this community in practical ways, from the bottom up. That meant regular ‘meet-ups’, gatherings where data scientists across the public sector could show early progress in their projects and exchange ideas and practice. They were also opportunities for guest speakers from outside government to inspire and show how these same tools were being used in the private sector.

The future...

From these early beginnings to build a community has grown a wide range of opportunities to share and connect. These include a dedicated messaging app, where code and frustrations can be shared, and an assortment of data drinks, lunches and dedicated community groups within departments.

Along the way, data scientists kept popping up. They were not always formally described as such, and their unique talents were not always being put to the best use – asking data scientists to grind through regular statistical releases from a department is a too common waste of their time and talent. However, the number of data scientists in government continues to grow as a result of hiring, training and uncovering previously unknown talent. There are now more than 350 identified individuals with these skills in the public sector.

The development of this community and the work they are producing is a great Civil Service success story. It shows how quickly government can adapt, even without the relentless market pressures of  the private sector.

There are now tens of data science case studies – a mixture of prototypes (some successful, some less so) and increasingly serious value-adding propositions. The UK Government is one of the best in this field, and significantly ahead of many larger organisations in the corporate sector, though it still lags behind some of the groundbreaking digital-first companies.

The government data science community has emerged from its early development phase. It stands ready to build its insights into serious transformational digital services, and to  use these new techniques to complement the traditional analytical advice to boards, programmes and ministers, in increasingly sophisticated and visual ways. It can be increasingly confident of its ability to partner with the UK’s amazing private sector data science companies where this is useful.

1 comment

  1. Fadi Naffah

    Great insight, I wonder if this might scale up to overseas posts to include more diversified experiences.

    Link to this comment