Skip to main content

https://quarterly.blog.gov.uk/2017/01/30/top-10-data-science-terms-and-what-they-mean/

Top 10 data science terms – and what they mean

Posted by: , Posted on: - Categories: Analysis and factual trends

Algorithm

At its simplest, an algorithm is a mathematical operation (or collection of operations) that is applied to some data and gives an output – for example, the sum of two numbers. A complex algorithm might be a combination of several machine learning models (see 'Machine learning', below), but still fundamentally taking input data and returning some form of output.

Big data

Notoriously difficult to define precisely, ‘big data’ is usually characterised as combining three elements: variety, volume and velocity. A good working definition is: data that is so complex, large or rapid that it cannot be processed or analysed on a single computer alone.

Data mining

This describes the process of deriving insight from data by using computers. It is now largely superseded by the term ‘data science’, referring to a set of skills and tools for exploring data to – for example – identify trends in behaviour or predict the likelihood of various outcomes.

Deep learning

A type of neural network (see below) that is capable of identifying and learning patterns from very large amounts of data. For this reason, it is widely used in image-recognition systems.

Machine learning

Describes algorithms that are able to learn patterns from data, and use these to make predictions when presented with new data. Historically, machine learning is divided into two branches: ‘supervised’ and ‘unsupervised’ learning (see below).

(Artificial) Neural network

A supervised machine-learning algorithm modelled on the structure of the brain. Despite being developed in the early 20th century, neural networks only gained prominence in recent years as computing resources have become cheaper and more widely available.

Statistics

A branch of mathematics concerned with describing or predicting patterns in data in the presence of uncertainty. The boundaries between statistics and machine learning are somewhat blurred. Certain algorithms are claimed by both fields, while others (for instance, neural networks) fall more squarely under machine learning.

Supervised learning

Machine-learning algorithms that must be told in advance what patterns to look for. These algorithms can be extremely powerful for a wide range of tasks including handwriting, speech recognition, and image classification.

Unsupervised learning

Machine-learning algorithms that can look for patterns in data, without advanced knowledge of what patterns may exist in that data. ‘Clustering algorithms’ are an example of unsupervised learning algorithms.

Web scraping

Automatically extracting data from websites by developing software to make multiple queries of a website or web service much more quickly than a human would be able to do.

Drawing of a hand holding a web of connected dots.

Sharing and comments

Share this page