New tool analyses huge amounts of data at record speeds

07 April 2025

New tool analyses huge amounts of data at record speeds

New tool analyses huge amounts of data at record speeds The algorithm can spot trends in hundreds of millions of data points in less than 20 minutes.

Connected points on a graph with finger point

A new tool that can analyse hundreds of millions of data points to discover trends far faster and more efficiently than current algorithms, has been developed by computer scientists at King’s College London. The researchers believe it could be used across multiple applications, including health data where it accurately identified different sets of patients based on their heart readings.

Developed to find trends in millions of data points per second over time, the open source tool could be used in medicine, finance, transport and energy domains, to name a few, say the researchers.

It processes the data far faster and more efficiently than current algorithms. When applied to more than 300 million data points of underwater microphone signals the algorithm took around 18 minutes to identify patterns each modelling a distinct trend shared by signals of whales, dolphins and porpoises. The existing model didn’t even manage to get through the data in 24 hours.

When tested on 20 million points and compared with another algorithm currently used, the new algorithm used less than 100 gigabytes or 80% less memory.

How it works

The completely novel design is comprised of three components: a data structure – a kind of index where data is stored and easily retrieved, and two algorithms that use the index to discover two different types of trends, maximal or closed order-preserving frequent patterns.

Dr Grigorios Loukides who headed up the project said, “current algorithms only read input data rather than using a data structure, making them impractical on large data sets. Because our tool is built with a data structure, our algorithms can operate faster and they need much less memory to run.

Ms Ling Li, working in the project as part of her PhD study commented, “Our method also has the benefit of being able to reuse the data structure to mine many different patterns within a data set, using different values to discover different trends.

How the tool can be used

Data in so many areas including medicine, finance, sensor networks such as GPS signals, cameras or any device that creates data etc, contain measurements over time. Spotting patterns in data is vital to understanding it and making accurate inferences and projections, such as identifying different sets of patients from their heartbeats, co-movements of stock prices in finance data, or energy consumption profiles for households.

So far, the tool has been tested on medical data, including ECG heart signals as part of a study aiming to explore exercise-induced pain in healthy individuals. It found two coherent and well-separated groups in the data, identifying that those who experienced more pain were the group that did more exercise.

Having now tested it as part of a medical research study, the researchers plan to roll it out across other areas of medical data – including data on sleep to further test its capabilities. The also aim to enhance the tool’s capabilities so that it takes into account contextual information related to the data, which could greatly improve pattern explainability, which is important in bioinformatics and natural language processing applications.

In this story

Dr Grigorios Loukides

Deputy Head of the Algorithms and Data Analysis Research Group

Ms Ling Li

PhD candidate

King's College London