Big data has been around for a while. But the discussions around big data miss the fact that small data can play a big role. Years ago, when data scientists showed us how to understand our world through population change or literacy, they were doing it with small data. Experts say while specialised business analysts have been able to exploit it at a macro level, big data has failed to provide individual workers with the insights they need to act daily.
In the last few years, there’s been a movement of mass democratisation of the means of access, storage and processing of data. It’s about more people than ever being able to collaborate effectively around a distributed ecosystem of information — an ecosystem of small data.
In the book Uplifting Leadership: How Organizations, Teams, and Communities Raise Performance, authors Andy Hargreaves, Alan Boyle, and Alma Harris explore the exponential success story of Seven-Eleven Japan. When Seven-Eleven Japan was born in 1973, the first CEO decided that the key to profitability for the company’s stores would be rapid inventory turnover. So he placed responsibility for ordering in the hands of the stores’ 200,000 mostly part-time salesclerks. He believed they could make the best decisions about what would sell quickly.
To support sales clerks’ decision making, he sent each store daily sales reports and supplemental information such as weather forecasts. The reports detailed what had sold the previous day, what had sold the last day, the weather, and what was selling in other stores. The CEO arranged for deliveries three times a day so that the clerks could base their orders on immediate needs. And he connected the clerks with suppliers to encourage the development of items that would suit local customers’ tastes. And the result is that Seven-Eleven has been one of the most profitable retailers in Japan.
This is a story about a lot of small data, about the ability to use data to make effective decisions, and, of course, empowering employees with the data they need to help them make better operating decisions on a daily basis.
Although size in itself doesn’t matter – what matters is having the data, of whatever size, that helps to solve a problem or address a question, small data — a term that describes data sets with fewer than 1,000 rows or columns, was coined in 2011 by researchers at IBM to describe datasets that are too small for traditional statistical methods — requires less data but offers useful insights.
While big data is characterised by the volume of data, the variety of types of data and the velocity at which it is processed, all of which combine to make big data difficult to manage. Small data, in contrast, consists of usable chunks. For example, marketing surveys of new customer segments, meeting minutes, spreadsheets.
Social channels are rich with small data that is ready to be collected to inform marketing and buyer decisions. We are constantly creating this small data each time we check in, search, browse, post, creating a unique signature that provides a glimpse into our digital health.
Also Read: Using AI effectively
Although it has taken a long time for the message that small data is useful to trickle down, Gartner has now identified it as one of the top 10 trends in the data and analytics space for this year. It predicts that by 2025, 70 per cent of organisations will shift their focus from big to small and wide data, providing more context for analytics and making artificial intelligence (AI) less data hungry.
According to Gartner, data and analytics leaders need to turn to new analytics techniques — small data and wide data (it enables the analysis and synergy of a variety of small and large, unstructured, and structured data sources). “Taken together they are capable of using available data more effectively, either by reducing the required volume or by extracting more value from unstructured, diverse data sources,” said Jim Hare, research vice president at Gartner.
Both the approaches facilitate robust analytics and AI, enabling a richer, more complete situational awareness. Applying both the techniques can help organisations to address challenges such as low availability of training data or developing more robust models by using a wider variety of data.
For many problems and questions, small data is enough. The data on household energy use, the times of local trains, and customer spending are all small data.
At its core, the idea of small data is that businesses can get actionable results without big data analytics. Slowly but surely, small data is one of the ways that businesses are now drawing back from a kind of obsession with the latest technologies that support more sophisticated business processes. Those promoting small data contend that it’s important for businesses to avoid overspending on certain types of technologies.
And when businesses want to scale up, they can do so by creating and integrating small data packages, not through creating enormous centralised silos.
According to Hare, organisations should explore small and wide data approaches to lower business barriers to entry for advanced analytics and AI caused by a real or perceived lack of data, rather than overly relying on “data-hungry” deep learning approaches.
Many of the most valuable data sets in organisations are quite small, yet they have “data-hungry” AI initiatives underway. A 12-week experiment conducted by Harvard Business Week with medical coders demonstrated emerging AI tools and techniques, albeit with attention to human factors, are now opening possibilities to train AI with small data and transform processes.
Extending the toolbox of organisations’ data and analytics teams with techniques to provide a richer context for more accurate business decision-making, leveraging the growing availability of external data sources through data sharing and marketplaces is one way of making use of small data.
Why is small data important? There are a number of reasons.
Oftentimes, small data answers core strategic questions about your business that should drive the more advanced analytics. Also, most organisations don’t have what can be categorised as Big data yet, but all organisations have some data from which they can gather insight. Besides, mastering small data is a critical step in the journey toward overall data management excellence within an organisation.
Besides, collecting large volumes of historical or labelled data for analytics and AI is a challenge for many organisations. With small data, analytics and AI will be able to work with more recent and less voluminous data.
Data sourcing, data quality, bias and privacy protection are common challenges. But even if Big data is available, the costs, time and energy to implement conventional ML can be prohibitive. In addition, decision-making by humans and AI has become more complex and demanding, requiring a greater variety of data for better situational awareness.
This means that there’s a growing need for analytical techniques that can leverage available data more effectively, either by reducing the required volume or by extracting more value from unstructured, diverse data sources. Here as well, small data works.
It’s actually a cheap and powerful way of taking advantage of all the little data an organisation accumulates.
Humans can read and understand small data without the need to use machines, although machine learning can be applied to better understand small data and identify patterns that are difficult to identify and quantify manually. Oftentimes, those insights are more informative than analysis of big data, whose results are sometimes more difficult to translate into actions. And this is particularly useful for companies that struggle with using ML and AI.
We have seen the potential of small data to streamline our shopping, power our fitness routine, or deliver recommendations about the best price for our flight. With more smart, wearable data-driven devices on the way, there promises to be even more potential of small data. Small data is about the end-user, what they need, and how they can take action.
Potential areas where small and wide data can be used are demand forecasting in retail, real-time behavioural and emotional intelligence in customer service applied to hyper-personalisation, and customer experience improvement.
Other areas include physical security or fraud detection and adaptive autonomous systems, such as robots, which constantly learn by the analysis of correlations in time and space of events in different sensory channels.
There’s no denying that this decade belongs to distributed models not centralised ones, to collaboration not control, and to small data not big data.