When a grocer runs out of an item, it raises a number of questions. Why was there a shortfall in the first place? How many items would the retailer have sold had it been fully stocked? And how much has this cost the retailer in lost sales? The best minds in the industry have been scratching their heads about this problem for years. Now computer brains and big data are being harnessed to crack the conundrum - and the charge to harness that power is being led by people like Duncan Apthorp.

Having established a career in automotive sport, Apthorp made the unlikely move to Tesco to head up its supply chain development programme in 2009. His mission was to crunch through four years of sales data and combine it with information on promotional and external data, including historic weather records, to create more accurate forecasts of demand and better understand the flow of products from supplier to depot to store.

The resulting IT and data science programme relied on the data modelling tool Matlab from software provider MathWorks and created around £100m in savings annually across the business. Simulating performance of its distribution depots saved about £50m through reduced stock levels. The group also used the Matlab tool to ‘regression test’ historical information to understand correlations between weather data and sales, saving £6m each year through more accurate stock ordering.

Similar techniques were applied to special offers, cutting out-of-stocks by 30% on these items. The team also worked to understand the effects of discounting of end-of-life stock. Computer algorithms applied to this area have saved £30m in wasted stock.

Since 2009, the project team has grown from five to 50. It is now supported by a 100-terabyte database and processing new sources of data, including social media, in an evolving mission to improve forecast accuracy. So how do these algorithms work?

“The general idea of much of this sort of analysis is to take a bunch of data and look for correlations: temperature and sales of barbecue food, for example ”

Jos Martin, MathWorks

Jos Martin, principal architect for parallel computing tools at MathWorks, says its regression and simulation software is more commonly used in engineering - it was used to design the mammoth Airbus A380 - but it is now attracting more attention from other sectors.

“The general idea of much of this sort of analysis is to take a bunch of data and look for correlations: temperature and sales of barbecue food, for example.”

Using software to measure the degree of correlation can tell a retailer the effects certain inputs - weather forecasts, for example - might have on sales of particular items. However, there can be many different factors influencing sales of a particular item with different degrees of correlations. Weather influences the purchasing of barbecue food and so does a bank holiday or an England World Cup match.

Optimise models

To untangle the effects of these various correlations to produce forecasts of sales of a particular item on a particular day, data scientists build computer models.

“What is difficult is, you don’t know what the right set of inputs are to generate a model that will give you the right outputs. So we build many models (based on as many inputs as possible) and compare them to see what they will do. You take the features of ones that do well, and optimise a set of models around them until you end up with one that works best,” says Martin.

These prototype models are tested on large sets of historic data, which is divided into a training set and a validation set. Models produce forecasts on the training set, and data scientists can see if their prediction came true. These models can then be used to forecast the stock needed in the following days or weeks. More accurate forecasts mean avoiding tying up working capital in unnecessary inventory or running out of stock, which ultimately leads to disappointed customers and lost sales.

In Tesco’s case, the algorithms are plugged into its supply chain mainframe, an IBM System Z, which places orders for thousands of suppliers, worth around £100m every day. Tiny improvements in forecast accuracy can therefore have a big financial impact.

However, Martin says users of these techniques need to be wary. While correlations with the physical world remain true over time, in business, changing culture, competition or economics can mean predictive models start to fail. “My advice is don’t just test on historic data, always test on current data to ensure the model is correct,” he says.

Park Cakes sees benefit of visual analytics

There’s a lot of noise in the IT industry around big data, but the reality is that much of the data available in the grocery supply chain goes unused, claims consultancy and software provider Atheon Analytics.

Half of Sainsbury’s suppliers do not even know the retailer published data to help with product performance management, the company contends, citing a survey of more than 100 suppliers. 

The problem is that the supplier data on Sainsbury’s Horizon system is in a very raw form and difficult for suppliers to work with, it says. To address this, it has created a service called SKUview, which allows suppliers to see changing patterns in product performance via online analytics tools.

Park Cakes, a £160m turnover supplier of own-label cakes to Sainsbury’s and other outlets, uses these cloud-based tools from Atheon to analyse the Sainsbury’s data, seeing patterns in historic demand and live sales data that might only be 12 hours old.

“One of our big problems was site waste,” says supply chain manager Julie Kenyon. “Controlling working capital and keeping waste to a minimum is difficult. Now we can see, if you’re doing a promotion, what trend it is following. retailers’ waste has come down and we have been able to plan with less change and give customers maximum date codes.”

Although this is not data modelling on the scale achieved by Tesco, Atheon Analytics managing director Guy Cuthbert says by opening these tools up to the supply base on a pay-as-you-go basis, retailers will achieve greater accuracy in their orders by having more eyes on the problem.

Supplier help

To make sure that their models are as robust as possible, more companies are recruiting data scientists outside Matlab’s traditional customer base to bring some degree of analytics to their forecasting. “This is becoming more mainstream,” according to Martin.

Among them are suppliers to the multiples. Although they may look at a much smaller range of products than their retail clients, their forecasts need to be just as accurate.

Take the example of Natsu. It supplies sushi and other fresh snacks to 2,500 supermarkets in Germany and mainland Europe. Its ‘full service’ offering means retailers only pay for what they sell and the supplier collects the waste. As a result, accurate demand forecasting has a significant impact on profits and also helps the company to avoid some of the unwanted environmental impacts.

To help get on top of this equation, the supplier opted for analytics and modelling technology from Blue Yonder, which the company’s founder developed while working with large data sets at CERN, the European subatomic particle research facility.

Working in conjunction with Blue Yonder, Natsu collected retail PoS data from the last three years. It then linked this to historic calendars, showing school and bank holidays, trade fairs, pay weeks and special events in the respective city. It also included data for the exact location of placement within a store and weather data.

“If you are being clever you should be looking for changes, so be ahead of the curve. It is not a case of fire once then you’re done ”

Chris Allan, Accenture

The system records when a product goes out of stock, and gathers comparative data from hundreds of sites selling similar items to create a picture of what could have been sold had there been enough stock, says Blue Yonder chief technology officer Jan Karstens.

Blue Yonder’s modelling software identifies the strength of correlations between input data and this “true demand” more rapidly and acutely than humans could, according to the company. It also continually tests the accuracy of its own models against the live data it receives.

Jan Meier, CIO at Natsu, says the data becomes more accurate as it approaches the delivery day. “The forecast data is important for our inventory and production planning,” he says. “In this way we can plan our product requirements and production layers more accurately. Generally we add a buffer to the forecast demand level for a product. This should ensure we do not run out of stock.”

Tesco explores Twitter buzz to boost barbecue order accuracy

Having achieved savings of £100m annually in the first five years, Tesco continues to refine and expand its supply chain analytics programme – and is even experimenting with data from social media.

Duncan Apthorp, Tesco programme manager, supply chain development, says it has recently carried out a major rebuilding of its sales forecasting system to further improve product availability.

“We collaborated closely with colleagues across the business, particularly in IT and stores, to help us really understand some of the challenges associated with waste and product availability,” he says. “We found that while our sales forecasts were very accurate, there were further improvements we could make for our promotional products, particularly with the slower-selling promotions.

“Here we were tending to over-predict sales, and sending too much stock into some stores. Improving our systems has stopped surplus stock being ordered and delivered, and has therefore helped us to reduce our waste.”

The programme is also analysing data from social media, although this is in its early stages. “For example, at the beginning of each summer we have a weekend when the weather finally turns hot and sunny enough for people to barbecue,” says Apthorp. “What we are finding is that the number of tweets about barbecuing rises significantly the week before, and we are looking to build this into our sales forecasting systems.”

Store managers and the business at large began to support the programme once they realised the upside for customers, he says. “Any new innovation such as this programme always starts with us asking the question, ‘How does this benefit our customers or colleagues?’ This programme clearly benefits millions of our customers around the UK and that’s a really exciting prospect.”

Tesco’s supply chain analytics team showed how weather affects sales by number-crunching sales records and historic weather data. For example, a weekend rise in temperature from 20C to 24C can lead to a 42% increase in burger sales. During the summer, a 10C temperature rise can mean customers want 300% more BBQ meat than they would on a colder weekend. They also want 25% fewer green vegetables.

“Reasonable” investment

Natsu expects a 20% reduction in rate of returns with the same level of sales using the system and although it put 10 project members and seven assistants on the team, Meier argues that “based on our goals, the investment was reasonable”.

That’s largely thanks to the fact Natsu did not have to buy expensive hardware and software to store and analyse the data - Blue Yonder provided this technology as a service via the cloud.

Chris Allan, MD of products analytics at Accenture, says many of the larger retailers are applying big data and analytics to the supply chain in the same way as Natsu, but adds that the approach taken needs to be continually improved if they want to retain a competitive advantage.

“If you are being clever you should be looking for those changes, so be ahead of the curve,” he advises. “Some have not changed for some time and are not, in some cases, reflecting business operations and consumer behaviour. Others are taking more data on board, giving them the edge. It is not a case of fire once then you’re done, it needs to continue to evolve.”

With sources of data mushrooming and tools to analyse them through the cloud available at little upfront cost, grocery retailers need to make sure that they exploit them to the best of their abilities.

Effective demand forecasting has become a need to have rather than a nice to have - and those that aren’t using big data to improve their forecasting accuracy could well find they’re making a big mistake.