The Edge of understanding AI
Before we can begin to understand AI, we need to learn about the edge. A lot of people have never heard of the edge, but in truth it's a bit like the chicken and the egg - you can’t have one without the other. For clarity, the ‘edge’ we are talking about involves the device on which you’re reading this article.
For example If server farms are at the centre, then individual connected devices, such as the smartphone in your hand, the IoT devices in your kitchen, the self-checkout at your supermarket or perhaps even the car in your driveway, are out there on the ‘edge’. Any AI is only as clever or useful as its existence out there in the real world.
Data is the fuel
In the same way that knowledge feeds intelligence, AI’s lifeblood is data, harvested on the edge, out there in the real world. Data informs the learning and accuracy of an AI and the learning is down to the quantity and quality of the data gathered. But here’s the problem: the quantities are vast and far too big to be transferred in full from the edge to the central points (the server farms that form the cloud infrastructure) where they can be distilled into amazing, accurate, ever-adapting learning. Looking at Tesla, since it takes ten hours to upload the data of just an hour’s driving from a single car, we can safely assume that the company is only using a fraction of all the data being gathered by its fleet of four wheel edge devices.
The bottleneck is just too tight. Andy Jassy, CEO of Amazon Web Services (AWS), estimates that using conventional means of transferring data, it will take you 26 years to move an exabyte of data (that’s a billion GB) to the cloud. Companies as big as Walmart produce something like two exabytes a month. No wonder AWS has resorted to transporting data for their biggest customers via truck-sized memory sticks or ‘snowmobiles’. Yes, road and diesel are the fastest means of transporting the volumes of data currently fuelling the AI revolution. So if data is AI’s lifeblood, it’s being drip fed and starved.
Labelling
There’s another major kink in the AI supply chain: labelling. Much of the usefulness of AI is a machines’ ability to recognise the objects and situations in their environment. Environments change and most objects, even of the same category, are not identical. Automated supermarket checkouts, for example, need to be able to recognise one type of apple from another. A Braeburn apple may retail at 35 pence, whilst a Pink Lady, from the same store, costs 50 pence. If the two are confused by the self checkout’s camera system, the supermarket could lose a lot of money over time. The problem is that the algorithms that underpin the technology are as naive as newborn babies and need to be fed millions of labelled examples to teach them to ‘see’. So how do these quantities of raw images get labelled? The answer is humans.
Intelligence doesn’t just come out of thin air, it has to start with the human brain and hundreds of thousands of them are employed in call centre-like offices for just this task. For a self-driving car algorithm to be taught the meaning of road signs, or to tell the difference between a child and a dog, hours of footage have to be watched and objects labelled, frame by frame.
Labelling at the edge
Food shoppers can inadvertently do the labelling job themselves by confirming, on a touch screen, the identity of the fruit or vegetable they just placed in front of a self-checkout’s camera. The machine’s algorithms suck up the shoppers’ collective recognition until they can do it for themselves.
It’s possible to accelerate this process away from the shop floor. We recently shipped in some self checkout machines and hired a team of people to scan a range of 30 fruits and vegetables, labelling each time. In just three weeks the checkout’s accuracy went from 60% to 99.8%. Accuracy, for supermarkets and for AI in general, is critical. You might remember this the first time you sit in a self-driving car. For supermarkets it means loss prevention running into billions of dollars.
However, the laborious issue of labelling tightens the bottleneck even further. A McKinsey report from 2018 listed it as the biggest obstacle to AI adoption within industry. The future therefore has to be a world where machines can teach themselves, doing away with the need for clouds and labellers altogether.
AI at the edge is quickly becoming the only way to run and train AI models. We need to ensure that AI processors are being taught correctly from the beginning.