This article has been written and contributed to engineering.com by Greta Cutulenco, CEO and Cofounder of Acerta Analytics Solutions.
We keep hearing about the massive amounts of data autonomous cars will generate. One estimate from Intel a few years back pegged the number at 4,000 GB per vehicle, per day. More recently, Gartner predicted that by next year, the average connected vehicle will generate the same amount of data, to the tune of 280 petabytes annually.
Automotive industry data is Acerta’s lifeblood; our machine learning models depend on quality data to help automotive engineers reduce scrap and rework rates and accelerate root cause analysis (RCA). So, we decided to crunch the numbers for ourselves. What we found might seem counterintuitive to the uninitiated, but anyone who’s seen automotive industry data shouldn’t be the least bit surprised.
Automotive Manufacturing & On-Road Data
Estimates for the total number of vehicles manufactured in 2019 vary, but, conservatively, it’s roughly 80 million worldwide. Each of those 80 million vehicles has approximately 30,000 parts.
The total number of vehicles on the road in 2019 has been estimated at 1.25 billion, but obviously those aren’t all connected vehicles, so we have to make some assumptions. For simplicity’s sake, we made three sets of estimates regarding the proportion of on-road vehicles with built-in connectivity and those with connectivity via OBD2:
- In-Built Connectivity: Avg: 12% / Low: 5% / High: 16%
- ODB2 Connectivity: Avg: 40% / Low: 20% / High: 50%
On the manufacturing side, we distinguished between simple parts and complex systems in terms of the amount of data they generate. Based on our combined experience with automotive industry data, we once again made three sets of estimates, varying the number of complex systems per vehicle in addition to the amount of data they generate:
- Data Per Simple Part: Avg: 0.000004 GB / Low: 0.000004 GB / High: 0.001 GB
- Data Per Complex System: Avg: 0.25 GB / Low: 0.25 GB / High: 1 GB
- Number of Complex Systems: Avg: 40 / Low: 20 / High: 100
Based on these figures, along with the number of seconds in a day (86,400) and the number of days in a year (365), we’re ready to make our comparison (almost). The last thing we need is an estimate of how much data each connected vehicle generates. That involves estimates of the number of signals being monitored, their information content, and the proportion of the day the vehicle is driving. You can see a breakdown of these figures in this Google Sheet.
As to the sources of all this data, for manufacturing they include product specifications, sensors on the production line, on-machine measurements, testing data and outputs from the electronic control units (ECUs). Regarding the sources of on-road data, we’re including everything generated by the vehicle’s sensors, ECUs, etc., that accessible on the CAN network, with one notable exception: image data.
Now, before you cry “Foul!” Hear us out.
Obviously, images will account for the majority of data autonomous vehicles generate while driving, but that’s not the case for connected vehicles, and our analysis focuses on the latter. More to the point, we’re excluding image data for manufacturing as well as driving. While it might seem obvious that driving a connected vehicle would generate far more image data than manufacturing one, it’s worth remembering that the machine vision market for manufacturing grew ten percent in North America last year alone.
View Full Article: