Are data practices holding back the automotive industry?

Synthetic data is surmounting the challenges presented by real-world datasets, writes Steve Harris

In the fast-paced world of the automotive industry, data has become the gas that fuels innovation, efficiency, and ultimately its growth—resulting in more vehicles being sold. However, a recent IT Supply Chain survey has unveiled a different reality: one in three leaders in the industry believe that difficulties finding suitable data solutions has resulted in suboptimal data practices.

As the automotive industry continues to push the boundaries of technology—with new developments like self-driving cars expected to hit the roads by 2035—it begs the question of why there are still challenges when it comes to finding suitable solutions. And, more importantly, why the industry is settling for suboptimal data practices.

The starting point is assessing the impact of data on decision-making, industry growth, and customer satisfaction. This includes considering potential strategies that can help automotive companies overcome these challenges and unlock the full potential of their data.

It’s difficult to find enough real-world datasets that are usable, accurate, and adhere to licensing and data privacy

The consequences of suboptimal data practices

If automotive companies have incomplete or inaccurate data sets, it can hinder decision-making processes and strategy. This, in turn, can impede innovation, with less data producing slower product development cycles, leading to decreased customer satisfaction and trust in automotive products and services.

Trying to find enough real-world datasets that are usable, accurate, and adhere to licensing and data privacy regulations can be like picking out several needles from several haystacks. From in-person interactions and in-cabin distractions to random weather events and rare edge-case scenarios, the need for data can massively outweigh real-world capabilities. The reality is that data born from reality is much scarcer than imagined. And, it’s even scarcer when it’s needed to be used in data sets on a broad enough scale to affect innovation and machine learning training in a significant way. So, how can this demand be met?

Two words: synthetic data

Synthetic data can recreate real-world scenarios, environments, and patterns without the need to use real-world data itself. For example, automotive engineers are using the characteristics of synthetic data to create virtual environments that simulate real-world driving conditions. Within this world, they can try out a plethora of diverse scenarios without using physical prototypes or expensive real-world tests.

It is worth noting that real-world data is still a key component for initial training and continual development. But synthetic data offers the ability to test infinitely more situations and act as a solution to both data scarcity and data privacy concerns. And, the more data that is created and the more it learns from real-world scenarios, the more realistic and representative these data sets are. These can then be used independently and in conjunction with real-world data.

Mindtech synthetic data — Synthetic data offers the ability to test infinitely more situations and act as a solution to both data scarcity and data privacy concern

In fact, recent research revealed that one in five automotive leaders believe synthetic data will help them be more resilient against privacy breaches. On top of this, 29% of automotive industry leaders use synthetic data for its quick problem-solving ability. The benefits are there, but the adoption isn’t yet where it needs to be.

The ability to carry out tests and simulate diverse scenarios can transform product development cycles and bolster decision-making. However, with suboptimal data, attempting to navigate each step of the product development process—from design concept to manufacturing to release—can be time-consuming at the best of times. In some cases, it is completely impossible. Yet generating diverse synthetic data can massively reduce the time and cost incurred between each stage, empowering engineers to test a whole range of use cases and fault conditions. It’s a no brainer.

Automotive data with synthetic data

The use of synthetic data is still in its early days, and there are several important considerations to take for unlocking its full potential. For machine learning algorithms to perform accurately, they need a diverse range of comprehensive training datasets to learn from. But within this, automotive engineers must consider a range of factors: they need to strike a balance between creating high-quality, large-scale datasets that recreate real-world conditions while also ensuring realism, data diversity, and data privacy. This requires careful cost-benefit considerations and data usage policies.

However, there’s no point in having data if half the people can’t access it. To truly realise the potential of synthetic data, it’s essential to facilitate collaborative data sharing among employees, industry decision-makers, and stakeholders. Just as open-source AI improves accessibility and collaboration in AI development, the sharing of synthetic data can drive innovation and policy-making in the industry.

Data born from reality is much scarcer than imagined

With more data at hand, algorithms can be better trained, and models can more accurately predict events. This almost limitless hub of data can contribute to the continuous improvement of in-cabin security systems, protecting vehicles and their occupants from potential vulnerabilities and foreseeing any dangers with current designs. Put simply, synthetic data can make the driving experience safer for all.

Breaking the data barrier

Synthetic data is surmounting the challenges presented by real-world datasets. With its ability to overcome data scarcity and privacy barriers, it presents one of the most logical options for automotive engineers to amass masses of data that can simulate a whole range of scenarios.

This surge of innovation is underscored by the burgeoning significance of synthetic data within the automotive domain. A noteworthy 82% surge in synthetic data investments coupled with 90% of automotive industry leaders harnessing its capabilities, with an impressive 29% employing it for swift problem-solving, vividly illustrates the escalating importance and adoption of synthetic data within the sector.

The automotive industry stands at the crossroads of data-driven innovation. Overcoming suboptimal data practices through the integration of synthetic data can pave the road to a more resilient, efficient, and innovative automotive future.

About the author: Steve Harris is Chief Executive of Mindtech

Are data practices holding back the automotive industry?

The consequences of suboptimal data practices

Two words: synthetic data

Automotive data with synthetic data

Breaking the data barrier

Join our LinkedIn Group

Sign up for our weekly newsletters