REMINDER Onboard telematics now can measure behavior of a driver, and therefore, car insurers have early started this adventure of connected car, more or less successfully. The simplest applications that have been deployed are: . locate stolen vehicles . measure the usage of the driver, and in particular the number of kilometers traveled in order to propose adaptive pricing (vehicle that always stays in a garage will never have an accident!) But the main business of the insurer deals with the concept of risk, and then, we have seen a lot of telematics firms proposing automatic detection of risky behaviors. The most common is the so-called detection of « severe braking », which is based on the assumption that severe braking reveals a lack of anticipation, and thereby a dangerous driving. We now know that this assumption is totally false [1], but it is still in the mind of some insurers that « want to believe » there is a simple way to classify human beings behaviours.
However, the lack of results of these deployments has led some German and US insurers to abandon Telematics [2]. The company NEXYAD has demonstrated that it is possible to « measure » in real time the risk of driving, and this stimulus now keen interest in telematics among insurers worldwide.
This interest was even increased as NEXYAD won the BMW Techdate Challenge with their onboard risk assessment App SafetyNex [3]. SafetyNex works where all other systems fail, simply because the problem was treated in a completely new way, without any « science fashion » consideration, especially about the deep learning (or machine learning). Indeed, the difficulties of developing an application of efficient onboard risk estimation are :
. Science and facts : an accident is a rare event and inexplicable (definition : « happens by chance », a driver has got one accident every 70 000 km, on average, most of which are harmless). Observing a driver during 5 years (to make sure an accident occured) is long and ineffective (one accident does little to make individual statistics), and variability factors of « road life situations » are extremely numerous, so it would take millions of drivers during decades before having relevant statistics.
. Ethics : driving behavior in itself has absolutely no direct link with the risk [1] (indeed we conceive easily that drifting demos on an abandoned airport or just in front of a school at noon, corresponds to very different risks although the driving behavior is the same : it obviously needs to be « contextualize »). Contextualization (that is not present in the « severe braking » experiments mentioned before) therefore demand to know, among other things, speed of the vehicle, and where this speed is practiced. But as digital maps have all recorded the maximum authorized speed, then if you record the speed and geolocation in a cloud... it potentially saves violations of speed limits. In many countries, including France, it is prohibited to record infringement to the law by non accredited organizations (like Insurance Companies). This totally disqualifies telematics boxes that record raw data in the cloud ! However, some European insurers continue to test this kind of solution in the hope (but in vain) that the « deep learning » and « data scientists » give their risk scores. But in any case in France (40 million vehicles market), the violation of the Penal Code is sanctioned and generally pursued by the CNIL [3 bis].
And insurance companies won’t have the opportonity to defend themselves saying « there is no choice » because SafetyNex estimates risk of driving without recording ANY confidential data ! And it was shown that SafetyNex delivers every needed data to insurance companies (without any violation of driver’s privacy).
We can see with these two constraints that the solution of « big data statistics in the cloud using machine learning » can not be applied: . statistics (or deep learning etc): accident is rare so it won’t work at the individual level . in the cloud: this is contrary to the laws that protect privacy of people.


The accident is a rare event (1 accident every 70 000 km on average and mostly minor). This means that one must observe tens of thousands of km to observe ONE accident... so to observe million accidents (for statistics, you need millions of data), one must observe a huge number of kilometers traveled... And in every place (because one area may be dangerous because of the presence of ravines, another one because many roads intersect, etc... it’s never the same). And as the observation of an accident is not enough, we must record all the measurable « variables » or « factors » (speed, acceleration, etc.) that describe the behavior of the vehicle at the time of the accident (in order to « explain » the accident as say the statisticians). Of course, nobody has millions of observations of accidents at each location of the infrastructure, then statisticians will just tell you « not enough data ». What is the impact of unsuffissant volume of data on deep learning [4] ? Well let’s take an example: Let’s record for one person during 5 years (the time it taked to get an accident), the day of the week, 3 the time slot, and driving signals (speed, acceleration, braking, ...). The result of this observation of 5 years (it's long ? isn’t it ?) will lead on average to 99 999 km without acident and 1 km where an accident occured. Let’s say it was a Thursday, at 15:00, the vehicle was traveling at 100 km / h, etc. As the vehicle drove frequently slower and faster than 100 km / h, the influence of the speed in the deep learning will be close to zero. However, the driver has never had an accident on Monday, Tuesday, Wednesday, Friday, Saturday, Sunday. Certainly, it has led many Thursday without accidents, but the only day he had an accident was on Thursday: the probability of having an accident on Thursday is therefore greater than that of having an accident the other days. Here is what data analysis, statistics, or deep learning will conclude. Everyone can understand that this conclusion is completely wrong, and that it you observe that same driver for thirty years (the time to get several accidents), then you will see that the day is not a key factor (it can have an influence if traffic varies with day, but obviously it os possible to have an accident any day of the week!). As a conclusion let’s keep in mind that It's easy to make global statistics of accident on a large population (France, Europe, USA, ...hundreds of million people). But at the individual level, it is not so easy. But the goal of onboard telematics is precisely to estimate a risk, at the local and individual level! We know it seems obvisous when we say that it is not possible to study rare events without prior knowledge using data oriented mathematical methods (because there are few data and because those methods refer to the « law of large numbers »). But it's better to say it because it is apparently not obvious to everybody (and it sounds always « cool » to tell your boss and your friends that you work on a deep learning application ! ^^). SafetyNex circumvented this problem by working in a much more rational and finally "classical" way, using knowledge and risk evaluation methods already validated by experts of accident. Note: To develop the theory of relativity, Albert Einstein did not record hundreds of billions of data to feed a deep learning system that automatically found the law E = mc2. He used the knowledge of physicians, and inference methods of mathematics that have led to this formula. And then, in order to validate this formula, experimental physicists have performed hundreds of experiments. It is exactly this approach that has been applied to develop SafetyNex: there are dozens of experts working on road infrastructure "risk diagnoses." NEXYAD worked in contact with these experts for 15 years (through collaborative research programs PREDIT [5]) and developed SafetyNex which is a "knowledge-based system" [6], validated a posteriori on about 50 million km. These experts work the same way than industrial risk experts in factories with methods like FMEA [7].

Read more :