Complex models representing real systems are not only used in the fields of natural and social sciences. They can also be used to build comprehensive representations of daily life situations. For instance, in the professional sport area, coaching staffs use models to estimate the risk of injury of their athletes based on their training load. This way, their performance can be optimized, reducing as much as possible the likelihood of injury while maintaining them at their highest level.
In a given situation, an injury can always be roughly described by simple causal links. Indeed, since the medical and physical processes leading to injuries are pretty well knowns, we talk about certainty (known knowns). But during the entire career of an athlete, we can’t tell precisely when injuries are going to happen (unknown). Hopefully, by analyzing datas of injuries, we can statistically describe their likelihood, allowing prediction of future injuries. This is called the risk of injury (known unknowns). Finally, we use models to simulate real life situations and estimate injury risks. But since the results are only simplifications of the complex real world, uncertainty (unknown unknowns) have to be taken into consideration.
In order to deliver information, models need adequate inputs. For this purpose, athletes’ trainings and performances are turned into variables and predictors, which are used as inputs for models. In return, the outputs give information, allowing predictions, inferences and, most importantly, interventions regarding injury risk. Because the results are uncertain, it is important to estimate “prediction accuracy” for the models at hand, by “educating” them with inputs of past datas. This further allows to compare prediction results to new real life data inputs. Prediction accuracy can be estimated by looking at the “predictive performance” of models, that is: their performance with new datas that have not been used to build the model itself (i.e. unseen datas). Obviously, the results obtained need to be tested by statistical tests, in combinaison with resampling technics.
Whenever a model learns by being fed with new datas, we could expect it to gain accuracy. However, at some point, the accuracy of a model will decrease due to over-interpretation of information, leading to incorrect prediction. This is called overfitting. More generally, the more predictors are provided to the model, the more complex it becomes. As a consequence, the uncertainty associated to the result increases. This way, less space remains for educated interpretation of the results.
The complexity of models has to be put into perspective with their aim. In the present case, their goal is to get an intuition about the processes leading to injury. But the “black box” architecture of a model don’t allow a definitive understanding of how injury happens. This is why the uncertainty around the results has to be at the lowest, to allow simple interpretation (finding causal links) and facilitate intervention. An easy interpretation of results allows coaching staffs to articulate rules of thumb and adaptative shortcuts, leading to straightforward decisions.
Figure 1: Heuristic matrix that can be used to make decisions in the training process using training load and readiness metrics.
To conclude, simpler models with less inputs and comprehensive processes are necessary to build heuristic plans that are articulable in real-life situations (see Figure 1). This shows how differently uncertainty is managed depending on the purpose of the results. Then, it is to be asked if using simpler models for policy intervention regarding climate change could help politicians to get rules of thumbs and intuition about how to solve the climate problem, putting aside all the complexity inherent to natural systems.
REFERENCES:
Mladen Jovanovic (2017): Uncertainty, Heuristics and Injury prediction
Hi Théo, that’s a very interesting model from quite a different field. I’m a bit unsure on how to transfer this idea of simple, easy-to-understand models to climate science. Do you think it’s possible to develop similarly simple models here? Maybe something like the model we covered with Prof. Held would qualify for this. (For anyone not in the class: this was a model of 3 differential equations that links carbon emissions with global mean temperature change.) I guess it doesn’t get much simpler than this. But do you think policymakers (who don’t have a physics background like Angela Merkel :)) would gain anything from being introduced to this model? I feel like the most important results of climate science can be summarised in a few sentences anyway: 2°(1.5°) target -> emissions budget -> budget can be distributed over time pretty much at will. Maybe the complexity of the solution is not the problem but rather psychological factors preventing us from acting. What do you think?