Statistical Hypothesis Skepticism: Implications for Credit Risk
Financial risk managers continue to rely heavily on statistical hypothesis testing in modeling and statistical analysis, even though a group of scientists are now arguing that these tests have lost their relevance.
Recently, in fact, some scientists have gone so far as to declare probability value (P-value), an important statistical measurement tool, obsolete, advocating for its elimination. For instance, the editors of the Journal of Basic and Applied Social Psychology (JBASP) have described the null hypothesis significance testing procedure (NHSTP) as “invalid,” while mandating their authors to omit “all vestiges of the NHSTP” – including P-values, F-values and T-tests.
But are these arguments logical, and just how much weight do they hold in the financial risk management community?
By Marco Folpmers
P-values, after all, are still used to determine if the probability of default (PD) is underestimated with the Jeffreys test. They’re also useful in figuring out whether a coefficient should be added to an early warning system, and they can help assess whether the SICR criteria for IFRS 9 are effectively implemented. All these examples demonstrate the importance of hypothesis testing in credit risk analysis, much like in biomedical research.
Despite all this, NHSTP continues to face scientific criticism, and there remains an ongoing debate about the role of hypothesis testing and P-values in credit risk modelling.
So, who’s right and who’s wrong? Are the critiques of NHSTP by scientists valid or do these types of tests remain central to credit risk modeling? Perhaps the answer somewhere in between.
Let’s now explore answers to these questions, concentrating on one specific criticism: the bias associated with repeated hypothesis testing.