In a peer review I got a while back, someone wrote “Sometimes understanding why a simple model fails is far more informative than showing that a complex model succeeds”. (You know who you are, if you’d prefer to be named, drop me an email). I wrote a paragraph in response to this initially, but the current paper wasn’t really the right place for it. I’m putting it here for now in case I (or someone else) finds it useful later:

This statement raises the interesting but difficult question of the conditions under which it is possible to understand why a model fails. It fails because some aspect of it is wrong, but that does not seem to answer the “why” question. Presumably, for that, one needs to be able to say which parts of it are wrong. The assumption that the relationship between associative strength and attention is always monotonic? The assumption that learning proceeds by separable error term? Some combination of these factors? Or, something broader – like the assumption that learning is associative in the sense of being able to be captured by a nonrecurrent network model? (Wills et al., 2019, PB&R). Showing that a model fails, is not the same as knowing why it fails. Indeed, the idea that one can know why a model fails seems to assume that there is exactly one fully adequate model of human cognition, and we know what it is. Under those conditions, we can test combinatorially the differences between model X and the One True Model, and hence find which components are responsible for the failure to account for the results we are currently discussing. We would then know why it failed. If we don’t know the One True Model, or even if we know there is more than one True Model (i.e. two or more entirely adequate models exist), it’s hard to see how the question of why Model X fails can be answered. True Model 1 and True Model 2 will be different, and hence Model X may differ from them in different ways. Thus, the question of what parts of Model X are wrong, depends on which True Model you pick.