For most biophysical domains, differences in model structures are seldom quantified. Here, we used a taxonomy-based approach to characterise thirteen rice models. Classification keys and binary attributes for each key were identified, and models were categorised into five clusters using a binary similarity measure and the unweighted pair-group method with arithmetic mean. Principal component analysis was performed on model outputs at four sites. Results indicated that (i) differences in structure often resulted in similar predictions and (ii) similar structures can lead to large differences in model outputs. User subjectivity during calibration may have hidden expected relationships between model structure and behaviour. This explanation, if confirmed, highlights the need for shared protocols to reduce the degrees of freedom during calibration, and to limit, in turn, the risk that user subjectivity influences model performance.