We investigated 32 net primary productivity (NPP) models by assessing skills to reproduce integrated NPP in the Arctic Ocean. The models were provided with two sources each of surface chlorophyll-a concentration (chlorophyll), photosynthetically available radiation (PAR), sea surface temperature (SST), and mixed-layer depth (MLD). The models were most sensitive to uncertainties in surface chlorophyll, generally performing better with in situ chlorophyll than with satellite-derived values. They were much less sensitive to uncertainties in PAR, SST, and MLD, possibly due to relatively narrow ranges of input data and/or relatively little difference between input data sources. Regardless of type or complexity, most of the models were not able to fully reproduce the variability of in situ NPP, whereas some of them exhibited almost no bias (i.e., reproduced the mean of in situ NPP). The models performed relatively well in low-productivity seasons as well as in sea ice-covered/deep-water regions. Depth-resolved models correlated more with in situ NPP than other model types, but had a greater tendency to overestimate mean NPP whereas absorption-based models exhibited the lowest bias associated with weaker correlation. The models performed better when a subsurface chlorophyll-a maximum (SCM) was absent. As a group, the models overestimated mean NPP, however this was partly offset by some models underestimating NPP when a SCM was present. Our study suggests that NPP models need to be carefully tuned for the Arctic Ocean because most of the models performing relatively well were those that used Arctic-relevant parameters.