Confidence in any bioassessment method is related to its ability to detect ecological improvement or impairment. We evaluated Australian River Assessment (AUSRIVAS)-style predictive models built using reference site data sets from the Australian Capital Territory (ACT), the Yukon Territory (YT; Canada), and the Laurentian Great Lakes (GL; North America) area. We evaluated model performance as ability to correctly assign reference condition with independent reference-site data. Evaluating model ability to detect human disturbance is generally more problematic because the actual condition of test sites is usually unknown. Independent reference-site data underwent simulated impairment by varying the proportions of sensitive, intermediate, and tolerant taxa to simulate degrees of eutrophication. Model performance was related to differences in data sets, such as number and distribution of invertebrate taxa. Sensitive taxa tended to have lower expected probabilities of occurrence than more-tolerant taxa, but the distribution of taxa grouped by tolerance categories also differed by data set. Thus, the models differed in ability to detect the simulated impairment. The ACT model performed best with respect to Type 1 error rates (0%) and the GL model the worst (38%). The YT model performed best (10% error) for detecting moderate impairment, and the ACT model detected all severely impaired sites. AUSRIVAS did not assign most mildly impaired sites to below-reference condition, but a reduction in observed/expected values for some of the mildly impaired sites was observed. Models did not detect mild impairment that simply changed taxon abundances because presence–absence data were used for models. However, in comparison with other models described in this special issue (that did use abundance data), the AUSRIVAS model performance was comparable or better for detecting the simulated moderate and severe impairments.