Receiver Operating Characteristics (ROC) has increasingly been advocated as a mechanism for evaluating classifiers, particularly when the precise conditions and costs of deployment are not known. Area Under the Curve (AUC) is then used a single figure for comparing how good too methods or algorithms are. Additional support for ROC AUC is cited in its equivalence to the non-parametric Wilcoxon signed rank test, but we show that this is in general misleading and that use of AUC implicitly makes theoretical assumptions that are not well met in practice. This paper advocates two ROC-related measures that separate out two specific types of goodness that are wrapped up in ROC-AUC, which we call Consistency (Con) and Certainty (Cert). We treat primarily the dichotomous 2 class case, but discuss also the generalization to mulitple classes.