Objectives: To determine which of two methods of case note review -holistic (implicit) and criterionbased (explicit) -provides the most useful and reliable information for quality and safety of care, and the level of agreement within and between groups of health-care professionals when they use the two methods to review the same record. To explore the process-outcome relationship between holistic and criterion-based quality-of-care measures and hospital-level outcome indicators. Data sources: Case notes of patients at randomly selected hospitals in England. Review methods: In the first part of the study, retrospective multiple reviews of 684 case notes were undertaken at nine acute hospitals using both holistic and criterion-based review methods. Quality-of-care measures included evidence-based review criteria and a quality-of-care rating scale. Textual commentary on the quality of care was provided as a component of holistic review. Review teams comprised combinations of: doctors (n = 16), specialist nurses (n = 10) and clinically trained audit staff (n = 3) and non-clinical audit staff (n = 9). In the second part of the study, process (quality and safety) of care data were collected from the case notes of 1565 people with either chronic obstructive pulmonary disease (COPD) or heart failure in 20 hospitals. Doctors collected criterion-based data from case notes and used implicit review methods to derive textual comments on the quality of care provided and score the care overall. Data were analysed for intrarater consistency, inter-rater reliability between pairs of staff using intraclass correlation coefficients (ICCs) and completeness of criterion data capture, and comparisons were made within and between staff groups and between review methods. To explore the process-outcome relationship, a range of publicly available health-care indicator data were used as proxy outcomes in a multilevel analysis. Results: Overall, 1473 holistic and 1389 criterionbased reviews were undertaken in the first part of the study. When same staff-type reviewer pairs/groups reviewed the same record, holistic scale score interrater reliability was moderate within each of the three staff groups [intraclass correlation coefficient (ICC) 0.46-0.52], and inter-rater reliability for criterion-based scores was moderate to good (ICC 0.61-0.88). When different staff-type pairs/groups reviewed the same record, agreement between the reviewer pairs/groups was weak to moderate for overall care (ICC 0.24-0.43). Comparison of holistic review score and criterionbased score of case notes reviewed by doctors and by non-clinical audit staff showed a reasonable level of agreement (p-values for difference 0.406 and 0.223, respectively), although results from all three staff types showed no overall level of agreement (p-value for difference 0.057). Detailed qualitative analysis of the textual data indicated that the three staff types tended to provide different forms of commentary on quality of care, although there was some overlap between some groups. In the process-outcome study there generally were high criterion-based scores for all hospitals, whereas there was more interhospital variation between the holistic review overall scale scores. Textual commentary on the quality of care verified the holistic scale scores. Differences among hospitals with regard to the relationship between mortality and quality of care were not statistically significant. Conclusions: Using the holistic approach, the three groups of staff appeared to interpret the recorded care differently when they each reviewed the same record. When the same clinical record was reviewed by doctors and non-clinical audit staff, there was no significant difference between the assessments of quality of care generated by the two groups. All three staff groups performed reasonably well when using criterion-based review, although the quality and type of information provided by doctors was of greater value. Therefore, when measuring quality of care from case notes, consideration needs to be given to the method of review, the type of staff undertaking the review, and the methods of analysis available to the review team. Review can be enhanced using a combination of both criterion-based and structured holistic methods with textual commentary, and variation in quality of care can best be identified from a combination of holistic scale scores and textual data review.