In the present study, two raters, a psychologist and a nurse, each made five independent ratings of 30 different video-recorded patient-examinations. Having thus excluded patient fluctuation, individual-rater consistency and between-rater agreement over the 6 weeks of the study are examined. While between-rater agreement was apparently being maintained, mean AIMS scores steadily increased. In the hands of these raters, AIMS items 2 and 4 emerged as very reliable, while items 1, 6, and 7 showed high variability. Some patients appeared to be hard to rate. Differences between the study raters and the author JB highlight the issue: how reproducible is an AIMS rating?