Context: Single-best-answer questions (SBAQs) have been widely used to test knowledge because they are easy to mark and demonstrate high reliability. However, SBAQs have been criticised for being subject to cueing.
Objectives: We used a novel assessment tool that facilitates efficient marking of open-ended very-short-answer questions (VSAQs). We compared VSAQs with SBAQs with regard to reliability, discrimination and student performance, and evaluated the acceptability of VSAQs.
Methods: Medical students were randomised to sit a 60-question assessment administered in either VSAQ and then SBAQ format (Group 1, n = 155) or the reverse (Group 2, n = 144). The VSAQs were delivered on a tablet; responses were computer-marked and subsequently reviewed by two examiners. The standard error of measurement (SEM) across the ability spectrum was estimated using item response theory.
Results: The review of machine-marked questions took an average of 1 minute, 36 seconds per question for all students. The VSAQs had high reliability (alpha: 0.91), a significantly lower SEM than the SBAQs (p < 0.001) and higher mean item–total point biserial correlations (p < 0.001). The VSAQ scores were significantly lower than the SBAQ scores (p < 0.001). The difference in scores between VSAQs and SBAQs was attenuated in Group 2. Although 80.4% of students found the VSAQs more difficult, 69.2% found them more authentic.
Conclusions: The VSAQ format demonstrated high reliability and discrimination and items were perceived as more authentic. The SBAQ format was associated with significant cueing. The present results suggest the VSAQ format has a higher degree of validity.
- short‐answer questions
- open‐ended very‐short‐answer questions
- Medical students
- Medical education