Context: Clinical-vignette multiple choice question (MCQ) examinations are used widely in medical education. Standardised MCQ examinations are used by licensure and certification bodies to award credentials that are meant to assure stakeholders as to the quality of physicians. Such uses are based on the interpretation of MCQ examination performance as giving meaningful information about the quality of clinical reasoning. There are several assumptions foundational to these interpretations and uses of standardised MCQ examinations. This study explores the implicit assumption that cognitive processes elicited by clinical-vignette MCQ items are like the processes thought to occur with ‘real-world’ clinical reasoning as theorised by dual-process theory. Methods: Fourteen participants (three medical students, five residents and six staff physicians) completed three sets of five timed MCQ items (total 15) from the Medical Knowledge Self-Assessment Program (MKSAP). Upon answering a set of MCQs, each participant completed a retrospective think aloud (TA) protocol. Using constant comparative analysis (CCA) methods sensitised by dual-process theory, we performed a qualitative thematic analysis. Results: Examinee behaviours fell into three categories: clinical reasoning behaviours, test-taking behaviours and reactions to the MCQ. Consistent with dual-process theory, statements about clinical reasoning behaviours were divided into two sub-categories: analytical reasoning and non-analytical reasoning. Each of these categories included several themes. Conclusions: Our study provides some validity evidence that test-takers’ descriptions of their cognitive processes during completion of high-quality clinical-vignette MCQs align with processes expected in real-world clinical reasoning. This supports one of the assumptions important for interpretations of MCQ examination scores as meaningful measures of clinical reasoning. Our observations also suggest that MCQs elicit other cognitive processes, including certain test-taking behaviours, that seem ‘inauthentic’ to real-world clinical reasoning. Further research is needed to explore if similar themes arise in other contexts (e.g. simulated patient encounters) and how observed behaviours relate to performance on MCQ-based assessments.