Background: Low back pain (LBP) classification systems are used by physical therapists to classify patients. Classification systems require observation, and are at risk of rater bias and erroneous classification decisions, if the reliability among raters is poor. Rater reliability of individual systems in subgroups of LBP is important, to justify their continued utility. Objectives: The purpose of this research was to investigate the reliability of LBP classification systems when applied exclusively to chronic low back pain (CLBP) populations. Methods: A systematic electronic database search of Medline, CINAHL, PEDro, The Cochrane Library, Informit, and Scopus was conducted. Studies that reported reliability and detailed reliability statistics of one or more LBP classification systems, exclusively in CLBP populations were included. Two independent reviewers used the Quality Appraisal of Reliability Studies (QAREL) tool to evaluate quality and risk of bias for each study. Four eligible studies were identified. Results: The Motor Control Impairment Classification System (OCS) and the Movement System Impairment Classification (MSI) were the only systems assessed for inter-rater reliability in CLBP populations. Inter-rater reliability for the MSI was substantial and inter-rater reliability for the OCS ranged from fair to almost perfect. However, risk of bias was high in the studies. Reported inter-rater reliability appeared to have an inverse relationship to study quality and risk of bias. Conclusions: The findings of this review identified insufficient evidence to determine conclusions on interrater reliability when LBP classification systems are applied for CLBP. Therefore, recommendations to substantiate their use to classify patients reliably among therapists should be considered with caution.