Comparison of 8 methods for univariate statistical exclusion of pathological subpopulations for indirect reference intervals and biological variation studies

Rui Zhen Tan, Corey Markus, Samuel Vasikaran, Tze Ping Loh, for the APFCB Harmonization of Reference Intervals Working Group

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Indirect reference intervals and biological variation studies heavily rely on statistical methods to separate pathological and non-pathological subpopulations within the same dataset. In recognition of this, we compare the performance of eight univariate statistical methods for identification and exclusion of values originating from pathological subpopulations. Methods: The eight approaches examined were: Tukey's rule with and without Box-Cox transformation; median absolute deviation; double median absolute deviation; Gaussian mixture models; van der Loo (Vdl) methods 1 and 2; and the Kosmic approach. Using four scenarios including lognormal distributions and varying the conditions through the number of pathological populations, central location, spread and proportion for a total of 256 simulated mixed populations. A performance criterion of ± 0.05 fractional error from the true underlying lower and upper reference interval was chosen. Results: Overall, the Kosmic method was a standout with the highest number of scenarios lying within the acceptable error, followed by Vdl method 1 and Tukey's rule. Kosmic and Vdl method 1 appears to discriminate better the non-pathological reference population in the case of log-normal distributed data. When the proportion and spread of pathological subpopulations is high, the performance of statistical exclusion deteriorated considerably. Discussions: It is important that laboratories use a priori defined clinical criteria to minimise the proportion of pathological subpopulation in a dataset prior to analysis. The curated dataset should then be carefully examined so that the appropriate statistical method can be applied.

Original languageEnglish
Pages (from-to)16-24
Number of pages9
JournalClinical Biochemistry
Volume103
Early online date15 Feb 2022
DOIs
Publication statusPublished - May 2022

Keywords

  • Biological variation
  • Data mining
  • Indirect approach
  • Outlier
  • Outlier exclusion
  • Reference intervals

Fingerprint

Dive into the research topics of 'Comparison of 8 methods for univariate statistical exclusion of pathological subpopulations for indirect reference intervals and biological variation studies'. Together they form a unique fingerprint.

Cite this