Randomised trials involving infants from both single and multiple births present unique statistical challenges. A range of methods have been used to analyse such data, including standard methods which treat all infants as independent, and more complex methods which account for the dependence between outcomes of infants from the same pregnancy. Conflicting recommendations have been made regarding if and when this dependence, or clustering, should be taken into account in the analysis. We studied the performance of ordinary logistic regression, which ignores the clustering, compared with logistic generalised estimating equations (GEEs) and mixed effects models (MEMs), which account for the clustering, using real and simulated datasets. Ordinary logistic regression produced appropriate type I error and coverage rates, provided the dependence between outcomes of infants from the same pregnancy was small and the multiple birth rate was low, but performed poorly otherwise. The type I error rate increased and the coverage rate decreased as either the strength of the dependence or the multiple birth rate increased. In contrast, logistic GEEs maintained appropriate type I error and coverage rates across a wide range of settings. The performance of logistic MEMs varied depending on the setting and the estimation procedure used but was often similar to or better than ordinary logistic regression. We recommend using a method which takes the clustering into account when analysing datasets including infants from multiple births.
- generalised estimating equations
- logistic regression
- mixed effects models
- multiple births
- statistical methodology