Neonatal infection remains a primary cause of infant morbidity and mortality worldwide and yet our understanding of how human neonates respond to infection remains incomplete. Changes in host gene expression in response to infection may occur in any part of the body, with the continuous interaction between blood and tissues allowing blood cells to act as biosensors for the changes. In this study we have used whole blood transcriptome profiling to systematically identify signatures and the pathway biology underlying the pathogenesis of neonatal infection. Blood samples were collected from neonates at the first clinical signs of suspected sepsis alongside age matched healthy control subjects. Here we report a detailed description of the study design, including clinical data collected, experimental methods used and data analysis workflows and which correspond with data in Gene Expression Omnibus (GEO) data sets (GSE25504). Our data set has allowed identification of a patient invariant 52-gene classifier that predicts bacterial infection with high accuracy and lays the foundation for advancing diagnostic, prognostic and therapeutic strategies for neonatal sepsis.