The UMCU uses big data to help neonatologist make decisions at the NICU

Executive Summary

Big Data 4 Small Babies, a team of machine learning engineers from Xomnia and developers at the University Medical Center in Utrecht (UMCU), has developed a model to help doctors determine with more certainty whether or not a premature baby has late-onset neonatal sepsis (LOS).

Given the risk associated with LOS, a doctor may choose to administer antibiotics to babies in the NICU based on the slightest suspicion of their infection with it. This form of strong treatment, however, is not always necessary, as sometimes test results show that a doctor’s suspicion of infection was not accurate.

While it is still in the minimum viable product phase (MVP), the predictive model aims to provide doctors with a probability of how likely a baby who showed symptoms of LOS really has it. Preliminary results show that the model has the potential to lead to 20% less administered antibiotic treatments (the model is yet to be tested to confirm or refute this predicted estimation).


The Neonatal Intensive Care Unit (NICU) at the UMC Utrecht is one of the top 10 NICU’s in the Netherlands. At the NICU, prematurely born babies, as well as babies born with defects or illnesses, are carefully monitored and treated. A common type of disease among babies at the NICU is the late onset neonatal sepsis (LOS), which can be challenging to accurately diagnose due to its varying, non-specific symptoms.

Since LOS can sometimes be fatal, doctors often prescribe antibiotic treatments to babies who show some of its symptoms, even before medical tests can conclusively diagnose the infection. For instance, in 60% of the cases, antibiotics were given to patients as a precautionary treatment; however, blood tests turned out to be negative. The unnecessary administering of antibiotics has a negative effect on the life quality of babies, and is paired with a higher chance of bacterial resistance to antibiotics in their bodies.

Big Data 4 Small Babies’ goal was to use Big Data to assist the neonatologist in the decision making process, with the ultimate goal of lowering the amount of antibiotics administered. The desired outcome is increasing health benefits for the patients and decreasing costs for the hospital.


In collaboration with other partners, such as Finaps in the Applied Data Analytics in Medicine (ADAM) program, the team created a classification model that serves as a decision support tool for neonatologists. To create the model, the team used data collected from various sources at the NICU, such as blood test results, heart rates… etc., which was combined with the assistance of a medical expert, and later cleaned. This data was used to engineer the model’s features, and to develop a decision tree, or a machine learning model.

The outcome is a model with a technical backend, where the technical team can import data and a prediction comes out, and a private website that doctors will be able to use to determine the likelihood of a specific patient at the NICU having LOS. The website at the moment is found in an internal server run by the department, and there are plans to integrate it in the systems that the doctors use.

Upon the solution’s completion, doctors will be able to put a patient’s name and number and run a test. Based on the data extracted from the databases regarding a patient’s vitals, a calculation will happen, after which the model will give an answer to the doctor in the form of a high or low probability of a baby’s sickness. The ultimate treatment decision, however, remains in the hands of the doctor.


Due to the European Medical Device Regulation (MDR) and the high impact of the predictions of the model, its implementation and its integrated application need to be thoroughly evaluated before clinical implementation.

The first testing steps were taken by running a pilot, whereby doctors recorded cases of suspicion of sepsis and their reasoning behind their suspicion on 20 babies at the NICU. This was subsequently compared to the model output. In this limited test setup, the model showed that it has the potential to lead to 20% less administered treatments. However, since the model is still an MVP (most viable product), further validation steps are required. Therefore, the full scope of this model is yet to be explored after it enters the implementation phase.