Neural Network Training with Highly Incomplete Datasets published in Machine Learning: Science and Technology

Working principles for training neural networks with highly incomplete dataset: vanilla (upper panel) vs GapNet (lower panel) (Image by Yu-Wei Chang.)
Neural Network Training with Highly Incomplete Datasets
Yu-Wei Chang, Laura Natali, Oveis Jamialahmadi, Stefano Romeo, Joana B. Pereira, Giovanni Volpe
Machine Learning: Science and Technology 3, 035001 (2022)
arXiV: 2107.00429
doi: 10.1088/2632-2153/ac7b69

Neural network training and validation rely on the availability of large high-quality datasets. However, in many cases only incomplete datasets are available, particularly in health care applications, where each patient typically undergoes different clinical procedures or can drop out of a study. Since the data to train the neural networks need to be complete, most studies discard the incomplete datapoints, which reduces the size of the training data, or impute the missing features, which can lead to artefacts. Alas, both approaches are inadequate when a large portion of the data is missing. Here, we introduce GapNet, an alternative deep-learning training approach that can use highly incomplete datasets. First, the dataset is split into subsets of samples containing all values for a certain cluster of features. Then, these subsets are used to train individual neural networks. Finally, this ensemble of neural networks is combined into a single neural network whose training is fine-tuned using all complete datapoints. Using two highly incomplete real-world medical datasets, we show that GapNet improves the identification of patients with underlying Alzheimer’s disease pathology and of patients at risk of hospitalization due to Covid-19. By distilling the information available in incomplete datasets without having to reduce their size or to impute missing values, GapNet will permit to extract valuable information from a wide range of datasets, benefiting diverse fields from medicine to engineering.

Calle Andersson, Jesper Bergquist, Karim Hasseli, Wilhelm Henriksson, Max Jisonsund and Amandus Reimer defended their Bachelor Thesis at Chalmers University of Technology on 25 May 2022. Congrats!

Calle Andersson, Jesper Bergquist, Karim Hasseli, Wilhelm Henriksson, Max Jisonsund and Amandus Reimer presenting their bachelor thesis. (Photo by L. Natali.)
Calle Andersson, Jesper Bergquist, Karim Hasseli, Wilhelm Henriksson, Max Jisonsund and Amandus Reimer defended their Bachelor Thesis at Chalmers University of Technology on 25 May 2022. Congrats!

Title: Simuleringsprogram för epidemihantering med hjälp av artificiell intelligens

­Sammandrag:Under människans historia har epidemier förekommit med viss regelbundenhet och varit orsak till stora förluster av människoliv. En sjukdom med utbredd spridning bland människor kan benämnas som en epidemi. Vad som i huvudsak avgör vilket hot en epidemi utgör mot ett samhälle är hur lätt det är för sjukdomen att sprida sig vidare, hur dödlig den är och hur länge en smittad individ är sjuk. Med syfte att ge en användare utan programmeringskunskaper möjlighet att undersöka dessa fenomen i en simulerad miljö påbörjades detta arbete. Slutresultatet är ett program, som simulerar smittspridningen i ett samhälle, som en användare kan köra via sin webbläsare. I programmet kan en användare observera och kontrollera ett simulerat samhälle, och bland annat ställa in flera parametrar relaterade till sjukdomen och samhällsutformningen. Användaren kan även välja att låta en sjukdom spridas fritt eller aktivera någon bekämpningsstrategi, vars mål är att minimera sjukdomens spridning genom olika metoder. En av dessa strategier använder sig av artificiell intelligens för att försöka minska smittspridningen. Det finns två versioner av programmet. Den första ger användaren möjligheten att visuellt se hur en sjukdom sprider sig i ett samhälle, och den andra versionen möjliggör djupare analys av simuleringar med valda parametrar. Ett antal fallstudier har även genomförts för att observera skillnader i olika simuleringar.

Abstract: Throughout history, epidemics have appeared with some regularity and they have lead to the loss of many human lives. When a disease spreads among many humans in a given population it can be defined as an epidemic. The threat of a disease mainly depends on its transmission rate, lethality and duration of infection in a host. In this project, we develop a simulation program for the evolution of an epidemic in a given society. The simulation program is web-based and designed for use by individuals without prior knowledge of programming. In the simulation program, the user has the freedom to modify some parameters related to an epidemic, such as population size, infection rate, recovery rate and death rate to name a few. The user can also choose between allowing the disease to spread freely or activating a containment strategy. One of these uses artificial intelligence. There are two different versions of the simulation program; the first version creates a visualization of how an epidemic evolves over time, and the second version enables a deeper analysis of a simulation with the chosen parameters. A number of case studies have been made to observe the contrasts between different simulations.

Supervisors: Laura Natali and Giovanni Volpe, Department of Physics, University of Gothenburg

Examiner: Lena Falk, Department of Physics, Chalmers University of Technology

Place: FL41
Date: 25 May, 2022
Time: 10:40

Laura Natali presented her half-time seminar on 1 April 2022

Opponent Bernhard Mehlig (left), Laura Natali (center), and PhD supervisor Giovanni Volpe (right). (Photo by L. Perez.)
Laura Natali completed the first half of her doctoral studies and she defended her half-time on the 1st of April 2022.

The presentation was held in hybrid format, with part of the audience in the Von Bahr room and the rest connected through zoom. The half-time consisted in a presentation about her past and planned projects and it was followed by a discussion and questions proposed by her opponent Bernhard Mehlig.

The presentation started with a description of her concluded projects about employing neural networks in an epidemic agent-based model, published in Improving epidemic testing and containment strategies using machine learning accepted in Machine Learning: Science and Technology. It continued with her second project, about handling incomplete medical datasets with neural networks, available online as a preprint Neural Network Training with Highly Incomplete Datasets on ArXiv. In the last section, she outlined the proposed continuation of her PhD, with an ongoing project for combining artificial active matter with neural networks.

Visit by Claus Roll, OPTICA director in Europe, 19 November 2021

Claus Roll is visiting the Soft Matter Lab on the 19 November 2021.

Claus is the director in Europe of OPTICA (former OSA)  and he will be in Gothenburg for an hybrid event organised together with the local OPTICA student chapter and the FFF (Föreningen för Forskarstuderande i Fysik) group.

The visit starts with a tour of different labs including the Soft matter and Biophysics lab. The tour is followed by an hybrid career seminar by Claus Roll, both in person and online starting at 10:30. The presentation is followed by a social lunch and networking session.

Presentation by L. Natali at Spatial Data Science 2020, 11 June 2021

Comparison of different evolution regimes of disease spreading: free evolution (bottom left half) vs network strategy (top right half). (Image by Laura Natali.)
Improving epidemic testing and containment strategies using machine learning. 
Laura Natali, Saga Helgadottir, Onofrio M. Maragò, Giovanni Volpe.
Submitted to SDS2020
Date: 11 June
Time: 16:15 (CEST)

Containment of epidemic outbreaks entails great societal and economic costs.  Cost-effective containment strategies rely on efficiently identifying infected individuals, making the best possible use of the available testing resources. Therefore, quickly identifying the optimal testing strategy is of critical importance. Here, we demonstrate that machine learning can be used to identify which individuals are most beneficial to test, automatically and dynamically adapting the testing strategy to the characteristics of the disease outbreak. Specifically, we simulate an outbreak using the archetypal susceptible-infectious-recovered (SIR) model and we use data about the first confirmed cases to train a neural network that learns to make predictions about the rest of the population. Using these prediction, we manage to contain the outbreak more effectively and more quickly than with standard approaches. Furthermore, we demonstrate how this method can be used also when there is a possibility of reinfection (SIRS model) to efficiently eradicate an endemic disease.

Press release on Machine learning can help slow down future pandemics

Comparison of different evolution regimes of disease spreading: free evolution (bottom left half) vs network strategy (top right half). (Image by Laura Natali.)

The article Improving epidemic testing and containment strategies using machine learning has been featured in the News of the Faculty of Science of Gothenburg University.

Here the links to the press releases:
Swedish: Maskininlärning kan bidra till att bromsa framtida pandemier
English: Machine learning can help slow down future pandemics

The articles was also featured in:
AI ska bromsa framtidens pandemier Metal Supply (23/04/2021)
El papel de la inteligencia artificial para frenar futuras pandemias El (16/04/2021)
AI could be critical to preventing future pandemics – study Health Tech World. (16/04/2021)
Machine Learning Slows Down Future Pandemics MedIndia. (15/04/2021)
Machine Learning May Be Key to Avoiding the Next Possible Pandemic (15/04/2021)
Så kan AI bromsa nästa pandemi – svensk forskningförfinar testningen Computer Sweden (15/04/2021)
AI could prevent future pandemics Electronics360 (14/04/2021)
L’IA peut contribuer à limiter la propagation des infections lors des futures épidémies (étude) Ecofin Telecom. (14/04/2021)
Machine Learning can help slow down future pandemics:Study (14/04/2021)
Machine learning can help slow down future pandemics —ScienceDaily Sortiwa Trending Viral News Portal (14/04/2021)
AI mot smittspridning Sveriges Radio Vetenskapsradion. (14/04/2021)

Career Seminar by OSA Ambassador Aura Higuera Rodriguez, 6 October 2020

Aura Higuera, Technical Account Manager PIC technology and Optical Society Ambassador

The OSA Chapter of Gothenburg together with the Association of Graduate Students in Physics (FFF) will organise an online career talk by Aura Higuera Rodriguez, OSA ambassador and Technical Account Manager at Synopsys Photonic Solutions. The seminar will take place online via Zoom on Tuesday 6th of October at 5 pm.

Aura completed her PhD at Eindhoven University of Technology in the photonic integration group. Afterwards, she worked at the Photonic Integration Technology Center as JePPIX Coordinator and Application Support and since January 2019 she joined Synopsys Photonic Solutions as Technical Account Manager. More about Aura on the OSA website .

Aura is among the OSA Ambassadors 2020 and her talk will focus on Career Development and Emotional Intelligence. The seminar is open to all members of the University of Gothenburg and Chalmers University of Technology via registration using the institutional email.

The seminar will be held on Zoom and the link will be available one hour before the start. To attend the talk, please register on   For information or questions you can contact OSA-GU at or FFF at

Place: Zoom (online)
Date: 6 October 2020
Time: 17:00 CET