Research database

SIMBAD - Statistical Inference from Multiscale Biological Data: theory, algorithms, applications

48 months (2027)
Principal investigator(s):
Project type:
UE-funded research - HE - Excellent Science - MSCA
Funding body:
Project identification number:
PoliTo role:


The last two decades have witnessed giant experimental breakthroughs in different areas of the life sciences, from genomics to epidemiology. Thanks to modern high-throughput techniques, biological systems across multiple scales –from single molecules up to entire populations– can now be probed quantitatively at high spatial and temporal resolutions. Besides enhancing our basic knowledge of a system’s constituents, these data potentially encode a plethora of information about the functional constraints that govern its evolution and the physical constraints that limit its performance, as well as about levels of organization, dynamical constraints or design principles that would be hard to identify from low-throughput data. Extracting this information is also crucial for applications ranging from the design of proteins with a desired functionality to the reconstruction of contacts during an epidemics. Inverse statistical mechanics attempts to do it by inferring generative models (Boltzmann distributions) from data using methods from the physics of disordered and random systems. Specific characteristics of biological data however, like strong undersampling and heterogeneity, limit the effectiveness of these tools. SIMBAD aims at developing a class of statistical inference techniques capable of overcoming these issues. In SIMBAD, theoretical work will supply concepts and methods to address four pressing problems (learning protein sequence landscapes, inverse modeling metabolic networks, inferring contact networks from epidemiological data, and improving survival analysis models), which in turn will guide the theory towards integration with the existing standards of each field. This effort promises to open new pathways for basic research to impact economic, technological and societal issues; the high- profile cross-disciplinary expertise represented in SIMBAD ensures instead for measurable and achievable objectives, placing SIMBAD in an ideal position to achieve its goals



ERC sectors

PE3_15 - Statistical physics: phase transitions, noise and fluctuations, models of complex systems, etc.
LS2_14 - Biological systems analysis, modelling and simulation
PE6_11 - Machine learning, statistical data processing and applications using signal processing (e.g. speech, image, video)
PE6_13 - Bioinformatics, biocomputing, and DNA and molecular computation systems, cyber-physical systems


Total cost: € 740,600.00
Total contribution: € 740,600.00
PoliTo total cost: € 119,600.00
PoliTo contribution: € 119,600.00