Research proposals 40th cycle

02 Ultra-low latency multimedia streaming over HTTP/3 (Prof. Antonio Servetti)

03 Media Quality Optimization using Machine Learning on Large Scale Datasets (Prof. Enrico Masala)

04 Security Analysis and Automation in Smart Systems (Prof. Fulvio Valenza)

05 Local energy markets in citizen-centered energy communities (Prof. Edoardo Patti)

06 Simulation and Modelling of V2X connectivity with traffic simulation (Prof. Edoardo Patti)

07 Machine Learning techniques for real-time State-of-Health estimation of Electric Vehicle?s batteries (Prof. Edoardo Patti)

08 Multi-Device Programming for Artistic Expression (Prof. Luigi De Russis)

09 Digital Wellbeing By Desing (Prof. Alberto Monge Roffarello)

10 Management Solutions for Autonomous Networks (Prof. Guido Marchetto)

11 Preserving privacy and fairness with generative AI-based synthetic data production (Prof. Antonio Vetro')

12 Digital Twin development for the enhancement of manufacturing systems (Prof. Sara Vinco)

13 State-of-Health diagnostic framework towards battery digital twins (Prof. Sara Vinco)

14 Modeling, simualtion and validation of modern electronic systems (Prof. Sara Vinco)

15 Robust AI systems for data-limited applications (Prof. Santa Di Cataldo)

16 Artificial Intelligence applications for advanced manufacturing systems (Prof. Santa Di Cataldo)

17 AI for Secured Networks: Language Models for Automated Security Log Analysis (Prof. Marco Mellia)

18 Leveraging Machine Learning Analytics for Intelligent Transport Systems Optimization in Smart Cities (Prof. Marco Mellia)

19 Natural Language Processing e Large Language Models for source code generation (Prof. Edoardo Patti)

20 Cloud continuum machine learning (Prof. Daniele Apiletti)

21 Graph network models for Data Science (Prof. Daniele Apiletti)

22 Automatic composability of Large Co-simulation Scenarios for smart energy communities (Prof. Edoardo Patti)

23 Multivariate time series representation learning for vehicle telematics data analysis (Prof. Luca Cagliero)

24 Designing a cloud-based heterogeneous prototyping platform for the development of fog computing apps (Prof. Gianvito Urgese)

25 Designing a Development Framework for Engineering Edge-Based AIoT Sensor Solutions (Prof. Gianvito Urgese)

26 Computational Intelligence for Computer-Aided Design (Prof. Giovanni Squillero)

28 Security of Software Networks (Prof. Cataldo Basile)

29 Emerging Topics in Evolutionary Computation: Diversity Promotion and Graph-GP (Prof. Giovanni Squillero)

30 Advanced ICT solutions and AI-driven methodologies for Cultural Heritage resilience (Prof. Edoardo Patti)

31 Monitoring systems and techniques for precision agriculture (Prof. Renato Ferrero)

32 Designing heterogeneous digital/neuromorphic fog computing systems and development framework (Prof. Gianvito Urgese)

33 Cloud at the edge: creating a seamless computing platform with opportunistic datacenters (Prof. Fulvio Giovanni Ottavio Risso)

34 AI-driven cybersecurity assessment for automotive (Prof. Luca Cagliero)

35 Applications of Large Language Models in time-evolving scenarios (Prof. Luca Cagliero)

36 Building Adaptive Embodied Agents in XR to Enhance Educational Activities (Prof. Andrea Bottino)

37 Real-Time Generative AI for Enhanced Extended Reality (Prof. Andrea Bottino)

38 Transferable and efficient robot learning across tasks, environments, and embodiments (Prof. Raffaello Camoriano)

39 Neural Network reliability assessment and hardening for safety-critical embedded systems (Prof. Matteo Sonza Reorda)

40 Design of an integrated system for testing headlamp optical functionalities. (Prof. Bartolomeo Montrucchio)

41 Machine unlearning (Prof. Elena Maria Baralis)

42 Generative AI models for enhanced text-to-image synthesis (Prof. Lia Morra)

43 Test, reliability, and safety of intelligent and dependable devices supporting sustainable mobility (Prof. Riccardo Cantoro)

44 Cybersecurity for a quantum world (Prof. Antonio Lioy)

45 Bridging Human Expertise and Generative AI in Software Engineering (Prof. Luca Ardito)

46 Explaining AI (XAI) models for spatio-temporal data (Prof. Elena Maria Baralis)

47 Advanced data modeling and innovative data analytics solutions for complex application domains (Prof. Silvia Anna Chiusano)

48 Functional Safety Techniques for Automotive oriented Systems-on-Chip (Prof. Paolo Bernardi)

49 Human-aware robot behaviour learning for HRI (Prof. Giuseppe Bruno Averta)

51 Generative and Adversarial AI for Testing and Optimization of Connected and Automated Vehicles (Prof. Claudio Ettore Casetti)

52 Continuous Machine Learning for Cybersecurity (Prof. Marco Mellia)

54 REM sleep disorder detection using tachogram and other bio-signals measured using wearable devices (Prof. Gabriella Olmo)

56 Innovative Techniques for Synthetic Renders Automatic Comparison (Prof. Federico Manuri)

57 Innovative Techniques for Storyboarding in Extended Reality (Prof. Federico Manuri)

58 Innovative Robot Learning Techniques for Human-Robot Collaboration (Prof. Federico Manuri)

59 Adaptive, Agile and Automated Cybersecurity Management (Prof. Fulvio Valenza)

60 Security of Linux Kernel Extensions for Fast Packet Processing (Prof. Riccardo Sisto)

61 Smart digital technologies for advanced and personalised health monitoring (Prof. Luigi Borzi')

62 Control theory-based sparse optimization for system identification and data-driven control (Prof. Sophie Fosson)

63 E-MIMIC: Empowering Multilingual Inclusive Communication (Prof. Tania Cerquitelli)

64 Explainable AI (XAI) and Trustworthy AI algorithms for audio and speech data (Prof. Eliana Pastor)

65 Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs) for the Insurance Industry (Prof. Lia Morra)

67 Optimizing Deep Neural Networks on Edge SoCs using AI Compilers for Heterogenous Systems (Prof. Massimo Poncino)

68 Beyond TinyML: innovative Hardware Blocks and ISA Extensions for Next-Generation AI (Prof. Massimo Poncino)

69 XR-based Adaptive Procedural Learning Systems (Prof. Andrea Bottino)

70 AI Based non-invasive vital signs monitoring (Prof. Gabriella Olmo)

71 Statistical learning of genomic data (Prof. Renato Ferrero)

72 High-End AI-Enhanced Computing System for Space Application (Prof. Luca Sterpone)

73 Design and development of software stack for execution of hybrid Quantum-classic algorithms (Prof. Bartolomeo Montrucchio)

74 Enhancing knowledge graph through multimodal recognition and extraction techniques (Prof. Lia Morra)

76 Embracing synergies between Software Engineering and Artificial Intelligence (Prof. Marco Torchiano)

77 Protect-IT | Privacy on the Internet: measurements and novel approaches (Prof. Marco Mellia)

78 Protect-IT | Distributed platform for cybersecurity data collection and analysis (Prof. Marco Mellia)

79 Human motion reconstruction and recognition based on dynamic point clouds (Prof. Fabrizio Lamberti)

80 Data-Driven and Sustainable Solutions for Distributed Systems (Prof. Guido Marchetto)

81 Multi-Provider Cloud-Edge Continuum (Prof. Fulvio Giovanni Ottavio Risso)

82 Reliability estimation techniques for AI accelerators (Prof. Matteo Sonza Reorda)

83 Analysis and improvement of fault tolerance and safety of AI accelerators (Prof. Matteo Sonza Reorda)

84 Multimodal Spatio-Temporal Data Analytics (Prof. Paolo Garza)

85 Reinforcement Learning for Multi-Scale Energy Management in Multi-Energy Systems (Prof. Lorenzo Bottaccioli)

86 4D reconstruction and perception (Prof. Tatiana Tommasi)

87 Video Dynamic Scene Understanding (Prof. Tatiana Tommasi)

88 Heterogeneous Computing Design Techniques for HPC (Prof. Luca Sterpone)

89 Edge Computing and Inclusive Innovation: Shaping the Future of Smart and Safe Technologies (Prof. Luca Sterpone)

90 Virtual Prototyping and Behavioral Modelling of AI Accelerators (Prof. Luca Sterpone)

91 Software-based hardening solutions for AI-based architectures (Prof. Luca Sterpone)

02

Ultra-low latency multimedia streaming over HTTP/3
Proposer	Antonio Servetti
Topics	Computer graphics and Multimedia, Parallel and distributed systems, Quantum computing
Group website	https://media.polito.it/
Summary of the proposal	The growing demand for interactive web services has also led to the need for interactive video applications, capable of accommodating a much larger audience than videoconferencing tools, but with almost the same, i.e., strict, requirements in end-to-end latency. This proposal aims to define and study new media coding and transmission techniques that will exploit new HTTP/3 features, such as QUIC and WebTransport, to improve the scalability and reduce the latency of current streaming solutions.
Research objectives and methods	Research objectives The term "Interactive Live Video Streaming" (IVS) has been defined for one-/few-to-many streaming services that can allow end-to-end latency below one second at scale and with low-cost, thus enabling some sort of interaction between one or more "presenters" and the audience. IVS is particularly useful in scenarios such as i) interactive video chat rooms, ii) instant feedback from video viewers (such as polling or voting), iii) promotional elements synchronized with a live stream. The market demand for IVS aligns with the ongoing deployment of HTTP/3, which is expected to obsolete the long-standing TCP transport protocol by means of the new QUIC protocol (which is UDP- based and implements congestion control algorithms in the user space). Although QUIC has been employed for data transfer in HTTP since 2012, and it is now experimentally supported by a good number of servers and browsers, it is still in the early stages of adoption for media delivery. Ensuring an optimal balance between network efficiency and user satisfaction poses several challenges for the deployment of multimedia services using the new protocol. For instance, one challenge is how to exploit both reliable streams and unreliable datagrams in the transmission protocol according to the characteristics of different media elements. Additionally, managing quality adaptation without overloading the server, and ensuring effective caching by relay servers, even with stringent delay requirements, are also critical issues that need to be addressed. To this aim, we plan to start from the implementation of an experimental HTTP/3 client-server application for ultra-low latency media delivery that will allow us to test and simulate different proposals and alternative solutions, and then compare their relative benefits and costs. The research will address the challenges of customizing media coding, packetization, forward error control, resource prioritization, and adaptivity in the new scenario. Such objectives will be pursued by using both theoretical and practical approaches. The resulting insight will then be validated in practical cases by analyzing the system's performance with simulations and actual experiments. Outline of the research work plan During the project's first year, the doctoral candidate will explore and gain practical experience with server and browser-related software (or libraries) for QUIC and WebTransport. Specifically, the candidate will test and investigate open-source software implementations geared toward low-delay multimedia streaming. This activity will address the creation of an experimental framework for client-server streaming of multimedia content over HTTP/3. This implementation will act as the foundation for testing, analyzing, and comparing different cutting-edge protocols under various practical scenarios, such as network conditions and media bitrate, among others. The expected outcome of this initial investigation is to produce a conference publication to present the research framework to the community and to facilitate subsequent engagement with the international research groups working on the topic. In the second year, building on the knowledge already present in the research group and on the candidate's background, new experiments for i) bitrate adaptability to the time-varying network condition, ii) quality/delay trade-offs, iii) scalability support by means of relay nodes or CDNs, will be implemented, simulated, and tested in laboratory to analyze their performance and ability to provide a significant reduction in end-to-end latency with respect to non HTTP/3 based solutions. These results are expected to yield other conference publications and potentially a journal publication with one or more theoretical performance models of the tested systems. In the third year, the activity will be expanded with the contribution of a media company in order to unfold new possibilities in supporting the scalability of the ultra-low delay streaming protocol via relay nodes or CDNs. The candidate will provide assistance to the company in the experimental deployment of the new solution in an industrially relevant environment. The proposed techniques will aim to produce advancements that will be targeted towards a journal publication reflecting the results that can be achieved in industrially relevant contexts. List of possible venues for publications Possible targets for research publications (well known to the proposer) include IEEE Transactions on Multimedia, IEEE Internet Computing, ACM Transactions on Multimedia Computing Communications and Applications, Elsevier Multimedia Tools and Applications, various international conferences (IEEE ICME, IEEE INFOCOM, IEEE ICC, ACM WWW, ACM Multimedia, ACM MMSys, Packet Video).
Required skills	The candidate is expected to have a good background in multimedia coding/streaming, computer networking, and web development. A reasonable knowledge of network programming and software development in the Unix/Linux environment is appreciated.

03

Media Quality Optimization using Machine Learning on Large Scale Datasets
Proposer	Enrico Masala
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	http://media.polito.it
Summary of the proposal	Machine learning (ML) significantly changed the way many optimization tasks are addressed. Here the focus is on optimizing the media compression and communication scenario, trying to predict users' quality of experience. Key objectives of the proposal are creation of tools to analyze and exploit large scale datasets using ML to identify media characteristics and features that most influence perceptual quality. Such new knowledge will be fundamental to improve existing measures and algorithms
Research objectives and methods	In recent years, ML has been successfully employed to develop video quality estimation algorithm (see, e.g., the Netflix VMAF proposal) to be integrated in media quality optimization frameworks. However, despite these improvements, no technique can currently be considered reliable, partly because the inner workings of ML models cannot be easily and fully understood especially when they are based on "black box" neural network models. We aim to improve the situation by developing more reliable and explainable quality prediction models. Starting from Internet Media Group's ongoing work on modeling the behavior of single human subjects in media quality experiments, the candidate will derive a systematic approach by employing several subjectively annotated datasets (i.e., with quality scores given by human subjects). With such an approach we expect to be able to identify meaningful media quality features useful to develop new reliable and explainable quality prediction models. However, to identify and improve such features by using machine learning models, it is important to include also large-scale, not subjectively annotated, datasets. To efficiently deal with this large amount of data, it is necessary to develop a framework comprising a set of tools that allows to more easily process both the subjective scores (given by human subjects) as well as objective scores in an efficient and integrated manner, since currently every dataset has its own characteristics, quality scale, way of describing distortions, etc. which make integration difficult. Such framework, that we will make publicly available for research purposes, will constitute the basis for reproducible research, which is increasingly important for ML techniques. The framework will allow to systematically investigate existing quality prediction algorithms finding strength and weaknesses, as well as to identify the most challenging content on which newer development can be based. Such objectives will be achieved by using both theoretical and practical approaches. The resulting insight will then be validated in practical cases by analyzing the performance of the system with simulations and experiments with industry-grade signals, leveraging cooperation with companies to facilitate the migration of the developed algorithms and technologies into prototypes that can then be effectively tested in real industrial media processing pipelines. The workplan of the activities is detailed in the following. In the first year the PhD candidate will first familiarize with the recently proposed ML and AI-based techniques for media quality optimization, as well as the characteristics of the publicly available datasets for research purposes. In parallel, a framework will be created to efficiently process the large sets of data (especially for the video case) with potentially complex measures that might need retraining, fine-tuning or other computationally complex optimizations. It is expected to make this framework publicly available also to address the research reproducibility issues that are of growing interest in the ML community. This initial investigation and activities are expected to lead to conference publications. In the second year, building on the framework and the theoretical knowledge already present in the research group, new media quality indicators for specific quality features will be developed, simulated, and tested to demonstrate their performance and in particular their ability to identify the root causes of the quality scores for several existing quality prediction algorithms, thus partly explaining their inner working methods in a more understandable form. In this context, potential shortcomings of such algorithms will be systematically identified. These results are expected to yield one or more journal publications. In the third year the activity will then be expanded to propose improvements that can mitigate the identified shortcoming as well as to create proposals for quality prediction algorithms based on the previously identified robust features. Such proposal will target journal publications. Possible targets for research publications, well known to the proposer, include IEEE Transactions on Multimedia, Elsevier Signal Processing: Image Communication, ACM Transactions on Multimedia Computing Communications and Applications, Elsevier Multimedia Tools and Applications, various IEEE/ACM international conferences (IEEE ICME, IEEE MMSP, QoMEX, ACM MM, ACM MMSys). The proposer is actively collaborating with the Video Quality Experts Group (VQEG), an international group of experts from academia and industry that aims to develop new standards in the context of video quality. In particular the tutor is co-chair of the JEG-Hybrid project which is very interested in the activity previously described.
Required skills	The PhD candidate is expected to have: strong analytical skills; some background on ML systems; good English writing and communication skills; reasonably good ability/willingness to learn how to work with large quantities of data on remote server systems, in particular by automating the procedures with scripts, pipelines, etc.

04

Security Analysis and Automation in Smart Systems
Proposer	Fulvio Valenza
Topics	Cybersecurity
Group website	http://netgroup.polito.it
Summary of the proposal	Cyber-physical systems and their smart components are pervasive in our daily activities. Unfortunately, identifying the potential threats and issues in these systems and selecting and configuring enough protection is challenging, given that such environments combine human, physical, and cyber aspects to the system design and implementation. This research aims to fill this gap by defining a novel, highly automated system to analyze and enforce security formally and optimally.
Research objectives and methods	The main objective of the proposed research is to improve the state of the art of security analysis and automation in novel Cyber-Physical Systems (i.e., Smart Systems), mainly focusing on the automated implementation of threat analysis and access control and defense policies. Although some of the methodologies and tools are available today, they support these activities only partially and with severe limitations. They especially leave most of the work and responsibility in charge of the human user, who must both identify potential threats and configure adequate protection mechanisms. The candidate will pursue highly automated approaches that limit human intervention as much as possible, thus reducing the risk of introducing human errors and speeding up security analysis and reconfigurations. This last aspect is essential because smart systems are highly dynamic. Moreover, if security attacks or policy violations are detected at runtime, the system should recover rapidly by reconfiguring its security promptly. Another feature that the candidate will pursue in the proposed solution is a formal approach, capable of providing formal correctness by construction. This way, high correctness confidence is achieved without needing a posteriori formal verification of the solution. Finally, the proposed approach will pursue optimization by selecting the best solution among the many possible ones. In this work, the candidate will exploit the results and the expertise recently achieved by the proposer's research group in the related field of network security automation. Although there are significant differences between the two application fields, there are also some similarities, and the underlying expertise on formal methods held by the group will be fundamental in the candidate's research work. If successful, this research work can have a high impact because improving automation can simplify and improve the quality of the verification and reconfigurations in cyber-physical systems, which are crucial for our society nowadays. The research activity will be organized in three phases: Phase 1 (1st year): The candidate will analyze and identify the main issues and limitations of the recent methodologies for threat modeling and analysis in cyber-physical systems. Also, the candidate will study the state-of-the-art literature on security automation and optimization of cyber-physical systems environment, with particular attention to formal approaches for modeling and configuring security properties and devices. Subsequently, with the tutor's guidance, the candidate will start identifying and defining new approaches for defining novel threat models and analysis processes and automating and enforcing access control and isolation mechanisms in smart systems. Some preliminary results are expected to be published at this phase's end (e.g., a conference paper). During the first year, the candidate will also acquire the background necessary for the research. This will be done by attending courses and by personal study. Phase 2 (2nd year): The candidate will consolidate the proposed approaches, fully implement them, and conduct experiments with them, e.g., to study their correctness, generality, and performance. In this year, particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. The results of this consolidated work will also be submitted for publication, aiming at least at a journal publication. Phase 3 (3rd year): based on the results achieved in the previous phase, the proposed approach will be further refined to improve its scalability, performance, and applicability (e.g., different security properties and strategies will be considered), and the related dissemination activity will be completed. The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of cybersecurity (e.g. IEEE S&P, ACM CCS, NDSS, ESORICS, IFIP SEC, DSN, ACM Transactions on Information and System Security, or IEEE Transactions on Secure and Dependable Computing), and applications (e.g. IEEE Transactions on Industrial Informatics or IEEE Transactions on Vehicular Technology).
Required skills	In order to successfully develop the proposed activity, the candidate should have a good background in cybersecurity (especially in network security), and good programming skills. Some knowledge of formal methods can be useful, but it is not required: the candidate can acquire this knowledge and related skills as part of the PhD Program, by exploiting specialized courses.

05

Local energy markets in citizen-centered energy communities
Proposer	Edoardo Patti
Topics	Software engineering and Mobile computing, Parallel and distributed systems, Quantum computing, Computer architectures and Computer aided design
Group website	www.eda.polito.it
Summary of the proposal	A smart citizen-centric energy system is at the centre of the energy transition. Energy communities will enable citizens to participate actively in local energy markets by exploiting new digital tools. Citizens will need to understand how to interact with smart energy systems, novel digital tools and local energy markets. Thus, new complex socio-techno-economic interactions will take place in such smart systems which need to be analyzed and simulated to evaluate possible future impacts. For this purpose, a novel co-simulation framework is needed, which combines agent-based modelling techniques with external simulators of the grid and energy sources.
Research objectives and methods	The diffusion of distributed (renewable) energy sources poses new challenges in the underlying energy infrastructure, e.g., distribution and transmission networks and/or within micro (private) electric grids. The optimal, efficient and safe management and dispatch of electricity flows among different actors (i.e., prosumers) is key to supporting the diffusion of the distributed energy sources paradigm. The goal of the project is to explore different corporate structures, billing and sharing mechanisms inside energy communities. For instance, the use of smart energy contracts based on Distributed Ledger Technology (blockchain) for energy management in local energy communities will be studied. A testbed comprising of physical hardware (e.g., smart meters) connected in the loop with a simulated energy community environment (e.g., a building or a cluster of buildings) exploiting different Renewable Energy Sources (RES) and energy storage technology will be developed and tested during the three-year program. Hence, the research will focus on the development of agents capable of describing:- the final customer/prosumer beliefs desires and intentions and opinions.- the local energy market where prosumers can trade their energy and or flexibility- the local system operator that has to provide the grid reliability All the software entities will be coupled with external simulators of the grid and energy sources in a plug-and-play fashion. Hence, the overall framework has to be able to work in a co-simulation environment with the possibility of performing hardware in the loop. The final outcomes of this research will be an agent-based modelling tool that can be exploited for:- Planning the evolution of future smart multi-energy systems by taking into account the operational phase- Evaluating the effect of different policies and related customer satisfaction- Evaluating the diffusion of technologies and/or energy policies under different regulatory scenarios- Evaluating new business models for energy communities and aggregators During the 1st year, the candidate will study state-of-the-art solutions of existing agent-based modelling tools in order to identify the best available solution for large-scale smart energy system simulation in distributed environments. Furthermore, the candidate will review the state of the art in prosumers/aggregators/market modelling in order to identify the challenges and identify possible innovations. Moreover, the candidate will focus on the review of possible corporate structures, billing and sharing mechanisms of energy communities. Finally, it will start the design of the overall platform starting with the requirements identification and definition. During the 2nd year, the candidate will terminate the design phase and will start the implementation of the agent intelligence. Furthermore, it will start to integrate agents and simulators together in order to create the first beta version of the tool. During 3rd year, the candidate will ultimate the overall platform and test it in different case studies and scenarios in order to show all the effects of the different corporate structures, billing and sharing mechanisms in energy communities. Possible international scientific journals and conferences:- IEEE Transaction Smart Grid,- IEEE Transactions on Evolutionary Computation,- IEEE Transactions on Control of Network Systems,- Environmental modelling and Software,- JASSS,- ACM e-Energy,- IEEE EEEIC internatational conference- IEEE SEST internatational conference- IEEE Compsac internatational conference
Required skills	Programming and Object-Oriented Programming (preferable in Python). Frameworks for Multi Agent Systems Development (preferable). Development in web environment (e.g. REST web services). Computer Networks

06

Simulation and Modelling of V2X connectivity with traffic simulation
Proposer	Edoardo Patti
Topics	Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website	www.eda.polito.it
Summary of the proposal	The development of novel ICT solutions in smart-grids has opened new opportunities to foster novel services for energy management and savings in all end-use sectors, with particular emphasis on Electric Vehicle connectivity, such as demand flexibility. Thus, there will be a strong interaction among transportation, traffic trends and energy distribution systems. New simulation tools are needed to evaluate the impact of Electric Vehicles in the grid by considering citizens behaviors.
Research objectives and methods	This research aims at developing novel simulation tools for smart cities/smart grid scenarios that exploit the Agent-Based Modelling (ABM) approach to evaluate novel strategies to manage the V2X connectivity with traffic simulation. The candidate will develop an ABM simulator that will provide a realistic and virtual city where different scenarios will be executed. The ABM should be based on real data, demand profiles and traffic patterns. Furthermore, the simulation framework should be flexible and extendable so that i) It can be improved with new data from the field; ii) it can be interfaced with other simulation layers (i.e. physical grid simulators, communication simulators); iii) It can interact with external tools executing real policies (such as energy aggregation). This simulator will be a useful tool to analyse how V2X connectivity and the associated services impact both social behaviours and traffic. It will also help the understanding of the impact of new actors and companies (e.g., sharing companies) in both the marketplace and the society, again by analysing the social behaviours and the traffic conditions. In a nutshell, ABM simulator will simulate both traffic variation and the possible advantages of V2X connectivity strategies in a smart grid context. This ABM simulator will be designed and developed to span different spatial-temporal resolutions. All the software entities will be coupled with external simulators of grid and energy sources in a plug-and-play fashion to be ready for being integrated with external simulators and platforms. This will enhance the resulting AMB framework unlocking also hardware in the loop features. The outcomes of this research will be an agent-based modelling tool that can be exploited for:- Simulating V2X connectivity considering traffic conditions- Evaluating the effect of different policies and related customer satisfaction- Evaluating the diffusion and acceptance of demand flexibility strategies- Evaluating the new business model for future companies and services During the 1st year, the candidate will study the state-of-the-art solution of existing agent-based modelling tools to identify the best available solution for large-scale traffic simulation in distributed environments. Furthermore, the candidate will review the state of the art of V2X connectivity to identify the challenges and identify possible innovations. Moreover, the candidate will focus on the review of Artificial Intelligence algorithms for simulating traffic conditions and variation for estimating EV flexibility and users' preferences. Finally, he/she will start the design of the overall ABM framework and algorithms starting with the requirements identification and definition. During the 2nd year, the candidate will terminate the design phase and will start the implementation of the agents' intelligence and test the first version of the proposed solution. During the 3rd year, the candidate will ultimate the overall ABM framework and AI algorithms and test it in different case studies and scenarios to assess the impact of V2X connection strategies and novel business models. Possible international scientific journals and conferences:- IEEE Transaction Smart Grid,- IEEE Transactions on Evolutionary Computation,- IEEE Transactions on Control of Network Systems,- Environmental modelling and Software,- JASSS,- ACM e-Energy,- IEEE EEEIC international conference- IEEE SEST international conference- IEEE Compsac international conference
Required skills	Programming and Object-Oriented Programming (preferable in Python), Frameworks for Multi Agent Systems Development (preferable) Development in web environment (e.g. REST web services), Computer Networks

07

Machine Learning techniques for real-time State-of-Health estimation of Electric Vehicles batteries
Proposer	Edoardo Patti
Topics	Data science, Computer vision and AI, Software engineering and Mobile computing, Computer architectures and Computer aided design
Group website	https://eda.polito.it/
Summary of the proposal	This Ph.D. research proposal aims at studying novel software solutions based on Machine Learning (ML) techniques to estimate the State-of-Health (SoH) of batteries in Electric Vehicles (EV) in near-real-time. This research area is gaining a strong interest in the last years as the number of EVs is constantly rising. Knowing this SoH can unlock different possible strategies i) to reuse EVs' batteries in other contexts, e.g. stationary energy storage systems in Smart Grids, or ii) to recycle them.
Research objectives and methods	In the last years, the number of Electric Vehicles (EVs) increased significantly and it is expected to grow in the upcoming years. Due to the use of high-value materials, there is a strong economic, environmental and political interest in implementing solutions to recycle EV's batteries for example by reusing them in stationary applications to become useful energy storage systems in Smart Grids. To achieve it, novel tools are needed to estimate the battery State-of-Health (SoH), i.e. the battery capacity measurement, in near-real-time. Currently, SoH is determined by bench discharging tests taking several hours and making this process time-consuming and expensive. The objective of this Ph.D. proposal consists of the design and development of models based on Machine Learning (ML) techniques that will exploit both synthetic and real-world datasets. The synthetic dataset is needed to train and test a generic ML model suitable for any EV independently from a specific brand and/or model. Whilst, the real-world dataset, given by monitoring real EVs, is needed to fine-tune the ML models, for example, by applying transfer learning techniques, customizing them more and more on the specific brand and model of the real-world EV to monitor. During the three years of the Ph.D., the research activity will be divided into four phases:- Study and analysis of both state-of-the-art solutions and datasets of real-world EV monitoring.- Design and develop a realistic simulator of an EV fleet to generate the synthetic and realistic dataset. Starting from both datasheet information of different EVs (in terms of brand and model) and information provided by the Italian National Institute of Statistics (ISTAT), the simulator will simulate different routes in terms of length, altitude and travel speed, impacting battery wear differently, thus making the resulting dataset realistic and heterogeneous.- Design and development of ML-based models trained and tested with the synthetic dataset to estimate the SoH of EV's batteries.- Application of transfer learning techniques to the ML-based models (from the previous bullet #3) to fine-tune them by exploiting datasets of real-world EV monitoring (result of the previous bullet #1). Possible international scientific journals and conference:- IEEE Transaction Smart Grid,- IEEE Transaction on Vehicular Technology,- IEEE Transaction on Industrial Informatics,- IEEE Transactions on Industry Applications,- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- ACM e-Energy- IEEE EEEIC international conference- IEEE SEST international conference- IEEE Compsac international conference
Required skills	Programming and Object-Oriented Programming (preferable in Python). Knowledge of Machine Learning and Neural Networks. Knowledge of frameworks to develop models based on Machine Learning and Neural Networks. Knowledge of development of Internet of Things Applications

08

Multi-Device Programming for Artistic Expression
Proposer	Luigi De Russis
Topics	Computer graphics and Multimedia, Software engineering and Mobile computing
Group website	https://elite.polito.it
Summary of the proposal	Media artists use smartphones and IoT devices as material for creative exploration. However, some do not code and have a low interest in learning. In addition, programming artworks enclose characteristics that differ from traditional coding. This Ph.D. proposal aims to extend our comprehension of the needs of artists for creative coding through the design, implementation, and evaluation of toolkits that serve them to realize code-based artworks across multiple devices and media effectively.
Research objectives and methods	The recent availability of smartphones, AR/VR headsets, IoT-enabled devices, and microcontroller kits creates new opportunities for creative explorations for media artists and designers. The field of creative coding emphasizes the goal of expression rather than function, and creative coders combine computational skills with creative insight. In some cases, artists and designers are interested in creative coding but lack the knowledge or programming skills to benefit from the offered possibilities. The main research objective of this Ph.D. proposal is to extend our comprehension of the needs of media artists and designers for creative coding across multiple devices and media. To reach this objective, the Ph.D. student will study, design, develop, and evaluate proper models and novel technical solutions (e.g., toolkits and tools) for supporting creative coders. The proposal envisions focusing on both creative coders and end-user programmers. The work will start from the needs of the stakeholders (i.e., artists and designers), complemented by the existing literature. Using a participatory approach, the Ph.D. student will keep the stakeholders involved in the various phases of the work. In particular, the Ph.D. research activity will focus on the following: - Study of the creative coding field, stakeholders' needs and current tools, and HCI techniques able to support the identification of suitable requirements and the creation of technical solutions to effectively support creative exploration and coding. - Creation of a theoretical framework to satisfy the identified needs and requirements, able to adapt to different media, devices, and skills. For instance, it can include end-user personalization as a way to allow end-users to create code-based artifacts and AI techniques to support the creation of programs. - Development of a toolkit and related tools to experiment with the theoretical framework's facets. The creation and evaluation of the tools will serve as the validation for the framework. The work plan will be organized according to the following four phases, partially overlapped:- Phase 1 (months 0-6): literature review about creative coders and coding; focus groups and interviews with designers and media artists of various skills; definitions and development of a set of use cases and promising strategies to be adopted.- Phase 2 (months 6-18): research, definition, and experimentation of an initial version of the theoretical framework and a first toolkit for creative coders, starting from the outcome of the previous phase. In this phase, the focus will be on the most common target devices, i.e., the smartphone and the PC, with the design, implementation, and evaluation of suitable tools.- Phase 3 (months 12-24): research, definition, and experimentation of a second toolkit (or an evolution of the previous one) for novice creative coders and end-users. Such a toolkit uses artificial intelligence and machine learning to help during the coding process. The subsequent design, implementation, evaluation of suitable tools, and extension of the framework.- Phase 4 (months 24-36): extension and generalization of the previous phases to include additional target devices and consolidate the theoretical framework; evaluation of the toolkit and the tools in real settings with a large number of artists. It is expected that the results of this research will be published in some of the top conferences in the Human-Computer Interaction field (e.g., ACM CHI, ACM CSCW, ACM C&C, and ACM IUI). One or more journal publications are expected on a subset of the following international journals: ACM Transactions on Computer-Human Interaction, ACM Transactions on Interactive Intelligent Systems, and International Journal of Human Computer Studies.
Required skills	A candidate interested in the proposal should (i) be able to critically analyze and evaluate existing research, as well as gather and interpret data from various sources; (ii) be able to communicate research findings through writing and presenting; (iii) have a solid foundation in computer science/engineering and possess the relevant technical skills; (iv) have a good understanding of HCI research methods, especially around needfinding.

09

Digital Wellbeing By Desing
Proposer	Alberto Monge Roffarello
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI, Software engineering and Mobile computing
Group website	https://elite.polito.it/
Summary of the proposal	Tools for digital wellbeing allow users to self-control their habits with distractive apps and websites. Yet, they are ineffective in the long term, as tech companies still adopt attention-capture designs, e.g., infinite scroll, that compromise users' self-control. This PhD proposal investigates innovative strategies for designers and end users to consider digital wellbeing in user interface design, recognizing the need to foster healthy digital experiences without depending on external support.
Research objectives and methods	In today's attention economy, tech companies compete to capture users' attention, e.g., by introducing visual features and functionalities - from guilty-pleasures recommendations to content autoplay - that are purposely designed to maximize metrics such as daily visits and time spent. These Attention-Capture Damaging Patterns (ACDPs) [1] compromise users' sense of agency and self-control, ultimately undermining their digital wellbeing. As of now, the HCI research community has traditionally considered digital wellbeing an end-user responsibility, enabling them to self-monitor their usage of apps and websites through tools for digital self-control. Nevertheless, studies have shown that these external interventions - especially those that are overly dependent on users' self-monitoring capabilities - are often ineffective in the long term. Taking a complementary perspective, the main research objective of this PhD proposal is to explore how to make digital wellbeing a top-design goal in user interface design, establishing a fruitful collaboration between designers and end users and recognizing the critical necessity to foster healthy online experiences and address potential negative impacts of ACDPs on users' mental health without depending on external support. The PhD student will study, design, develop, and evaluate proper models and novel technical solutions (e.g., tools and frameworks) to support designers and end users in fostering the creation user interfaces that preserve and respect user attention by design, starting from the relevant scientific literature and performing studies involving designers and end users. In particular, possible areas of investigation are:- Innovating frameworks that define and educate designers on novel theoretically grounded processes that prioritize digital wellbeing. These processes will build upon existing design guidelines and best practices, providing clear guidance on their application and providing tech companies and designers with actionable insights to transition away from the contemporary attention economy.- Creating a validated taxonomy of positive design patterns that respect and preserve the user's attention. These patterns will promote users' agency by design and support reflection by offering the same functionality as ACDPs. - Developing design tools to support designers in prioritizing users' digital wellbeing in real-time. Using artificial intelligence and machine learning models, these tools may detect when a designed interface contains ACDPs and/or fails to address digital wellbeing guidelines, suggesting positive design alternatives.- Developing strategies that empower end users to actively participate in designing technology that prioritizes digital wellbeing. This may include the development of platforms for co-designing user interfaces, as well as mechanisms for evaluating existing user interfaces against ACDPs and giving feedback. The proposal will adopt a human-centered approach, and it will build upon the existing scientific literature from different interdisciplinary domains, mainly from Human-Computer Interaction. The work plan will be organized according to the following four phases, partially overlapped:- Phase 1 (months 0-6): literature review at the intersection of digital wellbeing, design, and ACDPs; focus groups and interviews with designers, practitioners, and end users; definitions of a set of use cases and promising strategies to be adopted.- Phase 2 (months 3-24): research, definition, and evaluation of design frameworks and models of positive design patterns. Here, the focus will be on the design of user interfaces for the most commonly used devices, i.e., the smartphone and the PC.- Phase 3 (months 12-36): research, definition, and experimentation of design tools to support designers in prioritizing users' digital wellbeing in real-time, integrating frameworks, design guidelines, and positive design patterns explored and defined in the previous phase.- Phase 4 (months 24-36): extension and possible generalization of the previous phases to include additional devices; evaluation in real settings over long period of times of the proposed solutions; development and preliminary evaluation of strategies for end-user collaboration. It is expected that the results of this research will be published in some of the top conferences in the Human-Computer Interaction field (e.g., ACM CHI, ACM CSCW, and ACM IUI). Journal publications are expected on a subset of the following international journals: ACM Transactions on Computer-Human Interaction, ACM Transactions on the Web, ACM Transactions on Interactive Intelligent Systems, and International Journal of Human Computer Studies. [1] A. Monge Roffarello, K. Lukoff, L. De Russis, Defining and Identifying Attention Capture Deceptive Designs in Digital Interfaces, CHI 2023, https://dl.acm.org/doi/abs/10.1145/3544548.3580729
Required skills	A candidate interested in the proposal should ideally: be able to critically analyze and evaluate existing research, as well as gather and interpret data from various sources; be able to communicate research findings through writing and presenting; have a solid foundation in computer science/engineering and possess relevant technical skills; have a good understanding of HCI research methods, especially around needfinding.

10

Management Solutions for Autonomous Networks
Proposer	Guido Marchetto
Topics	Parallel and distributed systems, Quantum computing
Group website	http://www.netgroup.polito.it
Summary of the proposal	Next-Generation (NextG) networks are expected to support advanced and critical services, incorporating computation, communication, and intelligent decision making. This activity aims to design and implement novel mechanisms using supervised and unsupervised (distributed) learning, within software-defined networks to serve the needs of data-driven infrastructure management decisions. Moreover, we aim to design novel in-band network telemetry mechanisms to increase the accuracy of these decisions.
Research objectives and methods	Two research questions (RQ) guide the proposed work: RQ1: How can we design and implement on local and larger-scale testbeds effective transport and routing network protocols that integrate the network stack at different scopes using recent advances in supervised and unsupervised learning? RQ2: To scale the use of machine learning-based solutions in network management, what are the most efficient distributed machine learning architectures that can be implemented at the network edge layer? The final target of the research work is to answer these questions, also by evaluating the proposed solutions on small-scale network emulators or large-scale virtual network testbeds, using a few applications, including virtual and augmented reality, precision agriculture, or haptic wearables. In essence, the main goals are to provide innovation in network monitoring, network adaptation, and network resilience, using centralized and distributed learning integrated with edge computing infrastructures. Both vertical and horizontal integration will be considered. By vertical integration, we mean considering learning problems that integrate states across network hardware and software, as well as states across the network stack across different scopes. For example, the candidate will design data-driven algorithms for congestion control problems to address the tussle between in-network and end-to-end congestion notifications. By horizontal learning, we mean using states from local (e.g., physical layer) and wide area (e.g., transport layer) as input for the learning-based algorithms. The data needed by these algorithms are carried to the learning actor by means of newly defined in-band network telemetry mechanisms. Aside from supporting resiliency with the vertical integration, solutions must offer resiliency across a wide (horizontal) range of network operations: from close-edge, i.e., near the device, to the far-edge, with the design of secure data-centric resource allocation (federated) algorithms. The research activity will be organized in three phases: Phase 1 (1st year): the candidate will analyze the state-of-the-art solutions for network management, with particular emphasis on knowledge-based network automation techniques. The candidate will then define detailed guidelines for the development of architectures and protocols that are suitable for automatic operation and configuration of NextG networks, with particular reference to edge infrastructures. Specific use-cases will also be defined during this phase (e.g., in virtual reality). Such use cases will help identifying ad-hoc requirements and will include peculiarities of specific environments. With these use cases in mind, the candidate will also design and implement novel solutions to deal with the partial availability of data within distributed edge infrastructures. Results of this work will likely result in conference publications. Phase 2 (2nd year): the candidate will consolidate the approaches proposed in the previous year, focusing on the design and implementation of mechanisms for vertical and horizontal integration of supervised and unsupervised learning with network virtualization. Network, and computational resources will be considered for the definition of proper allocation algorithms. All solutions will be implemented and tested. Results will be published, targeting at least one journal publication. Phase 3 (3rd year): the consolidation and the experimentation of the proposed approach will be completed. Particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. Major importance will be given to the quality offered to the service, with specific emphasis on the minimization of latencies in order to enable a real-time network automation for critical environments (e.g., telehealth systems, precision agriculture, or haptic wearables). Further conference and journal publications are expected. The research activity is in collaboration with Saint Louis University, MO, USA, also in the context of the NSF grant #2201536 ?Integration-Small: A Software-Defined Edge Infrastructure Testbed for Full-stack Data-Driven Wireless Network Applications?. Furthermore, it is related to active collaborations with Futurewei Inc. and Tiesse SpA, both interested in the covered topics. The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of networking and machine learning (e.g. IEEE INFOCOM, ICML, ACM/IEEE Transactions on Networking, or IEEE Transactions on Network and Service Management) and cloud/fog computing (e.g. IEEE/ACM SEC, IEEE ICFEC, IEEE Transactions on Cloud Computing), as well as in publications related to the specific areas that could benefit from the proposed solutions (e.g., IEEE Transactions on Industrial Informatics, IEEE Transactions on Vehicular Technology)
Required skills	The ideal candidate has good knowledge and experience in networking and machine learning, or at least in one of the two topics. Availability for spending periods abroad (mainly but not only at Saint Louis University) is also important for a profitable development of the research topic.

11

Preserving privacy and fairness with generative AI-based synthetic data production
Proposer	Antonio Vetro'
Topics	Software engineering and Mobile computing, Data science, Computer vision and AI
Group website	https://nexa.polito.it
Summary of the proposal	Synthetic data generation is fundamental in contexts of data scarcity or low economical resources to collect data. However, several challenges are still open in this research field, the most important being the trade-off between privacy, fairness and accuracy. The goal of this PhD proposal is to design, develop and test new generative models for synthetic data production able to preserve privacy, guarantee fairness and good levels of accuracy.
Research objectives and methods	Synthetic data generation enables the reproduction, diversification, and augmentation of real data in contexts where data is scarce and where preserving privacy is paramount. However, synthetic data may come at costs of unrealistic synthetic populations or limited accuracy in the downstream predictions and classifications. In addition, reliable techniques for reaching satisfactory trade-offs between contrasting requirements (e.g., privacy, fairness, and accuracy) are still object of research and experimentation, as well as how to produce suitable dataset and model documentation. The goal of this PhD proposal is to design, develop and test new generative models for synthetic data production able to preserve privacy and guarantee fairness, while maintaining acceptable levels of accuracy. The focus will be mostly on tabular data because this type of data is used in most of the applications where fairness and privacy are paramount (for example, allocating social benefits or economic resources). The new developed techniques will be compared with state-of-art generative models (e.g., language models, variational autoencoder, generative adversarial network, diffusion models, self-supervised learning, etc. ) and with traditional probabilistic methods for dataset generation (e.g, Bayesan networks, univariate kernel density estimation, etc.) on a variety of evaluation measures, such as: distance from original population; differential privacy; imbalance; fairness and accuracy of models trained and tested on generated data. In addition, given the rising importance of auditable algorithms in the European legislative context, two further aspects will be investigated: i) how to properly document synthetic datasets and the model that generated them ; ii) how to generate suitable synthetic data for auditing black block systems against discrimination. Considering the different (and mostly contrasting) dimensions of evaluation above mentioned, the high-level research questions are: RQ 1. How to improve state of art generative models for synthetic data generation in a way that they can preserve privacy and fairness while maintaining acceptable levels of accuracy? RQ 2. What is the trade-off (between privacy, fairness, and accuracy) reached by the newly developed techniques in comparison of: - 1.1 state of art generative-AI models? - 1.2 traditional probabilistic methods? RQ 3. Is it possible to facilitate auditing of black box systems - 1.1 by documenting synthetic datasets and their generative models? - 1.2 by producing exhaustive and realistic synthetic data for testing black box systems against discrimination? Datasets used in the fair machine learning community will be firstly used as starting point, with possible integrations from incoming projects and/or industrial collaborations. A possible workplan is the following. Year 1: RQ 1. - Task 1.1) Analysis of state of the art about: fairness datasets, measures for evaluation (privacy, balance, fairness, distance from real population, etc), synthetic data generation (with traditional probabilistic methods and generative AI). RQ 1. - Task 1.2) Design and develop new generative models, based on existing ones. RQ 2. - Task 2.1) Design and prepare experimentation according to the measures and datasets selected in task 1.1. RQ 2. - Task 2.2) Run experiments, collect data and analyze results. Year 2: RQ 2. - Task 2.3) New experiments based on results of Task 2.2. RQ 3. Task 3.1) State of art of existing documentation suites for datasets (e.g., datasheets) and models (e.g., model cards); collect and organize online software systems that are suitable candidates for black box testing against discrimination (e.g., insurance, online advertising, etc.). RQ 3. - Task 3.2) Case study identification: select system for black box auditing, and produce suitable synthetic data sets with the model(s) developed in previous RQ. Year 3: RQ 3. - Task 3.3) Development of a new documentation suite for documenting synthetic datasets and their generative models, and application to the case study, including evaluation with users (to be identified). RQ 3. - Task 3.4) Quantitative analysis of discrimination in selected existing software systems using the synthetic data generated. Dissemination of results is a cross-cutting activity. Possible venues for publication are: Journals: - IEEE Transactions on Software Engineering - ACM Transactions on Software Engineering and Methodology - Journal of Machine Learning Research - Empirical Software Engineering - ACM Transactions on Information Systems - European Journal of Information Systems - Journal of Systems and Software - Software X - ACM Journal of Responsible Computing Conferences: - ACM Conference on Fairness, Accountability, and Transparency - AAAI/ACM Conference on AI, Ethics, and Society - International Conference on Internet Technologies & Society - EPIA Conference on Artificial Intelligence - ACM/IEEE International Symposium on Empirical Software Engineering and Measurement - International Conference on Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Application - International Conference on Software Engineering - Conference on Neural Information Processing Systems (NeurIPS) Workshops: - International Workshop on Data science for equality, inclusion and well-being challenges - International Workshop on Equitable Data & Technology - Workshop on Bias and Fairness in AI
Required skills	The candidate should have: ? Basic knowledge on software testing concepts, techniques, and methodologies. ? Basic knowledge of AI techniques. ? Good knowledge of statistical methods for analyzing experimental data. ? Proficiency in data analysis techniques and tools. ? Strong programming skills. ? Basic knowledge of the problem of algorithm bias. ? Research aptitude and curiosity to cross disciplinary boundaries. The candidate should also possess good communication and presentation skills.

12

Digital Twin development for the enhancement of manufacturing systems
Proposer	Sara Vinco
Topics	Data science, Computer vision and AI, Controls and system engineering
Group website	https://eda.polito.it/
Summary of the proposal	Industry 4.0 has deeply changed manufacturing: enormous quantity of data allows to build data-based decision-support strategies and to reduce down time and defects. Many challenges are posed by the heterogeneity and variety of data and by the construction of effective data-based analytics. This program tackles such challenges to build a virtual replica of a manufacturing system (digital twin), e.g.. targeting production lines, tire production, semiconductor manufacturing, battery management.
Research objectives and methods	The main goal of this PhD program is the construction of a digital twin of a manufacturing system, to improve production effectiveness. A digital twin is a virtual replica of the system that exploits available technologies (Artificial Intelligence, data management and mining, Internet of Things, etc.) to enhance production automatically or through decision support systems. While the technologies per se are well established, their application in real life scenarios is still preliminary. Manufacturing systems indeed entail challenges such as: extreme data variety and variability, protocol heterogeneity, lack of data collection infrastructures, reduced data availability for the training of algorithms. This PhD program seeks solutions to these challenges, to allow e.g., anomaly detection, maintenance support, and automatic optimization of the production flow. Example of application scenarios are new generation manufacturing systems, such as tire production, line production, semiconductor manufacturing. All scenarios will be investigated with the support and with case studies provided by industrial and research partners, such as Michelin, STMicroelectornics, Technoprobe. The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets and case studies, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological challenges posed by the specific industrial application. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed industry-oriented conference (e.g. ETFA), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Industrial Informatics, IEEE Systems journal, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone digital twin framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.
Required skills	The ideal candidate to this PhD program has: - positive attitude to research activity and working in team - solid programming skills - solid basics of linear algebra, probability, and statistics - good communication and problem-solving skills - some prior experience in the design and development of machine learning and deep learning architectures - some prior knowledge/experience of manufacturing processes is a plus, but not a requirement.

13

State-of-Health diagnostic framework towards battery digital twins
Proposer	Sara Vinco
Topics	Controls and system engineering, Data science, Computer vision and AI
Group website	https://eda.polito.it/
Summary of the proposal	The adoption of EVs is limited by their reliance on batteries with low energy and power densities compared to liquid fuels and subject to aging and performance deterioration. For this reason, monitoring the battery state-of-charge (SoC) and -health (SoH) is a very relevant problem. This PhD program focuses on the development of models for battery SoC and SoH, with the goal of enabling continuous monitoring of batteries and of improving their design and management throughout their lifetime.
Research objectives and methods	The main goal of this PhD program is the construction of a framework to simulate battery behavior over time, to create a virtual replica and allow the analysis of different management strategies and configurations. This will require: - The identification, analysis and adoption of datasets (both public and private) of batteries - The construction of models with different levels of accuracy and different flows, e.g., based on Artificial Intelligence techniques (e.g., Physically Informed Neural Networks, Machine Learning) and on top-down modeling techniques (e.g., circuit models)- The definition of the monitoring architecture to be installed at the level of the Battery Manage System (BMS) or in an IT infrastructure, to define decision-support solutions, digital twins to the customer, or other services All scenarios will be investigated with the support and with case studies either considered as reference for the state of the art (e.g., NASA datasets) or provided by industrial partners (e.g., automotive companies). The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets and case studies, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological challenges posed by the specific industrial application. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed power-oriented conference (e.g. ISLPED, PATMOS), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Sustainable Computing, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone digital twin framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.
Required skills	The ideal candidate to this PhD program has: - positive attitude to research activity and working in team - solid programming skills - solid basics of linear algebra, probability, and statistics - good communication and problem-solving skills - some prior experience in the design and development of machine learning and deep learning architectures - some prior knowledge of energy systems/batteries is a plus, but not a requirement.

14

Modeling, simualtion and validation of modern electronic systems
Proposer	Sara Vinco
Topics	Computer architectures and Computer aided design, Controls and system engineering
Group website	https://eda.polito.it/
Summary of the proposal	The current international semiconductor scenario is extremely competitive and is pushing for strong innovation advancement. This PhD program focuses on the development of modeling, simulation and validation flows of innovative systems, including not only digital functionality but also thermal and power flows, mechanical components (e.g., accelerometers) and analog subsystems (e.g., gate drivers). Research is supported by international projects and partners.
Research objectives and methods	Modern electronic systems are tightly coupled to mechanical, thermal, power aspects that must be taken into account at design time to ensure the correct operation of the final system. Ignoring behaviors or potential faults of connected analog subsystems or mechanical actuators may indeed lead to unsafe or incorrect behaviors, that prevent the operation of the design after deployment. This requires to extend the traditional design, simulation and validation flows with a sensibility to extra-functional and non-digital aspects. The main goal of this PhD program is the definition of such flows, through the adoption of OpenHW, standard, technologies such as SystemC(-AMS), RISC-V, IP-XACT, and other technologies that fall under the ChipsAct umbrella of EU research. Example of application scenarios are smart systems such as drones, and automotive and robotics applications. All scenarios will be investigated with the support and with case studies provided by industrial and research partners, such as Infineon and STMicroelectornics. The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start studying the extension of simulation flows to extra-functional aspects such as power or mechanics. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with support from the industrial partners, and will seek solutions to the challenge of validation of systems including heterogeneous aspects. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed EDA-oriented conference (e.g. DATE, DAC), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Computers, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly apply them to an industrial case study. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.
Required skills	The ideal candidate to this PhD program has: - positive attitude to research activity and working in team - solid programming skills - good communication and problem-solving skills - some prior experience in digital design flows - some prior knowledge/experience of analog and extra-functional domains is a plus, but not a requirement.

15

Robust AI systems for data-limited applications
Proposer	Santa Di Cataldo
Topics	Data science, Computer vision and AI
Group website	https://eda.polito.it/
Summary of the proposal	Artificial Intelligence is driving a revolution in many important sectors in society. Deep learning networks, and especially supervised ones such as Convolutional Neural Networks, remain the go-to approach for many important tasks. Nonetheless, training these models typically requires massive amount of good-quality annotated data, which makes them impractical in many real-world applications. This PhD program seeks answers to such problems, targeting important use-cases in today's society (among the others: industry 4.0 and biomedical applications).
Research objectives and methods	The main goal of this PhD program is the investigation of robust AI-based decision making in data-limited situations. This includes three possible scenarios, which are typical of many important real-world applications:- the training data is difficult to obtain, or it is available in limited quantity.- obtaining the training data is not difficult. Nonetheless, it is either difficult or economically impractical to have human experts labelling the data.- the training data/annotations are available, but the quality of such data is very poor. Possible solutions involve different approaches, from classic transfer learning and domain adaptation techniques, data augmentation with generative modelling, or semi- and self-supervised learning approaches, where the access to real data of the target application is either minimized or avoided altogether. In addition, the use of probabilistic approaches (e.g., Bayesian inference) can be of help to properly quantify the uncertainty level both at training and inference time, making the decision process more robust both to noisy data and/or inconsistent annotations. This research proposal aims to investigate and advance the state of the art in such areas. The outline can be divided into 3 consecutive phases, one per each year of the program.- In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting on the available state-of-the-art techniques. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, well-representing the three data-limited scenarios mentioned before. Stemming from the supervisors' collaborations and current research activity, these use-cases may involve industry 4.0 applications (for example: smart manufacturing and industrial 3D printing) as well as biomedicine and digital pathology. There is some scope to shape the specific focus of such use-cases with the interests and background of the prospective student, as well as with the ones of the various collaborators that could be involved in the project activity: research centers such as the Inter-departmental Center for Additive Manufacturing in PoliTO, the National Institute for Research in Digital Science and Technology (INRIA, France) as well as industries such as Prima Industrie, Stellantis, Avio Aero, etc. At the end of the second year, the candidate is expected to target at least a paper in a well-reputed conference in the field of applied AI, and possibly another publication in a Q1 journal of the Computer Science sector (e.g., Pattern Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, Expert Systems with Applications, etc.)- In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone architecture. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.
Required skills	The ideal candidate to this PhD program has: - positive attitude to research activity and working in team - solid programming skills - solid basics of linear algebra, probability, and statistics - good communication and problem-solving skills - some prior experience in the design and development of machine learning and deep learning architectures.

16

Artificial Intelligence applications for advanced manufacturing systems
Proposer	Santa Di Cataldo
Topics	Data science, Computer vision and AI
Group website	https://eda.polito.it/
Summary of the proposal	Industry 4.0 refers to digital technologies designed to sense, predict, and interact with production systems, to make decisions that support productivity, energy-efficiency, and sustainability. While Artificial Intelligence plays a crucial role in this paradigm, many challenges are still posed by the nature and dimensionality of the data, and by the immaturity and intrinsic complexity of some of the processes involved. The aim of this PhD program is to successfully tackle these challenges.
Research objectives and methods	The main goal of this PhD program is the investigation, design and deployment of state-of-the-art Artificial Intelligence approaches in the context of the smart factory, with special regards with new generation manufacturing systems. These tasks include:- quality assurance and inspection of manufactured product via heterogeneous sensors data (e.g., images from visible range or IR cameras, time-series, etc.)- process monitoring and forecasting- anomaly detection- failure prediction and maintenance planning support While the Artificial Intelligence technologies able to address such tasks may already exist and be successfully consolidated in other real-world applications, the specific domain of manufacturing systems poses severe challenges to the effective deployment of these techniques. Among the others:- the immaturity of the involved technologies- the complexity of the underlying physical/chemical processes- the lack of effective infrastructures for data collection, integration, and annotation- the necessity to handle heterogeneous and noisy data from different types of sensors/machines- the lack of annotated datasets for training supervised models- the lack of standardized quality measures and benchmarks This PhD program seeks solutions to these challenges, with specific focus on new generation manufacturing systems involving complex processes. For example: Additive Manufacturing (AM) and Semiconductor Manufacturing (SM).- AM includes many innovative 3D printing processes, which are rapidly revolutionizing manufacturing in the direction of higher digitalization of the process and higher flexibility of production. AM involves a fully digitalized process from design to product finishing, and hence it is a perfect candidate for the deployment of Artificial Intelligence. Nonetheless, it is a very complex and still immature technology, with tremendous room for improvement in terms of production time and product defectiveness. Specific use-cases in this regard will stem from the supervisors' collaborations with the Inter-departmental Center for Additive Manufacturing in Politecnico di Torino, as well as with several major industrial partners such as Prima Additive, Stellantis, Avio Aero, etc.- SM is another highly complex process, entailing a wide array of subprocesses and diverse equipment. Driven by the Industry 4.0 revolution and European Chips Act, the semiconductor industry is investing heavily in the digitalization of its production chain. As a result of these investments, the chip production process has been equipped with multiple sensors that constantly monitor the evolution of each manufacturing phase, from oxidation to testing and packaging, thus collecting a tremendous amount of heterogeneous data. To fully unveil the potential and hidden knowledge of such data, Artificial Intelligence is widely acknowledged to have a fundamental role. Use-cases in this regard will stem from the supervisors' collaborations with important industrial players in this sector, such as STMicroelectronics. The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program.- In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological and computational challenges posed by the specific industrial application. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed industry-oriented conference (e.g. ETFA), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Industrial Informatics, Expert Systems with Applications, etc.).- In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.
Required skills	The ideal candidate to this PhD program has: - positive attitude to research activity and working in team - solid programming skills - solid basics of linear algebra, probability, and statistics - good communication and problem-solving skills - some prior experience in the design and development of machine learning and deep learning architectures. - some prior knowledge/experience of manufacturing processes is a plus, but not a requirement.

17

AI for Secured Networks: Language Models for Automated Security Log Analysis
Proposer	Marco Mellia
Topics	Cybersecurity, Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://dbdmg.polito.it/dbdmg_web/
Summary of the proposal	Network security analysts are a key component of the defence infrastructure of an organization. They continuously and manually analyze security alarms and logs to make decisions against undesired intrusions. Language Models (LMs) demonstrated huge potential in processing texts. The research will evaluate the capabilities of LM agents (lightweight, large and multi-modal ones) in automating the investigations of security logs and performing zero-shot classification through generalization.
Research objectives and methods	Research objectives: Investigate and evaluate the capabilities of LLM agents in automating the manual investigations of the security analyst. This would assist them in analysis and incident reporting. The candidate will perform research to determine whether, and to what extent, the recent advances in language models could be used to automate and assist security analysts in the process (i) of learning the security-device rules by example and (ii) autonomously investigating the challenging cases currently analyzed by humans. In the second phase, the candidate will investigate how and if lightweight and generalizable language models can extract insights from raw data, as today large language models can do. The goal is to investigate whether models with limited supervision and a minimal number of trusted labels can attain comparable performance to generic large language models (LLMs) when applied to specific tasks such as code understanding, classification, anomaly detection, bug detection, or identifying security breaches. The research will consider multi-modal embeddings to conceptually constrain the embeddings towards the right task. By forcing the model to create multi-modal embeddings conceptually constrained to the right task, the model will possess the ability to generalize and autonomously reason about novel and previously unencountered tasks. For instance, test joint learning of (i) natural language label explanation of the security threat and (ii) the packet payload, using, e.g., contrastive learning techniques. The project will involve a collaboration with Huawei Technologies France and Politecnico di Torino. Outline of the research work plan: 1st year- Study of the state-of-the-art of security log analysis and state-of-the-art language models in ML.- Data collection and analysis of raw and structured data on security devices such as Firewall/Intrusion Prevention Systems (IPS), Endpoint Detection and Response (EDR) and Cloud security services. 2nd year- Adaptation and extension solutions to learn the security-device rules by example and autonomously investigate complex cases.- Propose and develop innovative solutions to the problems of cyber threats analysis with Language models.- Propose multi-modal embeddings for network raw data and security logs. 3rd year - Tune the developed techniques and highlight possible strategies to counteract the various threats.- Application of the strategies to new data for validation and testing. References: - Boffa, M., Valentim, R. V., Vassio, L., Giordano, D., Drago, I., Mellia, M., & Houidi, Z. B. (2023). LogPr\'ecis: Unleashing Language Models for Automated Shell Log Analysis. arXiv preprint arXiv:2307.08309- Boffa, M., Milan, G., Vassio, L., Drago, I., Mellia, M., & Houidi, Z. B. (2022, June). Towards nlp-based processing of honeypot logs. In 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) (pp. 314-321). IEEE.- Boffa, M., Vassio, L., Mellia, M., Drago, I., Milan, G., Houidi, Z. B., & Rossi, D. (2022, December). On using pretext tasks to learn representations from network logs. In Proceedings of the 1st International Workshop on Native Network Intelligence (pp. 21-26). List of possible venues for publications: - Security venues: IEEE Symposium on Security and Privacy, IEEE Transactions on Information Forensics and Security, ACM Symposium on Computer and Communications Security (CCS), USENIX Security Symposium, IEEE Security & Privacy; - AI venues: Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), International Conference on Machine Learning (ICML), AAAI Conference on Artificial Intelligence, ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD); - Computer networks venues: Distributed System Security Symposium (NDSS), Privacy Enhancing Technologies Symposium, The Web Conference (formerly International World Wide Web Conference WWW), ACM International Conference on Emerging Networking EXperiments and Technologies (CoNEXT), USENIX Symposium on Networked Systems Design and Implementation (NSDI).
Required skills	- Good programming skills (such as Python, Torch, Spark) - Excellent Machine Learning knowledge - Knowledge NLP and LM - Basics of Networking and security

18

Leveraging Machine Learning Analytics for Intelligent Transport Systems Optimization in Smart Cities
Proposer	Marco Mellia
Topics	Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://dbdmg.polito.it/dbdmg_web/
Summary of the proposal	Electrification and big data are changing the design of transport systems. The availability of large amounts of data collected by black boxes for insurance/safety opens innovative challenges and opportunities to improve transport systems and reduce carbon footprint. The research will focus on effective machine learning pipelines for multiple purposes including proposing new policies, optimizing fleets, and designing electrified systems, with a focus on the comparison of ICE/Electric car impact.
Research objectives and methods	Research objectives: This PhD research seeks to harness the power of machine learning and big data analytics in understanding and optimizing mobility through the analysis of data collected from black boxes in fleets of cars. This proposal outlines a comprehensive plan to leverage big data analytics for intelligent transport systems in smart cities. The impact of mobility based on electric vehicles and its comparison with previous habits will be a core part of the study. The research objectives aim to contribute valuable insights to mobility planning and optimization, and the work plan ensures a systematic approach to achieving these objectives. The PhD student will be involved in research activities with companies and funded research projects. Data will be provided by a company. Outline of the research work plan: 1st year- Study of the state-of-the-art data analysis techniques for transportation and mobility.- Data collection, Exploration and Pre-processing: Extract and pre-process raw data from black boxes, ensuring data quality and compatibility for further analysis. Develop techniques to handle missing or incomplete data.- Investigate and implement privacy-preserving methods to ensure ethical use of mobility data while still deriving valuable insights. 2nd year- Apply machine learning algorithms to identify patterns in mobility data, extracting insights into traffic flows, congestion, and usage patterns.- Implement anomaly detection mechanisms to identify unusual events and improve system resilience- Develop predictive models to forecast traffic conditions, enabling proactive measures to alleviate congestion and enhance overall traffic management.- Explore adaptive algorithms for real-time adjustments based on dynamic traffic patterns. 3rd year - Integrate developed algorithms into a cohesive system for intelligent transport systems.- Validate the system using real-world scenarios and fine-tune algorithms for optimal performance. References:- Ciociola, A., Cocca, M., Giordano, D., Mellia, M., Morichetta, A., Putina, A., & Salutari, F. (2017, August). UMAP: Urban mobility analysis platform to harvest car-sharing data. In SmartWorld/(pp. 1-8). IEEE.- Cocca, M., Giordano, D., Mellia, M., & Vassio, L. (2019). Free-floating electric car sharing: A data-driven approach for system design. IEEE Transactions on Intelligent Transportation Systems, 20(12), 4691-4703.- Cocca, M., Giordano, D., Mellia, M., & Vassio, L. (2019). Free-floating electric car sharing design: Data-driven optimisation. Pervasive and Mobile Computing, 55, 59-75. List of possible venues for publications - IEEE Transactions on Intelligent Transportation Systems- Elsevier Transportation Research- IEEE International Conference on Data Science and Advanced Analytics- IEEE International Conference on Big Data - IEEE International Smart Cities Conference- ACM Transactions on Spatial Algorithms and Systems- IEEE Transactions on Vehicular Technology- Elsevier Cities
Required skills	- Good programming and data analysis skills (such as Python, Pandas, Torch, Spark) - Excellent Machine learning knowledge - Fundamentals of Operational research

19

Natural Language Processing e Large Language Models for source code generation
Proposer	Edoardo Patti
Topics	Data science, Computer vision and AI, Software engineering and Mobile computing
Group website	https://eda.polito.it/
Summary of the proposal	This Ph.D. research is focused on revolutionizing source code generation by harnessing the capabilities of Natural Language Processing by exploring novel methodologies to facilitate the creation of high-quality code through enhanced human-machine collaboration. By leveraging advanced language models, like Generative Pretrained Transformer models, the research seeks to optimize the process, leading to more efficient, expressive, and context-aware source code generation in software development.
Research objectives and methods	The integration of Artificial Intelligence, especially Machine/Deep Learning, in industrial processes promises swift changes. Companies stand to benefit in the short term with improved production quality, efficiency, and automated routine tasks, fostering positive impacts on work environments. In addition to Natural Language Processing, Large Language Models (LLMs) have already demonstrated significant progress in healthcare, education, software development, finance, journalism, scientific research, and customer support. The future entails optimizing LLMs for widespread use, enhancing the competitiveness of the industrial system and streamlining collaborative supply chain management. The objective of this Ph.D. proposal consists of the design and development of AI-assisted models based on Natural Language Processing (NLP) and Large Language Models (LLMs) to optimize the AI-assisted source code generation in the context of software development by enhancing the process, leading to more efficient, expressive, and context-aware. During the three years of the Ph.D., the research activity will be divided into five phases:- Survey existing literature on NLP applications in software engineering and analyze methodologies and challenges in source code generation using language models.- Design and develop Large Language Models for improved programming language understanding by investigating techniques for domain-specific customization of language models.- Develop algorithms and strategies for context-aware source code generation by implementing prototype systems for evaluation and refinement.- Design and implement a collaborative framework that seamlessly integrates developer input with language model suggestions.- Evaluate the effectiveness of the collaboration framework through user studies and real-world projects. Possible international scientific journals and conference:- IEEE Transactions on Audio, Speech, and Language Processing- IEEE Transactions on Software Engineering- IEEE Transaction on Industrial Informatics,- IEEE Transactions on Industry Applications,- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- IEEE NLP-KE internat. conf.- IEEE ICNLP internat. conf.- IEEE Compsac internat. conf.
Required skills	Programming and Object-Oriented Programming (preferable in Python), Knowledge of Natural Language Processing and Large Language Models Knowledge of frameworks to develop models based on Natural Language Processing and Large Language Models

20

Cloud continuum machine learning
Proposer	Daniele Apiletti
Topics	Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website
Summary of the proposal	As the demand for novel distributed machine learning models operating at the edge continues to grow, so does the call for cloud continuum frameworks to support machine learning. In this broad context, the candidate will explore innovative solutions achieved by combining the benefits of edge-based machine learning models with the cloud continuum scenario, in a wide range of application contexts.
Research objectives and methods	Research Objectives This research aims to define new methods for improving machine learning applications in cloud computing contexts. Compared to traditional machine learning models that are trained in the cloud and can leverage virtually unlimited storage and computational resources offered by scalable data centers, the goal of the research is to investigate limitations, experimentally evaluate, and improve the state of the art in machine learning models based on distributed and federated learning techniques. Applications that are delay sensitive or generate large amounts of distributed time series data can benefit from the proposed paradigm: The computational power provided by devices at the edge and by intermediate nodes between the edge and the central cloud (fog computing) can be used to provide cloud continuum machine learning models. Innovative cloud continuum machine learning solutions will be applied using existing cloud-to-edge frameworks, while also following current EU research directions that aim to create alternatives to established hyperscalers by building an EU-based sovereign edge platform (e.g., SovereignEdge.eu, EUCloudEdgeIoT.eu, FluidOS, etc.). The proposed research can be useful in many scenarios: Time Series Data Modeling and Energy Management at different scales, from watersheds (e.g., PNRR project NODES) to smart cities, from large buildings to complex vehicles (e.g., airplanes and cruise ships), from smart manufacturing to distributed sensors in healthcare, in smart power grids, and IoT networks where devices have limited resources and are very sensitive to environmental conditions, data speed, network connectivity, and power consumption. To this end, several research topics will be addressed, such as: ? Edge AI and machine learning for next generation computing systems. ? Benefits and challenges of cloud and edge computing through comparative experimental analysis of state-of-the-art applications and real-world scenarios. ? Lightweight AI models with better efficiency for devices with limited computational and energy resources. ? Distributed and decentralized learning techniques in network monitoring and orchestration techniques. ? Mitigation and prevention of security breaches in Edge ML, using AI monitoring tools. Outline of the research work plan 1st year. The candidate will explore the state of the art of distributed machine learning techniques, such as federated learning, split learning, gossip learning, in the context of an edge computing environment. He/she will look for gaps and emerging trends in AI models in the cloud continuum and test applications of existing paradigms to a real-world application. 2nd year. The candidate will design and develop novel solutions to overcome limitations and constraints by testing proposed methods on highlighted real-world challenges. Public, artificial, and possibly real data sets will be used for the development and testing phases. New limitations and constraints are expected to be discovered during this phase. In the 3rd year, the candidate will advance the research by extending the experimental evaluation to more complicated scenarios that can better benefit from the expertise provided by the new cloud continuum of proposed machine learning solutions. To identify shortcomings and possible further advances in new application areas, the candidate will make optimizations to the proposed models. List of possible venues for publications. Journal of Grid Computing (Springer) Future Generation Computer Systems (Elsevier) IEEE TKDE (Trans. on Knowledge and Data Engineering) IEEE TCC (Trans. on Cloud Computing) ACM TKDD (Trans. on Knowledge Discovery in Data) ACM TOIS (Trans. on Information Systems) ACM TOIT (Trans. on Internet Technology) ACM TOIST (Trans. on Intelligent Systems and Technology) Information sciences (Elsevier) Expert systems with Applications (Elsevier) Internet of Things (Elsevier) Journal of Big Data (Springer) IEEE TBD (Trans. on Big Data) Big Data Research IEEE TETC (Trans. on Emerging Topics in Computing) IEEE Internet of Things Journal Journal of Network and Computer Applications (Academic Press)
Required skills	Knowledge of the basic computer science concepts. Knowledge of the main cloud computing topic. Programming skills in C-family and Python languages. Undergraduate experience with data mining and machine learning techniques. Knowledge of English, both written and spoken. Capability of presenting the results of the work, both written (scientific writing and slide presentations) and oral. Capability of guiding undergraduate students for thesis projects.

21

Graph network models for Data Science
Proposer	Daniele Apiletti
Topics	Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website
Summary of the proposal	Machine learning approaches extract information from data with generalized optimization methods. However, besides the knowledge brought by the data, extra a-priori knowledge of the modeled phenomena is often available. Hence an inductive bias can be introduced from domain knowledge and physical constraints, as proposed by the emerging field of Theory-Guided Data Science. Within this broad field, the candidate will explore solutions exploiting the relational structure among data.
Research objectives and methods	Research Objectives The research aims at defining new methodologies for semantics embedding, propose novel algorithms and data structures, explore applications, investigate limitations, and advance the solutions based on different emerging Theory-guided Data Science approaches. The final goal is to contribute to improving the machine learning model performance by reducing the learning space thanks to the exploitation of existing domain knowledge in addition to the (often limited) available training data, pushing towards more unsupervised and semantically richer models. To this aim, the main research objective is to exploit the Graph Network frameworks in deep-learning architectures by addressing the following issues: - Improving state-of-the-art strategies of organizing and extracting information from structured data. - Overcoming the Graph-Network model limitation in training very deep architectures, with a consequent loss in expressive power of the solutions. - Advancing the state-of-the-art solutions to dynamic graphs, which can change nodes and mutual connections over time. Dynamic Networks can successfully learn the behavior of evolving systems. - Experimentally evaluate the novel techniques in large-scale systems, such as supply chains, social networks, collaborative smart-working platforms, etc. Currently, for most graph-embedding algorithms, the scalability of the structure is difficult to handle since each node has a peculiar neighborhood organization. - Applying the proposed algorithms to natively graph-unstructured data, such as texts, images, audio, etc. - Developing techniques to design ensemble graph architectures to capture domain-knowledge relationships and physical constraints. Outline 1st year. The candidate will explore the state-of-the art techniques of dealing with both structured and unstructured data, to integrate domain-knowledge strategies in network model architectures. Applications to physics phenomena, images and text, taken from real-world networks such as social platforms and supply chains will be considered. 2nd year. The candidate will define innovative solutions to overcome the limitations described in the research objectives, by experimenting the proposed techniques on the identified real-world problems. The development and the experimental phase will be conducted on public, synthetic, and possibly real-world datasets. New challenges and limitations are expected to be identified in this phase. During the 3rd year, the candidate will extend the research by widening the experimental evaluation to more complex phenomena able to better leverage the domain-knowledge provided by the Graph Networks. The candidate will perform optimizations on the designed algorithms, establishing limitations of the developed solutions and possible improvements in new application fields. Target publications IEEE TKDE (Trans. on Knowledge and Data Engineering) ACM TKDD (Trans. on Knowledge Discovery in Data) ACM TOIS (Trans. on Information Systems) ACM TOIT (Trans. on Internet Technology) ACM TIST (Trans. on Intelligent Systems and Technology) IEEE TPAMI (Trans. on Pattern Analysis and Machine Intelligence) Information sciences (Elsevier) Expert systems with Applications (Elsevier) Engineering Applications of Artificial Intelligence (Elsevier) Journal of Big Data (Springer) ACM Transactions on Spatial Algorithms and Systems (TSAS) IEEE Transactions on Big Data (TBD) Big Data Research IEEE Transactions on Emerging Topics in Computing (TETC) Information sciences (Elsevier)
Required skills	- Knowledge of the basic computer science concepts. - Programming skills in Python - Undergraduate experience with data mining and machine learning techniques - Knowledge of English, both written and spoken. - Capability of presenting the results of the work, both written (scientific writing and slide presentations) and oral. - Capability of guiding undergraduate students for thesis projects.

22

Automatic composability of Large Co-simulation Scenarios for smart energy communities
Proposer	Edoardo Patti
Topics	Parallel and distributed systems, Quantum computing, Data science, Computer vision and AI, Computer architectures and Computer aided design
Group website	www.eda.polito.it
Summary of the proposal	The emerging concept of multi-energy systems is linked to heterogeneous competencies spanning from energy systems to cyber-physical systems and active prosumers. Studying such complex systems needs the usage of co-simulation techniques. However, the setup of co-simulation scenarios requires a deep knowledge of the framework and a time-consuming setup of the distributed infrastructure. The research program aims to develop automatic composability of multi-energy system co-simulations to ease usage
Research objectives and methods	A complex system such as a multi-energy system requires the accurate modelling of the heterogeneous aspects that constitute the overall phenomena under study. To achieve this goal researchers in different fields have started using co-simulation and model coupling to build new models capable of describing the interactions and the overall complexity. Such approaches give the possibility of coupling different models, running on different simulators and/or simulation engines, by exchanging data via some standard protocols over the internet. Indeed, such models have been developed and validated following a methodology that can be compared to service-oriented architecture, thus, reducing the time and complexity of building new models from scratch. Moreover, such an approach disease the interconnection of the vertical knowledge coming from each discipline/domain that is involved in the complex system, eg. ICT or Energy experts. Examples of models can be software entities that replicate the realistic behaviour of a photovoltaic (PV) system, energy storage, heating distribution networks or, even, human beans. Nowadays, researchers have invested in the usage of co-simulation orchestrators to achieve the goal of interconnection and synchronization of different models and simulators, including real-time simulators. However, the setup of the co-simulation is not an easy and trivial task as it is time-consuming and it requires the involvement of domain and co-simulation experts. This research topic aims to develop a framework, that exploits existing co-simulation orchestrators, for the automatic composability of co-simulation scenarios in a distributed infrastructure to assess different aspects of Multi-Energy-Systems. The framework will integrate models in a plug-and-play fashion reducing as much as possible the coding phase and the presence of a co-simulation expert easing the work of multi-energy systems engineers. Moreover, the framework will ease the setup in terms of computational resources for the modelling of complex and large scenarios. The final purpose consists of simulating the impact and management of future energy systems to foster the energy transition. Thus, the resulting infrastructure will integrate with a semantic approach in a distributed environment heterogeneous i) data sources, ii) cyber-physical-systems, i.e. Internet-of-Things devices iii) models of energy systems and iv) real-time simulators. The starting point of this activity will be the already existing EC-L co-simulation platform, which will be enhanced by embedding all the aforementioned features. Hence the research will focus on developing:- a methodology based on semantic web technologies for linking and interconnecting simulators automatically in a co-simulation approach- a domain-specific ontology for describing the components and interconnection of multi-energy system models- a methodology for the automatic composability and setup of the distributed infrastructures of the energetic scenario to assess (e.g., the impact of PV systems and EVs in a city) In a nutshell, the final result will provide a tool that exploits visual programming, semantic representation and cloud technologies to offer co-simulation as a service to describe multi-energy systems simulation scenarios in a plug-and-play fashion opening the usage of cosimulation to a wider audience. The outcomes of this research will be a distributed co-simulation platform for:- planning the evolution of the future smart multi-energy system by taking into account the operational phase- evaluating the effect of different policies and related customer satisfaction- evaluating the performances of hardware components in a realistic test bench During the first year, the candidate will study the literature solutions of existing co-simulation platforms to identify the best available solution for i) large-scale smart energy system simulation in distributed environments and ii) semantic web solutions to describe complex systems with a focus on the multi-energy system domain. Finally, the student will design the overall framework starting from the requirements identification and definition. During the second year, the candidate will face the implementation of the visual and semantic framework for model coupling and scenario creation. Furthermore, the candidate will start developing software solutions to automatic composability and setup of the co-simulation environments in terms of simulator deployment in a cloud system. During the third year, the candidate will complete the overall framework development and test it in different case study scenarios to assess the capabilities of the platform in terms of automatic scenario composition and setup. Possible international scientific journals and conferences:- IEEE Transaction Smart Grid- IEEE Transaction on Industrial Informatics,- IEEE Transaction on sustainable computing,- IEEE EEEIC internat. conf.- IEEE SEST internat. conf.- IEEE Compsac internat. conf.
Required skills	Programming and Object-Oriented Programming (preferable in Python), Frameworks for orchestration and setup of containerized applications, Knowledge of semantic technologies, Computer Networks

23

Multivariate time series representation learning for vehicle telematics data analysis
Proposer	Luca Cagliero
Topics	Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://www.tierratelematics.com/
Summary of the proposal	This PhD proposal aims to study new techniques for embedding multivariate time series, apply them to solve established downstream tasks, and leverage these solutions in Data Science pipelines to analyze vehicles' telematics data such as CAN Bus signals. Embeddings will not only capture the series' temporal properties but also their multi-dimensional relations. These models will be used to classify, segment, and cluster signals and to detect anomalies and communities for industrial vehicle usage.
Research objectives and methods	Description: Multivariate time series data have peculiar properties related to their sequential and multi-faceted nature. Although state-of-the-art embedding techniques tailored to time series data are effective in handling sequential data relations thanks to the use of auto-regressive or attention-based models, they often struggle to handle multiple dimensions at the same time. For example, CAN bus data acquired from vehicles cover a variety of different aspects (e.g., fuel level, coolant temperature, engine speed, ...) that are worth jointly analyzing to address predictive maintenance, anomaly detection, fleet detection and management, anomaly detection, and telematics service shaping. The PhD research will advance existing approaches to process and encode multivariate time series data, which encompass (but are not limited to) transformer models [1,2], contrastive and adversarial networks [3,4], matrix profile-based models [5,6], and Large Language Models [7]. The proposed representations will be then used to address various downstream tasks on time series data among which time series classification, forecasting, segmentation, and clustering and anomaly detection. For example, clustering and classifying CAN bus signals can be useful to automatically identify the working status of a vehicle according to both its performed activities and the environmental conditions [8]. Inter-series relations can be also analyzed to detect vehicle fleets and optimize resource allocation. Research objectives: Study of the state-of-the-art machine learning techniques for time series and compare their performance on the study case; Data collection and analysis of raw and structured data regarding vehicle telematics; Design, develop, test new approaches to time series representation; Benchmarking unimodal and multimodal time series models for time series classification, clustering, forecasting, and segmentation; Design new algorithms and methodologies to process time series data for supervised and unsupervised tasks. Industrial collaborations: The PhD activities will be supported by the ongoing research collaboration between Politecnico di Torino and Tierra Spa, a multinational telematics service provider that will provide in-domain data, expert supervision, and related case studies. In parallel, the research methods and algorithms can be also tested on benchmark data such as the UCR Time Series Classification Archive (https://www.cs.ucr.edu/~eamonn/time_series_data/) and mTAD (https://github.com/OpsPAI/MTAD). List of possible publication venues: - ECML PKDD, ACM CIKM, KDD, IEEE ICDE, IEEE ICDM, NEURIPS conferences - ACM TIST, ACM TKDD, IEEE TKDE, Elsevier Computers In Industry, Elsevier Information Sciences References: [1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention is All you Need. NIPS 2017: 5998-6008 [2] Chao Yang, Xianzhi Wang, Lina Yao, Guodong Long, Guandong Xu: Dyformer: A dynamic transformer-based architecture for multivariate time series classification. Inf. Sci. 656: 119881 (2024) [3] Sana Tonekaboni, Danny Eytan, Anna Goldenberg: Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. ICLR 2021 [4] hengyu Wang, Kui Wu, Tongqing Zhou, Zhiping Cai: Time2State: An Unsupervised Framework for Inferring the Latent States in Time Series Data. Proc. ACM Manag. Data 1(1): 17:1-17:18 (2023) [5] Eamonn J. Keogh: Time Series Data Mining: A Unifying View. Proc. VLDB Endow. 16(12): 3861-3863 (2023) [6] Yue Lu, Renjie Wu, Abdullah Mueen, Maria A. Zuluaga, Eamonn J. Keogh: DAMP: accurate time series anomaly detection on trillions of datapoints and ultra-fast arriving data streams. Data Min. Knowl. Discov. 37(2): 627-669 (2023) [7] Azul Garza, Max Mergenthaler Canseco: TimeGPT-1. CoRR abs/2310.03589 (2023) [8] Silvia Buccafusco, Luca Cagliero, Andrea Megaro, Francesco Vaccarino, Riccardo Loti, Lucia Salvatori: Learning industrial vehicles' duty patterns: A real case. Comput. Ind. 145: 103826 (2023)
Required skills	The PhD candidate is expected to - be proficient in Python and SQL programming- have a deep knowledge of statistics and probability fundamentals- have a solid background in data preprocessing and mining techniques- know the most established machine learning and deep learning techniques- have natural inclination for teamwork- be proficient in English speaking, reading, and writingWe seek motivated students who are willing to work at the intersection between academia and industry.

24

Designing a cloud-based heterogeneous prototyping platform for the development of fog computing apps
Proposer	Gianvito Urgese
Topics	Parallel and distributed systems, Quantum computing, Computer architectures and Computer aided design, Software engineering and Mobile computing
Group website	https://eda.polito.it/
Summary of the proposal	The PhD project enables SW developers to prototype complex solutions on heterogeneous systems (CPU, GPU, FPGA, Neuromorphic HW) effectively. We tackle technology transfer challenges in adopting neuromorphic HW for IoT and industry. The project objectives are the development of a Heterogeneous Prototyping Platform (HPP) that offers cost-effective testing of neuromorphic and traditional HW, a user-friendly interface, a HW-sharing system, and an energy monitoring system.
Research objectives and methods	Research objectives The technology transfer process often incurs high costs, as new HW needs to be purchased and there is a risk that the results may not meet expectations, rendering the investment futile. Currently, there is growing interest among academic and industrial developer teams in utilizing neuromorphic HW for complex IoT and industrial use cases. Neuromorphic platforms represent a new type of brain-inspired HW designed to efficiently simulate Spiking Neural Networks (SNNs), which are considered suitable for low-power and low-latency computation. However, the existing neuromorphic boards can be prohibitively expensive for companies in the early stages of experimentation. To address this issue, there is a need for an environment that allows companies to test and evaluate this new HW cost-effectively. This entails combining the neuromorphic chips with traditional components such as microcontrollers and GPUs in a heterogeneous system. The proposed solution, named Heterogeneous Prototyping Platform (HPP), enables companies to experiment with neuromorphic solutions before committing to the acquisition of the HW. This approach mitigates the risk of substantial costs and facilitates informed decision-making regarding further exploration of neuromorphic technology. The objectives of the PhD plan will focus on designing and developing a user-friendly interface for remote prototyping of digital/neuromorphic solutions. The user front-end will leverage Kubernetes and microservices technologies, which have been adopted in the cloud-based services such as the remote screen system implemented in the CrownLabs project. On the back-end, the candidate will develop a HW-sharing system to provide different users with access to the HW resources based on their specific requirements. Additionally, an energy monitor component will be designed using e-meters to track the power consumption of the various HW/SW components prototyped within the cloud platform. The envisioned platform architecture will be organized into multiple levels, including a login level, a computation level (encompassing programming, compilation, etc.), and an end-node level comprising heterogeneous HW interconnected with the system. In the project, the candidate will target several emerging HW technologies such as FPGA, GPU, Neuromorphic platforms, and parallel architectures. The activities of research will be evaluated on three primary areas of application: - Medical and bioinformatics data stream analysis; - Video surveillance and activity recognition; - Smart water management system. Outline of the research work plan 1st year. The candidate will explore cutting-edge frameworks for designing AIoT solutions on fog-based systems. He/She will analyze technologies such as SCADA systems, IoT platforms, data lakes, CrownLabs remote environment, and microservice SOA frameworks. Additionally, he/she will gain expertise in Neuromorphic HW and embedded systems, contributing to the development of tools for leveraging HW technologies in the Heterogeneous Prototyping Platform. 2nd year. The candidate will develop an integrated heterogeneous prototyping framework supporting HW-sharing system, container/virtualization, front-end interface, and energy monitoring system dashboard. Additionally, the candidate will be involved in the design and development of the HPP that facilitates the implementation and execution of neuromorphic simulations and AI applications on heterogeneous digital/neuromorphic computing systems. 3rd year. The candidate will validate the framework and platform on selected use cases with scientific and industrial partners. He/She will define and measure relevant KPIs to showcase the benefits of HPP for prototyping HW-heterogeneous computing solutions in fog systems. The candidate will also assist in integrating HPP into the EBRAINS service ecosystem. The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead fPVN project, and the EBRAINS-Italy project. List of possible venues for publications The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the bioinformatics and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events. In the following the possible conference and journal targets: - IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC); - IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).
Required skills	MS degree in computer engineering, electronics engineering. Excellent skills in computer programming, computer architecture, cloud computing, SOA paradigm, embedded systems, and IoT applications. Technical background in network, cloud services, modelling, simulation, and optimization.

25

Designing a Development Framework for Engineering Edge-Based AIoT Sensor Solutions
Proposer	Gianvito Urgese
Topics	Data science, Computer vision and AI, Life sciences, Parallel and distributed systems, Quantum computing
Group website	https://eda.polito.it/
Summary of the proposal	The transition to digitalization, driven by the Industry 4.0 paradigm, requires advanced frameworks and tools to effectively integrate System of Systems (SoS) within industrial use case scenarios. The PhD project aims to develop a framework for the digital design and testing of data fusion algorithms. The objective is to integrate distributed sensing techs deployed at the edge of industrial production lines through the fog computing paradigm, enhancing efficient and interoperable collaboration.
Research objectives and methods	Research objectives In automation use cases, information extracted by different System of Systems (SoS) modules is combined along the path from raw sensors to actuators. The pipeline typically follows an intuitive order: first, the information from the sensors is processed; then, AI-based models or expert systems elaborate on this information; finally, the extracted knowledge is used to develop a control strategy for the actuator. The objective of the PhD activity is the development of a framework to support the automation of each step in the engineering process, enabling the creation of optimized Artificial Intelligence of Things (AIoT) sensor solutions. The framework will: - Manage onboard sensor data collection and labelling, allowing developers to build datasets from each available sensing system for any given use case. - Select and customize AI-based models to accomplish classification, data fusion, and continual learning tasks compatible with the execution on edge devices. - Define optimization strategies and tools for identifying model parameters by leveraging the labelled data acquired by the onboard sensing systems. - Support the implementation of solvers for solving tasks mapped on constraint optimization problems. - Validate model accuracy on the target edge device. The framework will undergo evaluation using selected sensing technologies and analytics tasks associated with relevant industrial use cases in the automation field, such as the digitalization of smart water grids. In this project, the candidate will focus on emerging hardware technologies, including FPGA, GPU, Neuromorphic platforms, and parallel architectures, to implement new computational paradigms that optimize computation on the edge. Strong integration of Neuromorphic technology will be emphasized in the supported use cases. Outline of the research work plan 1st year. The candidate will explore cutting-edge frameworks for designing AIoT solutions on the edge. He/She will gain experience in Neuromorphic HW and embedded systems for industrial applications. Additionally, he/she will contribute to defining framework requirements, technologies, and solutions for developing AI applications on edge devices. 2nd year. The candidate will develop an integrated methodological approach to a fog computing platform for modelling applications and systems. He/She will develop a user-friendly framework for AI applications on edge devices, considering multi-scenario analysis and benchmarking. The candidate will select SW libraries for developing and deploying models in edge computing devices using the techniques available in the TinyML field, including Imitation, Continual, Federated, Deep learning, and neuromorphic models for relevant industrial use cases. 3rd year. The candidate will apply the proposed approach to different complex use cases such as the digitalization of smart water grids, enabling greater generalisation of the methodology to different domains. The candidate will define and measure relevant KPIs to demonstrate the advantages of using the developed framework, compared to the use case's baseline. The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead FPVN project and the EBRAINS-Italy project. List of possible venues for publications The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the AIoT and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events. In the following the possible conference and journal targets: - IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC); - IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).
Required skills	MS degree in computer engineering, electronics engineering or physics of complex systems. Excellent skills in computer programming, computer architecture, embedded systems, and IoT applications. Technical background in deep learning, AI, edge computing, electronic design, modelling, simulation and optimization.

26

Computational Intelligence for Computer-Aided Design
Proposer	Giovanni Squillero
Topics	Computer architectures and Computer aided design, Data science, Computer vision and AI
Group website	https://cad.polito.it
Summary of the proposal	The proposal focuses on the use and development of "Intelligent" algorithms specifically tweaked on the need and peculiarities of CAD industries. Generic techniques ascribable to Computational Intelligence have long been used in the CAD field: probabilistic methods for the analysis of failures or the classification of processes; bio-inspired algorithms for the generation of tests and optimization of parameters or the definition of surrogate models.
Research objectives and methods	The recent fortune of the term "Machine Learning" renewed the interests in many automatic processes; moreover, the publicized successes of (deep) neural networks smoothed down the bias against other non-explicable black-box approaches, such as Evolutionary Algorithms, or the use of complex kernels in linear models. The goal of the research is twofold: from an academic point of view, tweaking existing methodologies, as well as developing new ones, specifically able to tackle CAD problems; from an industrial point of view, creating a highly qualified expert able to bring the scientific know-how into a company, while being also able to understand the practical needs, such as how data are selected and possibly collected. The need to team the experts from industry with more mathematically minded researchers is apparent: frequently a great knowledge of the practicalities is not accompanied by an adequate understanding of the statistical models used for analysis and predictions. In the first year, the research will consider techniques less able to process large amount of information, but perhaps more able to exploit all problem-specific knowledge available. It will almost certainly include bio-inspired techniques for generating, optimizing, minimizing test programs; statistical methods for analyzing and predicting the outcome of industrial processes (e.g., predicting the maximum operating frequency of a programmable unit based on the frequencies measured by some ring oscillators; detecting dangerous elements in a circuit; predicting catastrophic events). The activity is also like to exploit (deep) neural networks, however developing novel, creative results in this area is not a priority. On the contrary, the research shall face problems related to dimensionality reduction, feature extraction and prototypes identification/creation. Then the research shall also focus on the study of surrogate measures, that is, the use of measures that can be easily and inexpensively gathered as a proxy for others, more industrially relevant but expensive. In this regard, the tutors are working with a semiconductor manufacturer for using in-situ sensors values as a proxy for the prediction of operating frequency. The work could then proceed by tackling problems related to "dimensionality reduction", useful to limit the number of input data of the model, and "feature selection", essential when each single feature is the result of a costly measurement. At the same time, the research is likely to help the introduction of more advanced optimization techniques in everyday tasks. From a practical standpoint, starting in the second year, the activity would continue by analyzing a current practical need, namely: "predictive maintenance". A significant amount of data is currently collected by many industries, although in a rather disorganized way. The student would start by analyzing the practical problems of data collection, storage, and transmission, while, at the same time, practicing with the principles of data profiling, classification, and regression (all topics that are currently considered part of "machine learning"). The analysis of sequences to predict the final event, or rather identify a trigger, is an open research topic, with implications far beyond CAD. Unfortunately, unlikely popular ML scenarios, the availability of data is a significant limitation, a situation where the amount of available data for training is insufficient and is sometimes labeled "small data". Expected target publications: Top journals with impact factors * ASOC -- Applied Soft Computing * TEC -- IEEE Transactions on Evolutionary Computation * TC -- IEEE Transactions on Computers Top conferences * ITC -- International Test Conference * DATE -- Design, Automation and Test in Europe Conference * GECCO -- Genetic and Evolutionary Computation Conference * CEC/WCCI -- World Congress on Computational Intelligence * PPSN - Parallel Problem Solving From Nature Notes: * The CAD Group has a long record of successful applications of intelligent systems in several different domains. For the specific activities, the list of possibly involved companies include: SPEA, Infineon (through the Ph.D. student Niccol? Bellarmino), ST Microelectronics, Comau (through the Ph.D. student Eliana Giovannitti) * The tutors are collaborating with Infineon on the subjects listed in the proposal: Two contracts have been signed, the third extension is currently under discussion; A joint paper has been published at ITC, other one was submitted, others are in preparation. * The tutors are collaborating with SPEA under the umbrella contract "Colibri". Such a contract is likely to be renewed on precisely the topics listed in the proposal.
Required skills	Required skills: Proficiency in Python (including deep understanding of object-oriented principles and design patterns); Proficiency in using libraries such as NumPy and SciPy for data analysis and manipulation // Preferred: Knowledge of Electronic CAD

28

Security of Software Networks
Proposer	Cataldo Basile
Topics	Cybersecurity, Parallel and distributed systems, Quantum computing
Group website	https://www.dauin.polito.it/research/research_groups/torsec_security_group
Summary of the proposal	The massive progress in software network complexity, flexibility, and manageability was only marginally used to increase the security of these networks: attacks may remain undiscovered for months, and human errors mainly cause them. The PhD proposal has a high-level research objective: investigating and exploiting software networks' full potential to mitigate cybersecurity risks automatically and provide defensive tools with more intelligence and a higher level of automation.
Research objectives and methods	Nowadays, attackers are always one or more steps behind the security defenders. When vulnerabilities are found, patches follow only days later, and anti-virus signature updates come after discovering new malware. Intrusion Prevention Systems provide simple reactions triggered by simplistic conditions often considered ineffective by large companies. Moreover, companies face risks of misconfiguration whenever security policies or network layouts need an update. Statistics are clear: attacks are discovered with unacceptable delays, and in most cases, attacks are caused by human errors. The solution is also clear: providing defensive tools with more intelligence and a higher level of automation. This PhD proposal aims to use these features for security purposes, i.e., to develop AI-based systems able to perform policy refinement, configure the network and security controls starting from high-level security requirements, and policy reaction to respond to incidents and mitigate risks. Coupling then understanding the features of security controls and software networks will build more resilient information systems that discover and react to attacks faster and more effectively. The initial phases of the PhD will be devoted to formalizing the framework models needed to reach the most ambitious research objectives. During the first year, the candidate will improve the model of security controls' capabilities and define the formal model of the software networks' reconfiguration abilities. The most relevant families of security controls will be analyzed, starting from filtering (up to layer seven) and channel protection. The candidate will contribute to a journal publication that extends an existing conference publication. The work on software network modelling will start with analysing the features of Kubernetes technology. It will also identify strategies to use pods and clusters to define policy enforcement units that merge security controls with complementary features for protecting network parts, which will be used for refinement purposes. The results of this task will be first submitted to a conference and then extended to a journal publication. More attention will be devoted to the refinement and reaction models from the second year. The candidate will study the possibility of building refinement models that use advanced logic (forward and abductive reasoning) to represent decision-making processes. AI (Artificial Intelligence) and machine learning techniques will be investigated to learn from decisions overridden and manual corrections made by humans for fine-tuning security decisions. The candidate will also perform research towards an abstract framework for abstractly representing reaction strategies to security events. Every strategy requires adaptations to be enforced in each context; the research will investigate how to characterize and implement this adaptation and what the proper level of abstraction for strategies is. The effectiveness of these models will be evaluated on relevant scenarios like corporate networks, ISP, automotive, and Industrial Control Systems, also coming from two EC-funded European Projects. The candidate will be guided in evaluating and deciding on the best venues to publish the results of his research. Moreover, to increase the impact of the research and cover existing gaps, the candidate will investigate how to standardize the information used to model the scenarios requiring reactions and the reaction and threat intelligence data with the proper level of detail. One or two 3-6 months internship periods are expected in an external institution. The objective is to acquire competencies that may emerge as needed. Research collaborations are ongoing with EU academia and with leading companies in the EU. We expect at least two publications on top-level cybersecurity conferences and symposia (e.g., ACM CCS, IEEE S&P) or top conferences about software networks (e.g., IEEE NetSoft). The models of the security controls and software networks' capabilities models will be submitted to top-tier journals in the cybersecurity, networking, and modelling scope (e.g., IEEE /ACM Transactions on Networking, IEEE Transactions on Network and Service Management, IEEE Journal on Selected Areas in Communications, IEEE Transactions on Dependable and Secure Computing). We also expect results for at least one journal article about the automatic enforcement and empirical assessments of software protections. Together with the journals reported above, if the innovation of the results will deserve it, also IEEE Transactions on Emerging Topics in Computing.
Required skills	The candidate needs to have a solid background in cybersecurity (risk management), defensive controls (e.g., firewall technologies and VPNs), monitoring controls (e.g., IDS/IPS and threat intelligence) and incident response. Moreover, he should also possess a background in software network technologies (SDN, NFV, Kubernetes) and cloud computing. Having skills in formal modelling and logical systems is a plus.

29

Emerging Topics in Evolutionary Computation: Diversity Promotion and Graph-GP
Proposer	Giovanni Squillero
Topics	Computer architectures and Computer aided design, Data science, Computer vision and AI
Group website	https://www.cad.polito.it/
Summary of the proposal	Soft Computing, including evolutionary computation (EC), is currently experiencing a unique moment. While fewer scientific papers focus solely on EC, traditional EC techniques are frequently utilized in practical activities under different labels. The objective of this analysis is to examine both the new representations that scholars are currently exploring and the old, yet still pressing, problems that practitioners are facing.
Research objectives and methods	Although the classical approach to representing solutions in EC involves bit strings and expression trees, far more complex encodings have been recetly proposed. More specifically, graph-based representations have led to novel applications of EC in circuit design, cryptography, image analysis, and other fields. At the same time, divergence of character, or, more precisely, the lack of it, is widely recognized as the most impairing single problem in the field of EC. While divergence of character is a cornerstone of natural evolution, in EC all candidate solutions eventually crowd the very same areas in the search space, such a "lack of speciation" has been pointed out in the seminal work of Holland back in 1975. It is usually labeled with the oxymoron "premature convergence" to stress the tendency of an algorithm to convergence toward a point where it was not supposed to converge to in the first place. The research activity would tackle "diversity promotion", that is either "increasing" or "preserving" diversity in an EC population, both from a practical and theoretical point of view. It will also include the related problems of defining and measuring diversity. The research project shall include an extensive experimental study of existing diversity preservation methods across various global optimization problems. Open-source, general-purpose EA toolkits, inspyred and DEAP, will also be used to study the influence of various methodologies and modifications on the population dynamics. Solutions that do not require the analysis of the internal structure of the individual (e.g., Cellular EAs, Deterministic Crowding, Hierarchical Fair Competition, Island Models, or Segregation) shall be considered. This study should allow the development of a, possibly new, effective methodology, able to generalize and coalesce most of the cited techniques. During the first year, the candidate will take a course in Artificial Intelligence, and all Ph.D. courses of the educational path on Data Science. Additionally, the candidate is required to improve the knowledge of Python. Starting from the second year, the research activity shall include Turing-complete program generation. The candidate will work on an open-source Python project, currently under active development. The candidate will try to replicate the work of the first year on much more difficult genotype-level methodologies, such as Clearing, Diversifiers, Fitness Sharing, Restricted Tournament Selection, Sequential Niching, Standard Crowding, Tarpeian Method, and Two-level Diversity Selection. At some point, probably toward the end of the second year, the new methodologies will be integrated into the Grammatical Evolution framework developed at the Machine Learning Lab of University of Trieste ? GE allows a sharp distinction between phenotype, genotype and fitness, creating an unprecedented test bench (the research group is already collaborating with a group in UniTS on these topics, see "Multi-level diversity promotion strategies for Grammar-guided Genetic Programming" Applied Soft Computing, 2019). A remarkable goal of this research would be to link phenotype-level methodologies to genotype measures. Target Publications Journals with impact factors - ASOC - Applied Soft Computing - ECJ - Evolutionary Computation Journal - GPem - Genetic Programming and Evolvable Machines - Informatics and Computer Science Intelligent Systems Applications - IS - Information Sciences - NC - Natural Computing - TCIAIG - IEEE Transactions on Computational Intelligence and AI in Games - TEC - IEEE Transactions on Evolutionary Computation Top conferences - ACM GECCO - Genetic and Evolutionary Computation Conference - IEEE CEC/WCCI - World Congress on Computational Intelligence - PPSN - Parallel Problem Solving From Nature Notes: The tutors regularly present tutorials on Diversity Preservation at top conferences in the field, such as GECCO, PPSN, and CEC. Additionally, they are involved in the organization of a workshops focused on graph-based representation for EA. Moreover, the research group is in contact with industries that actively consider exploiting evolutionary machine-learning for enhancing their biological models, for instance, KRD (Czech Republic), Teregroup (Italy), and BioVal Process (France). The research group has also a long record of successful applications of evolutionary algorithms in several different domains. For instance, the on-going collaboration with STMicroelectronics on test and validation of programmable devices, does exploit evolutionary algorithms and would benefit from the research.
Required skills	Proficiency in Python (including deep understanding of object-oriented principles and design patterns, and handling of parallelism); Preferred: Experience with metaheuristcs, Experience with optimization algorithms

30

Advanced ICT solutions and AI-driven methodologies for Cultural Heritage resilience
Proposer	Edoardo Patti
Topics	Data science, Computer vision and AI, Software engineering and Mobile computing, Parallel and distributed systems, Quantum computing
Group website	https://eda.polito.it/
Summary of the proposal	This Ph.D. research leverages on cutting-edge technologies to preserve Cultural Heritage (e.g., monuments, historical sites, etc.) against natural disasters, climate change, and human-related threats. The interdisciplinary approach integrates ICT tools, Machine Learning, and Data Analytics to develop proactive strategies for risk assessment, monitoring, and preservation of cultural assets by addressing challenges through innovative solutions for sustainable conservation and resilience
Research objectives and methods	Recent crises and disasters have affected the European citizens' lives, livelihoods, and environment in unforeseen and unprecedented ways. They have transformed our very understanding of them by reshaping hitherto unchallenged notions of the ?local? and the ?global? and putting into question well-rehearsed conceptual distinctions of ?natural? and ?man-made? disasters. Modern and high-performance ICT solutions need to be deployed in order to prevent and mitigate the effects of disasters and climate change events by enabling critical thinking and framing a holistic approach for better understanding of catastrophic events. The objective of this Ph.D. proposal consists of the design and development of ICT-driven solutions to develop proactive strategies for risk assessment, monitoring, and preservation of Cultural Heritage. The candidate will adopt a comprehensive interdisciplinary approach, seamlessly integrating modern techniques rooted in IoT, Machine/Deep Learning, and Big Data paradigms within the realm of cultural heritage resilience. This approach transcends purely technical facets, encompassing social and cultural dimensions to provide a holistic understanding and effective solutions. During the three years of the Ph.D., the research activity will be divided into five phases:- Survey existing literature on modern Ai-driven ICT solutions and applications in software engineering and analyze methodologies and challenges Cultural Heritage resilience.- Design and develop a data-driven digital ecosystem - i.e., distributed IoT platform - for the collection and harmonization of heterogeneous data from the real world to enable on-top advanced visualization and analysis services (e.g., Digital Twins). A multidisciplinary approach ranging from IoT paradigms to the application of Machine/Deep Learning methodologies for Big Data analysis is required in order to allow the development of proactive strategies for risk assessment, monitoring, and preservation of Cultural Heritage.- Develop algorithms and strategies for a context-aware Cultural Heritage resilience by implementing prototype systems for evaluation and refinement.- Design and implement continuous improvement and fine-tuning strategies for the development of increasingly effective and high-performing prevention strategies.- Evaluate the effectiveness of the data-driven digital ecosystem and developed strategies through user studies and real-world projects. Possible international scientific journals and conference: - IEEE Transactions on Computational Social Systems- IEEE Transactions on Industrial Informatics- Journal on Computing and Cultural Heritage- Journal of Cultural Heritage- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- IEEE CoStProgramming and Object-Oriented Programming (preferable in Python).- Knowledge of web application programming.- Knowledge of IoT paradigms.- Knowledge of Machine Learning and Deep Learning.- Knowledgeof frameworks to develop models based on Machine Learning and Deep Learning Model- internat. Conf.- IEEE SKIMA internat. Conf.
Required skills	Programming and Object-Oriented Programming (preferable in Python). Knowledge of web application programming. Knowledge of IoT paradigms. Knowledge of Machine Learning and Deep Learning. Knowledge of frameworks to develop models based on Machine Learning and Deep Learning Models

31

Monitoring systems and techniques for precision agriculture
Proposer	Renato Ferrero
Topics	Data science, Computer vision and AI, Software engineering and Mobile computing
Group website
Summary of the proposal	The most challenging current demand of the agricultural sector is the production of sufficient and safe food for a growing population without over-exploiting natural resources. This challenge is placed in a difficult context of unstable climate conditions, with competition for land, water, energy, and in an increasingly urbanized world. The research activity aims to increase the competitiveness of the agri-food system in terms of safety, quality, sustainability, and added value of food products.
Research objectives and methods	The research activity of the PhD candidate will investigate devices and techniques for monitoring the agricultural produce in a holistic vision, with the aim of limiting environmental pollution, preventing the misuse of pesticides and fertilizers, reducing water and energy request, and increasing net profit. A first activity concerns the development of a low-cost proximity monitoring system. Off-the-shelf sensors will be selected to measure the most meaningful parameters, such as temperature and humidity of both air and soil, light condition, PH of soil, concentration of NPK (nitrogen, phosphorus, and potassium) in the soil. The adoption of low cost sensors will make possible a pervasive distribution in the environments to be monitored. All the gathered data will be associated with GPS coordinates, date and time of the measurement. The measurements will be repeated several times at different points in the crop; at the end of each sample, the measurements will be synchronized on a server to keep track over time. The integration of the sensing, computing, and communication functionalities within small-size devices will be a key element for increasing the pervasiveness and robustness of the network. Possibly, the integration of the sensor network with drone-based systems will be investigated. A strictly correlated subsequent activity regards the analysis of the data collected by the sensor network: several goals are set, as detailed in the following. Different calibration strategies will be evaluated: reference values provided by other sensors will be used to determine the most effective calibration strategies and when the calibration needs to be repeated in order to ensure precise measurements. The correlation of the collected data with operating conditions and environmental conditions (e.g., measurement range, microclimatic characteristics) will be analyzed in order to assess the variability of the measurements, both in time and space. In particular, understanding spatial variability may lead to the development of models for data spatialization. Finally, the benefits of sensor redundancy, in terms of data availability, reliability, network performance, and maintainability will be investigated. A complementary research activity will focus on optical remote sensing. Non-destructive analysis techniques based on UV-Vis- NIR spectroscopy will be adapted in order to allow continuous monitoring of many critical aspects of the production. In particular, new procedures will be developed to correlate the absorption of light radiation, measured with spectroscopic techniques, with the chemical and physical properties of soil, crops, and horticultural produce. Computer graphics techniques will be studied for developing new protocols of calculation of vegetation and soil indices (e.g., NDVI, GNDVI, SAVI, RE). Images will be taken by cameras at different wavelengths, ranging from 1 to 14 microns. Algorithms for pattern analysis and recognition will be developed for the automatic identification of specific parts of the plant, such as leaves or stem, and the detailed analysis of its state of health, with the goal of correlating the images of leaves to the growth and onset of specific diseases. The PhD research activities can be grouped into three consecutive phases, each one roughly corresponding to one year in the PhD career. Initially, the PhD candidate will improve his/her background by attending PhD courses and surveying relevant literature. After this initial training, the student is expected to select and evaluate the most promising solutions for monitoring agricultural produce. The second phase regards experimental activities on the field aimed at the development of monitoring systems and techniques, such as the integration and deployment of the sensor network, the evaluation of effective calibration strategies, the acquisition of multispectral images, the computation of vegetation and soil indices, and the integration with drone-based systems. Finally, the data collected will be analyzed during the third phase with different goals: assessment of the measurement variability according to the operating conditions (e.g., measurement range, microclimatic characteristics, etc.), influence of sensor redundancy on the network performance, modeling the spatial distribution of data, relationship between sensor measurements and vegetation indices, etc. The research will be carried out as part of the activities of the National Research Centre for Agricultural Technologies (Agritech). Some expected target publications are: - IEEE Transactions on AgriFood Electronics - ACM Transactions on Sensor Networks - IEEE Transactions on Image Processing - Information Processing in Agriculture (Elsevier) - Computers and Electronics in Agriculture (Elsevier)
Required skills	As the research activity regards the design, development, and evaluation of digital technologies for the next generation agriculture in a holistic vision, the PhD candidate is required to own multidisciplinary skills: e.g., distributed computing, embedded systems, computer networks, security, computer graphics, programming, database management.

32

Designing heterogeneous digital/neuromorphic fog computing systems and development framework
Proposer	Gianvito Urgese
Topics	Parallel and distributed systems, Quantum computing, Life sciences, Data science, Computer vision and AI
Group website	https://eda.polito.it/
Summary of the proposal	The candidate will be involved in the development of:A Heterogeneous Prototyping Platform (HPP) for Spiking Neural Network (SNN) simulations and AI applications on digital/neuromorphic systems.A framework for end-to-end engineering of SNN simulations on neuromorphic devices.A SW library optimizing SNN on RISC-V-based edge devices. The PhD aims to enhance tools for developing neuromorphic solutions on fog computing systems, advancing their adoption in IoT, bioinformatics, and neuroscience domains.
Research objectives and methods	Research objectives Neuromorphic HW architectures, originally designed for brain simulations, have garnered interest in various fields, including IoT edge devices, high-performance computing, bioinformatics, industry and robotics. These platforms offer superior scalability compared to traditional multi-core architectures and excel in problems requiring massive parallelism, which is their inherent optimization. Additionally, the scientific community recognizes their suitability for low-power and adaptive applications that demand real-time data analysis. The objectives of the PhD plan encompass several key aspects:Develop the necessary knowledge to analyze available data from product documentation, extracting experimental features from complex components and systems.Evaluate the potential of Spiking Neural Networks (SNNs) efficiently simulated on neuromorphic platforms when customized at the abstraction level of a flow graph, enabling the implementation of general-purpose algorithms.Contribute to the design and development of a Heterogeneous Prototyping platform (HPP) and a framework for the development of neuromorphic solutions, covering all engineering phases from specification definition to HW procurement and installation of server nodes, Neuromorphic HW, and market-available sensors.Propose a general approach for generating simplified neuromorphic models that implement basic kernels, enabling users to directly apply them in their algorithms. The level of abstraction of these models will depend on the availability of SW libraries supporting the target neuromorphic HW.Utilize the HPP to design proof-of-concept applications by combining a set of neuromorphic models, aiming to provide outputs with acceptable error rates compared to versions running on standard systems. These applications should also reduce execution time and power consumption.Contribute to the design of a SW library that optimizes the execution of SNNs on RISC-V CPUs used in edge computing devices. The research activities will primarily focus on implementing algorithms in three main application areas:Simulations of models developed by the EBRAINS-Italy neuroscience community.Real-time data analysis from IoT and industrial applications.Analysis and pattern matching of neuroscience and bioinformatics data streams. Outline of the research work plan 1st year. The candidate will extensively study cutting-edge neuromorphic frameworks and their application in deploying simulations on various neuromorphic HW technologies. He/She will contribute to the development of a framework that enables the semi-automatic generation and connection of neuromorphic models, streamlining the modeling process and promoting the exploration of new computational paradigms. Additionally, in the first year, the candidate will participate in designing the Neuromorphic Computing component of the Heterogeneous Prototyping Platform (HPP-NC). He/She will also contribute to the design of a software library that optimizes SNN execution on standard CPUs, specifically RISC-V-based edge computing devices. 2nd year. The candidate will create an integrated methodological approach for modeling applications and systems. He/She will utilize experiences from the first year of research to conduct a multi-scenario analysis. The candidate will establish the foundational structure of a user-friendly neuromorphic computing framework, providing access and validation for the HPP-NC prototype. Additionally, he/she will define two Modelling, Simulation, and Analysis (MSA) use cases tailored to the needs of Neuroscientists, Bioinformaticians, and Data scientists/engineers. 3rd year. The candidate will implement the proposed approach in diverse industrial and IoT use cases, enabling its application across different domains. He/She will analyze investments in neuromorphic compilers for upcoming neuromorphic HW, alongside general-purpose CPUs. Moreover, the candidate will assist in integrating the HPP-NC into the EBRAINS service ecosystem. The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead fPVN project, and the EBRAINS-Italy project. List of possible venues for publications The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the AIoT and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events. In the following the possible conference and journal targets:IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC);IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).
Required skills	MS degree in computer engineering, electronics engineering or physics of complex systems. Excellent skills in computer programming, computer architecture, embedded systems, and IoT applications. Technical background in deep learning, AI, edge computing, electronic design, modelling, simulation and optimization.

33

Cloud at the edge: creating a seamless computing platform with opportunistic datacenters
Proposer	Fulvio Giovanni Ottavio Risso
Topics	Computer architectures and Computer aided design, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website	https://netgroup.polito.it Project website: https://liqo.io
Summary of the proposal	The idea is to aggregate the huge number of traditional computing/storage devices available in modern environments (such as desktop/laptop computers, embedded devices, etc.) into an opportunistic datacenter, hence transforming all the current devices into datacenter nodes. This proposal aims at tackling the most relevant problems towards the above scenario, such as defining a set of orchestration algorithms, as well as a proof-of-concept showing the above system in action.
Research objectives and methods	Cloud-native technologies are increasingly deployed at the edge of the network, usually through tiny datacenters made by a few servers that maintain the main characteristics (powerful CPUs, high-speed network) of the well-known cloud datacenters. However, most of current domestic environments and enterprises host a huge number of traditional computing/storage devices, such as desktop/laptop computers, embedded devices, and more, which run mostly underutilized. This project proposes to aggregate the above available hardware into an ?opportunistic? datacenter, hence replacing the current micro-datacenters at the edge of the network and the consequent potential savings in energy and CAPEX. This would transform all the current computing hosts into datacenter nodes, including the operating system software. The current Ph.D. proposal aims at investigating the problem that may arise in the above scenario, such as defining a set of algorithms that allow orchestrating jobs on an ?opportunistic? datacenter, as well as a proof-of-concept showing the above system in action. The objectives of the present research are the following:- Evaluate the economic potential impact (in terms of hardware expenditure, i.e., Capital Expenditures - CAPEX, and energy savings, i.e., Operating Expenses - OPEX) of such a scenario, in order to validate its economic sustainability and the impact in terms of energy consumption.- Extend existing operating systems (e.g., Linux) with lightweight distributed processing/storage capabilities, in order to allow current devices to host ?foreign? applications (in case of availability of resources), or to borrow resources in other machines and delegate the execution of some of its tasks to the remote device.- Define the algorithms for job orchestration on the ?opportunistic? datacenter, which may differ considerably from the traditional orchestration algorithms (limited network bandwidth between nodes; highly different node capabilities in terms of CPU/RAM/etc; reliability considerations; necessity to leave free resources to the desktop owner, etc). The research activity is part of the Horizon Europe FLUIDOS project (https://www.fluidos.eu/) and it is related to current active collaborations with Aruba S.p.A. (https://www.aruba.it/) and Tiesse (http://www.tiesse.com/). The research activity will be organized in three phases:- Phase 1 (Y1): Economic and energy impact of opportunistic datacenters. This would include real-world measurements in different environment conditions (e.g., University lab; domestic environment; factory) about computing characteristics and energy consumption and the creation of a model to assess potential savings (economic/energy).- Phase 2 (Y2): Job orchestration on opportunistic datacenters. This would include real-world measurements of the features required for distributed orchestration algorithms (CPU/memory/storage consumption; device availability; network characteristics), and the definition of a scheduling model that achieves the foreseen objectives, evaluated with simulations.- Phase 3 (Y3): Experimenting with opportunistic datacenters. This would include the creation of a proof of concept of the defined orchestration algorithm, executed on real platforms, with real-world measurements of the behavior of the above algorithm in a specific use-case (e.g., University computing lab, factory with many data acquisition devices, etc.) Expected target conferences are the following: Top conferences:- USENIX Symposium on Operating Systems Design and Implementation (OSDI)- USENIX Symposium on Networked Systems Design and Implementation (NSDI)- International Conference on Computer Communications (INFOCOM)- ACM European Conference on Computer Systems (EuroSys)- ACM Symposium on Principles of Distributed Computing (PODC)- ACM Symposium on Operating Systems Principles (SOSP) Journals:- IEEE/ACM Transactions on Networking- IEEE Transactions on Computers- ACM Transactions on Computer Systems (TOCS)- IEEE Transactions on Cloud Computing Magazines:- IEEE Computer
Required skills	The ideal candidate has good knowledge and experience in computing architectures, cloud computing and networking. Availability for spending periods abroad would be preferred for a more profitable investigation of the research topic.

34

AI-driven cybersecurity assessment for automotive
Proposer	Luca Cagliero
Topics	Data science, Computer vision and AI, Cybersecurity
Group website	https://www.dauin.polito.it/en/research/research_groups/dbdm_database_and_data_mining_group https://www.dauin.polito.it/research/research_groups/torsec_security_group https://www.drivesec.com/
Summary of the proposal	This PhD proposal aims to investigate how to leverage Generative AI techniques for assessing the cybersecurity posture of vehicles and automotive infrastructures and evaluating the compliance with existing standards (e.g., ISO 21434). It will also propose innovative LLM-based approaches to retrieve, recommend, and generate penetration tests and vulnerability-related information. It will also study innovative methodologies based on Multimodal Learning and Retrieval Augmented Generation.
Research objectives and methods	Research objectives Assessing the resilience of vehicles and their components has become crucial; it relies on tests that decree the security of a System Under Test. Vulnerability assessment (VA) and penetration testing (PT) are two primary complementary techniques that serve this purpose. VA is managed with automatic tools, but the existing ones rarely work in the automotive field. PT relies on teams made humans, which are costly and difficult to hire.Hence, the aim of this research is to investigate how the advancements in AI techniques can help automate the threat assessment and risk evaluation for the automotive field. The student will investigate innovative methods for the automatic processing and interpretation of the data produced by the analysis tools in their context, understand the implications from the security point of view, and use them to build a risk analysis model. Moreover, the student will explore the potential of Generative AI techniques, in combination with Search Engines, Question Answering models, and Multimodal Learning architectures to automate the process of retrieval, recommendation, and generation of penetration tests. Outline The student will get familiarity with the field of cybersecurity for automotive, the peculiarities and the normative framework. His main research goal is to leverage Generative AI to model VA/PT operations and generate new tests for assessing the verification objectives and adapting family of tests to work out of their original context. To this end, the algorithms, models, and techniques considered in the research activities will include (but are not limited to) - Large Language Models (e.g., GPT [1], Llama 2 [2], Llava [3]), to leverage the capabilities of transformer-based generative models to interpret end-users' questions posed in natural language, generate text and code that meet in-context requirements, and perform multi-hop reasoning based on Chain-of-Thought (CoT) Prompting; - Multimodal Architectures (e.g., CLIP [4]), to effectively handle input data in different modalities (e.g., images, tables, speech); - Search engines (e.g., ElasticSearch [5]), to efficiently store, index, and retrieve data about vulnerabilities and penetration tests; - Retrieval-Augmented Generation (e.g., Llama Index [6]), to efficiently address question answering tasks on proprietary data by leveraging LLM capabilities. Industrial collaborations This research will be made in collaboration with Drivesec s.r.l., which will provide the necessary automotive background, the equipment needed, and the data set for the testing and validation of the developed methods. Open resources Beyond proprietary data and industrial case studies, the PhD activities will also consider opensource data repositories, models, and projects, e.g., - MetaSploit (https://www.metasploit.com/) - MITRE (https://cve.mitre.org/) - PentestGPT (https://github.com/GreyDGL/PentestGPT) - HuggingFace (https://huggingface.co/models) List of possible publication venues - Conferences: IEEE CSR, ECML PKDD, ACM CIKM, KDD, IEEE ICDE, IEEE ICDM - Journals: IEEE TKDE, IEEE TAI, ACM TIST, IEEE TIIS, IEEE/ACM ToN, Elsevier Information Sciences, Elsevier Computers in Industry References [1] OpenAI: GPT-4 Technical Report. CoRR abs/2303.08774 (2023) [2] https://ai.meta.com/llama/ [3] Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang: LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models. CoRR abs/2312.02949 (2023) [4] https://openai.com/research/clip [5] https://www.elastic.co/) [6] https://www.llamaindex.ai/
Required skills	The PhD candidate is expected to - Have the ability to critically analyze complex systems, model them and identify weaknesses; - be proficient in Python programming; - know cybersecurity fundamentals; - have a solid background on machine learning and deep learning; - have natural inclination for teamwork; - be proficient in English speaking, reading, and writing. We seek motivated students who are willing to work at the intersection between academia and industry.

35

Applications of Large Language Models in time-evolving scenarios
Proposer	Luca Cagliero
Topics	Data science, Computer vision and AI
Group website	https://dbdmg.polito.it/ https://smartdata.polito.it
Summary of the proposal	Large Language Models are Generative AI models pretrained on a huge mass of data. Since training examples are sampled at a fixed time interval, LLMs require specific interventions to deal with time-evolving scenarios. Furthermore, they are not designed to process timestamped data such as time series and temporal sequences. The PhD proposal aims to propose new LLM-based approaches to analyze textual and multimedia sources in time-evolving scenarios and to leverage LLMs in timestamped data mining.
Research objectives and methods	Context Large Language Models (LLMs) have emerged as disruptive Artificial Intelligence technologies supporting a variety of Natural Language Generation tasks among which question answering, text summarization, and text paraphrasing [1,2]. Recently proposed LLMs such as LLaVA [3] support visual content as part of the LLM prompts beyond the raw text. LLMs are known to potentially suffer from the bias due to the inherent properties of the training examples. To overcome these "harms", various strategies such as in-context learning, probing, and fine-tuning have been proposed. Research objectives The PhD proposal has the twofold aim to address the limitations of LLMs in coping with time-evolving scenarios and timestamped data: 1) Apply LLMs in time-evolving scenarios: Several textual and visual data sources are, by design, time-evolving. Capturing their temporal evolution is relevant to address several tasks such as intent recognition [4] and summarization [5]. The research activities will investigate the design and development of innovative LLM-based approaches to solve time-evolving tasks. 2) Dealing with timestamped data: Classical LLMs are designed to handle textual data. Recent Multimodal LLMs handle visual content as well. Conversely, only a limited body of work has focused on coping with timestamped data such as time series [6]. The research activities will study new LLM-based solutions to handle timestamped data. Tentative work plan 1) Application of LLMs in time-evolving scenarios: - Analysis of the state-of-the-art of LLMs and Multimodal LLMs; - Identification of a selection of time-evolving NLP and Multimodal Learning tasks and related benchmarks. Exploration of state-of-the-art models' performance; - Proposal of new LLM-based approaches to solve the selected tasks. 2) LLMs and timestamped data: - Review of existing LLM-based approaches to time series and temporal sequences. Classification of their strengths and weaknesses; - Identification of a selection of tasks related to time series data (e.g., forecasting, segmentation, classification, anomaly detection); - Design and development of innovative LLM-based approaches to solve the selected tasks. Industrial collaborations This research activities will be partly carried out in collaboration with Amazon Research Center in Turin. List of possible publication venues - Conferences: ACM Multimedia, KDD, ACL, COLING, IEEE ICDM, ECML PKDD, ACM CIKM, INTERSPEECH, IEEE ICASSP - Journals: IEEE TKDE, ACM TKDD, IEEE TAI, ACM TIST, IEEE/ACM TASLP References [1] OpenAI. GPT-4 technical report. CoRR, abs/2303.08774, 2023. [2] Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, AdinaWilliams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur?lien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and finetuned chat models. CoRR, abs/2307.09288, 2023. [3] Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang: LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models. CoRR abs/2312.02949 (2023) [4] Patcharapruek Watanangura, Sukit Vanichrudee, On Minteer, Theeranat Sringamdee, Nattapong Thanngam, Thitirat Siriborvornratanakul: A Comparative Survey of Text Summarization Techniques. SN Comput. Sci. 5(1): 47 (2024) [5] Henry Weld, Xiaoqi Huang, Siqu Long, Josiah Poon, Soyeon Caren Han: A Survey of Joint Intent Detection and Slot Filling Models in Natural Language Understanding. ACM Comput. Surv. 55(8): 156:1-156:38 (2023) [6] Azul Garza, Max Mergenthaler Canseco: TimeGPT-1. CoRR abs/2310.03589 (2023)
Required skills	The PhD candidate is expected to - Have the ability to critically analyze complex systems, model them and identify weaknesses; - be proficient in Python programming; - know data science fundamentals; - have a solid background on machine learning and deep learning; - have natural inclination for teamwork; - be proficient in English speaking, reading, and writing.

36

Building Adaptive Embodied Agents in XR to Enhance Educational Activities
Proposer	Andrea Bottino
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://www.polito.it/cgvg
Summary of the proposal	This research explores the integration of Memory-Augmented Neural Networks (MANNs) in Embodied Conversational Agents (ECAs) to create interactive, personalized and engaging learning experiences in XR. Such ECAs can adapt to the learner's/learner group characteristics and progression to personalize education for more effective learning outcomes in both individual and collaborative learning. The challenge is to develop complex yet accessible ECAs for different educational environments.
Research objectives and methods	In the evolving landscape of AI and educational technology, the integration of MANNs and gamification in ECAs offers new opportunities to push the field of AI agents and create highly interactive, adaptive and engaging learning experiences in XR. The use of MANNs allows ECAs to store and recall previous interactions, resolving different problems of actual conversational agents and enabling unprecedented levels of personalized content and engagement. Ultimately, these ECAs can provide an enhanced learning experience that is both dynamic and responsive to learners' individual needs. In collaborative learning scenarios, these ECAs should be designed to act not just as facilitators but as active participants, encouraging group interaction and the overall learning process. The integration of these technologies offers the potential to explore new educational methodologies that align with the evolving digital competencies of today's learners. RESEARCH OBJECTIVES: 1. Enhancing ECAs with MANNs: - Develop ECAs that integrate with MANNs to provide a personalized learning experience by remembering and leveraging the learner's individual interactions and history. - Explore how these memory functions can be optimized to adapt to different learning styles and preferences. - Explore the possibilities of integrating the ECAs digital memory with emotional models to deliver lifelike interactions between agents and between the user and the agents - In the specific context of XR-based learning, address the main challenges of MANNs, such as the limited storage capacity of certain types of networks, the complexity associated with managing external memory structures and their computational overhead, the problems associated with memory recall processes, and the efficient use of memory to store and retrieve information over extended periods of time. 2. Facilitate collaborative learning through ECAs: - Develop ECAs that can dynamically participate in collaborative learning environments and contribute to and facilitate group-based educational activities. - Investigate the effectiveness of ECAs in promoting group dynamics and enhancing the collaborative learning experience. 3. Challenges in development and implementation: - Address Overcome the technological challenges associated with developing complex ECAs that can integrate advanced MANNs and gamification features. - Ensure the accessibility and effectiveness of these ECAs in a wide range of educational environments, including those with limited technological resources. - Propose novel methodologies and techniques for the MANN creation, for example or example exploit XR environments, generative AI and gamification concept to support data collection and/or models training and instruction. 4. Assessment: - Evaluate the impact of MANN-enhanced ECAs on the overall learning experience in XR environments. The multidisciplinary nature of this project includes expertise from the fields of AI, neuroscience, psychology, education and game design. A key advantage lies in the existing partnership of the proposer's research group with the Department of Neuroscience of the Faculty of Psychology of the University of Turin, which can provide invaluable insights and contributions, especially in the field of cognitive processes and neural mechanisms, enriching the depth and applicability of the project in the field of educational technologies. WORKPLAN Year 1: Foundation and State-of-the-Art Review - Q1-Q2: Conduct a comprehensive literature review on the current state of AI, Memory-Augmented Neural Networks (MANNs), Embodied Conversational Agents (ECAs), neuroscience, learning and education. Identify gaps in the current research and develop a detailed research proposal addressing these gaps. - Q3 ? Q4: Begin preliminary development of the ECA framework, focusing on basic integration of neural memory. Year 2: Development and Initial Testing Q1-Q2: Develop advanced features for the ECA, incorporating MANNs. Q3-Q4: Testing of the framework with a user panel, and its refinement according to users feedback. Year 3: Implementation, Evaluation, and Thesis Writing Implement the ECA in real-world individual and collaborative educational setting. Collect user data on its effectiveness, user engagement, and learning outcomes. PUBLICATION VENUES Journals: IEEE Trans. on Neural Networks and Learning Systems, IEEE Trans. on Learning Technologies, IEEE Trans. on Visualization and Computer Graphics, Neurocomputing, International Journal of Neural Systems Conferences: IJCAI, NeurIPS, ICONIP, AAAI, ICML, ICRA, and other conferences about the project topics COLLABORATIONS The proposer's research group is collaborating with the Neuroscience Department of the Faculty of Psychology of the University of Turin, which will be involved in the project as domain expert and will help in providing insights about neural memory models, develop use cases, design the approach and support the assessment phase.
Required skills	The ideal candidate for this PhD project should possess the following skills and characteristics: - Expertise in Artificial Intelligence and Machine Learning - Proficiency in programming languages such as Python and experience in handling and analyzing large data sets. - Familiarity with XR technologies: - Good research and analytical skills: - Excellent communication and collaboration skills - Publication and scientific writing skills - Adaptability and problem-solving skills

37

Real-Time Generative AI for Enhanced Extended Reality
Proposer	Andrea Bottino
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://www.polito.it/cgvg
Summary of the proposal	The integration of generative AI (GenAI) in extended reality (XR) offers transformative potential for the creation of dynamic and immersive experiences in many fields. The project aims to develop optimized GenAI models for XR, with a focus on algorithms that efficiently generate realistic content within the computational limitations of XR hardware.
Research objectives and methods	In the rapidly evolving field of extended reality (XR), the integration of GenAI offers transformative opportunities. GenAI is at the forefront of creating realistic, dynamic and immersive XR experiences. Its ability to automatically generate complex data such as geometries, textures, animations and even emotional voice modulations has a significant impact on various sectors, including education, entertainment and professional training. However, the practical implementation of these advanced technologies in XR faces critical challenges. The main challenge is to balance the generation of high-quality content with the real-time processing requirements of XR environments. XR devices, known for their limited processing power, require algorithms that are not only efficient but can also operate under the constraints of lower processing power. This requirement becomes even more critical when you consider that high-resolution data and sophisticated animations are required to ensure a truly immersive experience. In addition, real-time generation of detailed and diverse content ? from lifelike avatar animations to context-sensitive geometries or textures ? poses significant problems in terms of computational complexity. To overcome these hurdles, the development of lightweight yet powerful GenAI models is crucial. Such models must strike an appropriate balance between execution speed and output quality to ensure that the immersive experience is not compromised. RESEARCH OBJECTIVES 1. To develop efficient real-time GenAI algorithms that can operate in real-time in XR environments. These models should efficiently generate high quality data tailored to the computational constraints of XR devices. By fidning the right balance between computational efficiency and content quality, these models will enable more complex and realistic XR applications. This will increase user engagement and expand the range of possible XR experiences, from games to professional training simulations. 2. Innovations in the creation of lifelike avatars and environment simulations with a focus on realistic body and facial animations and the generation of contextual data. The focus is on generating these elements in real time and adapting to user interactions and changes in XR space. Realistic avatars and environments are key to immersive XR experiences. By improving these aspects, the project aims to increase the sense of presence and immersion for users. 3. Developing systems that can modulate voice and emotional responses in real time based on user interactions. This includes developing AI models capable of understanding and responding to users' emotions to enhance the communicative and interactive aspects of XR. Emotional responsiveness in AI will lead to more natural and intuitive user experiences in XR. This is particularly important for mental health, education and cultural heritage applications where user engagement and emotional connection are critical. 4. Ensure that the GenAI models and techniques developed are compatible with different XR platforms and scalable to different hardware capacities. This also includes ensuring that the solutions can be adapted to future advancements in XR technology. WORKPLAN - Phase 1: Analysis of the state of the art in GenAI and existing XR systems, identification of gaps and potentials. - Phase 2: Development of generative algorithms for the creation of XR content (geometries, textures, animations). - Phase 3: Exploring vocal and emotional modulation and integrating these capabilities into XR avatars. - Phase 4: Optimization of the models to ensure real-time performance on XR devices. - Phase 5: Evaluation of the developed models in terms of visual quality and performance. PUBLICATION VENUES Journals: IEEE Trans. on Visualization and Computer Graphics, Virtual Reality, Pattern recognition, IEEE Trans. On Affective Computing, Computers & Graphics, International Journal of Human-Computer Studies. Conferences: CVPR, ICPR, ECCV, ICCV, IROS, IJCAI, NeurIPS, ICRA, and other conferences about the project topics
Required skills	The ideal candidate should have a strong background in computer science and AI, with specific skills in generative algorithms and XR. Problem-solving abilities, creativity, and knowledge of model optimization for low-power devices are essential. Experience in GPU programming and immersive user interface development is also required. We also require good communication and collaboration skills and publication and scientific writing skills

38

Transferable and efficient robot learning across tasks, environments, and embodiments
Proposer	Raffaello Camoriano
Topics	Data science, Computer vision and AI
Group website	http://vandal.polito.it/
Summary of the proposal	The project's goal is the design of efficient methods for training, transfer, and inference of high-capacity models for embodied systems. Promising approaches include knowledge distillation, recent fine-tuning and approximation methods reducing the policy execution cost while retaining performance levels. Moreover, constraining model output space to low-dimensional manifold structures arising from the physics of the target problem also holds promise to improve policy efficiency and safety.
Research objectives and methods	Classical learning methods for robotic perception and control tend to target specific skills and embodiments, due to the difficulties in extracting transferable and actionable representations which are invariant to physical properties of the environment and of the robot. However, the performance of such specialized agents can be limited by low model capacity and training on relatively few examples. This can be particularly problematic when tackling complex and long-horizon tasks for which the cost of large-scale data collection on a single robot can be prohibitively high and the complexity of the policy to be learned might benefit from a more expressive function class (i.e., with a larger number of parameters). Conversely, recent high-capacity, highly flexible machine learning models, such as vision transformers and large multimodal models, proved their worth in less constrained domains such as computer vision and NLP. In such domains, pre-training on large and diverse datasets is possible due to web-scale data availability. This results in rich ?generalist? pre-trained models enabling model fine tuning and adaptation to specific target tasks with large savings in terms of target data collection and positive transfer to new tasks and visual appearances. A growing research line investigates the extension of high-capacity models to robotic tasks to enable complex skill learning across embodiments and modalities, thanks to the high flexibility of high-capacity architectures (e.g., GATO [1]). RobotCat [2] demonstrates how such models can be applied to solve complex robotic manipulation tasks with visually defined goals, while Open X Embodiment [3] demonstrates positive transfer for task goals specified in natural language. Octo further extends this concept by supporting multimodal goal definitions [4], while AutoRT [5] also supports multi-robot coordination. Large language models can also be employed to guide exploration and automate reward design for reinforcement learning [6]. However, these methods rely on very large numbers of parameters (i.e., in the order of billions), rendering model storage and real-time inference a challenge. This is a relevant roadblock when local execution on limited robotic hardware is required, as is often the case in open-world unstructured environments. Some of the most advanced multi-embodiment models (e.g., RT-2-X [3]) are so extensive that they cannot be stored locally and require communication with cloud environments to perform inference. Even more so when model fine-tuning or open-ended learning are required for tackling new tasks. Impractical computational and communication costs and catastrophic forgetting of previous tasks indeed represent a major challenge. The objective of this project is the development of efficient methods for training, transfer, and inference of generalist high-capacity models for embodied and robotic tasks. Several approaches will be investigated, including the use of knowledge distillation, recent fine-tuning methods which proved to reduce the cost of execution of robotic policies (i.e., RT-2-X) from quadratic to linear while retaining performance levels [7], and approximation methods to reduce the number of parameters while retaining approximation power [8]. Moreover, constraining model output space to low-dimensional manifolds arising from the physical constraints of the target problem also holds promise to improve policy efficiency and safety [9] [10]. Potential publication venues include major AI, ML, robotics, and computer vision venues (e.g., TRO, RAL, TPAMI, JMLR, ICRA, IROS, CoRL, NeurIPS, ICML, ICLR, etc.) Preliminary Main Activities Plan- M1-M4 Literature review on foundation models for robot learning- M3-M7 Empirical analysis of state-of-the-art methods for improving foundation model efficiency- M8-M15 Design and development of novel efficient methods focusing on robotic requirements and resource constraints- M16-M22 Experimental evaluation of the proposed methods- M23-M28 Development of novel methods incorporating output space constraints to enforce safety requirements while retaining efficiency and predictive capabilities- M28-M32 Experimental validation and dissemination of the results- M32-M36 Thesis writing References [1] Reed, Scott, et al. "A generalist agent." Transactions on Machine Learning Research (2022). [2] Bousmalis, Konstantinos, et al. "RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation." Transactions on Machine Learning Research (2023). [3] Padalkar, Abhishek, et al. "Open x-embodiment: Robotic learning datasets and rt-x models." arXiv preprint arXiv:2310.08864 (2023). [4] Team, Octo Model, et al. "Octo: An open-source generalist robot policy." (2023). [5] AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents https://auto-rt.github.io/static/pdf/AutoRT.pdf [6] M. Kwon, S. M. Xie, K. Bullard, and D. Sadigh, ?Reward design with language models,? in Proc. Int. Conf. Learn. Representations, 2023, pp. 1?18. [7] Leal, Isabel, et al. "SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention." arXiv preprint arXiv:2312.01990 (2023). [8] Xiong, Yunyang, et al. "Nystr?mformer: A nystr?m-based algorithm for approximating self-attention." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. No. 16. 2021. [9] Liu, Puze, et al. "Robot reinforcement learning on the constraint manifold." Conference on Robot Learning. PMLR, 2022. [10] Duan, Anqing, et al. "A structured prediction approach for robot imitation learning." The International Journal of Robotics Research (2023).
Required skills	We seek candidates highly motivated to conduct methodological research in ML and robotics. An excellent background in ML is required, covering theory and software. Proficiency with Python and ML, robotics, or CV frameworks are a must. Strong communication skills, self-motivation, proven teamwork experience, and independence are necessary. A proven track record and certifications of fluent speaking and technical writing in English are required. Prior research experience is highly appreciated.

39

Neural Network reliability assessment and hardening for safety-critical embedded systems
Proposer	Matteo Sonza Reorda
Topics	Computer architectures and Computer aided design, Data science, Computer vision and AI
Group website	https://cad.polito.it/
Summary of the proposal	Neural Networks are increasingly used within embedded systems in many application domains, including cases where safety is crucial (e.g., automotive, space, robotics). Possible hardware faults affecting the underlying hardware (CPU, GPU, TCU) can severely impact the produced results. The goal of the proposed research activity is first to estimate the probability that critical failures are produced, and then to devise effective solutions for system hardening, playing mainly at the software level.
Research objectives and methods	NNs are increasingly adopted in the area of embedded systems, even for safety-critical applications (e.g., in the automotive, aerospace and robotics domains), where the probability of failures must be lower than well-defined (and extremely low) thresholds. This goal is particularly challenging, since the hw used to run the NN often corresponds to extremely advanced devices (e.g., GPUs, or dedicated AI accelerators), built with highly sophisticated (and hence less mature) semiconductor technologies. On the other side, NNs are known to own some intrinsic robustness, and can tolerate a given number of faults inside the hardware. Unfortunately, given the complexity of the NN algorithms and of the underlying architectures, an extensive analysis to understand which (and how many) faults are particularly critical is difficult to perform, at least when usual computational resources are available. The planned research activities aim first at exploring the effects of faults affecting the hardware of a GPU/AI accelerator supporting the NN execution. Experiments will study the effects of the considered faults on the results produced by the NN. This study will mainly be performed resorting to fault injection experiments. In order to keep the computational effort reasonable, different solutions will be considered, combining simulation- and emulation-based fault injection with multi-level one. The trade-off between the accuracy of the results and the required computational effort will also be evaluated. Based on the gathered results, hardening solutions acting on the hardware and/or the software will be devised, aimed at improving the resilience of the whole application with respect to faults, and thus matching the safety requirements of the target applications. The proposed plan of activities is organized in the following phases (for each phase, the indicative time span in months from the beginning of the PhD period is reported): - phase 1 (M1 to M6): the student will first study the state of the art and the literature in the area of NNs, their implementation on different platforms (including CPUs, GPUs, and hardware accelerators) and their applications. At the same time, the student will become familiar with existing fault injection environments (e.g., NVbitFI). Suitable cases of study will also be identified, whose reliability and safety could be analyzed with respect to faults affecting the underlying hardware. - phase 2 (M7-M18): suitable solutions to analyze the impact of faults on the considered accelerator will be devised and prototypical environments implementing them will be put in place. - phase 3 (M19-M24): based on the results of a set of fault injection campaigns performed to assess the reliability and safety of the selected cases of study, a detailed analysis leading to the identification of the most critical faults/components will be carried out. - phase 4 (M25 to M36): suitable hardening solutions will be proposed and evaluated. Phases 2 to 4 will include dissemination activities, based on writing papers and presenting them at conferences (e.g., ETS, VTS, IOLTS, DATE). The most relevant proposed methods and results will be submitted for publication on the journals in the field, such as the IEEE Transactions on Computers, CAD, and VLSI, as well as Elsevier Microelectronics & Reliability. We also plan for a strong cooperation with the researchers of other universities and research centers working in the area, such as the University of Trento, the University of California at Irvine (US), the Federal University of Rio Grande do Sul (Brazil), NVIDIA.
Required skills	The candidate should own basic skills in - digital design - computing architectures - neural networks

40

Design of an integrated system for testing headlamp optical functionalities
Proposer	Bartolomeo Montrucchio
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://www.dauin.polito.it/it/la_ricerca/gruppi_di_ricerca/grains_graphics_and_intelligent_systems https://www.italdesign.it/services-electric-and-electronics/harness-and-lighting/
Summary of the proposal	Automobile recent development are based on several sensors such as cameras, radars, and others. These sensors are used also for improving road illumination, both for human and autonomous drivers. The purpose of the work will be to design new automatic systems for managing illumination; computer vision algorithms and image processing methods will be used together with optical design, in co-working with Italdesign S.p.A.
Research objectives and methods	Automobile evolution requires increasingly automatic systems for driving and traffic detection, for example of other cars, bicycles or other vehicles. Therefore, also lighting systems are in fast evolution. In particular future vehicles' headlamps will move towards several light sources independently driven, up to several thousand of different sources, each of them driven by means of a technology similar to the one used in digital micromirror projectors. The final purpose is to develop a headlamp able to move automatically the light on obstacles like pedestrian on bicycles suddenly appeared on the road. In order to find where to move the light all the sensors available in the car can be used, mainly cameras and radars. This PhD proposal aims at developing the already existent system up to a higher complexity level that allows measurements on matrix high beam functionalities in function of different road and car simulated configurations. The proposal puts together competences of Dipartimento di Automatica e Informatica and industrial strong knowledge of Italdesign S.p.A.. Therefore experimental activies, also in the foreign sites on the company, mainly in Germany, will be performed. This work will be developed during the three years, following the usual Ph.D program: - first year, improvement of the basic knowledge about lighting systems, attendance of most of the required courses, also on applied optics, submission of at least one conference paper - second year, design and implementation of new algorithms for testing headlamp optical functionalities and submission of conference papers and at least one journal - third year, finalization of the work, with at least a selected journal publication. Possible venues for publication will be, if possible, journals and conferences related to computer vision and optics, from IEEE, ACM and SPIE. An example could be the IEEE Transactions on Image Processing. The scholarship is sponsored by Italdesign S.p.A. A period of six months abroad will be done during the PhD, and a period of at least six months in Italdesign will be mandatory too. The work will therefore be done in strict collaboration together with Italdesign Giugiaro S.p.A, with whom there is already a collaboration.
Required skills	The ideal candidate should have an interest in optics, computer vision, and image processing. The candidate should also have a good background in programming skills, mainly in Python. Good teamwork skills will be very important, since the work will require to be integrated with company work.

41

Machine unlearning
Proposer	Elena Maria Baralis
Topics	Data science, Computer vision and AI
Group website	https://dbdmg.polito.it https://smartdata.polito.it
Summary of the proposal	Machine Unlearning is the task of selectively erasing or modifying previously acquired knowledge from machine learning models. This is particularly relevant nowadays due to the increasing concerns surrounding privacy (e.g. the Right To Be Forgotten required by GDPR) and copyright infringements, as highlighted by recent cases involving Large Language Models. The key goal of this proposal is to propose novel architectures, algorithms and evaluation metrics for Machine Unlearning.
Research objectives and methods	In recent years, the rapid advancement of machine learning models, particularly Large Language Models (LLMs), has raised significant concerns regarding privacy and intellectual property rights. The need for responsible AI practices has become increasingly evident, driven by legal frameworks such as the General Data Protection Regulation (GDPR) that mandates the Right To Be Forgotten. Additionally, high-profile cases involving LLMs have highlighted the need to address issues related to the unintentional retention of sensitive information and potential copyright infringements. The proposed research activity on Machine Unlearning (MU) aims to tackle these challenges by developing novel techniques to selectively erase or modify previously acquired knowledge from machine learning models. The primary objectives of this research are twofold: first, to explore the current state of the art in MU, and second, to propose innovative architectures, algorithms, and evaluation metrics to enhance the efficacy of the unlearning process. Through these goals, the aim is to contribute to the establishment of ethical and responsible AI practices, ensuring compliance with legal requirements and mitigating the risks associated with unintentional information retention by machine learning models. The workplan for this PhD is structured to comprehensively address the multifaceted challenges of MU. The research will focus on proposing novel architectures and algorithms that facilitate effective unlearning while preserving the model's overall performance. Given the current lack of definitive metrics for MU, part of the research efforts will be focused toward trying to identify more suitable and comprehensive metrics. The research activity progresses from foundational research to the practical implementation, validation and application of MU techniques. An outline of the possible research plan is as follows. - First year The first year will be dedicated to literature review and conceptualization, leading to the formulation of the main research objectives for the rest of the doctorate. This initial phase involves an extensive study of the literature, identifying gaps and shortcomings ? leading to the definition of initial proposals for improvements over state-of-the-art techniques. - Second year Based on the areas of opportunity identified and the preliminary proposals made, the candidate will work on the ideation and implementation of novel architectures and algorithms for MU, with ongoing validation and refinement based on the feedback obtained from experiments and evaluations. -Third year The final year will focus on consolidating the findings and defining the applications of main interest for the output produced. During the second/third year, the candidate will have the opportunity to spend a period of time abroad in a leading research center. Publication venues for this research include leading conferences and journals in the fields of machine learning and artificial intelligence. Key conferences include the conference on Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML), and the International Conference on Representation Learning (ICLR). Additionally, reputable journals such as the Journal of Machine Learning Research (JMLR) and the IEEE Transactions on Neural Networks and Learning Systems will be sought for in-depth dissemination of research contributions.
Required skills	The candidate should have a strong computer and data science background, in particular for what concerns: - Strong programming skills ? preferably in Python - Thorough understanding of theoretical and applied aspects of machine and deep learning - Fundamentals of Natural Language Processing

42

Generative AI models for enhanced text-to-image synthesis
Proposer	Lia Morra
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	http://grains.polito.it - http://dbmg.polito.it
Summary of the proposal	This research proposal aims to overcome limitations in current generative text-to-image models. Despite advancements in visual fidelity, existing models struggle with precise control over generated images in response to detailed prompts. The candidate will research innovative strategies to improve spatial composition and alignment with user-defined specifications, including the application of neuro-symbolic AI to embed logical constraints and leveraging background ontological knowledge.
Research objectives and methods	While current generative text-to-image latent diffusion models have reached unprecedented results in terms of visual fidelity, there are still open issues to be addressed in exerting precise control over the generated images. On the one hand, generative models have difficulty in creating correct images when the textual prompt contains many details and often with object placement and spatial awareness. Recent text-to-image latent diffusion models have shown substantial improvements in prompt following, yet still struggle with the use of words such as ?left? or ?behind?. Increasing the size of the model has so far led to small improvements on these aspects, in the face of a significant increase in hardware requirements. Alternatively, other recent works have looked into improving captions at training time. Neither approach, so far, as successfully addressed spatial composition. One possible reason lies in the inherent limitations of the text embedding employed to condition the generation process, that fails to learn sufficiently detailed and disentangled representation; this issue would not be necessarily solved by increasing the amount or complexity of training data. One the other hand, there is an also an ongoing struggle in aligning the generated output with human values. Generative models may generate offending images, perpetuate societal biases and stereotypes embedded in the training data, or ?regurgitate? training samples potentially exposing the user to inadvertent copyright infringements. While vendors have generally responded by establishing safeguards for specific inputs or outputs, a more general, robust and reliable solution is called for. For instance, recent preliminary results have shown that neuro-symbolic AI techniques could be used, in toy datasets, to sample from an unconditioned model under user-defined logical constraints. Research objectives: The present proposal aims at investigating novel ways to condition the generation process to ensure that the generated images comply with user specifications, both in terms of specific content (e.g., ?photo of a man, sitting at the right of the woman, who is looking towards a window at their left?) and/or in terms of general properties and rules (e.g., a photo of a nude person may be considered offensive and should be avoided). To this aim, several strategies will be investigated and compared, such as: - defining and integrating richer, more structured representations, such as scene graphs, as an intermediate step to disambiguate textual prompts, incorporate greater spatial awareness and increase control in image composition; - exploiting emerging techniques, such as neuro-symbolic AI, to incorporate logical constraints in the training objective or in the sampling process; - exploiting background ontological knowledge to further constraint and guide the generation; for instance, better differentiating between encyclopedic facts (e.g., Superman is a superhero) and general concepts (e.g., superhero) could prevent the model from excessively relying on memorization of frequent observed patterns. Outline of the research plan: In Year 1, the candidate will review of the current state-of-the-art on controllable image synthesis, text-to-image generative models and their inherent limitations and biases. The candidate will also strengthen competences and skills required to tackle the research program. A suitable dataset of challenging prompts, biased outputs and failures will be created by extensively reviewing open and closed source system, as well as the relevant literature. This dataset will provide the basis for the experimental validation. In Year 2, the candidate will investigate novel methods to increase control in object position, spatial composition and fine-grained detail in text-to-image synthesis, incorporating structured representations and/or logical constraints as detailed above. The proposed techniques will be compared against other strategies based, e.g., on prompt engineering and chain-of-thought prompting, in terms of quality, computational cost, resources and biases. In Year 3, the candidate will move into investigating issues related to promote fair and robust behaviors across all prompts. The proposed techniques will be extended to ensure that all generated outputs are consistent with basic rules, such as avoid to generate, or ensure sufficient diversity in the generated images. Possible publication venues include international peer-reviewed journals in the fields related to the current proposal, such as: IEEE Transactions Image Processing, IEEE transactions Pattern Analysis and Machine Intelligence, Pattern Recognition, Computer Vision and Image Understanding, International Journal of Computer Vision, and top-tier international conferences, such as CVPR, ICCV, ECCV, NeurIPS, ICPR, ACM Multimedia.
Required skills	- Good knowledge of machine learning, deep learning, and generative models. - Preferred previous experience with diffusion model, large language models, or multi-modal models - Strong analytical skills

43

Test, reliability, and safety of intelligent and dependable devices supporting sustainable mobility
Proposer	Riccardo Cantoro
Topics	Computer architectures and Computer aided design, Cybersecurity
Group website	https://cad.polito.it
Summary of the proposal	The research addresses the pressing need for dependable electronic systems in safety-critical domains, specifically focusing on sustainable mobility. The objective is to develop innovative hardware and software methodologies to qualify electronic systems against stringent reliability and safety requirements. The work will involve developing suitable hardening techniques on the hardware, software safety mechanisms, and a comprehensive assessment methodology supported by EDA partners.
Research objectives and methods	Research objectives The novelty of this research lies in its focus on sustainable mobility, which is an emerging area of research with great potential for real-world impact. The work is expected to significantly improve the reliability and safety of electronic systems, thereby enhancing the performance of safety-critical applications. The research team's expertise in electronic design automation (EDA) will be leveraged to develop robust methodologies that are both practical and effective. Furthermore, this research is aligned with the goals of the National Centers on Sustainable Mobility and HPC, as well as the Extended Partnership on Artificial Intelligence, which further emphasizes its significance in advancing the state-of-the-art in this field. The objectives of this research are summarized as follows:- Identify a suitable hardware platform for sustainable mobility applications with particular emphasis on RISC-V based systems.- Identify suitable software for mobility applications to be used as a representative benchmark for the qualification activities.- Assess dependability figures on the identified hardware/software infrastructure to identify critical parts of the design that require hardening.- Develop innovative hardening solutions to improve the reliability of critical areas in the design.- Focus on sustainable mobility as an emerging area of research with great potential for real-world impact.- Establish a comprehensive assessment methodology in collaboration with EDA partners. Outline of possible research plan First year: The candidate will start by conducting a thorough literature review on dependable electronic systems and sustainable mobility to identify the most recent and relevant research works. They will then select a suitable hardware platform for sustainable mobility applications, taking into account a variety of RISC-V based systems publicly available, and using IP cores from industrial partners (e.g., Synopsys). Furthermore, they will identify suitable software for mobility applications, including AI applications, and leverage publicly available benchmarks developed for other domains such as automotive and space. The candidate will develop a preliminary assessment methodology for the identified hardware and software infrastructure, which will be refined and improved in the following years. Second year: The candidate will focus on identifying the critical parts of the design that require hardening and implementing initial solutions to enhance the overall system reliability. They will perform dependability analysis on the identified hardware/software infrastructure, aiming to improve the quality of the developed assessment framework. The candidate will explore various hardening techniques, including redundancy, error-correcting codes, and fault-tolerant architectures, and select the most suitable ones to enhance the system reliability and safety. Third year: The candidate will develop innovative hardening solutions to improve the reliability of critical areas in the design, while ensuring the availability of safety mechanisms in the event of a fault. They will develop a comprehensive assessment methodology in collaboration with EDA partners, leveraging their expertise in electronic design automation to refine and optimize the assessment process. The proposed methodologies will be extensively evaluated through simulations and testing, and the candidate will collaborate with industry partners to validate their effectiveness on real-world applications. List of possible venues for publications The candidate will prepare and submit papers to top-tier conferences and journals in the field of electronic systems, embedded systems, and fault tolerance. Possible venues for publications could include:- IEEE Transactions on Computers - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems- IEEE Transactions on Very Large Scale Integration (VLSI) Systems- International Conference on Computer-Aided Design (ICCAD)- International Test Conference (ITC)- IEEE European Test Symposium (ETS)- Design, Automation and Test in Europe Conference (DATE)- RISC-V Summit Projects The research is consistent with the themes of the National Centers on Sustainable Mobility and HPC, as well as with those of the Extended Partnership on Artificial Intelligence, in which members of the CAD group participate. The research will be supported by industrial partners involved in active collaborations. Synopsys is involved in research activities on functional safety and reliability, and provides licensed tools, IP cores, and support. Infineon is also involved in the frame of research contracts on electronic system dependability.
Required skills	Background in digital design and verification. Solid foundations on microelectronic systems and embedded system programming. Experience with fault modeling and testing techniques for digital circuits, such as stuck-at, transition, and path-delay faults. Knowledge of EDA tools, particularly for fault simulation.

44

Cybersecurity for a quantum world
Proposer	Antonio Lioy
Topics	Cybersecurity, Parallel and distributed systems, Quantum computing
Group website	https://security.polito.it/ https://qubip.eu/
Summary of the proposal	Cybersecurity is typically based on cryptographic algorithms (e.g. RSA, ECDSA, ECDH) that are threatened by the advent of quantum computing. Purpose of this research is to create quantum-resistant versions of various security components, such as secure channels (e.g. TLS, IPsec), digital signatures, secure boot, Trusted Execution Environment (TEE). The final objective is the design and test of quantum-resistant versions of security solutions in an open-source environment (e.g. Linux, Keystone).
Research objectives and methods	Hard security is typically based on mathematical cryptographic algorithms that support computation of symmetric and asymmetric encryption, key exchange, digital signature, and hash values. Several of these algorithms (e.g. RSA, ECDSA, ECDH) are threatened by the advent of quantum computing. NIST and other bodies have thus selected new quantum-resistant algorithms and advocated their fast adoption in current security solutions. However this is not a simple change, as there are several intertwined aspects to be considered, such as hardware support, key-length, and X.509 certificates. Purpose of this work is to evolve various security components of modern ICT infrastructures to quantum-resistant versions. This may include secure network channels (e.g. TLS, IPsec), digital signatures, secure boot, Trusted Execution Environment (TEE), and remote attestation. The gross objective is the design and test of quantum-resistant versions of several security solutions in an open-source environment (e.g. Linux, Keystone). The specific objectives of this research activity are: 1. Identify security components threatened by quantum computing and review proposed standards to make them quantum-resistant. 2. Extend existing open-source systems and components (e.g. Linux, Keystone, openSSL, mbedTLS could be suitable targets) to support the proposed quantum-resistant solutions. 3. Implement a system with the hardware and software components needed to demonstrate the feasibility and performance of the improved quantum-resistant elements. The first year will be spent studying the existing security paradigms and how they are affected by quantum computing. The PhD student will also analyse the proposed post-quantum algorithms and evaluate their performance and hardware requirements. During this year, the student should also follow most of the mandatory courses for the PhD and submit at least one conference paper. During the second year, the PhD student will design a custom approach for quantum-resistant secure channels and trusted execution environment, possibly enriched with specialized hardware elements. At the end of the second year, the student should have started preparing a journal publication on the topic and submit at least another conference paper. Finally, the third year will be devoted to the Implementation and evaluation of the proposed solution, compared with the existing ones. At the end of this final year, a publication in a high-impact journal shall be achieved. Possible target publications: IEEE Security and Privacy, Springer International Journal of Information Security, Elsevier Computers and Security, Future Generation Computer Systems. This research is part of the Horizon Europe QUBIP project (Quantum-oriented Update to Browsers and Infrastructures for the PQ Transition) https://quibip.eu/
Required skills	REQUIRED SKILLS Cybersecurity (mandatory) Network security (mandatory) Trusted computing (preferred) Basics of quantum computing (useful)

45

Bridging Human Expertise and Generative AI in Software Engineering
Proposer	Luca Ardito
Topics	Software engineering and Mobile computing
Group website	https://softeng.polito.it
Summary of the proposal	In collaboration with Vodafone Digital and the ZTO team in Network Operations, the PhD project aims to define a framework to generate code from functional requirements, fostering synergy between human developers and AI-based actors. The project will involve a systematization of metrics and methodologies to evaluate the correctness of the generated code and requirements performed by the generative AI components to increase the effectiveness and gain trust in the outputs of generative AI.
Research objectives and methods	Research objectives The main objectives of the PhD programme are the following:The identification of generative AI mechanisms that can aid in code generation from software requirements. The development and assessment of methods for the evaluation of the correctness and the dependability of the application of generative AI to code development;The conduction of formal experiments to evaluate how code generated by AI Compares to human-written code in both functional and non-functional terms. Outline of the research work plan Task 1: Preliminary evaluation of state-of-the-art solutions (M1-M3) The task involves a comprehensive assessment of current solutions in the domain. The objective is to evaluate existing methodologies, technologies, and frameworks relevant to the research context. This preliminary analysis will be conducted systematically by applying Kitchenham's guidelines for conducting Systematic Literature Reviews in the Software Engineering research field. The systematic literature review will also consider grey literature sources (i.e., non-peer-reviewed sources available on various internet sources) to cope with the high novelty of the generative AI research field. The systematic evaluation of the state of the art will be complemented with open and structured interviews with practitioners and developers to understand their main needs and most common practices. Task 2: Selection and Integration of Generative AIs for Code Generation (M4-M18) This task focuses on selecting, customizing, integrating, and training a Generative AI, specifically a Large Language or Foundation Model, to generate code from formal requirements. It includes understanding various use case and formal requirement languages and creating modules for translating natural language requirements into structured notations like Use Case Diagrams. The process involves preprocessing data -collecting, cleaning, and structuring use case language datasets- and training the AI to understand these scenarios. Ongoing evaluation and refinement of the AI are crucial for accuracy. The main goal is to develop a solution that translates use case specifications into high-quality code, with evaluations based on the development effort, error rates, and requirement alignment. The task will use Software Repository Mining (MSR) techniques for diverse dataset collection. The implementation phase of this research will follow the Agile Software Development practices, streamlining the entire software development lifecycle to assess the efficacy of AI-generated code against existing tools and manually written code for both front-end and back-end applications. Furthermore, the research is dedicated to pioneering methods for automatically generating synchronized documentation and unit testing. It will also investigate strategies for conducting code quality reviews, monitoring resource usage efficiently, and evaluating the software's business impact, thereby tailoring the development process to meet the demands of network operations. Task 3: Definition of assessment methods for Generative AI-based code development (M13-M24) The task focuses on defining robust methods to assess Generative AI-based code development. This entails the definition of structured procedures to assess the accuracy, reliability, and compliance with the requirements of the generated code. The goal is to establish a rigorous framework for ensuring the quality of code produced by Generative AI, thus advancing the state of the art in Generative AI code development. Task 3 will involve applying systematic techniques to build taxonomies in Software Engineering (ref. Paul Ralph) through the Straussian Grounded Theory technique (ref. Strauss). Task 4: Analysis of the non-functional implications of Generative AI-based code development (M22-36) The task focuses on a comprehensive analysis of the non-functional implications inherent to Generative AI-based code development. This includes scrutinizing factors such as scalability, performance, readability and maintainability of the generated code. The objective is to discern and mitigate any adverse effects of integrating Generative AI into the code development process. The task will involve conducting empirical experiments over statistically significant samples to compare non-functional properties of software generated by human developers, software obtained through Generative AI, and software obtained through a synergistic interaction between human developers and generative AI tools. List of possible venues for publications The target for the PhD research includes a set of conferences in the general area of software engineering (ICSE, ESEM, EASE, ASE, ICSME) as well as in the specific area of testing (ICST, ISSTA). More mature results will be published in software engineering journals, mainly IEEE Transactions on Software Engineering, ACM TOSEM, Empirical Software Engineering, Journal of Systems and Software, and Information and Software Technologies.
Required skills	The main skills required by the candidate are the following: General knowledge about Large Language Models and application of AI-based algorithms to software development; Experience in software development with object-oriented languages (e.g., Java or C#) and knowledge of the web and/or mobile application domain; Knowledge of software verification and validation techniques (e.g., scripted unit and integration testing, end-2-end testing, Graphical User Interface testing).

46

Explaining AI (XAI) models for spatio-temporal data
Proposer	Elena Maria Baralis
Topics	Data science, Computer vision and AI
Group website	https://dbdmg.polito.it https://smartdata.polito.it
Summary of the proposal	Spatio-temporal data allow an effective representation of many interesting phenomena in application domains ranging from transportation to finance. Current state-of-the-art deep learning techniques (e.g., LM, CNN, RNN) provide black-box models, i.e., models that do not expose the motivations for their predictions. The main goal of this research activity is the study of methods to allow human-in-the-loop inspection of reasons behind classifier predictions for spatio-temporal data.
Research objectives and methods	Machine learning models are increasingly adopted to assist human experts in decision-making. Especially in critical tasks, understanding the reasons behind machine learning model predictions is essential for trusting the model itself. For example, experts can detect model wrong behaviors and actively work on model debugging and improvement. Unfortunately, most high-performance ML models lack interpretability. The research activity will consider application domains (e.g., transportation, industry, medical care, climate) in which the availability of understandable explanations is particularly relevant for explaining anomalous behaviors. The explanation algorithms will target different types of spatio-temporal data (e.g., multivariate time series, spatiotemporal graphs, trajectories, spatio-temporal matrices). The following different facets of XAI (Explainable AI) will be addressed. Model understanding. The research work will address local analysis of individual predictions. These techniques will allow the inspection of the local behavior of different classifiers and the analysis of the knowledge different classifiers are exploiting for their prediction. The final aim is to support human-in-the-loop inspection of the reasons behind model predictions. Model trust. Insights into how machine learning models arrive at their decision allow evaluating if the model may be trusted. Methods to evaluate the reliability of different models will be proposed. In case of negative outcomes, techniques to suggest enhancements of the model to cope with wrong behaviors and improve the trustworthiness of the model will be studied. Model debugging and improvement. The evaluation of classification models generally focuses on their overall performance, which is estimated over all the available test data. An interesting research line is the exploration of differences in the model behavior, which may characterize different data subsets, thus allowing the identification of potential sources of bias in the data. PhD years organization YEAR I: state-of-the-art survey for algorithms and for XAI both for time series and spatio-temporal data considering, e.g., feature attribution-based explanations, attention-based explanation, and counterfactual explanation; performance analysis and preliminary proposals of improvements over state-of-the-art algorithms; exploratory analysis of novel, creative solutions for XAI; assessment of main explanation issues in 1-2 specific industrial case studies. YEAR 2: new algorithm design and development; experimental evaluation on a subset of application domains considering public domain datasets (e.g. in the transportation field, TaxiNYC, METR-LA, PEMS-BAY, in the healthcare field, PTB-XL, NYUTron, and MIMIC-III, in the financial field, StockNet and NASDAQ-100); deployment of algorithms in selected industrial contexts. YEAR 3: algorithms improvements, both in design and development, experimental evaluation in new application domains. During the second-third year, the candidate will have the opportunity to spend a period of time abroad in a leading research center. Publication venues for this research include leading conferences and journals in the fields of spatio-temporal data managent, machine learning and artificial intelligence: IEEE TKDE (Trans. on Knowledge and Data Engineering) ACM TKDD (Trans. on Knowledge Discovery in Data) ACM TIST (Trans. on Intelligent Systems and Technology) Information sciences (Elsevier) Expert systems with Applications (Elsevier) Machine Learning with Applications (Elsevier) Engineering Applications of Artificial Intelligence (Elsevier) IEEE/ACM International Conferences (e.g., ACM KDD, ACM SIGSPATIAL, IEEE ICDM, NeurIPS)
Required skills	The candidate should have a strong computer and data science background, in particular for what concerns: - Strong programming skills ? preferably in Python - Thorough understanding of theoretical and applied aspects of machine and deep learning - Fundamentals of spatio-temporal data management - Fundamentals of Natural Language Processing

47

Advanced data modeling and innovative data analytics solutions for complex application domains
Proposer	Silvia Anna Chiusano
Topics	Data science, Computer vision and AI
Group website	dbdmg.polito.it
Summary of the proposal	Data science projects entail the acquisition, modelling, integration, and analysis of big and heterogeneous data collections generated by a diversity of sources, to profile the different facets and issues of the considered application context. However, data analytics in many application domains is still a daunting task, because data collections are generally too big and heterogeneous to be processed through machine learning techniques currently available. Thus advanced data modeling and machine learning/artificial intelligence techniques needs to be devised to uneart meaningful insights and efficiently manage large volumes of data.
Research objectives and methods	The PhD student will work on the study, design and development of proper data models and novel solutions for the integration, storage, management and analysis of big volumes of heterogeneous data collections in complex application domains. The research activity involves multidisciplinary knowledge and skills including database, machine learning and artificial intelligence algorithms, and advanced programming. Different application contexts will be considered to highlight a wide range of data modeling and analysis problems, and thus lead to the study of innovative solutions. The objectives of the research activity consist in identifying the peculiar characteristics and challenges of each considered application domain and devise novel solutions for the modelling, management and analysis of data for each domain. Example scenarios are urban context and in particular urban mobility, and the medical domain. More in detail, the following challenges will be addressed during the PhD: 1. Modeling Heterogeneous Data: Design innovative approaches for modeling heterogeneous data, including structured and unstructured data from different sources, integrating them into a single coherent framework. The experience gained on data modeling in different application contexts can lead to the realization of a Computer-Aided Software Engineering (CASE) tool that guides the user through the design process, reducing design time and improving the quality of the modeling result. 2. Innovative algorithms for data analytics. Study, design, and implementation of innovative machine learning algorithms, with a primary emphasis on clustering and classification tasks. The objective is to overcome limitations of current approaches, enhancing their accuracy, scalability, and ability to deal with heterogeneous data collections. 3. Scalable Learning: Investigate scalable learning techniques to address the increasing complexity and volume of data for achieving optimal performance in big data environments. This research is indeed driven by the growing demand to develop machine learning systems capable of dynamically adapting to the increasing complexity of data and models. For recent machine learning/AI applications, it is crucial to propose innovative models capable of handling large volumes of data with parallel and scalable solutions. The research activity will be organized as follows. 1st Year. The PhD student will start considering a first reference application domain (for example the urban scenario) and a first reference use case in this scenario (for example urban mobility). The PhD student will review the recent literature in the selected use case to (i) identify the most relevant open research issues, (ii) identify the most relevant data analysis perspectives for gaining useful insights, and (iii) assess of main data analysis issues. The PhD student will perform an exploratory evaluation of state-of-the-art technologies and methods on the considered domain, and she/he will present a preliminary proposal for the optimization techniques of these approaches. 2nd and 3rd Year. Based on the results of the 1st year activity, the PhD student will design and develop a suitable framework including innovative data analytics solutions to efficiently model data in the considered use case and extract useful knowledge, aimed at overcoming weaknesses of state-of-the-art methods. Moreover, during the 2nd and 3rd year, the student will progressively consider a larger spectrum of application domains. The student will evaluate if and how his/her proposed solutions can be applied to the new considered domains as well as he/she will propose novel analytics solutions. During the PhD, the student will have the opportunity to cooperate in the development of solutions applied to the research project on smart cities (e.g., PRIN project on the development of an atlas for historic buildings in an urban context). The student will also complete his/her background by attending relevant courses. The student will participate to conferences presenting the results of his/her research activity. Possible pubblication venues includes international journals such as IEEE Transactions on Intelligent Transportation Systems, Information Systems Frontiers (Springer), Information sciences (Elsevier), and international conferences such as IEEE Big data, ACM Inter. Conf. on Information & Knowledge Management (CIKM), IEEE International Conference on Data Mining (ICDM)
Required skills	The candidate should have good programming skills, and competencies in data modelling and techniques for data analysis.

48

Functional Safety Techniques for Automotive oriented Systems-on-Chip
Proposer	Paolo Bernardi
Topics	Computer architectures and Computer aided design
Group website
Summary of the proposal	The activities planned for this proposal include efforts toward Functional Safety Techniques for Automotive Systems-on-Chip (SoC): - Techniques for developing Software-Based Self-Test (SBST) libraries, which are demanded by standards such as the ISO-26262 and researched by companies in the Automotive Market. - Techniques for grading and developing System-level Test (SLT) libraries, which are considered an indispensable final test step and significantly contribute to chip quality.
Research objectives and methods	The phd student will pursue objectives in the broader research field of the Automotive Reliability and Testing. Key enabling factor for this work is the availability of a setup that includes both netlists to be simulated and real silicon chips with development boards to use effectively. Techniques for developing Software-Based Self-Test libraries In this research field, the PhD student will look in the following directions: o Creation of benchmarks design to use along the studies. 1. RISC5-oriented SoC 2. Industrial benchmarks provided by industrial supporters, including netlist and silicon implementation o Identification of the current industrial solutions for the development of SBST libraries and improvement of the state-of-the-art 1. Usage of currently available tools designed by EDA and classification of strength and weaknesses of commercial tools 2. Creation of "wrappers" to evolve EDA ecosystems for custom data collection and improved generation flows 3. Investigation on the extension of existing tools by creating ad-hoc tools adding analytical value and accelerating SBST development. o Data collection tools to enrich the log capabilities of EDA tools o High-level techniques based on silicon Techniques for grading and developing System-level Test (SLT) libraries Research in the field of System-Level Test by addressing the open points of this modern testing technique. SLT is a strong demand and quite an obscure subject that mainly concerns the integration of heterogeneous systems. Development of methods for achieving high SLT coverage will include: o Coverage computation techniques including simulation and silicon-based methods o Generation of SLT methods that maximize the coverage levels. The working plan for the PhD student is recalling the objectives drawn in the previous sections. The order is not fixed and may vary according to the advancement during the PhD program. 1st year 1. Benchmarks design including simulation and test synthesis 2. Preliminary usage of EDA tools to measure SBST and STL coverage 3. Identification of weaknesses and plan a list of counteractions 2nd year 1. Ad-hoc tools preliminary development upon results collected during the 1st year 2. Codification of methods to increase the coverage and speed up the generation process 3rd year 1. Exploration of alternative coverage measurements for SBST and SLT Completion of a flexible and extensible environment to quick creation of SBST and SLT libraries.
Required skills	C/C++, ASM, Simulation and Fault Simulation, VHDL, Firmware

49

Human-aware robot behaviour learning for HRI
Proposer	Giuseppe Bruno Averta
Topics	Data science, Computer vision and AI, Controls and system engineering
Group website	vandal.polito.it
Summary of the proposal	Humans are naturally multi-task agents, with an innate capability to interact with objects and tools and plan complex sequences of actions to address a specific activity. Advanced robots, on the other side, are still far from such capabilities. The goal of this work is to investigate how to learn from humans the capability to quickly plan and execute complex procedures in unstructured scenarios and transfer such skills to intelligent robots for an effective human-robot cooperation.
Research objectives and methods	This proposed PhD thesis aims to explore the domain of learning human skills from egocentric videos and transferring them to robotic systems. The increasing integration of robots into various aspects of human life highlight the need of developing a more intuitive and adaptive approach to skill acquisition, able to learn complex skills and adapt to various scenarios, where traditional learning paradigms fail. Egocentric videos, captured from a first-person perspective, provide a rich and unique source of contextual information that can enhance the learning process. This research seeks to leverage this unique sensing approach to develop a framework for transferring acquired human skills to robots, enabling them to perform complex tasks in diverse environments. Major Objectives: - Human Behavior understanding from ego- and exo-vision: Investigate methods for extracting relevant information from egocentric videos, eventually in combination with third person perspective, focusing on understanding human actions, interactions, and environmental cues. - Skill Representation: Develop a robust representation of human skills extracted from egocentric videos, considering both spatial and temporal aspects to capture the dynamics of actions. - Transfer Learning to Robots: Design a transfer learning framework that enables the adaptation of learned human skills to robotic systems, accounting for differences in morphology, sensors, and actuators. - Adaptive/Continual Learning: Explore techniques for adaptive and continual learning, allowing robots to continuously refine their skills through interaction with the environment and human feedback. - Real-world Applications: Evaluate the proposed framework in real-world scenarios, such as assistive robotics, industrial automation, and healthcare, to assess its practicality and generalizability. Methodology: The research methodology involves a combination of computer vision, machine learning, and robotics techniques. Deep learning models will be employed for ego-vision learning and skill representation, while transfer learning techniques will be investigated to adapt these representations to robotic platforms. Real-world experiments will validate the effectiveness and efficiency of the proposed framework. Significance: This research addresses a critical gap in the field of robotics by focusing on intuitive skill transfer from humans to robots, enhancing their adaptability and autonomy. The outcomes of this thesis will contribute to the development of more versatile and capable robotic systems, fostering their integration into various domains. Keywords: Egocentric Videos, Skill Transfer, Robotics, Transfer Learning, Human-Robot Interaction.
Required skills	Outstanding passion and motivation for research. Excellent programming skills (python and pytorch) are required. Experience in deep learning for videos (egocentric or third person video) is required. Experience with robot learning is not required, although preferred.

51

Generative and Adversarial AI for Testing and Optimization of Connected and Automated Vehicles
Proposer	Claudio Ettore Casetti
Topics	Software engineering and Mobile computing, Data science, Computer vision and AI
Group website
Summary of the proposal	The research aims to explore the integration of generative and adversarial AI techniques to enhance the robustness, safety, and efficiency of Connected and Automated Vehicles (CAVs). By creating realistic and complex driving scenarios on demand, and using adversarial methods to identify and address potential vulnerabilities, the goal will be to develop a comprehensive framework for the rigorous testing and optimization of CAVs.
Research objectives and methods	Digital twin (DT) and Mixed Reality (MR) systems and technologies can assist Connected and Automated Vehicles (CAVs) in using advanced sensors, like ultrasonic radars, cameras, and LiDAR, to gather data from their surroundings and create virtual representations. Artificial intelligence (AI) methods then use these virtual models to inform driving decisions, enhancing CAVs' responses to dynamic road conditions. However, CAVs still face limitations in environmental perception due to occlusions and other sensor limitations, even with high-class LiDAR and panoramic cameras. Leveraging inter-vehicle connectivity and Road Side Units (RSU) communicating with vehicles can alleviate these problems. To overcome these challenges, there is a need for a system where CAVs, roadside units (RSUs), and virtual simulators collaboratively share and fuse sensing data to achieve comprehensive environmental perception. Realizing this large-scale data collection and processing for real-time driving simulation and training of CAVs usually means developing virtual traffic and driving simulation platforms. These platforms leverage DT and MR technologies to create virtual representations of AVs, facilitating efficient traffic and training data collection and allowing for simulation and testing of rare scenarios, like virtual traffic accidents and car collisions, under realistic conditions. However, traditional testing methods may not cover all possible scenarios, especially unusual, or extreme, conditions. Generative AI offers a solution by creating synthetic data and scenarios, while adversarial AI can help identify weaknesses. Combining these approaches has the potential to provide a powerful framework for comprehensive testing and optimization. Research Objectives (1) Development of Generative Models:Design and train generative models (e.g., GANs, VAEs) to create diverse driving scenarios.Generate synthetic datasets (or integration of existing datasets using synthetic data) including varying weather conditions, traffic/movement patterns, and unexpected obstacles. (2) Adversarial AI Implementation:Develop methods to create adversarial examples that challenge the vehicle's perception and decision-making processes, identifying the types of scenarios that are most likely to cause failures or suboptimal performance in CAVs.Investigate the use of Adversarial AI for the setting and optimization of system parameters. (3) Integration and Adaptive Testing:Combine generative and adversarial methods to create a dynamic testing environment, based on simulation or hardware-in-the-loop emulationImplement adaptive testing where generative models produce scenarios based on adversarial findings.Continuously refine and improve CAV systems based on testing outcomes. (4) Evaluation:Conduct extensive simulations to evaluate the effectiveness of the proposed framework.Analyze the impact on safety, reliability, and performance of CAVs.Compare results with traditional testing methods to highlight improvements. Research Work Plan - Year 1: Literature review, development of generative models, evaluation of suitable adversarial AI techniques, initial scenario generation.- Year 2: Implementation of adversarial AI techniques, integration with generative models.- Year 3: Adaptive testing, extensive simulations and evaluation, system refinement. Publication Venues Conferences: IEEE Infocom, IEEE VNC, IEEE VTC, IEEE ICC/Globecom Journals: IEEE Transactions on Vehicular Technology, IEEE Transactions on Mobile Computing, IEEE Transactions on Automation Science and Engineering
Required skills	The ideal candidate should: - possess a strong foundation in machine learning concepts, algorithms, and frameworks - be proficient in programming languages such as Python and C++ - have strong development skills, including code versioning (e.g., Git), debugging, and optimization - have experience in using ML libraries and tools such as TensorFlow, PyTorch, and Keras Past experience in the application of AI/ML techniques in the analysis of wireless and/or mobile networks is a plus.

52

Continuous Machine Learning for Cybersecurity
Proposer	Marco Mellia
Topics	Cybersecurity, Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://dbdmg.polito.it/dbdmg_web/
Summary of the proposal	Machine Learning offers new opportunities to automatically extract data and build models to support cybersecurity tasks. This project focuses on the definition of learning strategies that continuously adapt the model to the evolving nature of internet traffic and attacks by setting up proper data collection and analysis modules.
Research objectives and methods	Research objectives: Investigate and evaluate the capabilities of continuous learning strategies in automating the identification and investigation of security threats. The candidate will perform research to determine whether, and to what extent, the recent advances in machine learning could be used to automate and assist security analysts in the process (i) of identifying new attacks and (ii) setting up countermeasures. In the first phase, the candidate will study the state of the art of continuous machine learning, including the problem of data collection in distributed scenarios and federated learning solutions. In the second phase, the candidate will investigate how and if machine learning models can extract insights from raw data to identify new threats. For this, we will first build a generic data representation using embeddings, which will be specialised to solve custom downstream tasks. We will consider a distributed scenario, where different data sources collect data. Via federated learning, the candidate will investigate how to train a shared embedding model suitable to then solve specific tasks such as threat classification, anomaly detection, or identifying new attack patterns. The candidate will consider multi-modal embeddings to conceptually constrain the embeddings towards the right task. By forcing the model to create multi-modal embeddings conceptually constrained to the right task, the model will possess the ability to generalize and autonomously reason about novel and previously unencountered tasks. For instance, test joint learning of (i) data collected from honeypots and (ii) the packet payload, using, e.g., contrastive learning techniques. The project will involve a collaboration with Huawei Technologies France and Politecnico di Torino. Outline of the research work plan 1st year - Study of the state-of-the-art of security log analysis and state-of-the-art continuous learning models in ML. - Data collection and analysis of raw and structured data on security devices such as honeypots, darknets, IDS, etc. 2nd year - Adaptation and extension solutions to build a distributed platform for data collection and model training using federated learning. - Propose and develop innovative solutions to the problems of cyber threats analysis with machine learning solutions. - Propose multi-modal embeddings for network raw data and security logs. 3rd year - Tune the developed techniques and highlight possible strategies to counteract the various threats. - Application of the strategies to new data for validation and testing. References: - Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Drago, Idilio, Milan, Giulia, Houidi, Zied Ben, Rossi, Dario (2023). Cross-network Embeddings Transfer for Traffic Analysis. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, p. 1-13, ISSN: 1932-4537, doi: 10.1109/TNSM.2023.3329442 - Luca Gioacchini, Luca Vassio, Marco Mellia, Idilio Drago, Zied Ben Houidi, Dario Rossi (2023). i-DarkVec: Incremental Embeddings for Darknet Traffic Analysis. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, vol. 23, p. 1-28, ISSN: 1533-5399, doi: 10.1145/3595378 - Huang, Kai, Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Dynamic Cluster Analysis to Detect and Track Novelty in Network Telescopes, 9th International Workshop on Traffic Measurements for Cybersecurity (WTMC 2024), Vienna, Austria, July 2024 - Huang, Kai, Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Incremental Federated Host Embeddings for Network Telescopes Traffic Analysis, IEEE International Workshop on Generative, Incremental, Adversarial, Explainable AI/ML in Distributed Computing Systems (AI-DCS), Jersey City, New Jersey, USA, July 2024 List of possible venues for publications Security venues: IEEE Symposium on Security and Privacy, IEEE Transactions on Information Forensics and Security, ACM Symposium on Computer and Communications Security (CCS), USENIX Security Symposium, IEEE Security & Privacy; AI venues: Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), International Conference on Machine Learning (ICML), AAAI Conference on Artificial Intelligence, ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD); Computer networks venues: Distributed System Security Symposium (NDSS), Privacy Enhancing Technologies Symposium, The Web Conference (formerly International World Wide Web Conference WWW), ACM International Conference on Emerging Networking EXperiments and Technologies (CoNEXT), USENIX Symposium on Networked Systems Design and Implementation (NSDI);
Required skills	- Good programming skills (such as Python, Torch, Spark) - Excellent Machine Learning knowledge - Knowledge Federated Learning and Machine Learning - Basics of Networking and security

54

REM sleep disorder detection using tachogram and other bio-signals measured using wearable devices
Proposer	Gabriella Olmo
Topics	Data science, Computer vision and AI, Life sciences
Group website	SMILIES
Summary of the proposal	REM Sleep Behavior Disorder (RBD) is a REM sleep parasomnia with a large conversion rate into Parkinson's disease or Lewy body dementia. The diagnosis of such diseases is clinical. However, when symptoms occur, the neurodegeneration is advanced and only symptomatic therapies are available. The diagnosis of RBD using low cost instrumentation and A<i algorithms may enable more efficient therapeutic interventions and pave the way towards population-level mass screening of neurodegenerative diseases.
Research objectives and methods	Based on encouraging preliminary results related to the diagnosis of sleep disorders using lightweight, low-cost instrumentation, the research objective of this proposal is to develop ML/AI algorithms to perform a healthy/RBD classification based solely on a tachogram (sequence of time intervals between successive R peaks of the ECG signal) recorded during sleep using a wearable device (e.g., chest heart rate monitor or optical wrist reader). In fact, it is known that the heart rate variability is strictly correlated to such sleep disorder. The results can possibly be improved using multimodal data from wearable EMG and/or EEG data, to match the sensitivity requirements of screening applications (low false negative rates). A measurement champaign performed both in hospital and at home will be implemented to validate the technique. The work is supported by the neurology department and the center for sleep disorders of the Molinette Hospital, in the context of a long-lasting cooperation. WORKPLAN: YEAR 1. Task 0. Revision of the state of the art related to the use of tachogram for sleep disorder identification. Task 1. Implementation of ML/DL algorithms for RBD/healthy control classification based on tachogram. The algorithms will be trained and tested using publicly available ECG data, processed following the AASM guidelines for a correct performance of the polysomnographic examination. Task 2: Definition of a protocol for bio-signal recording suitable for domestic use, using commercial wearable devices and in cooperation with the clinical staff the Molinette Hospital, Center for Sleep Disorders and Neurology Department. Selection of proper devices, with CE certification. Submission to the Ethics Committee of the Molinette Hospital. Task 3. Definition of the inclusion/exclusion criteria for the selection of (approximately) 20 patients to be included in the trial. Due to the several possible confounding factors, also patients affected by other sleep disorders (in particular, obstructive sleep apnea syndrome - OSAS) will be included in the trial to evaluate the algorithm capability of coping with confounding factors. ACTIVITIES: YEAR 2. Task 4. Algorithm testing and performance comparison with polysomnographic data and clinical reports provided by the Molinette Hospital. Refinement of the algorithms implemented in Task 1. Task 5. Study of the possible performance enhancement provided by the joint use of EMG or one EEG electrode for sleep stage definition. Task 6. Preliminary enrollment of patients enrolled at the Molinette Hospital and measurements made in hospital, using the selected wearable devices, during the execution of the normal polysomnography. First comparison of automatic classification with the clinical reporting. ACTIVITIES: YEAR 3. Task 7. Finalization of the data collection champaign in hospital, and related data analysis. Task 8. Finalization of algorithms using multi-dimensional data and their validation on newly collected data. Task 9. Data collection at home on a limited number of patients and reporting of the possible problems related to the self-monitoring process. Comparison if the quality of data measured at home and in hospital. Task 10. Critical analysis of the results, proposal of integration of the algorithms with heterogeneous data present in the patient's clinical records. N.B: Due to the complexity and novelty of this topic, the Task description is forcedly preliminary and may be subject to modifications depending on the obtained results. We plan to have at least two journal papers published per year. Target journals: IEEE Transactions on Biomedical Engineering IEEE Journal on Biomedical and Health Informatics IEEE Access IEEE Journal of Translational Engineering in Health and Medicine MDPI Sensors Frontiers in Neurology COOPERATIONS: - CNR-IEIIT - Department of Neuroscience "Rita Levi Montalcini - Molinette Hospital, Neurology Department and center for Sleep Disorders ON GOING PROJECTS ON RELATED TOPICS: - PNRR Ecosistema NODES Nord-Ovest Digitale e Sostenibile - PNRR Salute Complementare ?Health Digital Driven Diagnostics, prognostics and therapeutics for sustainable Health care? - PRIN-PNRR 2022 ?Objective monitoring of axial symptoms in Parkinson's disease: quantitative assessment in daily life based on the use of wearables, video sensing and artificial intelligence (OMNIA-PARK)?
Required skills	Preferred skills:- Expertise in the fields of Signal Processing, Data Analysis, Statistics and Machine Learning (e.g. feature selection and ranking, supervised and unsupervised learning).- Basic knowledge of bio-signal data processing (EEG, ECG, EMG, EOG).- Good knowledge of C, Python, Matlab, Simulink programming languages. - Good relational abilities and knowledge of the Italian language, to effectively manage interactions with patients during the evaluation trials.

56

Innovative Techniques for Synthetic Renders Automatic Comparison
Proposer	Federico Manuri
Topics	Data science, Computer vision and AI
Group website	http://grains.polito.it/index.php
Summary of the proposal	Synthetic renderings are widely used in various applications, including computer graphics, virtual reality, and product design. Comparing synthetic renderings can be used to identify changes or differences between renderings, compare their quality, and evaluate the performance of different rendering algorithms or techniques. This research aims to design and implement machine learning techniques to automate this process and improve the efficiency and accuracy of synthetic render comparison.
Research objectives and methods	Synthetic renderings, also known as computer-generated imagery (CGI) or computer graphics, are widely used in a variety of applications, including computer graphics, virtual reality, product design, marketing and advertising, and architecture. Synthetic renderings can be generated manually by a human artist or automatically using computational techniques. There are different techniques and methods to create renderings, simulating light and material properties, camera settings, and other variables to create a highly detailed image that corresponds to a realistic scene or object. Nowadays, synthetic datasets are widely used to train a convolutional neural network (CNN) for image or video analysis tasks in a variety of applications, including object detection, image segmentation, and image generation. There are several advantages to using synthetic datasets to train CNN: automatically obtaining labeled data and annotations; controlling and generating the data depending on the needs of the application to improve the performance of the model, the accuracy (scalable dataset size), the generalization and the robustness (scalable dataset variety). Synthetic rendering comparison refers to the process of evaluating and comparing the differences between two or more synthetic visual contents. This can be done manually by a human evaluator, or automatically using computational techniques. Automatic comparison of synthetic renderings allows for the rapid and objective comparison of different renderings and can be used to identify changes or differences between renderings, evaluate the performance of different rendering algorithms or techniques, and compare the quality of different versions of a rendering. There are several techniques that have been proposed for the automatic evaluation of synthetic renderings, including supervised learning, unsupervised learning, deep learning, and transfer learning techniques. These techniques can be used to predict the similarity or dissimilarity between synthetic renderings, classify renderings as similar or different, and measure the degree of difference between renderings. Research objectives:- To review and analyze current state-of-the-art machine learning techniques for the automatic comparison and generation of synthetic renders. This will involve a comprehensive literature review of existing approaches, including supervised and unsupervised learning, reinforcement learning, and transfer learning.- To identify the challenges and limitations of these techniques through a detailed analysis of the strengths and weaknesses of existing approaches, as well as an exploration of the unique challenges associated with using synthetic datasets to train CNNs.- To propose and evaluate novel approaches that address these challenges and improve the efficiency and accuracy of synthetic render comparison. This will involve the generation of a synthetic dataset of images, choosing and/or developing a model and framework that will be used to train and run the system, as well as training, validating, and deploying the model.- To demonstrate the practicality and effectiveness of the proposed techniques through simulations and experiments. This will involve the implementation and testing of the proposed techniques in realistic simulation environments, as well as in real-world scenarios to evaluate their performance and effectiveness Research work plan: 1st year- Conduct a comprehensive literature review of both existing machine learning techniques for the automatic comparison of synthetic renders and methods for generating synthetic rendering.- Identify and analyze the challenges and limitations of these techniques.- Improve programming, problem-solving, and data analysis skills pertaining to machine learning techniques, image processing, and automatic synthetic renders generation. - Develop strong communication skills to communicate research findings and ideas effectively to a range of audiences. 2nd year- Develop and propose novel approaches that address existing challenges and improve the efficiency and accuracy of synthetic render comparison. 3rd year- Implement and test the proposed techniques in simulation and real-world scenarios.- Analyze and evaluate the results of the simulations and experiments. Possible venues for publications:- IEEE Transactions on Visualization and Computer Graphics (Q1, IF 5.226)- ACM Transactions on Graphics (Q1, IF 7.403)- International Journal of Computer Vision (Q1, 13.369)- Computer Graphics Forum (Q1, 2.363)- IEEE Computer Graphics and Applications (Q2, IF 1.909)- ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques- EUROGRAPHICS Annual Conference of the European Association for Computer Graphics- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Required skills	The candidate should have a strong background in computer science. Experience in computer vision techniques, data science, and machine learning is a plus. The candidate should also have excellent problem-solving and analytical skills, whereas strong communication and writing skills are a plus. The candidate should be self-motivated and able to work independently, as well as collaboratively with a team.

57

Innovative Techniques for Storyboarding in Extended Reality
Proposer	Federico Manuri
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	http://grains.polito.it/index.php
Summary of the proposal	Creatives in the animation and film industries constantly explore new and innovative tools and methods to enhance their creative process. Realistic rendering techniques, 3D game engines, modeling and animation tools have been exploited to support storyboarding and movie prototyping. This research aims to design and implement innovative solutions based on machine learning techniques and extended reality technologies to automate, improve, and extend the storyboarding process.
Research objectives and methods	Creatives in the animation and film industries constantly explore new and innovative tools and methods to enhance their creative process, especially in preproduction. Traditional approaches rely on hand-drawn storyboards and physical mockups, whereas information technology introduced sketch-based and picture-based 2D drawing applications. As 3D animation became popular, 3D artists were employed to recreate in 3D the storyboard drawings. This approach is also helpful in cinematic production to pre-visualize the story before the video shoot phase. However, the conversion accuracy from 2D images to a 3D scene can only be judged when the 3D artists complete their works. Drawing the correct scale of objects without absolute references (for cinematic production) or 3D references (for CGI productions) is another possible mistake. Performing the storyboarding process in 3D can resolve these issues and provide an interactive pre-visualization of the storyboard. As realistic, real-time rendering techniques have emerged in recent years, 3D modeling and animation tools and 3D game engines, as well as machine learning techniques and extended reality interfaces, have been researched and explored to support storyboarding and movie prototyping: developing tools to generate storyboard automatically; exploiting 3D to innovate the shooting process of CGI through assisting filmmakers in camera composition; investigating novel ways to automatically pose 3D models from 2D sketches, scripts, or from dedicated staging languages such as Prose Storyboard Language (PSL); developing semi-automated cinematography toolset capable of procedurally generating cutscenes. The goal of this research is to design and implement innovative solutions based on machine learning techniques and extended reality technologies to automate, improve and extend the storyboarding process. The activities will comprehend: automatic 3D reconstruction of real environments; recognition of the user actions for extended reality applications; automatic analysis and creation of 3D storyboards. Research work plan: 1st year- Conduct a comprehensive literature review of existing machine learning techniques for 3D reconstruction, text analysis, and image processing. Moreover, existing 3D storyboard approaches will be considered.- Identify and analyze the challenges and limitations of these techniques.- Improve programming, problem-solving, and data analysis skills pertaining to machine learning techniques, image processing, and extended reality interface creation. - Develop strong communication skills to effectively communicate research findings and ideas to a range of audiences.- Write and submit a systematic literature review/survey to a relevant conference/journal. 2nd year- Develop and propose novel approaches that address existing challenges and improve the storyboarding process.- Write and submit publications to relevant conferences and journals. 3rd year- Implement and test the proposed techniques in simulation and real-world scenarios.- Analyze and evaluate the results of the simulations and experiments.- Write and submit publications to relevant conferences and journals. Possible venues for publications:- IEEE Transactions on Visualization and Computer Graphics (Q1, IF 5.226)- ACM Transactions on Graphics (Q1, IF 7.403)- International Journal of Computer Vision (Q1, 13.369)- Computer Graphics Forum (Q1, 2.363)- IEEE Computer Graphics and Applications (Q2, IF 1.909)- ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques- EUROGRAPHICS Annual Conference of the European Association for Computer Graphics- Entertainment Computing (Q2, IF 2.072)- IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Required skills	The candidate should have a strong background in computer science. Experience in computer vision techniques, data science, and machine learning is a plus. The candidate should also have excellent problem-solving and analytical skills, whereas solid communication and writing skills are a plus. The candidate should be self-motivated and able to work independently and collaboratively with a team.

58

Innovative Robot Learning Techniques for Human-Robot Collaboration
Proposer	Federico Manuri
Topics	Data science, Computer vision and AI
Group website	http://grains.polito.it/index.php
Summary of the proposal	Human-robot collaboration is a rapidly growing field that has the potential to revolutionize manufacturing, healthcare, and other industries by enhancing productivity and efficiency. However, robots must be able to learn and adapt to changing environments and tasks, and to the preferences and needs of their human collaborators. This research aims to design and implement learning algorithms for robotic arm manipulators that enable effective human-robot collaboration in dynamic environments.
Research objectives and methods	The proposed research aims to advance the field of human-robot collaboration by developing and demonstrating innovative robot learning techniques that enable effective collaboration in dynamic environments. A collaborative robotic arm manipulator is a type of robot that is designed to work alongside humans in a shared workspace. These robots are typically used in manufacturing, where they are capable of performing tasks such as picking, placing, and manipulating objects. Collaborative robotic arm manipulators are equipped with sensors and safety features that allow them to operate safely in close proximity to humans and are designed to be easy to use and adaptable to a wide range of tasks. They are used in a wide range of fields, including manufacturing, healthcare, retail, agriculture, construction, and service industries. Overall, the use of collaborative robotic arms is growing in a wide range of fields, as they are capable of performing tasks quickly, accurately, and safely, and can improve efficiency and reduce labor costs. There are several learning techniques that are commonly used for programming and operating collaborative robotic arm manipulators: lead-through programming, point-and-click programming, scripting, and hardware-in-the-loop (HIL) simulations. A novel emerging technique is learning by imitation, which refers to the process of a robot learning to perform a task by observing and copying the actions of a human or another robot. There are several approaches to robot learning by imitation, including: - Behavioural cloning: in this approach, the robot is trained to mimic the actions of a human or another robot by learning from a dataset of demonstrations. - Learning from demonstrations: In this approach, the robot is provided with a set of demonstrations and learns a model of the task by inferring the underlying rules or principles from the demonstrations. - Human-robot interaction: In this approach, the robot learns by interacting with a human in a collaborative manner, as the human provides guidance and feedback to the robot as it performs the task. Overall, robot learning by imitation is a powerful tool for enabling robots to learn new tasks and adapt to changing environments. The goal is to research and develop novel learning by imitation techniques and to contribute to the development of more effective and efficient human-robot collaboration systems. Active industrial collaboration pertaining to human-robot collaboration systems and applications is ongoing with Comau. Research objectives:- To review and analyze current state-of-the-art robot learning techniques for human-robot collaboration. This will involve a comprehensive literature review of existing approaches, including supervised and unsupervised learning, reinforcement learning, and transfer learning.- To identify the challenges and limitations of these techniques in dynamic environments. This will involve a detailed analysis of the strengths and weaknesses of existing approaches, as well as an exploration of the unique challenges posed by dynamic environments such as manufacturing and healthcare settings.- To propose and evaluate novel robot learning approaches that address these challenges and enhance human-robot collaboration. This will involve the development of new robot learning algorithms and approaches that are specifically designed to enable effective human-robot collaboration in dynamic environments.- To demonstrate the practicality and effectiveness of the proposed techniques through simulations and experiments. This will involve the implementation and testing of the proposed techniques in realistic simulation environments, as well as in real-world scenarios to evaluate their performance and effectiveness. Research work plan: 1st year- Conduct a comprehensive literature review of existing learning by imitation techniques for human-robot collaboration with robotic arm manipulators.- Identify and analyze the challenges and limitations of these techniques in dynamic environments.- Improve programming, problem-solving, and data analysis skills pertaining to robot learning techniques and human-robot interaction - Develop strong communication skills to communicate research findings and ideas effectively to a range of audiences 2nd year- Develop and propose novel learning approaches (based on the learning by imitation technique) that address existing challenges and enhance human-robot collaboration. 3rd year- Implement and test the proposed techniques in simulation and real-world scenarios.- Analyze and evaluate the results of the simulations and experiments. Possible venues for publications:- IEEE Transactions on Robotics (Q1, IF 6.835)- International Journal of Human-Computer Interaction (Q1, IF 4.920) - ACM Transactions on Human-Robot Interaction (Q2, IF 4.69)- IEEE International Conference on Human-Robot Interaction (HRI)- International Conference on Robotics and Automation (ICRA)
Required skills	The candidate should have a strong background in computer science. Experience in data science and machine learning is a plus. The candidate should also have excellent problem-solving and analytical skills, whereas strong communication and writing skills are a plus. The candidate should be self-motivated and able to work independently, as well as collaboratively with a team.

59

Adaptive, Agile and Automated Cybersecurity Management
Proposer	Fulvio Valenza
Topics	Cybersecurity
Group website	http://netgroup.polito.it http://netgroup.polito.it
Summary of the proposal	Modern digital systems, like cloud-edge computing, software, and virtualized networks, use complex services, devices, data, and infrastructure with many entwined, recursive, and often hidden relationships. Unfortunately, managing cybersecurity and configuring enough protection in these novel systems is challenging due to the fragmentation of cybersecurity operations in such multi-ownership systems. The proposed research aims to study adaptive, agile, and automated cybersecurity management.
Research objectives and methods	The main objective of the proposed research is to improve the state of the art of cybersecurity management and automation in digital systems (i.e., cloud-edge computing, software, and virtualized networks), mainly focusing on adaptive, agile, and automated cybersecurity configuration and reaction. Although some methodologies and tools are available today with this target, they support these activities only partially and still have severe limitations. Most notably, they leave a lot of the work and responsibility in charge of the human user, who is expected to configure adequate protection mechanisms and instantly react to cyberattacks. The candidate will pursue highly automated approaches that go beyond the state of the art, limiting human intervention as much as possible, so reducing the risk of introducing human errors and speeding up security analysis and reconfigurations. This last aspect is essential because novel systems are highly dynamic. Moreover, if security attacks or policy violations are detected at runtime, the system should recover rapidly by reconfiguring its security promptly. Another feature that the candidate will pursue in the proposed solution is a formal approach, capable of providing formal correctness by construction. This way, high correctness confidence is achieved without needing a posteriori formal verification of the solution. Finally, the proposed approach will pursue optimization by selecting the best solution among the many possible ones. In this work, the candidate will exploit the results and the expertise recently achieved by the proposer's research group in the related field of traditional network security automation. Although there are significant differences between the two application fields, there are also some similarities, and the underlying expertise on formal methods held by the group will be fundamental in the candidate's research work. If successful, this research work can have a high impact because improving automation can simplify and improve the quality of the verification and reconfigurations in these modern systems, which are crucial for our society. Even if the results are less than expected, this research would still contribute significantly to defining the methodology and tools that support security administrators. The research activity will be organized in three phases: Phase 1 (1st year): The candidate will analyze and identify the main issues and limitations of recent methodologies in adaptive and agile configuration and reconfiguration. Also, the candidate will study the state-of-the-art literature on security automation and optimization of cloud-edge computing environments and software/virtualized networks, with particular attention to formal approaches for modeling and configuring security properties and devices. Subsequently, with the tutor's guidance, the candidate will start identifying and defining new approaches for defining novel models and processes for automating and enforcing network and access control and isolation mechanisms. Some preliminary results are expected to be published at this phase's end. During the first year, the candidate will also acquire the background necessary for the research. This will be done by attending courses and by personal study. Phase 2 (2nd year): The candidate will consolidate the proposed approaches, fully implement them, and conduct experiments with them, e.g., to study their correctness, generality, and performance. In this year, particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. The results of this consolidated work will also be submitted for publication, aiming at least at a journal publication. Phase 3 (3rd year): based on the results achieved in the previous phase, the proposed approach will be further refined to improve its scalability, performance, and applicability (e.g., different security properties and strategies will be considered), and the related dissemination activity will be completed. The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of cybersecurity (e.g. IEEE S&P, ACM CCS, NDSS, ESORICS, IFIP SEC, DSN, ACM Transactions on Information and System Security, or IEEE Transactions on Secure and Dependable Computing), and applications (e.g. IEEE Transactions on Industrial Informatics or IEEE Transactions on Vehicular Technology). Moreover, the proposed research will be conducted in the context of the EU project Miranda, which will be started in September 2024.
Required skills	In order to successfully develop the proposed activity, the candidate should have a good background in cybersecurity (especially in network security) and good programming skills. Some knowledge of formal methods can be useful, but it is not required: the candidate can acquire this knowledge and related skills as part of the PhD Program by exploiting specialized courses.

60

Security of Linux Kernel Extensions for Fast Packet Processing
Proposer	Riccardo Sisto
Topics	Cybersecurity, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website	https://netgroup.polito.it
Summary of the proposal	eBPF (extended Berkeley Packet Filters) and XDP (eXpress Data Path) are technologies recently introduced in Linux to enable the execution of user-defined plugins in the Linux kernel with the purpose of processing network packets at highest speed. This research aims to perform a deep study of the security of these technologies, enriching the still limited literature in this field, and to propose code development techniques that avoid the related most dangerous vulnerabilities by construction.
Research objectives and methods	Today, there is a growing interest in eBPF and XDP in the networking field because such technologies allow ultra-high-speed monitoring of network traffic in real-time. However, the security of such techniques has not yet been studied adequately. Moreover, as witnessed by several new related vulnerabilities that have been recently discovered, eBPF/XDP security is not yet satisfactory despite eBPF code is statically analyzed by a bytecode verifier before being accepted for execution by the Linux kernel. The main objective of the proposed research is to improve the state of the art of secure coding for eBPF/XDP code. This will be done by first studying the state of the art and the attack surface of the eBPF/XDP technologies. Then, new techniques will be proposed to produce code that is provably free from the most dangerous vulnerabilities by construction. In this research work, the candidate will exploit the expertise about formal methods available in the proposer's research group. The research activity will be organized in three phases: Phase 1 (1st year): The candidate will analyze and identify the main security issues and attack surfaces of eBPF/XDP code, going beyond the limited studies available today in literature on the topic. This will be done by also applying new formal modeling approaches developed by the candidate with the tutor's help to look for new classes of possible eBPF/XDP vulnerabilities in a systematic way. At this phase's end, some preliminary results are expected to be published, such as a survey of the state of the art and the findings of the systematic search for new classes of vulnerabilities. During the first year, the candidate will also acquire the background necessary for the research. This will be done by attending courses and by personal study. Phase 2 (2nd year): The candidate will develop techniques to support the programmer in developing eBPF/XDP code that is provably free from the most important classes of vulnerabilities. This will be done by leveraging the knowledge about eBPF/XDP code security acquired in the first year, and by developing a formal secure-by-construction approach for the development of eBPF code. Particular emphasis will also be given to the experimental evaluation of the developed approach. The results of this work will also be submitted for publication, aiming at least at a journal publication. Phase 3 (3rd year): based on the results achieved in the previous phase, the proposed approach will be further refined, to improve its precision and relevance, and the related dissemination activity will be completed. The work will be done in synergy with the European Project ELASTIC, which started in 2024 with the goal of developing a software architecture for extreme-scale analytics based on recent programming technologies like eBPF/XDP and Wasm and characterized by high security standards. The proposer's group participates as one of the ELASTIC partners and is involved in the study of the security of eBPF/XDP, which is strictly related to the proposed research. The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of Cybersecurity (e.g. IEEE S&P, ACM CCS, NDSS, ESORICS, IFIP SEC, DSN, ACM Transactions on Information and System Security, or IEEE Transactions on Secure and Dependable Computing), and networking applications (e.g. INFOCOM, ACM/IEEE Transactions on Networking, or IEEE Transactions on Network and service Management).
Required skills	To successfully develop the proposed activity, the candidate should have a background in cybersecurity, software engineering and networking. Some knowledge of formal languages and formal methods can be useful, but it is not strictly required: the candidate can acquire this knowledge and related skills as part of the PhD Program, by exploiting specialized courses.

61

Smart digital technologies for advanced and personalised health monitoring
Proposer	Luigi Borzì
Topics	Data science, Computer vision and AI, Life sciences, Software engineering and Mobile computing
Group website	https://www.sysbio.polito.it/analytics-technologies-health/ https://www.smilies.polito.it/
Summary of the proposal	In an era where it is easy to collect huge amounts of medical data over long periods of time, appropriately designed computer programs are essential to merge multi-modal data, extract clinically relevant information and synthesise it to provide easy interpretation. The objective of the project is to design, implement, optimise and validate signal processing and machine learning algorithms that provide accurate health monitoring in different environments and conditions.
Research objectives and methods	In an increasingly ageing society, the prevalence of health-related problems (e.g. cardiovascular complications) is rapidly increasing, as is the incidence of chronic and neurodegenerative disorders. In recent decades, the development of new digital technologies promises to revolutionise the management of diseases and health-related issues. In particular, digital solutions are being designed and optimised to facilitate early diagnosis, unconstrained continuous monitoring and evaluation of disease progression and treatment effectiveness. These new technologies include wearable sensors that can be worn on the body for long periods of time and enable minimally invasive, unconstrained, continuous and accurate real-time health monitoring. In addition, devices such as RGB cameras, infrared sensors and radar systems offer the advantage of non-contact monitoring of vital signs and human movement. Non-invasive data collection procedures ensure the ecological validity of the recorded data, promoting patient comfort and compliance and thus enabling long-term monitoring. At the same time, the huge amount of multi-modal data generated by different sensors can pose challenges for data analysis, fusion, processing, synthesis, interpretation, security and privacy. To overcome these challenges, it is necessary to implement advanced signal processing and machine learning algorithms tailored to detect imperceptible changes in health parameters, aid early diagnosis and provide robust predictions. This project aims to provide a broad set of computer-based methods capable of handling large amounts of multi-modal health-related data. Algorithms include tailor-made signal processing methods to improve signal quality and fuse data from different domains, as well as feature- and data-driven machine learning models that process data to provide a robust and comprehensive health assessment. The design, implementation and optimisation of prediction algorithms must consider the trade-off between performance, interpretability, and computational complexity. Indeed, algorithms capable of running in real time on autonomous, low-resource portable devices will be developed. The research is inherently multidisciplinary and involves the application of computer methods to medical data. Data will be recorded in clinical settings under the supervision of medical personnel, as well as collected continuously in remote settings (e.g. at home). Young, healthy and elderly subjects as well as subjects with chronic and neurological diseases will be involved in the data acquisition process. Various commercial sensors and prototypes providing effective and non-invasive health monitoring will be exploited. Simple to complex algorithms will be developed and validated, both on large public data sets and on proprietary self-collected data. General models will be implemented to aid early diagnosis of chronic diseases and related symptoms. Subject-specific processing pipelines will be designed to match the characteristics of each patient, thus providing patient-centered solutions. Finally, secure and privacy-preserving methods, such as data anonymisation and federated learning, will be exploited to ensure adequate training of models while reducing the risk associated with the transfer of sensitive medical data. The research results will be presented at recognised international scientific conferences (e.g., IEEE International Conference on Biomedical and Health Informatics, IEEE Symposium on Computer-Based Medical Systems, ACM International Symposium on Wearable Computers, IEEE International Conference on Digital Health) and top-tier international journals (e.g., Nature Digital Medicine, Artificial Intelligence in Medicine, Computers in Biology and Medicine, IEEE Journal of Biomedical and Health Informatics) at the intersection of computer science, engineering and medicine. Research studies will be conducted and published in close cooperation with clinical professionals (e.g. neurologists, cardiologists, diabetologists, family doctors) and patient associations, to ensure that the methods developed meet current clinical gaps and the real needs of patients. In addition, national and international cooperation with academic research units (e.g. electronics, neuroscience, computer science) and specialised laboratories will provide a comprehensive set of knowledge and tools that will facilitate and promote high-quality research. Finally, collaborations with industrial partners will facilitate the design of effective, ready-to-use solutions for use in the real world.
Required skills	- Multi-disciplinary experience at the intersection of medicine and computer science - Strong motivation, ambition, curiosity, and creativity - Programming in Python and/or Matlab - Fundamentals of signal and image processing - Fundamentals of machine learning

62

Control theory-based sparse optimization for system identification and data-driven control
Proposer	Sophie Fosson
Topics	Controls and system engineering
Group website	https://www.dauin.polito.it/it/la_ricerca/gruppi_di_ricerca/sic_system_identification_control
Summary of the proposal	System identification and data-driven control deal with building mathematical models of dynamical systems from observed data. In general terms, we can approach this problem with machine learning: recurrent neural networks (RNNs) represent a promising tool to learn systems and controllers with nonlinear behaviors. The main goal of this Ph.D. project is to develop and analyze novel sparse optimization strategies to design parsimonious RNN models, that reduce the inference complexity.
Research objectives and methods	The use of RNNs in the context of system identification and data-driven control poses a number of difficult problems in terms of efficiency of the training algorithms and complexity of the obtained models. In the last years, the research activity of the DAUIN system identification and control (SIC) group has been focusing on the application of control theory methods to design novel optimization algorithms that tackle the vanishing and exploding gradient problem affecting gradient-based training algorithms. Although the obtained results are promising, a relevant problem still needs to be adequately addressed: how to force the training algorithm to learn models which are as simple as possible, to avoid redundancies, over-fitting, and undesired high complexity. This research line is drawing an increasing attention for its technological applications, such as, the implementation of the identified models and controllers in cyber-physical systems, sensor networks and smartphones. From a mathematical viewpoint, parsimonious models can be built by using sparse optimization tools, i.e., by solving suitable optimization problems where the number of effective parameters is encouraged to be small. For example, the sparsification can be obtained by suitable regularization methods. The main goal of this Ph.D. project is to develop and analyze sparse optimization strategies to design suitable RNN models for system identification and data-driven contro purposes. The research workplan is articulated as follows: M1-M6: Study of the literature on the topics related to the research objective: sparse optimization, methods currently available to reduce the complexity of neural networks, recent results of the SIC group. Milestone 1: Report of the results available in the literature; theoretical formulation of problem and analysis of the main difficulties/critical aspect/open problems. M7-M12: Development and analysis of a novel control theory -based sparse optimization technique for general convex problem. Milestone 2: Results obtained in this stage of the project are expected to be the core of a paper to be submitted to an international conference and/or an international journal. M13-M24: Development and analysis of a novel control theory -based sparse optimization technique for general nonconvex problem. Application to training of RNNs. Milestone 3: Results obtained in this stage of the project are expected to be the core of a conference contribution and a paper to be submitted to an international journal. M25-M36: Application of the proposed techniques to system identification and data-driven control of nonlinear systems. Theory and real-world application. Milestone 4: Application of the developed methods and algorithms to real-world problems, e.g., in the context of optimal control of electric vehicles and control of glucose-insuline dynamics in patients affected by Type I diabetes. Main venues for journal publications: IEEE Transactions on Automatic Control, Automatica. Main conferences: CDC, IFAC World Congress.
Required skills	Passion and motivation for methodological/theoretical research, in particular for the development and the mathematical analysis of algorithms. Solid mathematical background in linear algebra, convex optimization, control theory, system identification and machine learning theory. Good programming skills in Matlab and Python.

63

E-MIMIC: Empowering Multilingual Inclusive Communication
Proposer	Tania Cerquitelli
Topics	Data science, Computer vision and AI, Life sciences
Group website	https://dbdmg.polito.it/dbdmg_web
Summary of the proposal	The PhD student will develop a multidisciplinary study across the research areas of Deep Learning Natural Language Understanding and Linguistics, with the goal of promoting and ensuring equality and inclusion in communication. Deep-learning methods will be developed to suggest inclusive text reformulations. Novel strategies for training deep-learning models will be coupled with human-in-the-analytics analysis loop strategies to ensure responsible data processing by creating unbiased models.
Research objectives and methods	This study aims to overcome discriminatory use of language within a text through Deep Learning methods to disseminate correct language use that reflects the diversity of our society. To achieve this ambitious goal, a number of different but highly interrelated research objectives (RO) will be pursued ++ RO1. Automatically process raw input text, identify discriminatory text segments, and suggest alternative inclusive reformulations, that is: ++ RO1a. How to define what linguistic "bias" means, e.g., which linguistic expressions are likely to be correlated with non-inclusive communication. ++ RO1b. Whether Deep Learning techniques are able to override the bias in the input text and produce an appropriate reformulation of the text. ++ RO1c. To what extent are large, general, multilingual collections (e.g., Wikipedia) suitable for learning pre-trained models that can be conveniently tuned to specific tasks (e.g., text reformulation and generation)? RO2. Improve engine learning capabilities by storing and leveraging end-user feedback. RO3. Define an interactive tool to effectively explore the capabilities of the proposed automatic text rewrite tool. RO4. Benchmark and evaluate the proposed system tailored to formal communication used mainly in academia and public administration, adapted for two Romance languages (i.e., Italian and French), which are particularly prone to non-inclusive wording. The above objectives open a broad multidisciplinary research landscape that touches core aspects of linguistic research activities and data scientists working in the field of NLP research. The study will promote the application of a Deep Learning-based methodology for processing raw input text, identifying discriminative text snippets within an input text, and generating alternative and more comprehensive reformulations. To further improve the adaptability of the system, the engine continuously learns from user feedback and interactions and uses the acquired linguistic expertise to improve the ability of the Deep Learning methods to correctly handle the paraphrasing task and improve system performance. During Year 1, the candidate will study state-of-the-art deep learning algorithms for natural language processing, as well as linguistic and discursive criteria able to model the diversity of our society. The candidate will define new linguistic and discursive criteria based on French discourse analysis (Moirand 2020) and customized for communication typologies (e.g., legal documents, academic texts). In addition, the candidate will propose solutions for three tasks: (1) data modeling, (2) inclusive language classification, and (3) reformulation of inclusive language. In Year 2, the candidate will evaluate Deep Learning models using intrinsic and extrinsic quality metrics computed from a test set. However, the inherent complexity of the text classification and revision processes requires new strategies, specifically modeling human judgments, to quantify the soundness and completeness of the results obtained. To this aim, the candidate will propose new strategies for capturing and analyzing human preferences to propose alternative texts to different users based on the application scenarios and the users' knowledge and preferences. In Year 3, the candidate will design a visual interface that allows end users and linguistics experts to easily interact with and use the proposed engine and provide feedback and suggestions to improve the capability of the proposed solution. During the 2nd and 3rd years, the candidate will evaluate the proposed solutions in two Romance languages, Italian and French. All the research activity will be aimed at disseminating the proposed solutions not only in the primary venues (conferences and journals), but also towards language industries and society, by promoting cross-domain cross-fertilization of research results and new collaborations with other research actors.
Required skills	Strong background in data science, deep learning, and natural language processing

64

Explainable AI (XAI) and Trustworthy AI algorithms for audio and speech data
Proposer	Eliana Pastor
Topics	Data science, Computer vision and AI
Group website	https://dbdmg.polito.it https://smartdata.polito.it
Summary of the proposal	Speech and audio models are increasingly integrated into various applications, such as virtual assistants, automated transcription services, and customer service bots. However, these models often operate as black boxes, not revealing the reasons behind their inner workings. The primary goal of this research activity is to design and develop approaches to make AI models for audio and speech data reliable, interpretable, and trustworthy.
Research objectives and methods	The proposed PhD research will focus on developing novel explainable AI (XAI) techniques and trustworthy AI algorithms tailored explicitly for audio and speech data. The research activity will focus on the following targets. - Developing novel XAI techniques that provide insights into how AI models make decisions based on audio inputs. - Ensuring the robustness and reliability of audio and speech AI models in real-world scenarios, particularly focusing on their ethical and unbiased performance. - Validating the developed algorithms in practical tasks and applications such as automatic speech recognition, intent classification, and emotion recognition. PhD years organization YEAR I: the PhD candidate will review existing XAI and Trustworthy AI techniques for audio and speech data. The literature review will focus on by-design (e.g., concept-based) and post-hoc approaches (e.g., feature attribution-based explanations, counterfactual explanations). The candidate will define the research problem, identify gaps in the current state of the art, and formulate preliminary research hypotheses and opportunities for improvements. The candidate will then perform initial experiments to understand the limitations of current speech and audio models in terms of explainability and trustworthiness and explore preliminary solutions. YEAR II: Based on the identified research gaps, opportunities for improvements, and the preliminary proposals made, the candidate will work on the design and implementation of novel XAI algorithms specifically for audio and speech data, ensuring their interpretability and transparency. The candidate will preliminarily evaluate the developed algorithms on benchmark datasets and real-world scenarios to refine their performance and reliability. YEAR III: The candidate will consolidate the obtained findings. The candidates will focus on algorithm improvements, both in design and development, and applying the proposed novel approaches in multiple application domains. The candidate will conduct extensive evaluations to demonstrate the transparency, practical utility, and overall trustworthiness of the developed explainable algorithms and models. The candidate's activities may include involvement with incoming projects, including a collaboration with Amazon, and potential industrial collaborations. During the second/third year, the candidate will have the opportunity to spend a period abroad in a leading research center. Publication venues for this research include leading conferences and journals in the fields of Explainable and trustworthy AI, audio and speech processing, machine learning , and artificial intelligence: IEEE/ACM TASLP (IEEE/ACM Transactions on Audio, Speech, and Language Processing) IEEE TKDE (Trans. on Knowledge and Data Engineering) ACM TKDD (Trans. on Knowledge Discovery in Data) ACM TIST (Trans. on Intelligent Systems and Technology) Information sciences (Elsevier) Expert systems with Applications (Elsevier) Machine Learning with Applications (Elsevier) Engineering Applications of Artificial Intelligence (Elsevier) IEEE/ACM International Conferences (e.g., ACM KDD, IEEE ICDM, NeurIPS, ACM FAccT) ACL Conferences (e.g., ACL, EMNLP, NAACL, EACL) Audio and Speech processing Conferences (e.g., IEEE ICASSP, ISCA Interspeech)
Required skills	The candidate should have a strong computer and data science background, in particular for what concerns: - Strong programming skills, preferably in Python- Knowledge and experience of theoretical and applied aspects of machine and deep learning - Fundamentals of Natural Language Processing - Fundamentals of Explainable AI (useful)

65

Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs) for the Insurance Industry
Proposer	Lia Morra
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://grains.polito.it/
Summary of the proposal	Large Language Models (LLMs) face challenges like hallucination, outdated knowledge, and opaque reasoning. Knowledge Graphs (KGs) store rich factual knowledge in a structured way, with the potential to improve LLMs' robustness and interpretability, but are labour intensive to build and maintain. This research proposal aims to integrate KGs and LLMs to develop robust Retrieval-Augmented Generation (RAG) applications for the insurance sector.
Research objectives and methods	Large Language Models (LLMs) showcase impressive linguistic capabilities but produce hallucination, may incorporate outdated knowledge, and their predictions are non-transparent as the underlying ?reasoning? process is untraceable. Retrieval-Augmented Generation (RAG) has thus emerged as a promising solution by incorporating knowledge from external databases, including structured (knowledge graphs) and unstructured (textual sources). RAG has been shown to increase the accuracy and credibility of the generation, particularly for knowledge-intensive tasks, allowing for continuous knowledge updates and integration of domain-specific information. In particular, Knowledge Graphs (KGs) are structured representations that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and maintain over time; therefore, there is a need for automated methods in KGs to generate new facts and represent previously unseen knowledge. These limitations call for a synergistic approach in which the capabilities of both technologies are leveraged in a mutually beneficial way [1]. Starting from a core structured representation of domain-related concepts, entities and rules, such an approach could progressively and automatically expand the KG from unstructured textual documents. The KG could then provide a generalized structured representation of existing factual knowledge, that can in turn be used to enrich generated LLM-based queries, as well as verify the factual correctness of LLM-generated responses. The aim of the present PhD proposal is to exploire the integration and optimization of graph-enhanced RAG in LLMs to build robust and effective applications tailored to the insurance domain. To achieve this overarching goal, the following research questions will be tackled:- integrating information from KGs and other sources to perform RAG in LLMs for the insurance industry- designing LLM-based systems to perform KGs update, detecting novel concepts and inconsistencies- investigating to what extent RAG can improve the trusthworthiness of LLM-based applications in different contexts within the insurance industry Work plan outline- Review of the literature on RAG, KGs, and their integration with LLMs- Selection of one or more case studies- Definition of the requirements and specifications of the RAG-based solutions, along with suitable qualitative and quantitative evaluation methods- Design of novel methodologies for KGs embeddings in LLMs- Design of novel methodologies for LLM-based KG expansion and their integration into a continuous, life-long scenario- Implementation and evaluation of the proposed solutions- Dissemination of results in scientific journals and conferences Possible publication venues: Results will be disseminated at conferences and international journals targeting both foundational and applied data science. Conferences include European Conference on Machine Learning and Knowledge Discovery (ECML-PKDD), IEEE International Conference on Data Science and Advanced Analytics (DSAA), IEEE BigData, International Conference on Neuro-symbolic Learning and Reasoning, as well as emerging venues on generative AI and knowledge retrieval. Targeted journals include the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Data and Knowledge Engineering, Pattern Recognition, Expert Systems with Applications. Industrial collaborations: The activities will be conducted in collaboration with Reale Mutua Assicurazioni [1] Pan, Shirui, et al. "Unifying large language models and knowledge graphs: A roadmap." IEEE Transactions on Knowledge and Data Engineering (2024).
Required skills	- Advanced knowledge of artificial intelligence, machine learning, natural language processing, and LLMs. - Experience in programming and data handling. - Communication and teamwork skills.

67

Optimizing Deep Neural Networks on Edge SoCs using AI Compilers for Heterogenous Systems
Proposer	Massimo Poncino
Topics	Data science, Computer vision and AI, Computer architectures and Computer aided design
Group website	eda.polito.it
Summary of the proposal	TVM (Tensor Virtual Machine) is an open-source deep learning compiler that aims at accelerating the deployment of machine learning models. MATCH is an in-house AI compiler based on TVM designed for heterogeneous system AI deployment. In this proposal, the candidate plans to optimize the execution of DNNs on low-power microcontroller units (MCUs) by using TVM and MATCH. The scope of the research also includes improving the compilers themselves to support more heterogeneity and platforms.
Research objectives and methods	The candidate will explore the use of open-source compilers to improve DNN execution on low-power heterogeneous systems. In particular, the candidate will tailor TVM and MATCH to produce optimized code for RISC-V-based heterogeneous platforms, such as the Greenwave's platforms, GAP8 or GAP9, the Diana platform developed in KU Leuven or more complex systems such as the Occamy multi-core multi-chiplet platform. Objectives:- To review the existing literature on the execution of deep neural networks (DNNs) on low-power system-on-chips (SoCs) and identify potential optimization techniques that can be ported inside TVM and MATCH as an extension.- Implement and evaluate the selected optimization techniques using a series of DNNs and low-power platforms. The scope of the research is to optimize networks for specific heterogeneous platforms. In particular, a set of different optimizations will be developed, which would be either ?general-purpose?, i.e., applied to every hardware target, ?specific?, i.e., applied to a single target, or ?tunable?, i.e., applied to every target but with target-specific parameters.- To improve the efficiency of DNN execution on low-power devices by using either the TVM, MATCH or a new AI compiler (that would possibly be developed as an additional contribution of the Ph.D. candidate), enabling their broader use in various applications. The benchmark will be the TinyML Perf Suite, which lists four tasks with four different architectures, used to benchmark low-power hardware platforms. Additionally, the candidate will benchmark the novel attention-based Foundation models (FMs), which are becoming more and more popular on edge heterogeneous devices. The target hardware platforms will be different heterogeneous hardware platforms: Diana, a Soc that contains a main RISC-V control unit and two DNN hardware accelerators, one 16x16 digital systolic accelerator, and one 1152x512 analog-in-memory computing accelerator; GAP8, characterized by a RISC-V control unit and a small accelerator made of 8 general-purpose RISC-V cores with a dedicated Level 1 scratchpad memory; GAP9, a platform based on GAP8, which extends the accelerator capability from 8 to 9 cores and also includes a programmable DNN-specific digital accelerator; Occamy, a multi-core multi-chiplet platform that can scale up to hundreds of RISC-V cores, with a complex memory multi-level interconnect, allowing for efficient data transfers. Outline of Work:- Review of existing literature on the execution of DNNs on low-power MCUs and identification of potential optimization techniques that can be implemented using TVM and MLIR. This will involve a thorough review of relevant papers, articles, and other sources to identify the most promising optimization techniques that can be implemented using these two open-source deep learning compilers and runtimes. Relevant similar works will also be analyzed to take inspiration on how to implement new optimization for the target hardware platforms.- Familiarization with the hardware targets. The target platforms will be explored, installed, and configured to be available to be easily programmed. The candidate will familiarize himself with the different Software Development Kits (SDKs) and the different compilation toolchains.- Implementation and evaluation of selected optimization techniques using chosen DNN/FM benchmarks on low-power platforms (Diana, GAP8, GAP9, Occamy). This will involve implementing the selected optimization techniques using TVM and MATCH, and evaluating their performance. As previously said, different optimization techniques will be developed and tested.- Improvement of the efficiency of DNN execution on low-power devices through the use of AI compilers. This will involve measuring the performance improvements obtained through these two compilers and runtimes and analyzing the results to determine the most effective optimization techniques for improving the efficiency of DNN execution on low-power devices. Through this research, it is expected that the efficiency of DNN execution on low-power MCUs can be significantly improved through TVM and MATCH by implementing and evaluating the most promising optimization techniques identified in the literature review. The outcome of the research is expected to be published in high level journals and conferences such as: - IEEE Transactions on CAD - IEEE Transactions on Computers - IEEE Journal on Internet of Things - IEEE Transactions on Circuits and Systems (I and II) - IEEE Design & Test of Computers - IEEE Sensors Journal - ACM Transactions on Embedded Computing Systems - ACM Transactions of Design Automation of Electronic Systems - ACM Transaction on IoT
Required skills	1. Experience with neural network architectures: A good understanding of how different types of layers and connections can impact the performance of a model would be important. 2. Familiarity with programming languages: Experience with programming languages such as Python and C is needed. 3. Familiarity with edge devices: Understanding the constraints and limitations of edge devices would be important in order to optimize the deep learning models for these platforms.

68

Beyond TinyML: innovative Hardware Blocks and ISA Extensions for Next-Generation AI
Proposer	Massimo Poncino
Topics	Data science, Computer vision and AI, Computer architectures and Computer aided design
Group website	https://eda.polito.it
Summary of the proposal	The candidate will develop innovative hardware blocks and instruction set architecture (ISA) extensions aimed at enhancing the performance of next-generation AI systems, surpassing the limitations of TinyML. Key objectives include developing new ISA extensions tailored for AI workloads, introducing specialized instructions for common AI operations, and enhancing data movement. A robust instruction set simulator (ISS) will be designed to evaluate and validate these ISA extensions.
Research objectives and methods	This proposal aims to develop new hardware blocks and instruction set architecture (ISA) extensions to enhance the performance and capabilities of next-generation AI systems, moving beyond the constraints of TinyML The target will be the RISC-V ISA that is becoming a core block of European development for new open-source HW architecture. The key objectives of the research are: ISA Extensions: - Develop new ISA extensions tailored for AI workloads, including advanced machine learning algorithms and optimization techniques. - Introduce specialized instructions to accelerate common AI operations such as matrix multiplications, convolutions, and non-linear activations. - Enhance the flexibility and efficiency of data movement and memory management within AI models, reducing latency and power consumption. Instruction Set Simulator (ISS): - Design or extend an ISS to evaluate and validate the proposed ISA extensions. - Ensure the ISS provides accurate performance modeling and debugging capabilities for developers. Hardware Blocks: - Develop innovative hardware blocks that complement the new ISA extensions, focusing on parallelism and low-power operation. - Integrate specialized accelerators for AI tasks into the existing hardware ecosystem. - Optimize the interconnection between general-purpose processors and dedicated AI hardware to achieve a balanced performance improvement. Outline of the Research Work Plan and program for the 3-year PhD: 1. Conduct a comprehensive review of existing ISA extensions and hardware blocks for AI (M1-M4) 2. Identify current limitations and potential areas for improvement.(M5-M6) 3. Develop a set of new ISA extensions based on the identified needs and objectives. (M7-M12) 4. Define the new instructions and their expected impact on AI workloads. (M7-M12) 5. Design the ISS architecture, implement it, or extend an existing one to support the new ISA extensions. (M13-M24) 6. Develop the architecture of new hardware blocks that complement the ISA extensions. (M25-M36) 7. Focus on achieving high parallelism and low-power operation. (M25-M36) 8. Integrate the new hardware blocks with the ISA extensions. (M25-M36) Possible Venues for Publications: - IEEE Transactions on CAD - IEEE Transactions on Computers - IEEE Journal on Internet of Things - IEEE Transactions on Circuits and Systems (I and II) - IEEE Design & Test of Computers - IEEE Sensors Journal - ACM Transactions on Embedded Computing Systems - ACM Transactions of Design Automation of Electronic Systems - ACM Transaction on IoT - International Symposium on Computer Architecture (ISCA) - International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) - IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
Required skills	1. Experience with neural network architectures: The project involves evaluating different neural network architectures on different accelerators, so a good understanding of how different types of network models is important. 2. Familiarity with programming languages such as C, VHDL, Verilog. 3. Familiarity with edge devices and hardware accelerators: 4. Familiarity with open-source ISA such as RISC-V and with Instruction Set Simulators (ISSs).

69

XR-based Adaptive Procedural Learning Systems
Proposer	Andrea Bottino
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://areeweb.polito.it/ricerca/cgvg/
Summary of the proposal	The aim of this research is to develop a versatile learning framework that leverages adaptive algorithms and XR technology to enhance education in various domains through interactive and immersive experiences that dynamically adapt to learners' needs and skills. The research will explore and test the effectiveness of adaptive models from the literature, and develop software for rapid prototyping of procedural learning applications in XR ensuring scalability and efficiency in real-time learning.
Research objectives and methods	In many fields, such as industry, medicine and others, procedures play an important role in the smooth execution of activities and tasks. In general, a procedure is a sequence of operations or activities that follow certain rules and ultimately serve to achieve a desired result. Procedural learning refers to the process of acquiring and mastering all the skills, knowledge and understanding required to perform a particular procedure effectively. The main objectives of procedural learning can be summarized as follows. First, to provide learners with the necessary theoretical knowledge of the procedure, including an understanding of the underlying principles and concepts behind each step. Second, help learners acquire the practical skills they need to perform a procedure as accurately as possible, and to memorize and retain the individual steps of the procedure so that they can remember the sequence when they need to perform it. Finally, learners should also be enabled to master the procedure execution to minimize errors and optimize the workflow. In the field of computer-assisted learning, XR (Extended Reality) technologies are certainly a game changer as they offer the possibility of providing multimedia and interactive information that utilizes contextual awareness, enableing the development of effective, engaging, and even entertaining learning environments. This research project aims to advance the state of the art in XR-based procedural learning by tackling the following objectives: - Creating AI models that are able to adapt the learning content and its presentation based on the user's prior knowledge and progress. - Development of XR environments that replicate different real-life scenarios with interactive elements and virtual instructors. - Implement systems that provide real-time intelligent feedback tailored to learners' challenges and achievements, as well as advanced and natural interaction mechanisms (e.g. natural language interaction) to improve user engagement and the realism of training scenarios. - Implement autonomous, lifelike virtual agents that can take on different roles, from non-playable characters (NPCs) to virtual instructors. These agents will enhance the interactivity and realism of the training environment. - Develop solutions that are compatible with various XR platforms and scalable across different hardware capacities to ensure broad acceptance and future-proofing against technological advances. - Develop a framework for rapid prototyping and development of procedural learning applications featuring visual design tools for non-technical professionals to create training content and define activity graphsprocedures' activities, rules and constraints. This approach aims to improve collaborative development, simplify communication between technicians and domain experts, and automate the generation of software elements from visual representations. Work Plan: Phase 1 - Foundation and initial development (M1-M16) Analyze the current SOTA in procedural learning, XR-based learning and adaptive learning systems and identify gaps and opportunities. Create initial prototypes and conduct their preliminary testing and validation. Phase 2 - XR procedural learning framework (M6-M30) The framework will support portability and scalability across different devices and environments (from AR to VR) and will be iteratively extended with the following features: - intelligent real-time feedback systems tailored to learners' challenges and achievements - natural interaction mechanisms for better usability and user engagement - autonomous, lifelike virtual agents that can assume different roles - visual design tools for non-technical professionals and domain experts to create training content and define activity graphs, rules and constraints Phase 3 - Refinement (M12-M36) Identify one or more specific use cases and with the collaboration of domain expert design and implement a training activity exploiting the developed framework. Once developed, select a group of target learners and conduct comprehensive testing of the entire system. Collect data to asses user experience and learning effectiveness of the proposed ssyem. Refine the proposed systems according to the feedback gathered and iteratively improve the solution. Publication Venues: - Journals: IEEE Transactions on Learning Technologies, Computers & Education, International Journal of Human-Computer Studies, Virtual Reality. - Conferences: CHI, IEEE VR, AIED, ICALT, ISMAR, EC-TEL. Industrial Collaboration and Funding: collaborations with companies specializing in digital infrastructure for training, such as Immersave, and stakeholders like the MECAU department of the Ospedale Molinette of Turin, which has several ongoing collaboration with our group.
Required skills	The ideal candidate should possess a strong background in computer science and AI, with expertise in XR technologies. Essential skills include problem-solving, creativity, asset optimization for low-end devices, and immersive user interface development. Good communication, collaboration, and scientific writing skills are also required.

70

AI Based non-invasive vital signs monitoring
Proposer	Gabriella Olmo
Topics	Data science, Computer vision and AI, Life sciences
Group website	https://www.dauin.polito.it/it/la_ricerca/gruppi_di_ricerca/smilies_resilient_computer_architectures_and_life_sciences
Summary of the proposal	Vital signs monitoring is crucial to improve quality of life of, reduce cost of public healthcare, mitigate risks, better address treatment. In this context, little invasive signal detectors, or the possibility of remote monitoring, would enable usage in daily life environments. As a case study, remote monitoring of drivers' vital signs represents a situation where the reduction of invasiveness is crucial, and vital sign detection could decrease significantly risks associated to accidents.
Research objectives and methods	The scope of the research is to study and develop a robust methodology to monitor a driver in vehicle by detecting continuously vital parameters such Heart Rate and SpO2. The invasiveness of the device must be reduced to the maximum possible extent, so as to not to affect the status of the driver him/herself. In principle, non-contact monitoring would be the preferred solution. To this end, devices like global shutter cameras will be explored with different illumination system in the IR domain in order not to blind the driver. Moreover, a robust methodology for signal processing, face detection, recognition of the points of acquisition and reduction of the noise introduced by vibration or human motion will be devised and experimented. Deep or machine learning methodologies will be evaluated to reduce motion artifacts, focusing both on performance and real time response with very short latency. In parallel of the main activity, a collection of ECGs will be performed to improve the automatic identification of cardiac anomalies. WORKPLAN: YEAR 1. Task 0. Revision of the state of the art related to in-vehicle vital signs remote monitoring. Task 1. Collection of ECG data in hospital and RSA to set up algorithms for anomaly detection (contact ECG sensors). Task 2: Implementation and testing if algorithms for hearth rate (HR) anomaly detection and classification (contact ECG sensors). Task 3. Development of a prototypal system for contactless PPG detection in vehicle. YEAR 2. Task 4. ML and DL models to clean up the signal from electrical and background noise. Implementation and testing of algorithms for HR anomaly detection and classification (contactless). Task 5. Collection of databases in vehicle to train the models. Task 6. First results of the prototypal system in vehicle. YEAR 3. Task 7. Finalization of the data collection champaign in hospital, and related data analysis. Task 8. ML and DL models to reduce the noise due to the motion and vibration. Task 9. Finalization of the algorithms and the preparation of demo in vehicle during normal usage Task 9. Critical analysis of the results, proposal of integration of the algorithms with a robust trial on the field. Task 10. Critical analysis of the results, proposal of integration of the algorithms with heterogeneous data present in the patient's clinical records. We plan to have at least two journal papers published per year. Target journals: IEEE Transactions on Biomedical Engineering IEEE Journal on Biomedical and Health Informatics IEEE Access IEEE Journal of Translational Engineering in Health and Medicine MDPI Sensors Frontiers in Neurology COOPERATIONS: -CNR-IEIIT -Mauriziano Hospital, Cardiology Department A cooperation is on going with Cooperativa Valdocco to perform measurements of cardiac activity of elderly persons, using wearable devices. ON GOING PROJECTS ON RELATED TOPICS: - PNRR Ecosistema NODES Nord-Ovest Digitale e Sostenibile - PNRR Salute Complementare "Health Digital Driven Diagnostics, prognostics and therapeutics for sustainable Health care" - PRIN-PNRR 2022 "Objective monitoring of axial symptoms in Parkinson's disease: quantitative assessment in daily life based on the use of wearables, video sensing and artificial intelligence (OMNIA-PARK)"
Required skills	Expertise in the fields of Signal Processing, Data Analysis, Statistics and Machine Learning. Basic knowledge of bio-signal data processing (EEG, ECG, EMG, EOG) Basic knowledge in hardware and mechanical developing. Good knowledge of C, Python, MATLAB, Simulink. Basic knowledge of orcad and Solidwork (or similar) - Good relational abilities and knowledge of the Italian language, to effectively manage interactions with patients during the evaluation trials.

71

Statistical learning of genomic data
Proposer	Renato Ferrero
Topics	Life sciences, Data science, Computer vision and AI
Group website	https://cad.polito.it/
Summary of the proposal	The information content of biological data can be quantified using various information-theoretic measures, such as entropy, mutual information, and compression complexity. The research activity concerns the study of such measures and clustering methods, in order to provide insights into functional and evolutionary properties of the genome. The research will be extended to the pangenome, i.e., the complete set of genes within a species, encompassing genetic variation across different individuals.
Research objectives and methods	The structure and properties of DNA molecules can be conveniently represented by translating the sequence of nucleotides (A, T, C, G), for example into a one-dimensional numerical sequence (known as DNA walk). Then, different theoretic measures can be adopted to evaluate the information content of such generated sequence. For example, entropy measures the uncertainty or randomness of the sequence, with higher entropy indicating more randomness and lower entropy indicating more predictability. Mutual information can be used to quantify the dependence between different regions of the genome. Compression complexity quantifies the shortest possible description length of the sequence. The research activity of the PhD candidate will provide valuable insights into various structural and functional aspects of DNA sequences. The complexity of DNA sequences will be assessed by examining the variability and unpredictability in the walk. The randomness (or order) of the sequence will be quantified by measuring the Shannon entropy. The analysis can detect periodic patterns that may have functional or regulatory roles, as well as identify large-scale structural variations such as inversions, translocations, or duplications that alter the walk's trajectory. Finally, the identified DNA walk patterns may be related to disease-causing variants or mutations. The work plan will ensure progressive development from foundational studies to comprehensive analyses and practical applications. Foundation and preliminary studies will be carried out during the first year. An extensive review will survey methodologies, applications, and key findings on DNA walks. A comprehensive set of DNA sequences will be gathered from the Human Pangenome Reference Consortium. Existing software for sequence alignment, visualization, and statistical analysis will be evaluated. The second year will focus on comprehensive data analysis and pattern identification. Statistical and computational methods will be applied to identify recurring patterns, motifs, and structural variations in the DNA walks. Complexity and entropy analyses will be performed to quantify the randomness and order within the DNA sequences. Efficient data management practices will be implemented to handle the large volume of generated data, ensuring data integrity and accessibility. The third year finalized the research activities as it will concentrate on its applications. The candidate will attempt to correlate patterns in DNA walks with known functional elements such as genes, promoters, and enhancers. Findings will be validated using experimental data and existing annotations from genomic databases. In order to engage with the scientific community for feedback and collaboration, findings will be published on peer-reviewed journals that focus on bioinformatics, genomics, and computational biology, such as: - IEEE/ACM Transactions on Computational Biology and Bioinformatics - Bioinformatics (Oxford University Press) - Pattern Recognition (Elsevier)
Required skills	The candidate should have programming skills in languages commonly used in bioinformatics, such as Python, R, and MATLAB, and experience with software development for creating custom analysis tools and scripts. A genomics and molecular biology background is preferable, with familiarity with genomic concepts, including DNA structure, gene function, and regulatory elements. Some data analysis and statistics knowledge are required, including handling large datasets and performing statistical tests.

72

High-End AI-Enhanced Computing System for Space Application
Proposer	Luca Sterpone
Topics	Computer architectures and Computer aided design, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website	www.cad.polito.it asaclab.polito.it
Summary of the proposal	The research focuses on methodologies to enhance the reliability, efficiency, and capabilities of AI-driven systems in space to support the critical role of AI in space applications outlined above. The main challenges in developing high-end AI-enhanced computing systems for space applications include:Reliability of the systemsReal-time ProcessingLimited Computational Resources
Research objectives and methods	Artificial Intelligence (AI) is increasing its role in space applications, revolutionizing various aspects of space exploration, satellite operations, and data analysis. AI algorithms have proven extremely effective in enabling autonomous navigation and driving for spacecraft and remote vehicles. For example, NASA's Mars rovers, such as Curiosity and Perseverance, use AI to traverse the Martian terrain, avoiding obstacles and selecting paths to scientific targets. AI also monitors the health and status of satellites, predicting potential failures and optimizing maintenance schedules. It is used for processing and analyzing images captured by Earth observation satellites, supporting applications in environmental monitoring, disaster response, and agriculture. Such approaches can significantly enhance Earth observation and space image analysis, bringing new capabilities and efficiencies to these applications. AI models analyze satellite data to monitor indicators of climate change, such as melting ice caps, sea-level rise, and changes in vegetation cover. Using AI can perform data analysis on edge to reduce satellite data transfer tasks. Typically, AI-driven functionalities rely on embedded systems for real-time data processing and decision-making. Embedded systems equipped with AI capabilities allow satellites and spacecraft to process data onboard, significantly reducing the need for data transmission to Earth and enabling real-time analysis and decision-making. For instance, CubeSats and larger satellites often include embedded systems that use AI algorithms to process imagery and sensor data immediately after collection. System-on-chip (SoC) platforms integrate processors, memory, and AI accelerators into a single chip, providing the computational power required for AI tasks while minimizing space, weight, and power consumption, which represent critical factors in space-borne systems. The research focuses on methodologies to enhance the reliability, efficiency, and capabilities of AI-driven systems in space to support the critical role of AI in space applications outlined above. The main challenges in developing high-end AI-enhanced computing systems for space applications include: - Reliability of the systems: High levels of radiation characterizing the space environment are a primary source of errors in space applications that must be addressed at the hardware and application level. Advanced research in continuous availability, fault-tolerant designs, and redundancy strategies are fundamental to exploring and enhancing the reliability of AI-driven systems. - Real-time Processing: Space applications often require real-time data processing for navigation, hazard avoidance, and scientific analysis. The research aims to develop AI systems capable of high-speed, real-time processing with minimal latency, exploiting the latest devices and architecture, focusing particularly on heterogeneous and parallel paradigms for edge devices. - Limited Computational Resources: Systems on the edge often have constrained computational resources due to size, weight, and power limitations. Developing AI algorithms that can operate efficiently within these constraints is a non-trivial challenge. The research must explore lightweight, resource-efficient AI models that deliver high performance relying on the available computational infrastructure. This includes a high codesign of a system based on techniques such as model pruning, quantization, and customized AI accelerators. The research projects correlated to the present proposals in the last years are: - [active 2023] ?TERRAC: Towards Exascale Reconfigurable and Rad-hard Accelerated Computing in space? in cooperation with the European Space Agency (ESA) - [close 2020] ?The HERA Cubesat program? in cooperation with DIMEAS, Tyvak International and the European Space Agency (ESA) The research program will target the following conference venues: - IEEE Design and Test in Europe (DATE) - IEEE Dependable Network and System (DSN) - IEEE International Test Conference (ITC) - IEEE International Conference on Field Programmable Logic (FPL) - IEEE RADECS - IEEE NSREC - ACM Computing Frontiers The research program will target the following journals: - IEEE Transactions on Circuits and Systems-I - IEEE Transactions on Computers - IEEE Transactions on Reliability - ACM Transactions on Reconfigurable Technology and Systems - Intelligent Systems with Applications - Transactions of Emerging Topics in Computing
Required skills	The candidate should possess technical expertise in software programming for AI and embedded systems and strong collaboration and communication skills essential for interdisciplinary teamwork. Research is expected to tackle complex problems with innovative solutions. The Candidate should proactively explore and integrate various disciplines such as AI, circuit and system design, and embedded software programming.

73

Design and development of software stack for execution of hybrid Quantum-classic algorithms
Proposer	Bartolomeo Montrucchio
Topics	Parallel and distributed systems, Quantum computing
Group website	https://www.dauin.polito.it/it/la_ricerca/gruppi_di_ricerca/grains_graphics_and_intelligent_systems https://linksfoundation.com/en/
Summary of the proposal	Quantum computing has the purpose to work with classic computation in order to speed up specific mathematical kernels. Moreover, given the reduced number of qubits available today, noisy intermediate-scale quantum (NISQ) algorithms are usually hybrid. The purpose of this work will be to work on an open source software stack for superconducting quantum computers. Algorithms and software will be developed in co-working with Fondazione Links.
Research objectives and methods	Quantum computing is a very powerful technology whose purpose is to speed-up traditional computation, given its exponential intrinsic parallelism, able to make linear the exponential algorithms. However, given the reduced number of qubits available today, NISQ algorithms are usually hybrid. For this reason a strict integration with classic resources is fundamental to be able to work together with technologies like cloud computing and High Performance Computing (HPC). PhD candidate will work on the definition of an open source software stack for superconducting quantum machines, in order to integrate them with classic computers for solving scientific and industrial problems. In particular, activity will be concentrated on the research of state of the art solutions, on implementing new software components and on the integration with vendors of quantum hardware for the following abstraction levels:- controlling the machine by using pulses, integrating standard code with pulse-level code - compiling efficiently (exact or approximated compiling) quantum circuits for a given hardware target architecture, defining benchmarks for understanding the effect of compiling on potential applications - integrating quantum architecture with standard architecture, simplifying access to the classic counterpart, and efficiently allocating quantum resources together classic resources The proposal puts together competences of Dipartimento di Automatica e Informatica and knowledge of Fondazione Links. Experimental activities will be done mainly together Fondazione Links. This work will be developed during the three years, following the usual Ph.D program: - first year, improvement of the basic knowledge about quantum computing, attendance of most of the required courses, also on quantum mechanics if required, submission of at least one conference paper - second year, design and implementation of new algorithms for the previously illustrated abstraction levels and submission of conference papers and at least one journal - third year, finalization of the work, with at least a selected journal publication. Possible venues for publication will be, if possible, journals and conferences related to HPC and Quantum Computing, from IEEE, ACM and SPIE. An example could be the IEEE Transactions on Quantum Engineering. The scholarship is sponsored by Fondazione Links. A period of six months abroad will be done during the PhD, and a period of at least six months in Fondazione Links will be mandatory too. The work will therefore be done in strict collaboration together with Fondazione Links, with whom there is already a long-term collaboration, not only on QC.
Required skills	The ideal candidate should have a strong interest in Quantum Computing. The candidate should also have a good background in programming skills, mainly in C/C++/Python and possibly in quantum mechanics. A previous knowledge in quantum computing is not mandatory, but it would be very appreciated. Good teamwork skills will be very important, since the work will require to be integrated with company work.

74

Enhancing knowledge graph through multimodal recognition and extraction techniques
Proposer	Lia Morra
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://grains.polito.it
Summary of the proposal	Multimodal Knowledge Graph (MMKG) is a cutting-edge way to represent structured information. Unlike regular knowledge graphs that are predominantly text-based, MMKG also includes pictures, videos, sounds, and other non-text data. The goal of this research proposal is to improve and extend techniques such as multi-modal named entity recognition and relation extraction to faciliate the automatic construction and maintenance of MMKG, including personal and commonsense knowledge graphs.
Research objectives and methods	Research objectives: Knowledge Graphs (KGs) are structured representations that explicitly store rich factual knowledge. KGs are powerful assets that support and enrich countless application scenarios in search, recommender system and data mining; they are also promising tools to overcome some limitations of LLMs [1], and can serve as a key repository for knowledge, providing a foundation for various downstream tasks, such as retrieval and visual query answering. Although many KGs have been proposed to store general-purpose, encyclopeadic knowledge, more recently increasing attention have been devoted to more specialized use cases such as personal knowledge graph [2], based on personal life records as their primary data source. KGs are notoriously challenging to construct and maintain over time, and this challenge is exhacerbated by the shift towards Multi-Modal KG (MMKG) that extend beyod textual data to include image, video and audio information [3]. Therefore, there is a pressing need for automated methods in KGs to generate new facts and represent previously unseen knowledge. Multimodal named entity recognition (MNER), multimodal relation extraction (MRE) and multimodal entity alignment (MEA) are fundamental subtasks in the construction of MMKG. The objectives of the present research proposals are to investigate novel techniques for these tasks, focusing on issues such as integration of multiple modalities, expressiveness and scalability. The primary focus will be on commonsense and personal KGs. For personal knowledge graphs, a key issue will be the ability to represent a rich, user-centric personalized variety of entity types, possibly evolving over time, aligning it with external, general purpose KGs. Outline of the proposed project plan: During year 1, the PhD candidate will conduct a detailed literature and dataset search. A suitable dataset will be collected or assembled from existing sources to tackle the limitations of existing datasets [3] and benchmarks existing methods for MNER. Fine-grained annotations for visual and possibly audio data will be established based on pre-trained foundational model to ground entities and concepts in the different modalities. During year 2, the candidate will investigate novel techniques for MNER and MEA. As an example, the grounded MNER problem could be cust as an indexing generation problem, or as a machine reading comprehension task. During year 3, the candidate will delve deeper into applying the proposed method to automatically and continously generate personal multimodal KG for records, featuring a richer variety of entity types with text and images serving as nodes, and investigating its role as a foundational knowledge base for other downstream tasks. Possible publication venues: Results will be disseminated at conferences and international journals targeting both foundational and applied knowledge representation and machine learning conferences. Conferences include European Conference on Machine Learning and Knowledge Discovery (ECML-PKDD), IEEE International Conference on Data Science and Advanced Analytics (DSAA), IEEE BigData, ACM MultiMedia, Neuro-symbolic Artificial Intelligence conference, as well as venues on generative AI and knowledge retrieval. Targeted journals include the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Data and Knowledge Engineering, IEEE Transactions on Multimedia, Pattern Recognition. [1] Pan, Shirui, et al. "Unifying large language models and knowledge graphs: A roadmap." IEEE Transactions on Knowledge and Data Engineering (2024). [2] Balog, Krisztian, and Tom Kenter. "Personal knowledge graphs: A research agenda." Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval. 2019. [3] Liang, Wanying, et al. "A Survey of Multi-modal Knowledge Graphs: Technologies and Trends." ACM Computing Surveys (2024).
Required skills	- Advanced knowledge of artificial intelligence, machine learning - Previous experience with natural language processing, knowledge graphs, and multi-modal learning - Extensive experience in programming and data handling - Communication and teamwork skills

76

Embracing synergies between Software Engineering and Artificial Intelligence
Proposer	Marco Torchiano
Topics	Data science, Computer vision and AI, Software engineering and Mobile computing
Group website	https://softeng.polito.it
Summary of the proposal	Software Engineering (SE) is undergoing a significant transformation with the integration of Artificial Intelligence (AI) techniques into the software development lifecycle. Conversely, SE practices hold potential benefits for the AI lifecycle. This PhD proposal aims at advancing AI techniques within software engineering practices and improving the quality and reliability of AI pipelines with the adoption of customized SE techniques, e.g. MLOps.
Research objectives and methods	RESEARCH OBJECTIVES 1. Build a comprehensive framework for the existing and potential synergies between SE and AI. The objective is to explore, document and organize in a conceptual framework the complementary aspects of SE and AI, deeply analyzing and identifying the areas within SE that can benefit from the effective application of AI techniques and vice versa. 2. Develop Customized SE Techniques for AI Lifecycle. This objective focuses on developing tailored SE techniques to improve the AI lifecycle in a set of selected dimensions such as reliability, robustness, and interpretability of AI systems, and according to specific stages (es., dataset collection and management, feature engineering, model training, model deployment). The high-level goal is to ensure AI systems align with established SE principles and standards. 3. AI-Driven Approaches for Software Quality Assurance This objective focuses on the creation of novel AI techniques specifically tailored for software quality assurance of SE processes (e.g., requirements elicitation and verification, code review, software testing). The general goal is to fortify the reliability and robustness of software systems through AI-powered methodologies. OUTLINE OF THE WORKPLAN 1) Literature review and framework development Conduct an extensive review of the existing literature on Artificial Intelligence for Software Engineering (AI4SE) and Software Engineering for Artificial Intelligence (SE4AI). Identify current proposals and implementations, research gaps and potential areas for further investigation. The findings will be organized within a comprehensive conceptual framework, that should consider both the specific phases of software development and the stages of a typical AI pipeline. This will take place mainly during the 1st year. 2) Development/customization and evaluation of SE Techniques for AI improvement. Building on top of the previous activity's findings, the candidate shall select a set of SE techniques to be specifically tailored for the AI lifecycle. Examples are design methodologies for requirements elicitation, testing strategies, maintenance techniques and tools. Experimental evaluation should validate the improvements in terms of more robust and reliable AI systems, comparing the outcomes achieved with and without SE techniques. Comparison with existing benchmarks is also possible. This will take place mainly during the 2nd and 3rd years. 3) Development/customization and evaluation of AI-Driven Approaches for software quality assurance. Based on the findings of the first activity, the candidate will develop or customize AI techniques to address the identified SE quality assurance challenges, either in terms of processes or artifacts quality. This activity may involve using one or more of the following techniques: ML algorithms, neural networks, natural language processing, image recognition/classification techniques. Experimental evaluation should validate the improvements in the involved aspects of the software development lifecycle, comparing the performance and effectiveness of the AI-driven novel approach with existing SE methods. This should allow to highlight the advantages, disadvantages, and potential improvements offered by the AI-driven approaches. This will take place mainly during the 2nd and 3rd years. LIST OF POSSIBLE VENUES FOR PUBLICATION Journals: - IEEE Transactions on Software Engineering - Empirical Software Engineering - ACM Transactions on Information Systems - European Journal of Information Systems - Journal of Systems and Software Conferences: - ACM/IEEE International Symposium on Empirical Software Engineering and Measurement - International Conference on Software Engineering - International Conference on Product-Focused Software Process Improvement - International Conference on Evaluation and Assessment in Software Engineering (EASE)
Required skills	The candidate should exhibit a good mix of: - Solid foundation in SE principles, methodologies, and practices. He/she should be familiar with the software development lifecycle and software quality assurance techniques. - Good understanding of AI techniques. He/she should be proficient in the use of ML and AI techniques, with evidence of past project experiences. He/she should also be able to communicate the research in an effective way.

77

Protect-IT \| Privacy on the Internet: measurements and novel approaches
Proposer	Marco Mellia
Topics	Cybersecurity, Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://dbdmg.polito.it/dbdmg_web/
Summary of the proposal	The project aims to investigate privacy issues on the Internet, and on possible alternatives for behavioural advertising on the web. Via passive and active data collection, we aim to define policies and mechanisms to quantify and monitor the amount of information services collect, and to define novel policies to balance it with privacy requirements for users.
Research objectives and methods	Research objectives: In the current web ecosystem, targeted or behavioural advertising lets providers monetize their content, by collecting and processing personal data to build accurate user profiles. This massive data collection has created tension between users and the ads ecosystem. Mozilla Firefox and Apple Safari have started battling third-party cookies by giving third-party cookies a separate cookie jar per site, so they cannot be used to track users across sites anymore. Other alternatives have been proposed to limit and control the amount of information users share with web services. This project consists of the design, implementation and testing of novel measurement methodologies to observe and quantify the amount of information online services collect, and on the definition of possible alternative solutions that result in more privacy friendly. The research will focus on the design of data collection methodologies that allow one to observe and quantify the amount of information web services collect when users browse the web. This includes both active measurements, e.g., web crawlers, and passive measurements, e.g., browser plugins that observe the users' browsing activity. We will investigate mechanisms that will offer privacy guarantees to users, e.g., via the extraction of client-side models that process the information locally rather than sharing the raw data with servers. At the same time, we will investigate the introduction of privacy guarantees, e.g., based on differential privacy or k-anonymity concepts. In the first phase, the candidate will study the state of the art of privacy methodologies in general and of the problem of data collection on the internet. He/she will design possible data collection methodologies and metrics to measure and quantify the current state. In the second phase, the candidate will design and integrate possible alternative solutions to mitigate and control the amount of information web services collect from users. These shall include privacy guarantees for example those provided by differential privacy or k-anonymity approaches. The project will involve a collaboration with partners in the Protect-IT project including the University of Brescia, of Naples, of Milano among others. Outline of the research work plan: 1st year - Study of the state-of-the-art on privacy and the web, initial design of the data collection platforms and machine learning models to extract and quantify the amount of personal data exchanged with web services. 2nd year - Data collection campaign and measurement analysis - Propose and develop innovative solutions to the problems of privacy on the web. - Propose machine learning approaches to offer privacy guarantees and extract local models to avoid the sharing of raw data. 3rd year - Tune the developed techniques and highlight possible strategies to deploy them in the wild. - Application of the strategies to new data for validation and testing. References: - Nikhil Jha, Martino Trevisan, Luca Vassio, and Marco Mellia. 2022. The Internet with Privacy Policies: Measuring The Web Upon Consent. ACM Trans. Web 16, 3, Article 15 (August 2022), 24 pages. https://doi.org/10.1145/3555352 - N. Jha, M. Trevisan, M. Mellia, R. Irarrazaval and D. Fernandez, "I Refuse if You Let Me: Studying User Behavior with Privacy Banners at Scale," 2023 7th Network Traffic Measurement and Analysis Conference (TMA), Naples, Italy, 2023, pp. 1-9, doi: 10.23919/TMA58422.2023.10198936. - Nikhil Jha, Martino Trevisan, Emilio Leonardi, and Marco Mellia. 2024. Re-Identification Attacks against the Topics API. ACM Trans. Web Just Accepted (June 2024). https://doi.org/10.1145/3675400 List of possible venues for publications_ Security venues: IEEE Symposium on Security and Privacy, IEEE Transactions on Information Forensics and Security, ACM Symposium on Computer and Communications Security (CCS), USENIX Security Symposium, IEEE Security & Privacy; AI venues: Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), International Conference on Machine Learning (ICML), AAAI Conference on Artificial Intelligence, ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD); Computer networks venues: Distributed System Security Symposium (NDSS), Privacy Enhancing Technologies Symposium, The Web Conference (formerly International World Wide Web Conference WWW), ACM International Conference on Emerging Networking EXperiments and Technologies (CoNEXT), ACM Internet Measurement Conference, USENIX Symposium on Networked Systems Design and Implementation (NSDI).
Required skills	- Solid programming skills (such as Python, Torch, Spark) - Excellent Machine Learning knowledge - Knowledge privacy preserving analytics - Knowledge of Networking and security

78

Protect-IT \| Distributed platform for cybersecurity data collection and analysis
Proposer	Marco Mellia
Topics	Cybersecurity, Data science, Computer vision and AI
Group website	https://smartdata.polito.it/ https://dbdmg.polito.it/dbdmg_web/
Summary of the proposal	The project aims to respond to the deficiency of the current network security solutions, dramatically shown by the ever-increasing cyberattacks. The goal is to build an open and distributed platform leveraging AI-ML and policy-based enforcement solutions to identify new threats quickly, automatically, and reliably, and to deploy prompt and effective network security countermeasures.
Research objectives and methods	Research objectives: The project consists of the design, implementation and testing of a flexible and distributed infrastructure to collect and process data related to the identification of novel cybersecurity threats. Thanks to technologies based on containers, darknets, and honeypots, the infrastructure will support the collection of data in a scalable manner. The usage of AI and ML solutions will allow one to extract actionable information to identify already known and novel attacks. To avoid data sharing, Federated Learning solutions will allow the training of a common model by leveraging the data each vantage point will collect seamlessly. The research will focus on extending machine learning solutions to automate and assist security analysts in the process (i) of identifying new attacks and (ii) setting up countermeasures. In the first phase, the candidate will study the state of art of platforms for cybersecurity data collection and machine learning, including the problem of data collection in distributed scenarios, and of federated learning solutions. In the second phase, the candidate will create and setup the distributed platform leveraging servers and virtual machines offered by partners of the PROTECT-IT/SERICS project. This will include the joint deployment of (i) data collected from honeypots and (ii) darknets. Then he/she will investigate machine learning models that can identify new threats. For this, he/she will leverage mechanisms to create generic data representation using embeddings that will be then specialised to solve custom downstream tasks. Via federated learning, the candidate will investigate how to train a shared embedding model suitable to then solve specific tasks such as threat classification, anomaly detection, or identifying new attack patterns. The project will involve a collaboration with partners in the Protect-IT project including the University of Brescia, Naples, Milano among others. Outline of the research work plan: 1st year - Study of the state-of-the-art security log analysis and state-of-the-art data collection platforms and machine learning models in ML. - Data collection platform design with inclusion of security monitors such as honeypots, darknets, IDS, etc. 2nd year - Deployment of the distributed platform for data collection and initial model training using federated learning. - Propose and develop innovative solutions to the problems of cyber threats analysis with machine learning solutions. - Propose multi-modal embeddings for network raw data and security logs. 3rd year - Tune the developed techniques and highlight possible strategies to counteract the various threats. - Application of the strategies to new data for validation and testing. References: - Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Drago, Idilio, Milan, Giulia, Houidi, Zied Ben, Rossi, Dario (2023). Cross-network Embeddings Transfer for Traffic Analysis. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, p. 1-13, ISSN: 1932-4537, doi: 10.1109/TNSM.2023.3329442 - Luca Gioacchini, Luca Vassio, Marco Mellia, Idilio Drago, Zied Ben Houidi, Dario Rossi (2023). i-DarkVec: Incremental Embeddings for Darknet Traffic Analysis. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, vol. 23, p. 1-28, ISSN: 1533-5399, doi: 10.1145/3595378 - Huang, Kai, Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Dynamic Cluster Analysis to Detect and Track Novelty in Network Telescopes, 9th International Workshop on Traffic Measurements for Cybersecurity (WTMC 2024), Vienna, Austria, July 2024 - Huang, Kai, Gioacchini, Luca, Mellia, Marco, Vassio, Luca, Incremental Federated Host Embeddings for Network Telescopes Traffic Analysis, IEEE International Workshop on Generative, Incremental, Adversarial, Explainable AI/ML in Distributed Computing Systems (AI-DCS), Jersey City, New Jersey, USA, July 2024 List of possible venues for publications: Security venues: IEEE Symposium on Security and Privacy, IEEE Transactions on Information Forensics and Security, ACM Symposium on Computer and Communications Security (CCS), USENIX Security Symposium, IEEE Security & Privacy; AI venues: Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), International Conference on Machine Learning (ICML), AAAI Conference on Artificial Intelligence, ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD); Computer networks venues: Distributed System Security Symposium (NDSS), Privacy Enhancing Technologies Symposium, The Web Conference (formerly International World Wide Web Conference WWW), ACM International Conference on Emerging Networking EXperiments and Technologies (CoNEXT), USENIX Symposium on Networked Systems Design and Implementation (NSDI).
Required skills	- Good programming skills (such as Python, Torch, Spark) - Excellent Machine Learning knowledge - Knowledge Federated Learning and Machine Learning - Basics of Networking and security

79

Human motion reconstruction and recognition based on dynamic point clouds
Proposer	Fabrizio Lamberti
Topics	Computer graphics and Multimedia, Data science, Computer vision and AI
Group website	https://grains.polito.it
Summary of the proposal	3D reconstruction of human motion is a hot topic with links with motion detection and recognition [1]. Purely vision-based motion recognition may be difficult to obtain in scenarios where the line of sight is not clear. Millimeter waves have unique advantages in such scenes. The goal is to combine them to provide a robust solution that is both accurate and robust. The focus will be on human motion reconstruction and recognition based on fusing point clouds from depth cameras and mmWave radars.
Research objectives and methods	Research objectives In recent years, works like that by Mei Oyama et al. [2] showed how to recover an accurate human pose from a single depth map. Pose alone, however, is not enough to provide useful information in the scenarios of interest. Dynamic point cloud data need to be considered for reconstructing the entire motion process. For motion recognition, Biswas et al. [3] designed an end-to-end system that combines RGB images and point cloud information to recover 3D human motion. Arindam Sengupta et al. [4], in turn, proposed a method of real-time detection and tracking of human joints using mmWave radars radar. These methods, though, infer motions based on the current pose, which introduces uncertainty. Moreover, it is challenging to use mmWave radars for object tracking, as their signals are usually of low resolution and highly noisy due to multi-path and specularity [5]. To cope with the above limitations, this research will directly consider a continuous sequence of mmWave point clouds over time to determine the motion. Moreover, to record the continuous changes in the point cloud, tracking algorithms will be applied to human body joints. Finally, in order to cope with challenges associated with pure radar-based reconstructions, the combination of vision and radar will be explored to support applications in non-ideal configurations. Outline of the proposed project plan During the first year, the PhD student will review the state of the art regarding multi-modal 3D reconstruction, detection and recognition of human motion. He or she will start familiarizing with open problems in domains encompassing, e.g., gaming, movie and game production, surveillance, home and industry automation, health monitoring, etc. by focusing on one of them and leveraging mmWave radar technology alone first. At least a conference publication is expected to be produced based on the results of these activities. In the second year, the student will work on extending the scope of the analysis by addressing the limitations found and devising solutions based on the fusion of the two types of data (mmWave radar and depth camera). The student is expected to publish at least another conference paper on the outcomes of these activities. During the first two years, the student will complete his or her background in topics relevant for the proposal by attending suitable courses. During the third year, the student will work onto evaluating the effectiveness of the devised solutions in realistic scenarios by also constructing appropriate datasets involving human users. Results obtained will be reported into at least a high-quality journal publication. Possible publication venues Publications will target conferences and journals in the areas of multi-modal machine/deep learning, pattern recognition, and human-computer interaction like, e.g., IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Human-Machine Systems, ACM Transactions on Computer-Human Interaction, ACM Conference on Human Factors in Computing Systems (CHI), CVPR, ICPR, etc. Some specific publication venues related to the considered application scenarios could be considered too. References [1] J. Yang, A. W. Ismail , "A review: Deep learning for 3D reconstruction of human motion detection," in International Journal of Innovative Computing, 12(1), 65?71, 2021. [2] M. Oyama, N. Kaneko, M. Hayashi, K. Sumi and T. Yoshida, "Two-stage model fitting approach for human body shape estimation from a single depth image," Proc. 15th IAPR International Conference on Machine Vision Applications (MVA), 2017. [3] A. Biswas, H. Admoni, A.Steinfeld, "Fast on-board 3D torso pose recovery and forecasting ," Proc. International Conference on Robotics and Automation, 2019. [4] A. Sengupta, F. Jin, R. Zhang, S. Cao, "mm-Pose: Real-time human skeletal posture estimation using mmWave radars and CNNs," in IEEE Sensors Journal, vol. 20, no. 17, pp. 10032-10044, 2020. [5] Y. Sun, Z. Huang, H. Zhang, Z. Cao, D. Xu, "3DRIMR: 3D reconstruction and imaging via mmWave radar based on deep learning," IEEE International Performance, Computing, and Communications Conference (IPCCC), 2021.
Required skills	- Advanced knowledge of artificial intelligence and machine/deep learning. - Previous experience with the analysis of multi-modal data related to human-motion and moving objects in general. - Extensive experience in programming and data handling. - Communication and teamwork skills.

80

Data-Driven and Sustainable Solutions for Distributed Systems
Proposer	Guido Marchetto
Topics	Parallel and distributed systems, Quantum computing, Data science, Computer vision and AI
Group website	http://www.netgroup.polito.it
Summary of the proposal	Recent advances in cyber-physical systems are expected to support advanced and critical services incorporating computation, communication, and intelligent decision making. The research activity aims to leverage advanced analytics, machine learning, and optimization techniques, to enhance the efficiency, resilience, and sustainability of distributed systems. Key focus areas include reducing energy consumption while using distributed learning techniques and optimizing resource allocation.
Research objectives and methods	Two research questions (RQ) guide the proposed work: RQ1: How can we design and implement on local and larger-scale testbeds effective autonomous solutions that integrate the network information at different scopes using recent advances in supervised and reinforcement learning? RQ2: To scale the use of machine learning-based solutions in cyber-physical systems, what are the most efficient distributed machine learning architectures that can be implemented at the edge of such systems? The final target of the research work is to answer these questions, also by evaluating the proposed solutions on small-scale emulators or large-scale virtual testbeds, using a few applications, including virtual and augmented reality, precision agriculture, or haptic wearables. In essence, the main goals are to provide innovation in decision, planning, responsiveness, using centralized and distributed learning integrated with edge computing infrastructures. Both vertical and horizontal integration will be considered. By vertical integration, we mean considering learning problems that integrate states across hardware and software, as well as states across the network stack across different scopes. For example, the candidate will design data-driven algorithms for planning the deployment of IoT sensors, tasks scheduling, and resources organization. By horizontal learning, we mean using states from local (e.g., physical layer) and wide area (e.g., transport layer) as input for the learning-based algorithms. The data needed by these algorithms are carried to the learning actor by means of newly networking protocols. Aside from supporting resiliency with the vertical integration, solutions must offer resiliency across a wide (horizontal) range of network operations: from close-edge, i.e., near the device, to the far-edge, with the design of secure data-centric resource allocation (federated) algorithms. The research activity will be organized in three phases: Phase 1 (1st year): the candidate will analyze the state-of-the-art solutions for cyber-physical systems management, with particular emphasis on knowledge-based network automation techniques. The candidate will then define detailed guidelines for the development of architectures and protocols that are suitable for automatic operation and (re-)configuration of such deployments, with particular reference to edge infrastructures. Specific use-cases will also be defined during this phase (e.g., in virtual reality, smart agriculture). Such use cases will help identifying ad-hoc requirements and will include peculiarities of specific environments. With these use cases in mind, the candidate will also design and implement novel solutions to deal with the partial availability of data within distributed edge infrastructures. Results of this work will likely result in conference publications. Phase 2 (2nd year): the candidate will consolidate the approaches proposed in the previous year, focusing on the design and implementation of mechanisms for vertical and horizontal integration of supervised and reinforcement learning. Network, and computational resources will be considered for the definition of proper allocation algorithms, with the objective of energy efficiency. All solutions will be implemented and tested. Results will be published, targeting at least one journal publication. Phase 3 (3rd year): the consolidation and the experimentation of the proposed approach will be completed. Particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. Major importance will be given to the quality offered to the service, with specific emphasis on the minimization of latencies in order to enable a real-time network automation for critical environments (e.g., telehealth systems, precision agriculture, or haptic wearables). Further conference and journal publications are expected. The research activity is in collaboration with Saint Louis University, MO, USA and University of Kentucky, KY, USA, also in the context of some NSF grants. The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of networking and machine learning (e.g. IEEE INFOCOM, ICML, ACM/IEEE Transactions on Networking, or IEEE Transactions on Network and Service Management) and cloud/fog computing (e.g. IEEE/ACM SEC, IEEE ICFEC, IEEE Transactions on Cloud Computing), as well as in publications related to the specific areas that could benefit from the proposed solutions (e.g., IEEE PerCom, ACM MobiCom, IEEE Transactions on Industrial Informatics, IEEE Transactions on Vehicular Technology).
Required skills	The ideal candidate has good knowledge and experience in networking and machine learning, or at least in one of the two topics. Availability for spending periods abroad (mainly but not only at Saint Louis University and/or University of Kentucky) is also important for a profitable development of the research topic.

81

Multi-Provider Cloud-Edge Continuum
Proposer	Fulvio Giovanni Ottavio Risso
Topics	Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing
Group website	-
Summary of the proposal	Cloud computing resources are being complemented with additional resources operating at the edge (either on the telco side or on the customer's premises). This research activity tackles the above problems: (1) scalable and infrastructure-wide orchestration; (2) enhanced resiliency and capability of the software infrastructure to survive and evolve also in case of network outages or planned disconnections; (3) network optimizations in case of infrastructure involving N>2 federated clusters.
Research objectives and methods	Cloud-native technologies are increasingly deployed at the edge of the network, usually through tiny datacenters made by a few servers that maintain the main characteristics (powerful CPUs, high-speed network) of the well-known cloud datacenters. This project proposes to aggregate the above available hardware into a gigantic, distributed datacenter, potentially controlled by multiple administrative entities. This would originate a set of highly dynamic, yet still secure and robust, ?virtual? clusters spanning across multiple physical infrastructures. The key objectives of the present research are the following: - Architectural Paradigms and Resource Management: Develop new paradigms, algorithms, and protocols for advertising, negotiating, and acquiring resources across different administrative domains. - Orchestration and Lifecycle Management: Create scalable algorithms and protocols for orchestrating the computing continuum, optimizing the placement of software services, and managing the lifecycle of virtual clusters. - Resilience and Evolution: Ensure the software infrastructure and virtual clusters can survive and evolve during network outages or planned disconnections. - Data Exchange Optimization: Optimize data exchange between virtual services when clusters span multiple sites. The research activity is part of the IPCEI projects initiative (https://www.bmwk.de/Redaktion/EN/Artikel/Industry/ipcei-cis.html), in particular in collaboration with TIM S.p.A.. The research activity will be organized in the following phases. Year 1: Foundation and Initial Development Q1-Q2: Literature Review and Requirement Analysis - Conduct a comprehensive literature review on cloud-native technologies, edge computing, and distributed datacenters. - Identify key challenges and gaps in current technologies. - Define detailed requirements for the project. Q3-Q4: Architectural Design and Initial Prototyping - Develop novel architectural paradigms for resource management across multiple administrative domains. - Design initial scalable algorithms and protocols for resource advertisement, negotiation, and acquisition. - Create a prototype for testing basic functionalities. Year 2: Development and Testing Q1-Q2: Advanced Algorithm Development - Develop scalable algorithms and protocols for orchestrating the computing continuum. - Focus on optimized placement for software services and applications. - Implement lifecycle management for virtual clusters. Q3-Q4: Resilience and Data Exchange Optimization - Design and implement algorithms for infrastructure resilience during network outages or planned disconnections. - Develop scalable algorithms for optimizing data exchange between virtual services across multiple sites. - Conduct extensive testing and validation of the developed algorithms and protocols. Year 3: Integration and Evaluation Q1-Q2: Integration and System Testing - Integrate all developed components into a cohesive system. - Perform comprehensive system testing to ensure functionality and performance. - Optimize the system based on testing results. Q3-Q4: Evaluation and Dissemination - Evaluate the system in real-world scenarios to assess its effectiveness and robustness. - Document findings and prepare research papers for publication. - Present results at conferences and workshops. - Plan for potential exploitation or further research extensions. Expected target conferences are the following: Top conferences: - USENIX Symposium on Operating Systems Design and Implementation (OSDI) - USENIX Symposium on Networked Systems Design and Implementation (NSDI) - International Conference on Computer Communications (INFOCOM) - ACM European Conference on Computer Systems (EuroSys) - ACM Symposium on Principles of Distributed Computing (PODC) - ACM Symposium on Operating Systems Principles (SOSP) Journals: - IEEE/ACM Transactions on Networking - IEEE Transactions on Computers - ACM Transactions on Computer Systems (TOCS) - IEEE Transactions on Cloud Computing Magazines: - IEEE Computer - IEEE Network
Required skills	The ideal candidate has good knowledge and experience in computing architectures, cloud computing and networking. Availability for spending periods abroad would be preferred for a more profitable investigation of the research topic.

82

Reliability estimation techniques for AI accelerators
Proposer	Matteo Sonza Reorda
Topics	Computer architectures and Computer aided design, Data science, Computer vision and AI
Group website	https://tiramisu-project.eu/
Summary of the proposal	In several domains it is crucial to estimate the impact of faults in AI applications based on Neural Networks. Unfortunately, traditional methods are unfeasible due to the huge computational cost. The planned activities will focus on the development of efficient solutions to estimate the reliability of AI accelerators, considering the different kinds of (temporary and permanent) faults that can affect the hardware.
Research objectives and methods	The research proposal aims at developing suitable solutions for estimating the reliability of AI applications running on hardware accelerators, trading off the achieved accuracy and the required computation effort. In particular, activities will focus on hardware accelerators such as GPUs and Tensor Core Units (TCUs). The approach will combine hybrid techniques acting at multiple levels of abstraction. An important part of the planned work includes the assessment of the proposed solutions on representative test cases, in terms of the underlying hardware, the considered AI applications, the adopted data sets. The planned activities will involve- an analysis of the state of the art in the area- the identification of suitable solutions- their experimental assessment via the development of suitable prototypes, and their usage on selected test cases. The devised methods and the related experimental results will be presented at the main international conferences n the area (IOLTS, ETS, VTS, ITC, DSN) and reported in the main journals (TCAD, ToC, ToR). This PhD position is supported by the TIRAMISU project (https://tiramisu-project.eu/).
Required skills	Basic knowledge about computer architectures, digital design and AI techniques

83

Analysis and improvement of fault tolerance and safety of AI accelerators
Proposer	Matteo Sonza Reorda
Topics	Computer architectures and Computer aided design, Data science, Computer vision and AI
Group website	https://tiramisu-project.eu/
Summary of the proposal	The PhD project focuses on improving the fault tolerance and safety of AI accelerators using innovative EDA methods. The aim is to explore how hardware accelerators can be designed to incorporate high fault detection and correction mechanisms. Innovative methods to leverage EDA tool flows to efficiently detect and correct faults in AI accelerators will be developed, and traditional flows of existing Cadence tools and flows will be adapted to integrate AI capabilities.
Research objectives and methods	The research proposal aims at developing suitable solutions for hardening AI applications running on hardware accelerators in safety-critical domains. In particular, activities will focus on hardware accelerators such as GPUs, possibly complemented by Tensor Core Units (TCUs) or similar AI-oriented hardware accelerators. The approach will focus on the development of EDA solutions for estimating the reliability of AI applications (with emphasis on Edge AI), for pinpointing the most critical HW and SW components, and for hardening them with respect to the effects of hardware faults. The compliance with existing safety standards and the integration of the proposed solutions into real industrial flows represent crucial aspects in this research proposal. An important part of the planned work includes the assessment of the proposed solutions on representative test cases, in terms of the underlying hardware, the considered AI applications, the adopted data sets. The planned activities will involve- an analysis of the state of the art in the area- the identification of suitable solutions- their experimental assessment via the development of suitable prototypes, and their usage on selected test cases. The devised methods and the related experimental results will be presented at the main international conferences n the area (IOLTS, ETS, VTS, ITC, DSN) and reported in the main journals (TCAD, ToC, ToR). The selected PhD student will be hired by and will work in Cadence (Munich, D). This PhD position is supported by the TIRAMISU project (https://tiramisu-project.eu/).
Required skills	Basic knowledge about computer architectures, digital design and AI techniques

84

Multimodal Spatio-Temporal Data Analytics
Proposer	Paolo Garza
Topics	Data science, Computer vision and AI
Group website	https://dbdmg.polito.it
Summary of the proposal	Spatio-Temporal (ST) data sources are continuously increasing and are inherently multimodal: numerical time series collected from sensors, sequences of images collected by satellites, and text documents are only a few of the possible Spatio-Temporal data types. Although Spatial-temporal data have been extensively studied, the current data analytics approaches do not effectively manage multimodal ST data. Innovative algorithms to fuse multimodal ST sources are the primary goal of this proposal.
Research objectives and methods	The main objective of this research activity is to design data-driven techniques and machine learning algorithms to fuse and analyze contemporaneously multimodal Spatio-Temporal data (e.g., numerical time series collected by sensors combined with sequences of satellite images or tabular data with temporal and spatial information). Both descriptive and predictive problems will be considered. The main research objectives that will be addressed are as follows. Multimodality of the sources. A plethora of spatio-temporal data sources associated with different modalities are available. Each source and modality provides a different perspective of the phenomena under analysis. However, integrating different spatio-temporal sources and modalities is still not well-managed. Effectively combining different sources and modalities allows the mining of new knowledge that cannot be extracted considering one source or modality at a time. Innovative integration and fusion techniques based, for instance, on deep neural networks will be studied to address this challenge. Big spatio-temporal data. Spatio-Temporal data are frequently big (e.g., vast collections of sequences of remote sensing data) or are gathered using big data streaming frameworks. Hence, specific multimodal big data systems and approaches are needed to collect, store, and manage spatio-temporal data. Resource awareness. In several domains, near-real-time analyses are needed. In this context, part of the analysis is performed onboard the devices used to collect the data. Resource-aware machine learning models are needed in that case to use the limited amount of resources as best as possible. The work plan for the three years is organized as follows. 1st year. Analysis of the state-of-the-art algorithms and data analytics frameworks for multimodal Spatio-Temporal data analysis. Based on the pros and cons of the current solutions, a preliminary approach for integrating spatio-temporal sources with different modalities will be studied and designed. Based on the proposed data integration approaches, novel algorithms will be designed, developed, and validated on historical spatio-temporal data related to specific domains (e.g., water level monitoring of lakes). 2nd year. The scalability of the proposed models will be mainly addressed during the second year. Starting from the algorithms designed during the first year, big data versions of the algorithms will be developed and validated. Moreover, the proposed algorithm will be evaluated in different domains. 3rd year. Resource-aware models will be designed for onboard inference to analyze in near-real-time and onboard the collected spatio-temporal data. The outcomes of the research activity are expected to be published at IEEE/ACM International Conferences and in any of the following journals: - ACM Transactions on Spatial Algorithms and Systems - IEEE Transactions on Knowledge and Data Engineering - IEEE Transactions on Big Data - IEEE Transactions on Emerging Topics in Computing - ACM Transactions on Knowledge Discovery in Data - Information sciences (Elsevier) - Expert systems with Applications (Elsevier) - Machine Learning with Applications (Elsevier)
Required skills	Strong background in data science fundamentals and machine learning algorithms, including embeddings-based data models. Strong programming skills. Knowledge of big data frameworks such as Spark is expected.

85

Reinforcement Learning for Multi-Scale Energy Management in Multi-Energy Systems
Proposer	Lorenzo Bottaccioli
Topics	Data science, Computer vision and AI, Controls and system engineering
Group website	www.eda.polito.it
Summary of the proposal	This research aims to apply Reinforcement Learning (RL) to energy systems at various scales, from individual components and homes to city districts. We will develop both single and multi-agent RL algorithms using model-free and model-based approaches, integrating Physics-Informed Neural Networks (PINNs). The goal is to enhance energy efficiency, economic benefits, and community objectives in Multi-Energy Systems (MES).
Research objectives and methods	Reinforcement Learning (RL) has emerged as a powerful tool for optimizing complex systems by learning optimal policies through interaction with the environment. In energy systems, RL can enhance the management and optimization of energy. This research aims to apply RL techniques at multiple scales within Multi-Energy Systems (MES), from individual components and households to entire districts. The increasing complexity of energy systems due to renewable integration, distributed generation, smart grids, and flexible loads like Heating, Ventilation, and Air Conditioning (HVAC) systems and batteries necessitates advanced control and optimization strategies. Traditional control methods like Proportional-Integral-Derivative (PID) controllers often struggle with the stochastic nature and high dimensionality of these systems. RL offers a data-driven approach that adapts to dynamic environments, learning optimal control policies without explicit system modeling. Research Objectives:- Develop single-agent RL algorithms for optimizing energy consumption and control at component and household levels, focusing on devices like HVAC systems, batteries, and flexible loads.- Design and implement Multi-Agent RL (MARL) frameworks for coordinating multiple agents in community or district energy systems, facilitating demand response and enhancing system flexibility.- Investigate model-free and model-based RL approaches, integrating Physics-Informed Neural Networks (PINNs) to incorporate physical laws and constraints into the learning process, improving efficiency and reliability.- Evaluate the performance of the developed RL methods in terms of energy optimization, economic benefits, community objectives, and distribution network considerations within MES. Outline of the Research Work Plan: First Year:- Conduct a comprehensive literature review on RL applications in energy systems, focusing on single-agent and MARL approaches, model-free and model-based methods, and the role of PINNs.- Study the integration of PINNs into RL frameworks, identifying challenges and solutions.- Review traditional control methods like PID controllers, exploring limitations and potential synergies with RL.- Define requirements and specifications for the RL algorithms.- Begin developing single-agent RL models for component-level energy optimization, focusing on HVAC systems, battery storage, and flexible loads. Second Year:- Expand single-agent models to household-level energy management, incorporating flexibility measures and demand response strategies.- Develop MARL frameworks for community and district-level energy optimization, enabling coordination among multiple agents to achieve collective goals.- Integrate PINNs into the RL algorithms to incorporate physical constraints, such as thermal dynamics of buildings, electrical network limitations, and HVAC characteristics.- Explore demand response mechanisms and their integration into the RL framework to enhance flexibility and economic optimization. Third Year:- Conduct extensive simulations and experiments to evaluate the RL algorithms in realistic scenarios, including varying demand patterns, renewable generation, and market conditions.- Analyze results in terms of energy optimization, economic benefits, community objectives, and impacts on distribution networks.- Collaborate with industry partners and participate in funded projects to validate the research outcomes in practical settings like smart grids and energy communities. Starting Point and Collaborations: The research will leverage existing energy system co-simulation platforms of the Energy Center Lab. These platforms will be extended to support the development and testing of RL algorithms within MES, including HVAC systems, batteries, and demand response mechanisms. Collaboration with the Energy Center Lab at Politecnico di Torino and industry partners will provide practical insights and data for validation. Outcomes of the Research:- Advanced RL algorithms for energy optimization across multiple scales, outperforming traditional methods like PID controllers.- Novel integration of PINNs into RL for energy applications, contributing to scientific advancement.- Practical tools and frameworks for energy system operators, utility companies, and stakeholders to implement advanced control strategies in MES.- Enhanced understanding of flexibility, demand response, and coordination in optimizing energy systems.- Publications in high-impact journals and presentations at leading conferences. Possible Venues for Publications:- Journals:IEEE Transactions on Smart GridIEEE Transactions on Sustainable ComputingEngineering Applications of Artificial Intelligence, ElsevierEnergy and AI, ElsevierExpert Systems with Applications, Elsevier- Conferences:International Conference on Machine Learning (ICML)Neural Information Processing Systems (NeurIPS)ACM e-Energy ConferenceIEEE International Conference on Industrial Informatics (INDIN)IEEE SEST International ConferenceIEEE COMPSAC International Conference
Required skills	- Strong programming skills, preferably in Python. - Experience with neural networks, especially Physics-Informed Neural Networks (PINNs). - Familiarity with energy systems, smart grids, HVAC systems, and demand response mechanisms. - Ability to work with simulation tools and develop custom simulation environments.

86

4D reconstruction and perception
Proposer	Tatiana Tommasi
Topics	Data science, Computer vision and AI
Group website	https://vandal.polito.it https://research.nvidia.com/labs/dvl/
Summary of the proposal	This research aims to study the synergies between reconstruction methods such as Gaussian Splatting or neural reconstruction fields (NeRF) and perception, i.e., semantics. The idea is not only to reconstruct the 3D geometry of an observed scene but also to understand the content at the time of reconstruction. The interplay between the two tasks should make both of them easier, i.e. knowing as a prior that I'm reconstructing a car makes the reconstruction task easier.
Research objectives and methods	Humans naturally interact with the surrounding environment thanks to their ability to jointly elaborate spatial and semantic information. Despite the recent advancements, artificial agents still need to improve their skills on these tasks and struggle to jointly target geometric accuracy and contextual understanding, especially in dynamic scenes. The research activity for this thesis aims to bridge the gap between 3D reconstruction and semantic perception, advancing current methods and developing a new integrative framework on which it is possible to rely for real-world applications. The project will proceed in phases, progressively increasing the complexity of the scenarios and models used. (1) Studying Reconstruction-Perception Interplay As dynamic scene understanding is a challenging problem, we will start by considering the task of multi-object tracking and studying how it can be performed by using reconstruction objectives. The scene will be represented by a graph as this enables us to simultaneously track individual objects and capture the contextual relationships that exist among them. For example, the spatial relationship between a car and a pedestrian is inherently different from that between two pedestrians walking side by side. By modeling these relationships, we can examine how the presence and behavior of certain objects influence the reconstruction and understanding of others, providing insight into the interplay between geometry (reconstruction via implicit e.g. Nerf, and explicit e.g. Gaussian Splatting representations) and meaning (perception e.g. via object recognition, detection, and relationship prediction). (2) Developing Joint Reconstruction and Perception Frameworks Building on insights from the previous phase we will move towards attaching full semantics to reconstruction and constraining the reconstruction to such semantics. We may integrate semantic objectives into 3D reconstruction optimization problems and develop models that use the graph-structured representation to establish consistency across frames, maintaining relationships among objects during the reconstruction of dynamic scenes. (3) Scalability to complex scenes and transfer We plan to stress test the methods developed in the previous phase considering scenes of growing complexity and investigate the effect of adding and removing objects in challenging (e.g. real-time) settings. We will also consider different kinds of semantic cues (i.e. object functionalities/affordances, part annotations, physical attributes of rigid and deformable objects) and whether the knowledge acquired by the trained models can be transferred to a variety of downstream tasks. The research activity will be pursued in collaboration with NVIDIA Dynamic Vision and Learning Group which is supporting this PhD position. It is expected that the scientific results of the project will be reported at top computer vision, robotics, and machine learning conferences (e.g. IEEE CVPR, IEEE ICCV, ECCV, IEEE IROS, IEEE ICRA, NeurIPS, ICML), together with at least one publication in a Q1 international journal.
Required skills	- Good programming skills, and proficiency in programming languages (Python is required, C++ is a plus). - Familiarity with at least one recent deep learning framework (PyTorch, JAX, or TensorFlow). - The candidate is expected to be proactive and capable of autonomously studying and reading the most recent literature. - Ability to tackle complex problems and algorithmic thinking. - Be fluent in English, both written and oral.

87

Video Dynamic Scene Understanding
Proposer	Tatiana Tommasi
Topics	Data science, Computer vision and AI
Group website	https://vandal.polito.it https://research.nvidia.com/labs/dvl/
Summary of the proposal	The objective of this research is to create data-driven models that perform dynamic scene understanding in the open world. Towards this end, we propose to design novel computer vision and multi-modal approaches by exploiting data generation, semi-supervised and active learning techniques as well as uncertainty quantification to deal with object classes never seen at train time.
Research objectives and methods	Current deep learning techniques have proven powerful for tasks like object detection, segmentation, and tracking in controlled settings. However, they often struggle in dynamic, open-world environments where the set of observable objects is vast and their relation as well as their dynamics may be difficult to predict. The research activity of this thesis aims to address these limitations by developing video scene understanding models that can manage novelty and present robustness to unseen classes without sacrificing accuracy on known classes. The project will proceed in phases, progressively increasing the complexity of the scenarios and models used. (1) Develop new ways to obtain annotated data for training We propose to focus on the task of multi-object tracking which will provide the perfect playground to start studying dynamic scene understanding. Synthesizing new data and their related annotation for this task will be the objective of the work for the first year. We will focus on designing tailored generative approaches to obtain data collections covering enough variability and mimicking the open-world condition. (2) Novelty detection, semi-supervised and active learning We will use the sample sets obtained from the previous phase to train novelty detection models that separate known from unknown instances and fill the knowledge gap with active learning techniques. We plan to adopt semi-supervised strategies in combination with a limited amount of real samples and evaluate the uncertainty of the models together with their accuracy. Besides choosing the learning techniques, it will be necessary to think about a new network design that can address the shortcomings of CNNs as well as Transformers when dealing with long sequences. These activities will be mainly performed in the second year. (3) Multi-modal learning for efficient, robust, and scalable dynamic scene understanding The results of the previous phase may yield new insights into the data's underlying structure, informing a generative loop that incorporates multi-modal cues. We plan to leverage the Lidar version Segment Anything model developed by the NVIDIA group and extend the test of our approach to challenging conditions under domain shift (e.g. variation in lighting or weather conditions) to assess the robustness provided by 3D information. These activities will be mainly performed in the third year. The research activity will be pursued in collaboration with NVIDIA Dynamic Vision and Learning Group which is supporting this PhD position. It is expected that the scientific results of the project will be reported at top computer vision, robotics, and machine learning conferences (e.g. IEEE CVPR, IEEE ICCV, ECCV, IEEE IROS, IEEE ICRA, NeurIPS, ICML), together with at least one publication in a Q1 international journal.
Required skills	- Good programming skills, and proficiency in programming languages (Python is required, C++ is a plus). - Familiarity with at least one recent deep learning framework (PyTorch, JAX, or TensorFlow). - The candidate is expected to be proactive and capable of autonomously studying and reading the most recent literature. - Ability to tackle complex problems and algorithmic thinking. - Be fluent in English, both written and oral.

88

Heterogeneous Computing Design Techniques for HPC
Proposer	Luca Sterpone
Topics	Computer architectures and Computer aided design
Group website	www.cad.polito.it asaclab.polito.it
Summary of the proposal	Heterogeneous High-Performance Computing (HPC) is increasingly adopted in fields requiring immense computational power and efficiency, such as climate modeling, genomics, telecommunication, and artificial intelligence. By integrating diverse processors like CPUs, GPUs, FPGAs, and ASICs, heterogeneous HPC systems can optimize specific tasks, significantly accelerating data processing and complex simulations. This PhD proposal explores strategies for optimizing these systems to maximize performance and efficiency across diverse workloads. The primary benefits include enhanced performance, energy efficiency, and the ability to handle a variety of computational workloads simultaneously.
Research objectives and methods	This research proposal aims to explore and develop strategies for leveraging heterogeneous computing architectures to enhance the performance and efficiency of artificial intelligence (AI) and machine learning (ML) workloads. The main objectives of the research proposal are: Investigate Workload Partitioning Techniques: Develop methods for dividing AI/ML tasks into smaller subtasks that can be processed concurrently by different types of state-of-the-art processors and co-processors. This stage will require exploring algorithms for dynamically allocating tasks to processors based on real-time load and availability, maximizing resource utilization, and minimizing bottlenecks. This objective is expected to be achieved during the first year of the PhD program. Design of custom HPC node on Reconfigurable SoC: With acquired knowledge of AI computational paradigms, this stage aims to explore different architectural solutions implementing custom AI accelerator with reconfiguration capabilities. The new architectural solution should adapt its datapath dynamically depending on the workload execution to manage the available resources while providing power efficiency efficiently. The custom HPC node will be implemented by exploiting high-performance reconfigurable SoC equipped with a next-generation FPGA device. This objective is expected to be achieved within the second year of the PhD program. Validation and Performance Exploration: The developed HPC node will be validated by developing a monitoring system. A host PC software platform will be developed to generate and transmit the test workload. The developed accelerator performance capabilities should be compared with current state-of-the-art solutions. For this purpose, a specific software compiler should be implemented to automatize the process of translating high-level algorithm models to the specific Instruction Set Architecture (ISA) of the developed accelerator. This objective is expected to be achieved within the third year of the PhD program. Inclusion of the custom Accelerator in HPC system: The custom reconfigurable AI accelerator will be included in a Heterogenous HPC system consisting of multiple CPUs, GPUs and FPGA devices. At this stage, the focus will be on selecting the optimal transmission protocol to enable data transfer among the different COTS units and the custom HPC node, while also addressing different nodes' latency. Particular emphasis will be placed on investigating the use of shared memory spaces accessible by all nodes to reduce data transfer overhead and improve overall efficiency. Moreover, it will be required to create software frameworks that facilitate uniform communication and data transfer between the heterogeneous nodes, ensuring optimal workload distribution. This objective is expected to be achieved within the third year of the PhD program. The research projects correlated to the present proposals in the last years are:[active 2024] ?AMATERASU: Analysis, Mitigation, And Radiation Test of Advanced COTS FPGAs through Reliability-Aware Place and Route? financed by the European Space Agency (ESA)- [active 2023] ?REASE: Resilient computing Architecture in the Space Quantum Communication Era? ? PRIN 2023- [active 2023] ?TERRAC: Towards Exascale Reconfigurable and Rad-hard Accelerated Computing in space? financed by the European Space Agency (ESA)- [active 2021] ?EuFRATE: European FPGA Radiation-hardened Architecture for Telecommunications and Embedded computing in space? in cooperation with Argotec and the European Space Agency (ESA)- [active 2020] ?The HERA Cubesat program? in cooperation with DIMEAS, Tyvak International and the European Space Agency (ESA) The research program will target the following conference venues: - IEEE Design and Test in Europe (DATE) - IEEE Conference on Computer Design (ICCD)- IEEE International Symposium on Circuits and Systems (ISCAS)- IEEE International Conference on Field Programmable Logic (FPL)- ACM International Conference on Computing Frontiers (CF) The research program will target the following journals: - IEEE Transactions on Circuits and Systems-I- IEEE Transactions on Computers- ACM Transactions on Reconfigurable Technology and Systems- IEEE Transactions on Emerging Topics in Computational Intelligence
Required skills	The research program is characterized by the following encouraged skills: - good knowledge of VHDL languages and computer architectures - enthusiastic approach to hardware design of computing architecture and systems - good knowledge of CAD tools for FPGA and ASICs.

89

Edge Computing and Inclusive Innovation: Shaping the Future of Smart and Safe Technologies
Proposer	Luca Sterpone
Topics	Life sciences, Computer architectures and Computer aided design
Group website	www.cad.polito.it asaclab.polito.it
Summary of the proposal	This PhD proposal explores two key areas: optimizing edge computing platforms and promoting inclusivity within smart city environments. Focusing on applications in healthcare, automation, and IoT, this research aims to improve edge computing's efficiency, security, and scalability. The proposal also aims to make smart cities more accessible, using edge computing to design adaptive infrastructure and inclusive services for all residents, especially marginalized groups.
Research objectives and methods	In an increasingly connected world, the potential for technology to drive societal progress is vast. However, to fully unlock this potential, we must address two crucial areas: the optimization of computational frameworks and the promotion of inclusivity in emerging technological environments. This PhD proposal focuses on these two pillars: the development of edge computing platforms and the application of inclusivity within the context of safe and smart cities. The first pillar centers on edge computing platforms, with a focus not only on smart cities but also on broader applications. Edge computing, which allows data to be processed closer to its source, offers significant advantages across a wide range of fields, from healthcare and industrial automation to autonomous vehicles and IoT ecosystems. By reducing latency and improving response times, edge computing is essential for real-time decision-making and enhancing the performance of critical systems. This research will investigate how to optimize edge computing platforms for scalable, secure, and efficient use in multiple sectors, exploring its impact on data-intensive applications, distributed networks, and decentralized infrastructure. Key areas of study will include resource management, security, and the development of algorithms that maximize edge computing's potential across various industries. The second pillar emphasizes the importance of inclusivity, particularly in the context of smart and safe cities. As cities evolve to incorporate smarter technologies, it is critical to ensure that all residents, including those with disabilities or from marginalized communities, can benefit from these advancements. This research will explore how edge computing can be used to develop inclusive applications, such as adaptive urban infrastructure, accessible communication systems, and personalized public safety services. By focusing on inclusive design, this pillar aims to create smart city solutions that are equitable and accessible to all, ensuring that technology contributes to the well-being of diverse populations. By integrating these two pillars, this PhD proposal seeks to create a holistic approach where the power of edge computing not only enhances the performance of diverse applications but also fosters inclusivity in the development of future technologies. The outcome will be a framework that applies edge computing for efficient and accessible solutions across sectors, with a particular focus on making smart cities safer and more inclusive for everyone. Year 1: Foundational Research and Framework Development The first year will focus on establishing the theoretical and practical foundation for the research. A thorough literature review will analyze the state of the art in edge computing platforms and inclusive technologies. Technical requirements for optimizing edge computing frameworks in smart city contexts will be identified, and a preliminary design for resource management and security protocols will be developed. Simultaneously, specific use cases relevant to inclusivity, such as adaptive urban infrastructure and accessible communication systems, will be defined. This foundational work will set the stage for developing prototypes in subsequent years. Year 2: Prototype Implementation and Performance Evaluation Building on the groundwork laid in the first year, the second year will involve the development and implementation of prototypes for edge computing solutions addressing the identified use cases. These prototypes will be evaluated in real-world environments to assess their performance and scalability. Collaborations with stakeholders, such as urban planners, industry experts, and community organizations, will be essential to refine these solutions and ensure they align with the needs of diverse populations. Performance metrics such as latency, efficiency, and user accessibility will be rigorously measured, providing critical insights for further optimization. Year 3: Scaling Prototypes and Expanding Applications The third year will focus on enhancing the scalability and robustness of the developed solutions, ensuring they can be applied across various sectors and larger-scale scenarios. Research activities from the second year will continue, with particular attention to integrating the solutions into more complex environments and extending their application to additional use cases beyond the initial pilot projects. Further iterations of prototypes will be deployed, incorporating feedback from real-world testing and stakeholder collaboration. This iterative process will ensure that the solutions are adaptable, inclusive, and ready for broader adoption. The findings will also be disseminated through publications in high-impact journals and presentations at leading conferences. The research projects correlated to the present proposals in the last years are: [2024] PoC Instrument ?Safe Smart City: Rilevare Violenza e Richieste di Aiuto in Tempo Reale Tramite i Dispositivi di Videosorveglianza ? Fondazione Compagnia di San Paolo [2023] HONEY ? Hybrid ONline tEchnologY fr particle therapy ? PRIN 2023 The research program will target the following conference venues: - IEEE International Conference on Edge Computing - IEEE International Conference on Smart Computing - IEEE International Conference on Smart Systems and Edge Computing - EEE Symposium on Edge Intelligence - ACM/IEEE Symposium on Edge Computing The research program will target the following journals: - IEEE Transactions on Emerging Topics in Computational Intelligence - Intelligent Systems with Applications - IEEE Information Systems
Required skills	The research program is characterized by the following encouraged skills: - Good knowledge of Python programming Language - Enthusiastic approach to hardware design of computing architecture and systems

90

Virtual Prototyping and Behavioral Modelling of AI Accelerators
Proposer	Alberto Pisoni Luca Sterpone
Topics	Computer architectures and Computer aided design
Group website	www.cad.polito.it asaclab.polito.it
Summary of the proposal	This PhD proposal aims to create a virtual environment capable of emulating executable software for a variety of processors, including AI-capable ones used in modern control systems. The research will address the specific properties, computational needs, and real-time challenges of these processors, with a particular focus on modeling AI accelerators and their unique characteristics, such as parallelism and the configuration of Multiply and Accumulate Units.
Research objectives and methods	Modern control systems are equipped with a diverse range of processors, each optimized for specific tasks, from general-purpose computation to AI-driven operations. This heterogeneity poses significant challenges for software development and testing, particularly in ensuring compatibility, performance, and real-time behavior across different platforms. A unified virtual environment capable of emulating these processors and their executable software would streamline the development process and enhance reliability. This PhD research will focus on developing such a virtual environment, addressing the following aspects: Research Objectives: • Design a Modular Virtual Environment: Create an extensible framework capable of emulating a wide variety of processors, including general-purpose CPUs, DSPs, and AI-specific accelerators. • Model AI Accelerators: Develop detailed models of AI-capable processors, such as TPUs, emphasizing their parallelism and computational capabilities. For example, tailor the configuration of Multiply and Accumulate Units to accurately simulate real-world behavior. • Address Real-Time and Accuracy Requirements: Ensure the virtual environment can replicate the timing constraints and computational accuracy of physical processors. • Optimize for Scalability: Design the system to handle complex multi-processor environments without significant performance degradation. Research Work Plan: Year 1: Requirements Analysis and Framework Design • Analyze the computational and physical characteristics of modern processors used in control systems. • Design the architecture of the virtual environment, identifying key modules and interfaces. Year 2: Development of Processor Models • Implement emulation models for general-purpose processors, AI accelerators, and other specialized units. • Focus on the specific challenges of AI-capable processors, such as parallel computing and real-time operation. Year 3: Integration and Validation • Integrate the processor models into the virtual environment. • Validate the environment by comparing its outputs to real-world benchmarks and experimental data. • Optimize the system for scalability and efficiency. • Publish results in high-impact conferences and journals. The research program will target the following conference venues: - IEEE Transaction on Computer - IEEE Transaction on Circuit and System - ACM Transactions on Modeling and Computer Simulation - Journal of Systems and Software - ACM/IEEE Design Automation Conference (DAC)Design, Automation and Test in Europe Conference
Required skills	The research program is characterized by the following encouraged skills: - Strong programming skills in Python and C++ - Familiarity with hardware emulation and system modeling frameworks (e.g., QEMU, Gem5) - Experience with real-time systems and control theory

91

Software-based hardening solutions for AI-based architectures
Proposer	Luca Sterpone Sarah Azimi
Topics	Computer architectures and Computer aided design
Group website	www.cad.polito.it asaclab.polito.it
Summary of the proposal	This PhD proposal aims to develop software-based techniques to enhance the reliability of AI accelerators such as GPUs and TPUs. The research will investigate methods to reduce the failure probability of these accelerators, which are critical for AI inference, by focusing on software-level fault-tolerance mechanisms that consider underlying hardware architectures. The ultimate goal is to ensure robust AI deployments in critical and non-critical applications with minimal performance trade-offs.
Research objectives and methods	The rapid adoption of AI model inference has been fueled by the proliferation of high-performance hardware accelerators like GPUs and TPUs. These accelerators enable complex computations in real time, making them integral to applications in healthcare, autonomous driving, finance, and more. However, as the reliance on these devices grows, so does the risk of failures due to hardware faults, such as transient errors or aging-related degradation. These failures can have catastrophic consequences in critical systems, necessitating a focused effort to enhance their reliability. Research Objectives: • Identify Vulnerabilities: Analyze common failure modes in AI hardware accelerators to understand their impact on AI inference outcomes. • Develop Fault-Tolerant Software Techniques: Propose and implement software-level mechanisms, such as checkpointing, error detection, and redundancy, to mitigate the effects of hardware faults. • Performance-Reliability Trade-Offs: Quantify the performance impacts of fault-tolerant methods to achieve an optimal balance between reliability and computational efficiency. • Hardware-Aware Optimization: Design software solutions tailored to the specific architectures of GPUs and TPUs to maximize their effectiveness. Research Work Plan: Year 1: Literature Review and Problem Analysis • Conduct a comprehensive review of AI hardware accelerators and their failure modes. • Define key reliability metrics and benchmarks for fault-tolerant AI inference. Year 2: Development of Fault-Tolerant Techniques • Design and prototype software-based methods for fault detection and recovery. • Implement initial solutions on GPUs and TPUs to evaluate their feasibility. Year 3: Performance Evaluation and Optimization • Test the developed techniques on diverse AI workloads. • Optimize the solutions to minimize performance overheads while maintaining reliability. • Publish findings in high-impact conferences and journals. The research projects correlated to the present proposals in the last years are: • TIRAMISU The research program will target the following conference venues: - IEEE Transaction on Computer - IEEE Transaction on Circuit and System - IEEE Transaction on Reliability - IEEE Transactions on Dependable and Secure Computing - ACM/IEEE International Symposium on Computer Architecture - IEEE International Symposium on Fault-Tolerant Computing - Design, Automation and Test in Europe Conference
Required skills	The research program is characterized by the following encouraged skills: - Strong programming skills in Python and C++ - Knowledge of fault-tolerance methods and system reliability analysis - Analytical mindset and problem-solving capabilities