Research proposals 40th cycle

01 Spatio-Temporal Data Analytics on Heterogeneous Data (Prof. Paolo Garza)

02 Ultra-low latency multimedia streaming over HTTP/3 (Prof. Antonio Servetti)

03 Media Quality Optimization using Machine Learning on Large Scale Datasets  (Prof. Enrico Masala)

04 Security Analysis and Automation in Smart Systems (Prof. Fulvio Valenza)

05 Local energy markets in citizen-centered energy communities (Prof. Edoardo Patti)

06 Simulation and Modelling of V2X connectivity with traffic simulation (Prof. Edoardo Patti)

07 Machine Learning techniques for real-time State-of-Health estimation of Electric Vehicle?s batteries (Prof. Edoardo Patti)

08 Multi-Device Programming for Artistic Expression (Prof. Luigi De Russis)

09 Digital Wellbeing By Desing (Prof. Alberto Monge Roffarello)

10 Management Solutions for Autonomous Networks (Prof. Guido Marchetto)

11 Preserving privacy and fairness with generative AI-based synthetic data production (Prof. Antonio Vetro')

12 Digital Twin development for the enhancement of manufacturing systems (Prof. Sara Vinco)

13 State-of-Health diagnostic framework towards battery digital twins (Prof. Sara Vinco)

14 Modeling, simualtion and validation of modern electronic systems (Prof. Sara Vinco)

15 Robust AI systems for data-limited applications (Prof. Santa Di Cataldo)

16 Artificial Intelligence applications for advanced manufacturing systems (Prof. Santa Di Cataldo)

17 AI for Secured Networks: Language Models for Automated  Security Log Analysis (Prof. Marco Mellia)

18 Leveraging Machine Learning Analytics for Intelligent Transport Systems Optimization in Smart Cities (Prof. Marco Mellia)

19 Natural Language Processing e Large Language Models for source code generation (Prof. Edoardo Patti)

20 Cloud continuum machine learning (Prof. Daniele Apiletti)

21 Graph network models for Data Science (Prof. Daniele Apiletti)

22 Automatic composability of Large Co-simulation Scenarios for smart energy communities (Prof. Edoardo Patti)

23 Multivariate time series representation learning for vehicle telematics data analysis (Prof. Luca Cagliero)

24 Designing a cloud-based heterogeneous prototyping platform for the development of fog computing apps (Prof. Gianvito Urgese)

25 Designing a Development Framework for Engineering Edge-Based AIoT Sensor Solutions (Prof. Gianvito Urgese)

26 Computational Intelligence for Computer-Aided Design (Prof. Giovanni Squillero)

28 Security of Software Networks (Prof. Cataldo Basile)

29 Emerging Topics in Evolutionary Computation: Diversity Promotion and Graph-GP (Prof. Giovanni Squillero)

30 Advanced ICT solutions and AI-driven methodologies for Cultural Heritage resilience (Prof. Edoardo Patti)

31 Monitoring systems and techniques for precision agriculture (Prof. Renato Ferrero)

32 Designing heterogeneous digital/neuromorphic fog computing systems and development framework  (Prof. Gianvito Urgese)

33 Cloud at the edge: creating a seamless computing platform with opportunistic datacenters (Prof. Fulvio Giovanni Ottavio Risso)

34 AI-driven cybersecurity assessment for automotive (Prof. Luca Cagliero)

35 Applications of Large Language Models in time-evolving scenarios (Prof. Luca Cagliero)

36 Building Adaptive Embodied Agents in XR to Enhance Educational Activities (Prof. Andrea Bottino)

37 Real-Time Generative AI for Enhanced Extended Reality (Prof. Andrea Bottino)

38 Transferable and efficient robot learning across tasks, environments, and embodiments (Prof. Raffaello Camoriano)

39 Neural Network reliability assessment and hardening for safety-critical embedded systems (Prof. Matteo Sonza Reorda)

40 Design of an integrated system for testing headlamp optical functionalities.  (Prof. Bartolomeo Montrucchio)

41 Machine unlearning (Prof. Elena Maria Baralis)

42 Generative AI models for enhanced text-to-image synthesis (Prof. Lia Morra)

43 Test, reliability, and safety of intelligent and dependable devices supporting sustainable mobility (Prof. Riccardo Cantoro)

44 Cybersecurity for a quantum world (Prof. Antonio Lioy)

45 Bridging Human Expertise and Generative AI in Software Engineering (Prof. Luca Ardito)

46 Explaining AI (XAI) models for spatio-temporal data (Prof. Elena Maria Baralis)

47 Advanced data modeling and innovative data analytics solutions for complex application domains (Prof. Silvia Anna Chiusano)

48 Functional Safety Techniques for Automotive oriented Systems-on-Chip (Prof. Paolo Bernardi)

49 Human-aware robot behaviour learning for HRI (Prof. Giuseppe Bruno Averta)

01

Spatio-Temporal Data Analytics on Heterogeneous Data

Proposer

Paolo Garza

Topics

Data science, Computer vision and AI

Group website

https://dbdmg.polito.it/, https://linksfoundation.com/

Summary of the proposal

Spatio-Temporal data are continuously increasing (e.g., remote sensing images, LiDAR acquisitions, and time series collected from IoT sensors). Although Spatio-Temporal data have been extensively studied, most current data analytics approaches do not effectively manage the heterogeneous nature of the data, especially considering the aforementioned domains, with most of the state-of-the-art approaches focusing on one modality at a time. Innovative AI approaches designed to solve practical tasks leveraging the multimodality and heterogeneity of information conveyed by multiscale and multitemporal geospatial data sources are the primary goals of this proposal.

Research objectives and methods

The main objective of this research activity is to design machine learning algorithms aimed at big data-driven applications to analyze heterogeneous Spatio-Temporal data (e.g., images from satellites, aerial vehicles or UAVs, 3D acquisitions from LiDAR sensors, or time series collected from IoT sensors), considering both descriptive and predictive problems.

The main research questions that will be addressed are as follows.

Heterogeneity. Several data sources are available, characterized by different data types and modalities. Each data source represents a facet of the analyzed phenomena and provides additional insights, especially when adequately integrated with other sources. Innovative integration techniques based, for instance, on latent spaces will be studied to leverage the opportunities provided by such diverse data sources. An effective integration of heterogeneous modalities could enable better performances of AI-based tasks.

Scalability. Spatio-Temporal data are frequently big (e.g., vast collections of remote sensing data). Hence, big data solutions are needed to process and analyze them, mainly when historical data are analyzed.

Timeliness. Timeliness is crucial in several domains (e.g., emergency management). Efficient machine learning algorithms shall be designed and implemented to deal with rapid and near real-time scenarios, with an eye towards practical and deployable solutions.

The work plan for the three years is organized as follows.

1st year. Analysis of the state-of-the-art algorithms and ML frameworks for heterogeneous and multimodal Spatio-Temporal data. Based on the pros and cons of the current solutions, a preliminary common data representation based on latent spaces will be studied and designed to integrate heterogeneous data effectively. Based on the proposed data representation, novel algorithms will be designed, developed, and validated on historical data related to specific domains (e.g., emergency management).

2nd year. State-of-the-art representations of multimodal Spatio-Temporal data will be further analyzed and proposed, focusing on scalable algorithms.

3rd year. The timeliness aspect will be considered, especially during the last year. Specifically, the focus will be on near real-time Spatio-Temporal data analysis based on efficient ML-based algorithms.

The activity is part of a well-established collaboration with the LINKS Foundation.

The outcomes of the research activity are expected to be published at IEEE/ACM/CVF International Conferences and in any of the following journals:- ACM Transactions on Spatial Algorithms and Systems- IEEE Transactions on Knowledge and Data Engineering- IEEE Transactions on Geoscience and Remote Sensing- IEEE Transactions on Big Data- IEEE Transactions on Emerging Topics in Computing- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing - Information Sciences (Elsevier)- Expert Systems with Applications (Elsevier)Machine Learning with Applications (Elsevier)

Required skills

Strong background in data science fundamentals and machine and deep learning algorithms. Strong programming skills. Knowledge of frameworks such as PyTorch or Spark is advisable but not mandatory.

02

Ultra-low latency multimedia streaming over HTTP/3

Proposer

Antonio Servetti

Topics

Computer graphics and Multimedia, Parallel and distributed systems, Quantum computing

Group website

https://media.polito.it/

Summary of the proposal

The growing demand for interactive web services has also led to the need for interactive video applications, capable of accommodating a much larger audience than videoconferencing tools, but with almost the same, i.e., strict, requirements in end-to-end latency. This proposal aims to define and study new media coding and transmission techniques that will exploit new HTTP/3 features, such as QUIC and WebTransport, to improve the scalability and reduce the latency of current streaming solutions.

Research objectives and methods

Research objectives
The term "Interactive Live Video Streaming" (IVS) has been defined for one-/few-to-many streaming services that can allow end-to-end latency below one second at scale and with low-cost, thus enabling some sort of interaction between one or more "presenters" and the audience. IVS is particularly useful in scenarios such as i) interactive video chat rooms, ii) instant feedback from video viewers (such as polling or voting), iii) promotional elements synchronized with a live stream.
The market demand for IVS aligns with the ongoing deployment of HTTP/3, which is expected to obsolete the long-standing TCP transport protocol by means of the new QUIC protocol (which is UDP- based and implements congestion control algorithms in the user space). 
Although QUIC has been employed for data transfer in HTTP since 2012, and it is now experimentally supported by a good number of servers and browsers, it is still in the early stages of adoption for media delivery. 
Ensuring an optimal balance between network efficiency and user satisfaction poses several challenges for the deployment of multimedia services using the new protocol. For instance, one challenge is how to exploit both reliable streams and unreliable datagrams in the transmission protocol according to the characteristics of different media elements. Additionally, managing quality adaptation without overloading the server, and ensuring effective caching by relay servers, even with stringent delay requirements, are also critical issues that need to be addressed.
To this aim, we plan to start from the implementation of an experimental HTTP/3 client-server application for ultra-low latency media delivery that will allow us to test and simulate different proposals and alternative solutions, and then compare their relative benefits and costs. The research will address the challenges of customizing media coding, packetization, forward error control, resource prioritization, and adaptivity in the new scenario. 
Such objectives will be pursued by using both theoretical and practical approaches. The resulting insight will then be validated in practical cases by analyzing the system's performance with simulations and actual experiments. 
 
Outline of the research work plan
During the project's first year, the doctoral candidate will explore and gain practical experience with server and browser-related software (or libraries) for QUIC and WebTransport. Specifically, the candidate will test and investigate open-source software implementations geared toward low-delay multimedia streaming. This activity will address the creation of an experimental framework for client-server streaming of multimedia content over HTTP/3. This implementation will act as the foundation for testing, analyzing, and comparing different cutting-edge protocols under various practical scenarios, such as network conditions and media bitrate, among others. The expected outcome of this initial investigation is to produce a conference publication to present the research framework to the community and to facilitate subsequent engagement with the international research groups working on the topic.
In the second year, building on the knowledge already present in the research group and on the candidate's background, new experiments for i) bitrate adaptability to the time-varying network condition, ii) quality/delay trade-offs, iii) scalability support by means of relay nodes or CDNs, will be implemented, simulated, and tested in laboratory to analyze their performance and ability to provide a significant reduction in end-to-end latency with respect to non HTTP/3 based solutions. These results are expected to yield other conference publications and potentially a journal publication with one or more theoretical performance models of the tested systems.
In the third year, the activity will be expanded with the contribution of a media company in order to unfold new possibilities in supporting the scalability of the ultra-low delay streaming protocol via relay nodes or CDNs. The candidate will provide assistance to the company in the experimental deployment of the new solution in an industrially relevant environment. The proposed techniques will aim to produce advancements that will be targeted towards a journal publication reflecting the results that can be achieved in industrially relevant contexts.
 
List of possible venues for publications 
Possible targets for research publications (well known to the proposer) include IEEE Transactions on Multimedia, IEEE Internet Computing, ACM Transactions on Multimedia Computing Communications and Applications, Elsevier Multimedia Tools and Applications, various international conferences (IEEE ICME, IEEE INFOCOM, IEEE ICC, ACM WWW, ACM Multimedia, ACM MMSys, Packet Video).

Required skills

The candidate is expected to have a good background in multimedia coding/streaming, computer networking, and web development. A reasonable knowledge of network programming and software development in the Unix/Linux environment is appreciated.

03

Media Quality Optimization using Machine Learning on Large Scale Datasets

Proposer

Enrico Masala

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI

Group website

http://media.polito.it

Summary of the proposal

Machine learning (ML) significantly changed the way many optimization tasks are addressed. Here the focus is on optimizing the media compression and communication scenario, trying to predict users' quality of experience. Key objectives of the proposal are creation of tools to analyze and exploit large scale datasets using ML to identify media characteristics and features that most influence perceptual quality. Such new knowledge will be fundamental to improve existing measures and algorithms

Research objectives and methods

In recent years, ML has been successfully employed to develop video quality estimation algorithm (see, e.g., the Netflix VMAF proposal) to be integrated in media quality optimization frameworks. However, despite these improvements, no technique can currently be considered reliable, partly because the inner workings of ML models cannot be easily and fully understood especially when they are based on "black box" neural network models.

We aim to improve the situation by developing more reliable and explainable quality prediction models. Starting from Internet Media Group's ongoing work on modeling the behavior of single human subjects in media quality experiments, the candidate will derive a systematic approach by employing several subjectively annotated datasets (i.e., with quality scores given by human subjects). With such an approach we expect to be able to identify meaningful media quality features useful to develop new reliable and explainable quality prediction models.
However, to identify and improve such features by using machine learning models, it is important to include also large-scale, not subjectively annotated, datasets. To efficiently deal with this large amount of data, it is necessary to develop a framework comprising a set of tools that allows to more easily process both the subjective scores (given by human subjects) as well as objective scores in an efficient and integrated manner, since currently every dataset has its own characteristics, quality scale, way of describing distortions, etc. which make integration difficult. Such framework, that we will make publicly available for research purposes, will constitute the basis for reproducible research, which is increasingly important for ML techniques.
The framework will allow to systematically investigate existing quality prediction algorithms finding strength and weaknesses, as well as to identify the most challenging content on which newer development can be based.
Such objectives will be achieved by using both theoretical and practical approaches. The resulting insight will then be validated in practical cases by analyzing the performance of the system with simulations and experiments with industry-grade signals, leveraging cooperation with companies to facilitate the migration of the developed algorithms and technologies into prototypes that can then be effectively tested in real industrial media processing pipelines.

The workplan of the activities is detailed in the following. In the first year the PhD candidate will first familiarize with the recently proposed ML and AI-based techniques for media quality optimization, as well as the characteristics of the publicly available datasets for research purposes. 
In parallel, a framework will be created to efficiently process the large sets of data (especially for the video case) with potentially complex measures that might need retraining, fine-tuning or other computationally complex optimizations. It is expected to make this framework publicly available also to address the research reproducibility issues that are of growing interest in the ML community. This initial investigation and activities are expected to lead to conference publications.
In the second year, building on the framework and the theoretical knowledge already present in the research group, new media quality indicators for specific quality features will be developed, simulated, and tested to demonstrate their performance and in particular their ability to identify the root causes of the quality scores for several existing quality prediction algorithms, thus partly explaining their inner working methods in a more understandable form. In this context, potential shortcomings of such algorithms will be systematically identified. These results are expected to yield one or more journal publications.
In the third year the activity will then be expanded to propose improvements that can mitigate the identified shortcoming as well as to create proposals for quality prediction algorithms based on the previously identified robust features. Such proposal will target journal publications.

Possible targets for research publications, well known to the proposer, include IEEE Transactions on Multimedia, Elsevier Signal Processing: Image Communication, ACM Transactions on Multimedia Computing Communications and Applications, Elsevier Multimedia Tools and Applications, various IEEE/ACM international conferences (IEEE ICME, IEEE MMSP, QoMEX, ACM MM, ACM MMSys).

The proposer is actively collaborating with the Video Quality Experts Group (VQEG), an international group of experts from academia and industry that aims to develop new standards in the context of video quality. In particular the tutor is co-chair of the JEG-Hybrid project which is very interested in the activity previously described.

Required skills

The PhD candidate is expected to have: strong analytical skills; some background on ML systems; good English writing and communication skills; reasonably good ability/willingness to learn how to work with large quantities of data on remote server systems, in particular by automating the procedures with scripts, pipelines, etc.

04

Security Analysis and Automation in Smart Systems

Proposer

Fulvio Valenza

Topics

Cybersecurity

Group website

http://netgroup.polito.it

Summary of the proposal

Cyber-physical systems and their smart components are pervasive in our daily activities. Unfortunately, identifying the potential threats and issues in these systems and selecting and configuring enough protection is challenging, given that such environments combine human, physical, and cyber aspects to the system design and implementation. This research aims to fill this gap by defining a novel, highly automated system to analyze and enforce security formally and optimally.

Research objectives and methods

The main objective of the proposed research is to improve the state of the art of security analysis and automation in novel Cyber-Physical Systems (i.e., Smart Systems), mainly focusing on the automated implementation of threat analysis and access control and defense policies. 
 
Although some of the methodologies and tools are available today, they support these activities only partially and with severe limitations. They especially leave most of the work and responsibility in charge of the human user, who must both identify potential threats and configure adequate protection mechanisms. 
 
The candidate will pursue highly automated approaches that limit human intervention as much as possible, thus reducing the risk of introducing human errors and speeding up security analysis and reconfigurations. This last aspect is essential because smart systems are highly dynamic. Moreover, if security attacks or policy violations are detected at runtime, the system should recover rapidly by reconfiguring its security promptly. Another feature that the candidate will pursue in the proposed solution is a formal approach, capable of providing formal correctness by construction. This way, high correctness confidence is achieved without needing a posteriori formal verification of the solution. Finally, the proposed approach will pursue optimization by selecting the best solution among the many possible ones.
 
In this work, the candidate will exploit the results and the expertise recently achieved by the proposer's research group in the related field of network security automation. Although there are significant differences between the two application fields, there are also some similarities, and the underlying expertise on formal methods held by the group will be fundamental in the candidate's research work. If successful, this research work can have a high impact because improving automation can simplify and improve the quality of the verification and reconfigurations in cyber-physical systems, which are crucial for our society nowadays.
 
 The research activity will be organized in three phases:
Phase 1 (1st year): The candidate will analyze and identify the main issues and limitations of the recent methodologies for threat modeling and analysis in cyber-physical systems.  Also, the candidate will study the state-of-the-art literature on security automation and optimization of cyber-physical systems environment, with particular attention to formal approaches for modeling and configuring security properties and devices. 
Subsequently, with the tutor's guidance, the candidate will start identifying and defining new approaches for defining novel threat models and analysis processes and automating and enforcing access control and isolation mechanisms in smart systems. Some preliminary results are expected to be published at this phase's end (e.g., a conference paper). During the first year, the candidate will also acquire the background necessary for the research. This will be done by attending courses and by personal study.
 
Phase 2 (2nd year): The candidate will consolidate the proposed approaches, fully implement them, and conduct experiments with them, e.g., to study their correctness, generality, and performance. In this year, particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. The results of this consolidated work will also be submitted for publication, aiming at least at a journal publication.
 
Phase 3 (3rd year): based on the results achieved in the previous phase, the proposed approach will be further refined to improve its scalability, performance, and applicability (e.g., different security properties and strategies will be considered), and the related dissemination activity will be completed.
 
 The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of cybersecurity (e.g. IEEE S&P, ACM CCS, NDSS, ESORICS, IFIP SEC, DSN, ACM Transactions on Information and System Security, or IEEE Transactions on Secure and Dependable Computing), and applications (e.g. IEEE Transactions on Industrial Informatics or IEEE Transactions on Vehicular Technology).

Required skills

In order to successfully develop the proposed activity, the candidate should have a good background in cybersecurity (especially in network security), and good programming skills. Some knowledge of formal methods can be useful, but it is not required: the candidate can acquire this knowledge and related skills as part of the PhD Program, by exploiting specialized courses.

05

Local energy markets in citizen-centered energy communities

Proposer

Edoardo Patti

Topics

Software engineering and Mobile computing, Parallel and distributed systems, Quantum computing, Computer architectures and Computer aided design

Group website

www.eda.polito.it

Summary of the proposal

A smart citizen-centric energy system is at the centre of the energy transition. Energy communities will enable citizens to participate actively in local energy markets by exploiting new digital tools. Citizens will need to understand how to interact with smart energy systems, novel digital tools and local energy markets. Thus, new complex socio-techno-economic interactions will take place in such smart systems which need to be analyzed and simulated to evaluate possible future impacts. For this purpose, a novel co-simulation framework is needed, which combines agent-based modelling techniques with external simulators of the grid and energy sources.

Research objectives and methods

The diffusion of distributed (renewable) energy sources poses new challenges in the underlying energy infrastructure, e.g., distribution and transmission networks and/or within micro (private) electric grids. The optimal, efficient and safe management and dispatch of electricity flows among different actors (i.e., prosumers) is key to supporting the diffusion of the distributed energy sources paradigm. The goal of the project is to explore different corporate structures, billing and sharing mechanisms inside energy communities. For instance, the use of smart energy contracts based on Distributed Ledger Technology (blockchain) for energy management in local energy communities will be studied. A testbed comprising of physical hardware (e.g., smart meters) connected in the loop with a simulated energy community environment (e.g., a building or a cluster of buildings) exploiting different Renewable Energy Sources (RES) and energy storage technology will be developed and tested during the three-year program. Hence, the research will focus on the development of agents capable of describing:- the final customer/prosumer beliefs desires and intentions and opinions.- the local energy market where prosumers can trade their energy and or flexibility- the local system operator that has to provide the grid reliability
All the software entities will be coupled with external simulators of the grid and energy sources in a plug-and-play fashion. Hence, the overall framework has to be able to work in a co-simulation environment with the possibility of performing hardware in the loop. The final outcomes of this research will be an agent-based modelling tool that can be exploited for:- Planning the evolution of future smart multi-energy systems by taking into account the operational phase- Evaluating the effect of different policies and related customer satisfaction- Evaluating the diffusion of technologies and/or energy policies under different regulatory scenarios- Evaluating new business models for energy communities and aggregators
 
During the 1st year, the candidate will study state-of-the-art solutions of existing agent-based modelling tools in order to identify the best available solution for large-scale smart energy system simulation in distributed environments. Furthermore, the candidate will review the state of the art in prosumers/aggregators/market modelling in order to identify the challenges and identify possible innovations. Moreover, the candidate will focus on the review of possible corporate structures, billing and sharing mechanisms of energy communities. Finally, it will start the design of the overall platform starting with the requirements identification and definition.
During the 2nd year, the candidate will terminate the design phase and will start the implementation of the agent intelligence. Furthermore, it will start to integrate agents and simulators together in order to create the first beta version of the tool.
During 3rd year, the candidate will ultimate the overall platform and test it in different case studies and scenarios in order to show all the effects of the different corporate structures, billing and sharing mechanisms in energy communities.
 
Possible international scientific journals and conferences:- IEEE Transaction Smart Grid,- IEEE Transactions on Evolutionary Computation,- IEEE Transactions on Control of Network Systems,- Environmental modelling and Software,- JASSS,- ACM e-Energy,- IEEE EEEIC internatational conference- IEEE SEST internatational conference- IEEE Compsac internatational conference

Required skills

Programming and Object-Oriented Programming (preferable in Python). Frameworks for Multi Agent Systems Development (preferable). Development in web environment (e.g. REST web services). Computer Networks

06

Simulation and Modelling of V2X connectivity with traffic simulation

Proposer

Edoardo Patti

Topics

Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing

Group website

www.eda.polito.it

Summary of the proposal

The development of novel ICT solutions in smart-grids has opened new opportunities to foster novel services for energy management and savings in all end-use sectors, with particular emphasis on Electric Vehicle connectivity, such as demand flexibility. Thus, there will be a strong interaction among transportation, traffic trends and energy distribution systems. New simulation tools are needed to evaluate the impact of Electric Vehicles in the grid by considering citizens behaviors.

Research objectives and methods

This research aims at developing novel simulation tools for smart cities/smart grid scenarios that exploit the Agent-Based Modelling (ABM) approach to evaluate novel strategies to manage the V2X connectivity with traffic simulation. The candidate will develop an ABM simulator that will provide a realistic and virtual city where different scenarios will be executed. The ABM should be based on real data, demand profiles and traffic patterns. Furthermore, the simulation framework should be flexible and extendable so that i) It can be improved with new data from the field; ii) it can be interfaced with other simulation layers (i.e. physical grid simulators, communication simulators); iii) It can interact with external tools executing real policies (such as energy aggregation). This simulator will be a useful tool to analyse how V2X connectivity and the associated services impact both social behaviours and traffic. It will also help the understanding of the impact of new actors and companies (e.g., sharing companies) in both the marketplace and the society, again by analysing the social behaviours and the traffic conditions. In a nutshell, ABM simulator will simulate both traffic variation and the possible advantages of V2X connectivity strategies in a smart grid context. This ABM simulator will be designed and developed to span different spatial-temporal resolutions. All the software entities will be coupled with external simulators of grid and energy sources in a plug-and-play fashion to be ready for being integrated with external simulators and platforms. This will enhance the resulting AMB framework unlocking also hardware in the loop features.
The outcomes of this research will be an agent-based modelling tool that can be exploited for:- Simulating V2X connectivity considering traffic conditions- Evaluating the effect of different policies and related customer satisfaction- Evaluating the diffusion and acceptance of demand flexibility strategies- Evaluating the new business model for future companies and services
 
During the 1st year, the candidate will study the state-of-the-art solution of existing agent-based modelling tools to identify the best available solution for large-scale traffic simulation in distributed environments. Furthermore, the candidate will review the state of the art of V2X connectivity to identify the challenges and identify possible innovations. Moreover, the candidate will focus on the review of Artificial Intelligence algorithms for simulating traffic conditions and variation for estimating EV flexibility and users' preferences. Finally, he/she will start the design of the overall ABM framework and algorithms starting with the requirements identification and definition.
 
During the 2nd year, the candidate will terminate the design phase and will start the implementation of the agents' intelligence and test the first version of the proposed solution.
 
During the 3rd year, the candidate will ultimate the overall ABM framework and AI algorithms and test it in different case studies and scenarios to assess the impact of V2X connection strategies and novel business models.
 
Possible international scientific journals and conferences:- IEEE Transaction Smart Grid,- IEEE Transactions on Evolutionary Computation,- IEEE Transactions on Control of Network Systems,- Environmental modelling and Software,- JASSS,- ACM e-Energy,- IEEE EEEIC international conference- IEEE SEST international conference- IEEE Compsac international conference

Required skills

Programming and Object-Oriented Programming (preferable in Python), 
Frameworks for Multi Agent Systems Development (preferable)
Development in web environment (e.g. REST web services),
Computer Networks

07

Machine Learning techniques for real-time State-of-Health estimation of Electric Vehicles batteries

Proposer

Edoardo Patti

Topics

Data science, Computer vision and AI, Software engineering and Mobile computing, Computer architectures and Computer aided design

Group website

https://eda.polito.it/

Summary of the proposal

This Ph.D. research proposal aims at studying novel software solutions based on Machine Learning (ML) techniques to estimate the State-of-Health (SoH) of batteries in Electric Vehicles (EV) in near-real-time. This research area is gaining a strong interest in the last years as the number of EVs is constantly rising. Knowing this SoH can unlock different possible strategies i) to reuse EVs' batteries in other contexts, e.g. stationary energy storage systems in Smart Grids, or ii) to recycle them.

Research objectives and methods

In the last years, the number of Electric Vehicles (EVs) increased significantly and it is expected to grow in the upcoming years. Due to the use of high-value materials, there is a strong economic, environmental and political interest in implementing solutions to recycle EV's batteries for example by reusing them in stationary applications to become useful energy storage systems in Smart Grids. To achieve it, novel tools are needed to estimate the battery State-of-Health (SoH), i.e. the battery capacity measurement, in near-real-time. Currently, SoH is determined by bench discharging tests taking several hours and making this process time-consuming and expensive.
 
The objective of this Ph.D. proposal consists of the design and development of models based on Machine Learning (ML) techniques that will exploit both synthetic and real-world datasets. The synthetic dataset is needed to train and test a generic ML model suitable for any EV independently from a specific brand and/or model. Whilst, the real-world dataset, given by monitoring real EVs, is needed to fine-tune the ML models, for example, by applying transfer learning techniques, customizing them more and more on the specific brand and model of the real-world EV to monitor.
 
During the three years of the Ph.D., the research activity will be divided into four phases:- Study and analysis of both state-of-the-art solutions and datasets of real-world EV monitoring.- Design and develop a realistic simulator of an EV fleet to generate the synthetic and realistic dataset. Starting from both datasheet information of different EVs (in terms of brand and model) and information provided by the Italian National Institute of Statistics (ISTAT), the simulator will simulate different routes in terms of length, altitude and travel speed, impacting battery wear differently, thus making the resulting dataset realistic and heterogeneous.- Design and development of ML-based models trained and tested with the synthetic dataset to estimate the SoH of EV's batteries.- Application of transfer learning techniques to the ML-based models (from the previous bullet #3) to fine-tune them by exploiting datasets of real-world EV monitoring (result of the previous bullet #1).
 
Possible international scientific journals and conference:- IEEE Transaction Smart Grid,- IEEE Transaction on Vehicular Technology,- IEEE Transaction on Industrial Informatics,- IEEE Transactions on Industry Applications,- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- ACM e-Energy- IEEE EEEIC international conference- IEEE SEST international conference- IEEE Compsac international conference

Required skills

Programming and Object-Oriented Programming (preferable in Python). Knowledge of Machine Learning and Neural Networks. Knowledge of frameworks to develop models based on Machine Learning and Neural Networks. Knowledge of development of Internet of Things Applications

08

Multi-Device Programming for Artistic Expression

Proposer

Luigi De Russis

Topics

Computer graphics and Multimedia, Software engineering and Mobile computing

Group website

https://elite.polito.it

Summary of the proposal

Media artists use smartphones and IoT devices as material for creative exploration. However, some do not code and have a low interest in learning. In addition, programming artworks enclose characteristics that differ from traditional coding.

This Ph.D. proposal aims to extend our comprehension of the needs of artists for creative coding through the design, implementation, and evaluation of toolkits that serve them to realize code-based artworks across multiple devices and media effectively.

Research objectives and methods

The recent availability of smartphones, AR/VR headsets, IoT-enabled devices, and microcontroller kits creates new opportunities for creative explorations for media artists and designers. The field of creative coding emphasizes the goal of expression rather than function, and creative coders combine computational skills with creative insight. In some cases, artists and designers are interested in creative coding but lack the knowledge or programming skills to benefit from the offered possibilities.

The main research objective of this Ph.D. proposal is to extend our comprehension of the needs of media artists and designers for creative coding across multiple devices and media. To reach this objective, the Ph.D. student will study, design, develop, and evaluate proper models and novel technical solutions (e.g., toolkits and tools) for supporting creative coders. The proposal envisions focusing on both creative coders and end-user programmers. The work will start from the needs of the stakeholders (i.e., artists and designers), complemented by the existing literature. Using a participatory approach, the Ph.D. student will keep the stakeholders involved in the various phases of the work.

In particular, the Ph.D. research activity will focus on the following: - Study of the creative coding field, stakeholders' needs and current tools, and HCI techniques able to support the identification of suitable requirements and the creation of technical solutions to effectively support creative exploration and coding. - Creation of a theoretical framework to satisfy the identified needs and requirements, able to adapt to different media, devices, and skills. For instance, it can include end-user personalization as a way to allow end-users to create code-based artifacts and AI techniques to support the creation of programs. - Development of a toolkit and related tools to experiment with the theoretical framework's facets. The creation and evaluation of the tools will serve as the validation for the framework.

The work plan will be organized according to the following four phases, partially overlapped:- Phase 1 (months 0-6): literature review about creative coders and coding; focus groups and interviews with designers and media artists of various skills; definitions and development of a set of use cases and promising strategies to be adopted.- Phase 2 (months 6-18): research, definition, and experimentation of an initial version of the theoretical framework and a first toolkit for creative coders, starting from the outcome of the previous phase. In this phase, the focus will be on the most common target devices, i.e., the smartphone and the PC, with the design, implementation, and evaluation of suitable tools.- Phase 3 (months 12-24): research, definition, and experimentation of a second toolkit (or an evolution of the previous one) for novice creative coders and end-users. Such a toolkit uses artificial intelligence and machine learning to help during the coding process. The subsequent design, implementation, evaluation of suitable tools, and extension of the framework.- Phase 4 (months 24-36): extension and generalization of the previous phases to include additional target devices and consolidate the theoretical framework; evaluation of the toolkit and the tools in real settings with a large number of artists.

It is expected that the results of this research will be published in some of the top conferences in the Human-Computer Interaction field (e.g., ACM CHI, ACM CSCW, ACM C&C, and ACM IUI). One or more journal publications are expected on a subset of the following international journals: ACM Transactions on Computer-Human Interaction, ACM Transactions on Interactive Intelligent Systems, and International Journal of Human Computer Studies.

Required skills

A candidate interested in the proposal should (i) be able to critically analyze and evaluate existing research, as well as gather and interpret data from various sources; (ii) be able to communicate research findings through writing and presenting; (iii) have a solid foundation in computer science/engineering and possess the relevant technical skills; (iv) have a good understanding of HCI research methods, especially around needfinding.

09

Digital Wellbeing By Desing

Proposer

Alberto Monge Roffarello

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI, Software engineering and Mobile computing

Group website

https://elite.polito.it/

Summary of the proposal

Tools for digital wellbeing allow users to self-control their habits with distractive apps and websites. Yet, they are ineffective in the long term, as tech companies still adopt attention-capture designs, e.g., infinite scroll, that compromise users' self-control. This PhD proposal investigates innovative strategies for designers and end users to consider digital wellbeing in user interface design, recognizing the need to foster healthy digital experiences without depending on external support.

Research objectives and methods

In today's attention economy, tech companies compete to capture users' attention, e.g., by introducing visual features and functionalities - from guilty-pleasures recommendations to content autoplay - that are purposely designed to maximize metrics such as daily visits and time spent. These Attention-Capture Damaging Patterns (ACDPs) [1] compromise users' sense of agency and self-control, ultimately undermining their digital wellbeing.
As of now, the HCI research community has traditionally considered digital wellbeing an end-user responsibility, enabling them to self-monitor their usage of apps and websites through tools for digital self-control. Nevertheless, studies have shown that these external interventions - especially those that are overly dependent on users' self-monitoring capabilities - are often ineffective in the long term.
Taking a complementary perspective, the main research objective of this PhD proposal is to explore how to make digital wellbeing a top-design goal in user interface design, establishing a fruitful collaboration between designers and end users and recognizing the critical necessity to foster healthy online experiences and address potential negative impacts of ACDPs on users' mental health without depending on external support. The PhD student will study, design, develop, and evaluate proper models and novel technical solutions (e.g., tools and frameworks) to support designers and end users in fostering the creation user interfaces that preserve and respect user attention by design, starting from the relevant scientific literature and performing studies involving designers and end users. In particular, possible areas of investigation are:- Innovating frameworks that define and educate designers on novel theoretically grounded processes that prioritize digital wellbeing. These processes will build upon existing design guidelines and best practices, providing clear guidance on their application and providing tech companies and designers with actionable insights to transition away from the contemporary attention economy.- Creating a validated taxonomy of positive design patterns that respect and preserve the user's attention. These patterns will promote users' agency by design and support reflection by offering the same functionality as ACDPs. - Developing design tools to support designers in prioritizing users' digital wellbeing in real-time. Using artificial intelligence and machine learning models, these tools may detect when a designed interface contains ACDPs and/or fails to address digital wellbeing guidelines, suggesting positive design alternatives.- Developing strategies that empower end users to actively participate in designing technology that prioritizes digital wellbeing. This may include the development of platforms for co-designing user interfaces, as well as mechanisms for evaluating existing user interfaces against ACDPs and giving feedback.
The proposal will adopt a human-centered approach, and it will build upon the existing scientific literature from different interdisciplinary domains, mainly from Human-Computer Interaction. The work plan will be organized according to the following four phases, partially overlapped:- Phase 1 (months 0-6): literature review at the intersection of digital wellbeing, design, and ACDPs; focus groups and interviews with designers, practitioners, and end users; definitions of a set of use cases and promising strategies to be adopted.- Phase 2 (months 3-24): research, definition, and evaluation of design frameworks and models of positive design patterns. Here, the focus will be on the design of user interfaces for the most commonly used devices, i.e., the smartphone and the PC.- Phase 3 (months 12-36): research, definition, and experimentation of design tools to support designers in prioritizing users' digital wellbeing in real-time, integrating frameworks, design guidelines, and positive design patterns explored and defined in the previous phase.- Phase 4 (months 24-36): extension and possible generalization of the previous phases to include additional devices; evaluation in real settings over long period of times of the proposed solutions; development and preliminary evaluation of strategies for end-user collaboration.
It is expected that the results of this research will be published in some of the top conferences in the Human-Computer Interaction field (e.g., ACM CHI, ACM CSCW, and ACM IUI). Journal publications are expected on a subset of the following international journals: ACM Transactions on Computer-Human Interaction, ACM Transactions on the Web, ACM Transactions on Interactive Intelligent Systems, and International Journal of Human Computer Studies. 
[1] A. Monge Roffarello, K. Lukoff, L. De Russis, Defining and Identifying Attention Capture Deceptive Designs in Digital Interfaces, CHI 2023, https://dl.acm.org/doi/abs/10.1145/3544548.3580729

Required skills

A candidate interested in the proposal should ideally: be able to critically analyze and evaluate existing research, as well as gather and interpret data from various sources; be able to communicate research findings through writing and presenting; have a solid foundation in computer science/engineering and possess relevant technical skills; have a good understanding of HCI research methods, especially around needfinding.

10

Management Solutions for Autonomous Networks

Proposer

Guido Marchetto

Topics

Parallel and distributed systems, Quantum computing

Group website

http://www.netgroup.polito.it

Summary of the proposal

Next-Generation (NextG) networks are expected to support advanced and critical services, incorporating computation, communication, and intelligent decision making. 
This activity aims to design and implement novel mechanisms using supervised and unsupervised (distributed) learning, within software-defined networks to serve the needs of data-driven infrastructure management decisions. Moreover, we aim to design novel in-band network telemetry mechanisms to increase the accuracy of these decisions.

Research objectives and methods

Two research questions (RQ) guide the proposed work: 

RQ1: How can we design and implement on local and larger-scale testbeds effective transport and routing network protocols that integrate the network stack at different scopes using recent advances in supervised and unsupervised learning? 

RQ2: To scale the use of machine learning-based solutions in network management, what are the most efficient distributed machine learning architectures that can be implemented at the network edge layer? 

The final target of the research work is to answer these questions, also by evaluating the proposed solutions on small-scale network emulators or large-scale virtual network testbeds, using a few applications, including virtual and augmented reality, precision agriculture, or haptic wearables. In essence, the main goals are to provide innovation in network monitoring, network adaptation, and network resilience, using centralized and distributed learning integrated with edge computing infrastructures. Both vertical and horizontal integration will be considered. By vertical integration, we mean considering learning problems that integrate states across network hardware and software, as well as states across the network stack across different scopes. For example, the candidate will design data-driven algorithms for congestion control problems to address the tussle between in-network and end-to-end congestion notifications. By horizontal learning, we mean using states from local (e.g., physical layer) and wide area (e.g., transport layer) as input for the learning-based algorithms. The data needed by these algorithms are carried to the learning actor by means of newly defined in-band network telemetry mechanisms.  Aside from supporting resiliency with the vertical integration, solutions must offer resiliency across a wide (horizontal) range of network operations:  from close-edge, i.e., near the device, to the far-edge, with the design of secure data-centric resource allocation (federated) algorithms.

The research activity will be organized in three phases:

Phase 1 (1st year): the candidate will analyze the state-of-the-art solutions for network management, with particular emphasis on knowledge-based network automation techniques. The candidate will then define detailed guidelines for the development of architectures and protocols that are suitable for automatic operation and configuration of NextG networks, with particular reference to edge infrastructures. Specific use-cases will also be defined during this phase (e.g., in virtual reality). Such use cases will help identifying ad-hoc requirements and will include peculiarities of specific environments. With these use cases in mind, the candidate will also design and implement novel solutions to deal with the partial availability of data within distributed edge infrastructures. Results of this work will likely result in conference publications. 

Phase 2 (2nd year): the candidate will consolidate the approaches proposed in the previous year, focusing on the design and implementation of mechanisms for vertical and horizontal integration of supervised and unsupervised learning with network virtualization. Network, and computational resources will be considered for the definition of proper allocation algorithms. All solutions will be implemented and tested. Results will be published, targeting at least one journal publication.

Phase 3 (3rd year): the consolidation and the experimentation of the proposed approach will be completed. Particular emphasis will be given to the identified use cases, properly tuning the developed solutions to real scenarios. Major importance will be given to the quality offered to the service, with specific emphasis on the minimization of latencies in order to enable a real-time network automation for critical environments (e.g., telehealth systems, precision agriculture, or haptic wearables). Further conference and journal publications are expected.

The research activity is in collaboration with Saint Louis University, MO, USA, also in the context of the NSF grant #2201536 ?Integration-Small: A Software-Defined Edge Infrastructure Testbed for Full-stack Data-Driven Wireless Network Applications?. Furthermore, it is related to active collaborations with Futurewei Inc. and Tiesse SpA, both interested in the covered topics.

The contributions produced by the proposed research can be published in conferences and journals belonging to the areas of networking and machine learning (e.g. IEEE INFOCOM, ICML, ACM/IEEE Transactions on Networking, or IEEE Transactions on Network and Service Management) and cloud/fog computing (e.g. IEEE/ACM SEC, IEEE ICFEC, IEEE Transactions on Cloud Computing), as well as in publications related to the specific areas that could benefit from the proposed solutions (e.g., IEEE Transactions on Industrial Informatics, IEEE Transactions on Vehicular Technology)

Required skills

The ideal candidate has good knowledge and experience in networking and machine learning, or at least in one of the two topics. Availability for spending periods abroad (mainly but not only at Saint Louis University) is also important for a profitable development of the research topic.

11

Preserving privacy and fairness with generative AI-based synthetic data production

Proposer

Antonio Vetro'

Topics

Software engineering and Mobile computing, Data science, Computer vision and AI

Group website

https://nexa.polito.it

Summary of the proposal

Synthetic data generation is fundamental in contexts of data scarcity or low economical resources to collect data. However, several challenges are still open in this research field, the most important being the trade-off between privacy, fairness and accuracy. The goal of this PhD proposal is to design, develop and test new generative models for synthetic data production able to preserve privacy, guarantee fairness and good levels of accuracy.

Research objectives and methods

Synthetic data generation enables the reproduction, diversification, and augmentation of real data in contexts where data is scarce and where preserving privacy is paramount. However, synthetic data may come at costs of unrealistic synthetic populations or limited accuracy in the downstream predictions and classifications. In addition, reliable techniques for reaching satisfactory trade-offs between contrasting requirements (e.g., privacy, fairness, and accuracy) are still object of research and experimentation, as well as how to produce suitable dataset and model documentation.
 
The goal of this PhD proposal is to design, develop and test new generative models for synthetic data production able to preserve privacy and guarantee fairness, while maintaining acceptable levels of accuracy. The focus will be mostly on tabular data because this type of data is used in most of the applications where fairness and privacy are paramount (for example, allocating social benefits or economic resources).
 
The new developed techniques will be compared with state-of-art generative models (e.g., language models, variational autoencoder, generative adversarial network, diffusion models, self-supervised learning, etc. ) and with traditional probabilistic methods for dataset generation (e.g, Bayesan networks, univariate kernel density estimation, etc.) on a variety of evaluation measures, such as: distance from original population; differential privacy; imbalance; fairness and accuracy of models trained and tested on generated data.
 
In addition, given the rising importance of auditable algorithms in the European legislative context, two further aspects will be investigated: i) how to properly document synthetic datasets and the model that generated them ; ii) how to generate suitable synthetic data for auditing black block systems against discrimination.
 
Considering the different (and mostly contrasting) dimensions of evaluation above mentioned, the high-level research questions are:
 
RQ 1. How to improve state of art generative models for synthetic data generation in a way that they can preserve privacy and fairness while maintaining acceptable levels of accuracy?  
 
RQ 2. What is the trade-off (between privacy, fairness, and accuracy) reached by the newly developed techniques in comparison of:
-      1.1 state of art generative-AI models?
-      1.2 traditional probabilistic methods?
 
RQ 3. Is it possible to facilitate auditing of black box systems
-      1.1 by documenting synthetic datasets and their generative models?
-      1.2 by producing exhaustive and realistic synthetic data for testing black box systems against discrimination?
 
Datasets used in the fair machine learning community will be firstly used as starting point, with possible integrations from incoming projects and/or industrial collaborations.
 
A possible workplan is the following.
 
Year 1:
RQ 1. - Task 1.1) Analysis of state of the art about: fairness datasets, measures for evaluation (privacy, balance, fairness, distance from real population, etc), synthetic data generation (with traditional probabilistic methods and generative AI).
 
RQ 1. - Task 1.2) Design and develop new generative models, based on existing ones.
 
RQ 2. - Task 2.1) Design and prepare experimentation according to the measures and datasets selected in task 1.1.
 
RQ 2. - Task 2.2) Run experiments, collect data and analyze results.
 
Year 2:
RQ 2. - Task 2.3) New experiments based on results of Task 2.2.

RQ 3. Task 3.1) State of art of existing documentation suites for datasets (e.g., datasheets) and models (e.g., model cards); collect and organize online software systems that are suitable candidates for black box testing against discrimination (e.g., insurance, online advertising, etc.).
 
RQ 3. - Task 3.2) Case study identification: select system for black box auditing, and produce suitable synthetic data sets with the model(s) developed in previous RQ.
 
Year 3:
RQ 3. - Task 3.3) Development of a new documentation suite for documenting synthetic datasets and their generative models, and application to the case study, including evaluation with users (to be identified).

RQ 3. - Task 3.4) Quantitative analysis of discrimination in selected existing software systems using the synthetic data generated.
 
Dissemination of results is a cross-cutting activity. Possible venues for publication are:
 
Journals:
-      IEEE Transactions on Software Engineering
-      ACM Transactions on Software Engineering and Methodology
-      Journal of Machine Learning Research
-      Empirical Software Engineering
-      ACM Transactions on Information Systems
-      European Journal of Information Systems
-      Journal of Systems and Software
-      Software X
-      ACM Journal of Responsible Computing
 
Conferences:
-      ACM Conference on Fairness, Accountability, and Transparency
-      AAAI/ACM Conference on AI, Ethics, and Society
-      International Conference on Internet Technologies & Society
-      EPIA Conference on Artificial Intelligence
-      ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
-      International Conference on Frontiers of Artificial Intelligence, Ethics, and Multidisciplinary Application
-      International Conference on Software Engineering
-      Conference on Neural Information Processing Systems (NeurIPS)
 
Workshops:
-      International Workshop on Data science for equality, inclusion and well-being challenges
-      International Workshop on Equitable Data & Technology
-      Workshop on Bias and Fairness in AI

Required skills

The candidate should have:  ?    Basic knowledge on software testing concepts, techniques, and methodologies. ?    Basic knowledge of AI techniques. ?    Good knowledge of statistical methods for analyzing experimental data. ?    Proficiency in data analysis techniques and tools. ?    Strong programming skills. ?    Basic knowledge of the problem of algorithm bias. ?    Research aptitude and curiosity to cross disciplinary boundaries.   The candidate should also possess good communication and presentation skills.

12

Digital Twin development for the enhancement of manufacturing systems

Proposer

Sara Vinco

Topics

Data science, Computer vision and AI, Controls and system engineering

Group website

https://eda.polito.it/

Summary of the proposal

Industry 4.0 has deeply changed manufacturing: enormous quantity of data allows to build data-based decision-support strategies and to reduce down time and defects. Many challenges are posed by the heterogeneity and variety of data and by the construction of effective data-based analytics. This program tackles such challenges to build a virtual replica of a manufacturing system (digital twin), e.g.. targeting production lines, tire production, semiconductor manufacturing, battery management.

Research objectives and methods

The main goal of this PhD program is the construction of a digital twin of a manufacturing system, to improve production effectiveness. A digital twin is a virtual replica of the system that exploits available technologies (Artificial Intelligence, data management and mining, Internet of Things, etc.) to enhance production automatically or through decision support systems. While the technologies per se are well established, their application in real life scenarios is still preliminary. Manufacturing systems indeed entail challenges such as: extreme data variety and variability, protocol heterogeneity, lack of data collection infrastructures, reduced data availability for the training of algorithms. This PhD program seeks solutions to these challenges, to allow e.g., anomaly detection, maintenance support, and automatic optimization of the production flow. Example of application scenarios are new generation manufacturing systems, such as tire production, line production, semiconductor manufacturing. All scenarios will be investigated with the support and with case studies provided by industrial and research partners, such as Michelin, STMicroelectornics, Technoprobe. 
 
The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets and case studies, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.-  In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological challenges posed by the specific industrial application. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed industry-oriented conference (e.g. ETFA), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Industrial Informatics, IEEE Systems journal, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone digital twin framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.

Required skills

The ideal candidate to this PhD program has:
- positive attitude to research activity and working in team
- solid programming skills 
- solid basics of linear algebra, probability, and statistics
- good communication and problem-solving skills
- some prior experience in the design and development of machine learning and deep learning architectures
- some prior knowledge/experience of manufacturing processes is a plus, but not a requirement.

13

State-of-Health diagnostic framework towards battery digital twins

Proposer

Sara Vinco

Topics

Controls and system engineering, Data science, Computer vision and AI

Group website

https://eda.polito.it/

Summary of the proposal

The adoption of EVs is limited by their reliance on batteries with low energy and power densities compared to liquid fuels and subject to aging and performance deterioration. For this reason, monitoring the battery state-of-charge (SoC) and -health (SoH) is a very relevant problem. This PhD program focuses on the development of models for battery SoC and SoH, with the goal of enabling continuous monitoring of batteries and of improving their design and management throughout their lifetime.

Research objectives and methods

The main goal of this PhD program is the construction of a framework to simulate battery behavior over time, to create a virtual replica and allow the analysis of different management strategies and configurations. This will require: - The identification, analysis and adoption of datasets (both public and private) of batteries - The construction of models with different levels of accuracy and different flows, e.g., based on Artificial Intelligence techniques (e.g., Physically Informed Neural Networks, Machine Learning) and on top-down modeling techniques (e.g., circuit models)- The definition of the monitoring architecture to be installed at the level of the Battery Manage System (BMS) or in an IT infrastructure, to define decision-support solutions, digital twins to the customer, or other services 
All scenarios will be investigated with the support and with case studies either considered as reference for the state of the art (e.g., NASA datasets) or provided by industrial partners (e.g., automotive companies). 
 
The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets and case studies, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological challenges posed by the specific industrial application.  At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed power-oriented conference (e.g. ISLPED, PATMOS), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Sustainable Computing, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone digital twin framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.

Required skills

The ideal candidate to this PhD program has:
- positive attitude to research activity and working in team
- solid programming skills 
- solid basics of linear algebra, probability, and statistics
- good communication and problem-solving skills
- some prior experience in the design and development of machine learning and deep learning architectures
- some prior knowledge of energy systems/batteries is a plus, but not a requirement.

14

Modeling, simualtion and validation of modern electronic systems

Proposer

Sara Vinco

Topics

Computer architectures and Computer aided design, Controls and system engineering

Group website

https://eda.polito.it/

Summary of the proposal

The current international semiconductor scenario is extremely competitive and is pushing for strong innovation advancement. This PhD program focuses on the development of modeling, simulation and validation flows of innovative systems, including not only digital functionality but also thermal and power flows, mechanical components (e.g., accelerometers) and analog subsystems (e.g., gate drivers). Research is supported by international projects and partners.

Research objectives and methods

Modern electronic systems are tightly coupled to mechanical, thermal, power aspects that must be taken into account at design time to ensure the correct operation of the final system. Ignoring behaviors or potential faults of connected analog subsystems or mechanical actuators may indeed lead to unsafe or incorrect behaviors, that prevent the operation of the design after deployment. This requires to extend the traditional design, simulation and validation flows with a sensibility to extra-functional and non-digital aspects. The main goal of this PhD program is the definition of such flows, through the adoption of OpenHW, standard, technologies such as SystemC(-AMS), RISC-V, IP-XACT, and other technologies that fall under the ChipsAct umbrella of EU research. Example of application scenarios are smart systems such as drones, and automotive and robotics applications. All scenarios will be investigated with the support and with case studies provided by industrial and research partners, such as Infineon and STMicroelectornics. 
 
The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program. - In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start studying the extension of simulation flows to extra-functional aspects such as power or mechanics. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with support from the industrial partners, and will seek solutions to the challenge of validation of systems including heterogeneous aspects.  At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed EDA-oriented conference (e.g. DATE, DAC), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Computers, etc.). - In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly apply them to an industrial case study. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.

Required skills

The ideal candidate to this PhD program has:
- positive attitude to research activity and working in team
- solid programming skills 
- good communication and problem-solving skills
- some prior experience in digital design flows
- some prior knowledge/experience of analog and extra-functional domains is a plus, but not a requirement.

15

Robust AI systems for data-limited applications

Proposer

Santa Di Cataldo

Topics

Data science, Computer vision and AI

Group website

https://eda.polito.it/

Summary of the proposal

Artificial Intelligence is driving a revolution in many important sectors in society. Deep learning networks, and especially supervised ones such as Convolutional Neural Networks, remain the go-to approach for many important tasks. Nonetheless, training these models typically requires massive amount of good-quality annotated data, which makes them impractical in many real-world applications. This PhD program seeks answers to such problems, targeting important use-cases in today's society (among the others: industry 4.0 and biomedical applications).

Research objectives and methods

The main goal of this PhD program is the investigation of robust AI-based decision making in data-limited situations. This includes three possible scenarios, which are typical of many important real-world applications:- the training data is difficult to obtain, or it is available in limited quantity.- obtaining the training data is not difficult. Nonetheless, it is either difficult or economically impractical to have human experts labelling the data.- the training data/annotations are available, but the quality of such data is very poor.
Possible solutions involve different approaches, from classic transfer learning and domain adaptation techniques, data augmentation with generative modelling, or semi- and self-supervised learning approaches, where the access to real data of the target application is either minimized or avoided altogether. In addition, the use of probabilistic approaches (e.g., Bayesian inference) can be of help to properly quantify the uncertainty level both at training and inference time, making the decision process more robust both to noisy data and/or inconsistent annotations. This research proposal aims to investigate and advance the state of the art in such areas. 
The outline can be divided into 3 consecutive phases, one per each year of the program.- In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting on the available state-of-the-art techniques. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, well-representing the three data-limited scenarios mentioned before. Stemming from the supervisors' collaborations and current research activity, these use-cases may involve industry 4.0 applications (for example: smart manufacturing and industrial 3D printing) as well as biomedicine and digital pathology. There is some scope to shape the specific focus of such use-cases with the interests and background of the prospective student, as well as with the ones of the various collaborators that could be involved in the project activity: research centers such as the Inter-departmental Center for Additive Manufacturing in PoliTO, the National Institute for Research in Digital Science and Technology (INRIA, France) as well as industries such as Prima Industrie, Stellantis, Avio Aero, etc.  At the end of the second year, the candidate is expected to target at least a paper in a well-reputed conference in the field of applied AI, and possibly another publication in a Q1 journal of the Computer Science sector (e.g., Pattern Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, Expert Systems with Applications, etc.)- In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone architecture. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.

Required skills

The ideal candidate to this PhD program has:
-    positive attitude to research activity and working in team
-    solid programming skills 
-    solid basics of linear algebra, probability, and statistics
-    good communication and problem-solving skills
-    some prior experience in the design and development of machine learning and deep learning architectures.

16

Artificial Intelligence applications for advanced manufacturing systems

Proposer

Santa Di Cataldo

Topics

Data science, Computer vision and AI

Group website

https://eda.polito.it/

Summary of the proposal

Industry 4.0 refers to digital technologies designed to sense, predict, and interact with production systems, to make decisions that support productivity, energy-efficiency, and sustainability. While Artificial Intelligence plays a crucial role in this paradigm, many challenges are still posed by the nature and dimensionality of the data, and by the immaturity and intrinsic complexity of some of the processes involved. The aim of this PhD program is to successfully tackle these challenges.

Research objectives and methods

The main goal of this PhD program is the investigation, design and deployment of state-of-the-art Artificial Intelligence approaches in the context of the smart factory, with special regards with new generation manufacturing systems. These tasks include:- quality assurance and inspection of manufactured product via heterogeneous sensors data (e.g., images from visible range or IR cameras, time-series, etc.)- process monitoring and forecasting- anomaly detection- failure prediction and maintenance planning support
While the Artificial Intelligence technologies able to address such tasks may already exist and be successfully consolidated in other real-world applications, the specific domain of manufacturing systems poses severe challenges to the effective deployment of these techniques. Among the others:- the immaturity of the involved technologies- the complexity of the underlying physical/chemical processes- the lack of effective infrastructures for data collection, integration, and annotation- the necessity to handle heterogeneous and noisy data from different types of sensors/machines- the lack of annotated datasets for training supervised models- the lack of standardized quality measures and benchmarks
This PhD program seeks solutions to these challenges, with specific focus on new generation manufacturing systems involving complex processes. For example: Additive Manufacturing (AM) and Semiconductor Manufacturing (SM).- AM includes many innovative 3D printing processes, which are rapidly revolutionizing manufacturing in the direction of higher digitalization of the process and higher flexibility of production. AM involves a fully digitalized process from design to product finishing, and hence it is a perfect candidate for the deployment of Artificial Intelligence. Nonetheless, it is a very complex and still immature technology, with tremendous room for improvement in terms of production time and product defectiveness. Specific use-cases in this regard will stem from the supervisors' collaborations with the Inter-departmental Center for Additive Manufacturing in Politecnico di Torino, as well as with several major industrial partners such as Prima Additive, Stellantis, Avio Aero, etc.- SM is another highly complex process, entailing a wide array of subprocesses and diverse equipment. Driven by the Industry 4.0 revolution and European Chips Act, the semiconductor industry is investing heavily in the digitalization of its production chain. As a result of these investments, the chip production process has been equipped with multiple sensors that constantly monitor the evolution of each manufacturing phase, from oxidation to testing and packaging, thus collecting a tremendous amount of heterogeneous data. To fully unveil the potential and hidden knowledge of such data, Artificial Intelligence is widely acknowledged to have a fundamental role. Use-cases in this regard will stem from the supervisors' collaborations with important industrial players in this sector, such as STMicroelectronics.
The outline of the PhD program can be divided into 3 consecutive phases, one per each year of the program.- In the first year, the candidate will acquire the necessary background by attending PhD courses and surveying the relevant literature and will start experimenting state-of-the-art techniques on the available datasets, either from public sources or from past projects of the supervisors. A seminal conference publication is expected at the end of the year.- In the second year, the candidate will select and address some relevant use-cases, with real data from the industrial partners, and will seek solutions to the technological and computational challenges posed by the specific industrial application. At the end of the second year, the candidate is expected to target at least a second conference paper in a well-reputed industry-oriented conference (e.g. ETFA), and possibly another publication in a Q1 journal of the Computer Science sector (e.g. IEEE Transactions on Industrial Informatics, Expert Systems with Applications, etc.).- In the third year, the candidate will consolidate the models and approaches that were investigated in the second year, and possibly integrate them into a standalone framework. The candidate will also finalize this work into at least another major journal publication, as well as into a PhD thesis to defend at the end of the program.

Required skills

The ideal candidate to this PhD program has:
-    positive attitude to research activity and working in team
-    solid programming skills 
-    solid basics of linear algebra, probability, and statistics
-    good communication and problem-solving skills
-    some prior experience in the design and development of machine learning and deep learning architectures. 
-  some prior knowledge/experience of manufacturing processes is a plus, but not a requirement.

17

AI for Secured Networks: Language Models for Automated  Security Log Analysis

Proposer

Marco Mellia

Topics

Cybersecurity, Data science, Computer vision and AI

Group website

https://smartdata.polito.it/
https://dbdmg.polito.it/dbdmg_web/

Summary of the proposal

Network security analysts are a key component of the defence infrastructure of an organization. They continuously and manually analyze security alarms and logs to make decisions against undesired intrusions.   
 
Language Models (LMs) demonstrated huge potential in processing texts.  The research will evaluate the capabilities of LM agents (lightweight, large and multi-modal ones) in automating the investigations of security logs and performing zero-shot classification through generalization.

Research objectives and methods

Research objectives:
 
Investigate and evaluate the capabilities of LLM agents in automating the manual investigations of the security analyst. This would assist them in analysis and incident reporting.  

The candidate will perform research to determine whether, and to what extent, the recent advances in language models could be used to automate and assist security analysts in the process (i) of learning the security-device rules by example and (ii) autonomously investigating the challenging cases currently analyzed by humans.
 
In the second phase, the candidate will investigate how and if lightweight and generalizable language models can extract insights from raw data, as today large language models can do.  The goal is to investigate whether models with limited supervision and a minimal number of trusted labels can attain comparable performance to generic large language models (LLMs) when applied to specific tasks such as code understanding, classification, anomaly detection, bug detection, or identifying security breaches.
 
The research will consider multi-modal embeddings to conceptually constrain the embeddings towards the right task. 
By forcing the model to create multi-modal embeddings conceptually constrained to the right task, the model will possess the ability to generalize and autonomously reason about novel and previously unencountered tasks. For instance, test joint learning of (i) natural language label explanation of the security threat and (ii) the packet payload, using, e.g., contrastive learning techniques. 
 
The project will involve a collaboration with Huawei Technologies France and Politecnico di Torino. 

Outline of the research work plan:

1st  year- Study of the state-of-the-art of security log analysis and state-of-the-art language models in ML.- Data collection and analysis of raw and structured data on security devices such as Firewall/Intrusion Prevention Systems (IPS), Endpoint Detection and Response (EDR) and Cloud security services. 

2nd  year- Adaptation and extension solutions to learn the security-device rules by example and autonomously investigate complex cases.- Propose and develop innovative solutions to the problems of cyber threats analysis with Language models.- Propose multi-modal embeddings for network raw data and security logs.

3rd  year - Tune the developed techniques and highlight possible strategies to counteract the various threats.- Application of the strategies to new data for validation and testing.

References:
- Boffa, M., Valentim, R. V., Vassio, L., Giordano, D., Drago, I., Mellia, M., & Houidi, Z. B. (2023). LogPr\'ecis: Unleashing Language Models for Automated Shell Log Analysis. arXiv preprint arXiv:2307.08309- Boffa, M., Milan, G., Vassio, L., Drago, I., Mellia, M., & Houidi, Z. B. (2022, June). Towards nlp-based processing of honeypot logs. In 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) (pp. 314-321). IEEE.- Boffa, M., Vassio, L., Mellia, M., Drago, I., Milan, G., Houidi, Z. B., & Rossi, D. (2022, December). On using pretext tasks to learn representations from network logs. In Proceedings of the 1st International Workshop on Native Network Intelligence (pp. 21-26).

List of possible venues for publications:
- Security venues: IEEE Symposium on Security and Privacy, IEEE Transactions on Information Forensics and Security, ACM Symposium on Computer and Communications Security (CCS), USENIX Security Symposium, IEEE Security & Privacy;
- AI venues: Neural Information Processing Systems (NeurIPS), International Conference on Learning Representations (ICLR), International Conference on Machine Learning (ICML), AAAI Conference on Artificial Intelligence, ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD);
- Computer networks venues: Distributed System Security Symposium (NDSS), Privacy Enhancing Technologies Symposium, The Web Conference (formerly International World Wide Web Conference WWW), ACM International Conference on Emerging Networking EXperiments and Technologies (CoNEXT), USENIX Symposium on Networked Systems Design and Implementation (NSDI).

Required skills

- Good programming skills (such as Python, Torch, Spark)

- Excellent Machine Learning knowledge

- Knowledge NLP and LM

- Basics of Networking and security

18

Leveraging Machine Learning Analytics for Intelligent Transport Systems Optimization in Smart Cities

Proposer

Marco Mellia

Topics

Data science, Computer vision and AI

Group website

https://smartdata.polito.it/
https://dbdmg.polito.it/dbdmg_web/

Summary of the proposal

Electrification and big data are changing the design of transport systems. The availability of large amounts of data collected by black boxes for insurance/safety opens innovative challenges and opportunities to improve transport systems and reduce carbon footprint.
The research will focus on effective machine learning pipelines for multiple purposes including proposing new policies, optimizing fleets, and designing electrified systems, with a focus on the comparison of ICE/Electric car impact.

Research objectives and methods

Research objectives:
 
This PhD research seeks to harness the power of machine learning and big data analytics in understanding and optimizing mobility through the analysis of data collected from black boxes in fleets of cars. 

This proposal outlines a comprehensive plan to leverage big data analytics for intelligent transport systems in smart cities. The impact of mobility based on electric vehicles and its comparison with previous habits will be a core part of the study. 
The research objectives aim to contribute valuable insights to mobility planning and optimization, and the work plan ensures a systematic approach to achieving these objectives. 
The PhD student will be involved in research activities with companies and funded research projects. Data will be provided by a company.

Outline of the research work plan:

1st  year- Study of the state-of-the-art data analysis techniques for transportation and mobility.- Data collection, Exploration and Pre-processing: Extract and pre-process raw data from black boxes, ensuring data quality and compatibility for further analysis. Develop techniques to handle missing or incomplete data.- Investigate and implement privacy-preserving methods to ensure ethical use of mobility data while still deriving valuable insights.

2nd  year- Apply machine learning algorithms to identify patterns in mobility data, extracting insights into traffic flows, congestion, and usage patterns.- Implement anomaly detection mechanisms to identify unusual events and improve system resilience- Develop predictive models to forecast traffic conditions, enabling proactive measures to alleviate congestion and enhance overall traffic management.- Explore adaptive algorithms for real-time adjustments based on dynamic traffic patterns.

3rd  year - Integrate developed algorithms into a cohesive system for intelligent transport systems.- Validate the system using real-world scenarios and fine-tune algorithms for optimal performance.

References:- Ciociola, A., Cocca, M., Giordano, D., Mellia, M., Morichetta, A., Putina, A., & Salutari, F. (2017, August). UMAP: Urban mobility analysis platform to harvest car-sharing data. In SmartWorld/(pp. 1-8). IEEE.- Cocca, M., Giordano, D., Mellia, M., & Vassio, L. (2019). Free-floating electric car sharing: A data-driven approach for system design. IEEE Transactions on Intelligent Transportation Systems, 20(12), 4691-4703.- Cocca, M., Giordano, D., Mellia, M., & Vassio, L. (2019). Free-floating electric car sharing design: Data-driven optimisation. Pervasive and Mobile Computing, 55, 59-75.


List of possible venues for publications
- IEEE Transactions on Intelligent Transportation Systems- Elsevier Transportation Research- IEEE International Conference on Data Science and Advanced Analytics- IEEE International Conference on Big Data  - IEEE International Smart Cities Conference- ACM Transactions on Spatial Algorithms and Systems- IEEE Transactions on Vehicular Technology- Elsevier Cities

Required skills

- Good programming and data analysis skills (such as Python, Pandas, Torch, Spark)
- Excellent Machine learning knowledge
- Fundamentals of Operational research

19

Natural Language Processing e Large Language Models for source code generation

Proposer

Edoardo Patti

Topics

Data science, Computer vision and AI, Software engineering and Mobile computing

Group website

https://eda.polito.it/

Summary of the proposal

This Ph.D. research is focused on revolutionizing source code generation by harnessing the capabilities of Natural Language Processing by exploring novel methodologies to facilitate the creation of high-quality code through enhanced human-machine collaboration. By leveraging advanced language models, like Generative Pretrained Transformer models, the research seeks to optimize the process, leading to more efficient, expressive, and context-aware source code generation in software development.

Research objectives and methods

The integration of Artificial Intelligence, especially Machine/Deep Learning, in industrial processes promises swift changes. Companies stand to benefit in the short term with improved production quality, efficiency, and automated routine tasks, fostering positive impacts on work environments. In addition to Natural Language Processing, Large Language Models (LLMs) have already demonstrated significant progress in healthcare, education, software development, finance, journalism, scientific research, and customer support. The future entails optimizing LLMs for widespread use, enhancing the competitiveness of the industrial system and streamlining collaborative supply chain management.

The objective of this Ph.D. proposal consists of the design and development of AI-assisted models based on Natural Language Processing (NLP) and Large Language Models (LLMs) to optimize the AI-assisted source code generation in the context of software development by enhancing the process, leading to more efficient, expressive, and context-aware.

During the three years of the Ph.D., the research activity will be divided into five phases:- Survey existing literature on NLP applications in software engineering and analyze methodologies and challenges in source code generation using language models.- Design and develop Large Language Models for improved programming language understanding by investigating techniques for domain-specific customization of language models.- Develop algorithms and strategies for context-aware source code generation by implementing prototype systems for evaluation and refinement.- Design and implement a collaborative framework that seamlessly integrates developer input with language model suggestions.- Evaluate the effectiveness of the collaboration framework through user studies and real-world projects.

Possible international scientific journals and conference:- IEEE Transactions on Audio, Speech, and Language Processing- IEEE Transactions on Software Engineering- IEEE Transaction on Industrial Informatics,- IEEE Transactions on Industry Applications,- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- IEEE NLP-KE internat. conf.- IEEE ICNLP internat. conf.- IEEE Compsac internat. conf.

Required skills

Programming and Object-Oriented Programming (preferable in Python),
Knowledge of Natural Language Processing and Large Language Models
Knowledge of frameworks to develop models based on Natural Language Processing and Large Language Models

20

Cloud continuum machine learning

Proposer

Daniele Apiletti

Topics

Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing

Group website

 

Summary of the proposal

As the demand for novel distributed machine learning models operating at the edge continues to grow, so does the call for cloud continuum frameworks to support machine learning.
In this broad context, the candidate will explore innovative solutions achieved by combining the benefits of edge-based machine learning models with the cloud continuum scenario, in a wide range of application contexts.

Research objectives and methods

  Research Objectives
 
This research aims to define new methods for improving machine learning applications in cloud computing contexts. Compared to traditional machine learning models that are trained in the cloud and can leverage virtually unlimited storage and computational resources offered by scalable data centers, the goal of the research is to investigate limitations, experimentally evaluate, and improve the state of the art in machine learning models based on distributed and federated learning techniques.
Applications that are delay sensitive or generate large amounts of distributed time series data can benefit from the proposed paradigm: The computational power provided by devices at the edge and by intermediate nodes between the edge and the central cloud (fog computing) can be used to provide cloud continuum machine learning models.
 
Innovative cloud continuum machine learning solutions will be applied using existing cloud-to-edge frameworks, while also following current EU research directions that aim to create alternatives to established hyperscalers by building an EU-based sovereign edge platform (e.g., SovereignEdge.eu, EUCloudEdgeIoT.eu, FluidOS, etc.).
 
The proposed research can be useful in many scenarios: Time Series Data Modeling and Energy Management at different scales, from watersheds (e.g., PNRR project NODES) to smart cities, from large buildings to complex vehicles (e.g., airplanes and cruise ships), from smart manufacturing to distributed sensors in healthcare, in smart power grids, and IoT networks where devices have limited resources and are very sensitive to environmental conditions, data speed, network connectivity, and power consumption.
To this end, several research topics will be addressed, such as:
?       Edge AI and machine learning for next generation computing systems.
?       Benefits and challenges of cloud and edge computing through comparative experimental analysis of state-of-the-art applications and real-world scenarios.
?       Lightweight AI models with better efficiency for devices with limited computational and energy resources.
?       Distributed and decentralized learning techniques in network monitoring and orchestration techniques.
?       Mitigation and prevention of security breaches in Edge ML, using AI monitoring tools.
 
Outline of the research work plan
 
1st year. The candidate will explore the state of the art of distributed machine learning techniques, such as federated learning, split learning, gossip learning, in the context of an edge computing environment. He/she will look for gaps and emerging trends in AI models in the cloud continuum and test applications of existing paradigms to a real-world application.
 
2nd year. The candidate will design and develop novel solutions to overcome limitations and constraints by testing proposed methods on highlighted real-world challenges. Public, artificial, and possibly real data sets will be used for the development and testing phases. New limitations and constraints are expected to be discovered during this phase.
 
In the 3rd year, the candidate will advance the research by extending the experimental evaluation to more complicated scenarios that can better benefit from the expertise provided by the new cloud continuum of proposed machine learning solutions. To identify shortcomings and possible further advances in new application areas, the candidate will make optimizations to the proposed models.
 
 
List of possible venues for publications.
 
 
Journal of Grid Computing (Springer)
Future Generation Computer Systems (Elsevier)
IEEE TKDE (Trans. on Knowledge and Data Engineering)
IEEE TCC (Trans. on Cloud Computing)
ACM TKDD (Trans. on Knowledge Discovery in Data)
ACM TOIS (Trans. on Information Systems)
ACM TOIT (Trans. on Internet Technology)
ACM TOIST (Trans. on Intelligent Systems and Technology)
Information sciences (Elsevier)
Expert systems with Applications (Elsevier)
Internet of Things (Elsevier)
Journal of Big Data (Springer)
IEEE TBD (Trans. on Big Data)
Big Data Research
IEEE TETC (Trans. on Emerging Topics in Computing)
IEEE Internet of Things Journal
Journal of Network and Computer Applications (Academic Press)

Required skills

Knowledge of the basic computer science concepts.
Knowledge of the main cloud computing topic.
Programming skills in C-family and Python languages.
Undergraduate experience with data mining and machine learning techniques.
Knowledge of English, both written and spoken.
Capability of presenting the results of the work, both written (scientific writing and slide presentations) and oral.
Capability of guiding undergraduate students for thesis projects.

21

Graph network models for Data Science

Proposer

Daniele Apiletti

Topics

Data science, Computer vision and AI, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing

Group website

 

Summary of the proposal

Machine learning approaches extract information from data with generalized optimization methods. However, besides the knowledge brought by the data, extra a-priori knowledge of the modeled phenomena is often available. Hence an inductive bias can be introduced from domain knowledge and physical constraints, as proposed by the emerging field of Theory-Guided Data Science.
Within this broad field, the candidate will explore solutions exploiting the relational structure among data.

Research objectives and methods

  Research Objectives
 
The research aims at defining new methodologies for semantics embedding, propose novel algorithms and data structures, explore applications, investigate limitations, and advance the solutions based on different emerging Theory-guided Data Science approaches.
The final goal is to contribute to improving the machine learning model performance by reducing the learning space thanks to the exploitation of existing domain knowledge in addition to the (often limited) available training data, pushing towards more unsupervised and semantically richer models. 
To this aim, the main research objective is to exploit the Graph Network frameworks in deep-learning architectures by addressing the following issues:
- Improving state-of-the-art strategies of organizing and extracting information from structured data.
- Overcoming the Graph-Network model limitation in training very deep architectures, with a consequent loss in expressive power of the solutions.
- Advancing the state-of-the-art solutions to dynamic graphs, which can change nodes and mutual connections over time. Dynamic Networks can successfully learn the behavior of evolving systems.
- Experimentally evaluate the novel techniques in large-scale systems, such as supply chains, social networks, collaborative smart-working platforms, etc. Currently, for most graph-embedding algorithms, the scalability of the structure is difficult to handle since each node has a peculiar neighborhood organization.
- Applying the proposed algorithms to natively graph-unstructured data, such as texts, images, audio, etc.
- Developing techniques to design ensemble graph architectures to capture domain-knowledge relationships and physical constraints.
 
Outline
 
1st year. The candidate will explore the state-of-the art techniques of dealing with both structured and unstructured data, to integrate domain-knowledge strategies in network model architectures. Applications to physics phenomena, images and text, taken from real-world networks such as social platforms and supply chains will be considered.
2nd year. The candidate will define innovative solutions to overcome the limitations described in the research objectives, by experimenting the proposed techniques on the identified real-world problems. The development and the experimental phase will be conducted on public, synthetic, and possibly real-world datasets. New challenges and limitations are expected to be identified in this phase.
During the 3rd year, the candidate will extend the research by widening the experimental evaluation to more complex phenomena able to better leverage the domain-knowledge provided by the Graph Networks. The candidate will perform optimizations on the designed algorithms, establishing limitations of the developed solutions and possible improvements in new application fields.
 
Target publications
 
IEEE TKDE (Trans. on Knowledge and Data Engineering)
ACM TKDD (Trans. on Knowledge Discovery in Data)
ACM TOIS (Trans. on Information Systems)
ACM TOIT (Trans. on Internet Technology)
ACM TIST (Trans. on Intelligent Systems and Technology)
IEEE TPAMI (Trans. on Pattern Analysis and Machine Intelligence)
Information sciences (Elsevier)
Expert systems with Applications (Elsevier)
Engineering Applications of Artificial Intelligence (Elsevier)
Journal of Big Data (Springer)
ACM Transactions on Spatial Algorithms and Systems (TSAS)
IEEE Transactions on Big Data (TBD)
Big Data Research
IEEE Transactions on Emerging Topics in Computing (TETC)
Information sciences (Elsevier)

Required skills

- Knowledge of the basic computer science concepts.
- Programming skills in Python
- Undergraduate experience with data mining and machine learning techniques
- Knowledge of English, both written and spoken.
- Capability of presenting the results of the work, both written (scientific writing and slide presentations) and oral.
- Capability of guiding undergraduate students for thesis projects.

22

Automatic composability of Large Co-simulation Scenarios for smart energy communities

Proposer

Edoardo Patti

Topics

Parallel and distributed systems, Quantum computing, Data science, Computer vision and AI, Computer architectures and Computer aided design

Group website

www.eda.polito.it

Summary of the proposal

The emerging concept of multi-energy systems is linked to heterogeneous competencies spanning from energy systems to cyber-physical systems and active prosumers. Studying such complex systems needs the usage of co-simulation techniques. However, the setup of co-simulation scenarios requires a deep knowledge of the framework and a time-consuming setup of the distributed infrastructure. The research program aims to develop automatic composability of multi-energy system co-simulations to ease usage

Research objectives and methods

A complex system such as a multi-energy system requires the accurate modelling of the heterogeneous aspects that constitute the overall phenomena under study. To achieve this goal researchers in different fields have started using co-simulation and model coupling to build new models capable of describing the interactions and the overall complexity. Such approaches give the possibility of coupling different models, running on different simulators and/or simulation engines, by exchanging data via some standard protocols over the internet. Indeed, such models have been developed and validated following a methodology that can be compared to service-oriented architecture, thus, reducing the time and complexity of building new models from scratch. Moreover, such an approach disease the interconnection of the vertical knowledge coming from each discipline/domain that is involved in the complex system, eg. ICT or Energy experts. Examples of models can be software entities that replicate the realistic behaviour of a photovoltaic (PV) system, energy storage, heating distribution networks or, even, human beans. Nowadays, researchers have invested in the usage of co-simulation orchestrators to achieve the goal of interconnection and synchronization of different models and simulators, including real-time simulators. However, the setup of the co-simulation is not an easy and trivial task as it is time-consuming and it requires the involvement of domain and co-simulation experts. This research topic aims to develop a framework, that exploits existing co-simulation orchestrators, for the automatic composability of co-simulation scenarios in a distributed infrastructure to assess different aspects of Multi-Energy-Systems. The framework will integrate models in a plug-and-play fashion reducing as much as possible the coding phase and the presence of a co-simulation expert easing the work of multi-energy systems engineers. Moreover, the framework will ease the setup in terms of computational resources for the modelling of complex and large scenarios. The final purpose consists of simulating the impact and management of future energy systems to foster the energy transition. Thus, the resulting infrastructure will integrate with a semantic approach in a distributed environment heterogeneous i) data sources, ii) cyber-physical-systems, i.e. Internet-of-Things devices iii) models of energy systems and iv) real-time simulators. The starting point of this activity will be the already existing EC-L co-simulation platform, which will be enhanced by embedding all the aforementioned features.

Hence the research will focus on developing:- a methodology based on semantic web technologies for linking and interconnecting simulators automatically in a co-simulation approach- a domain-specific ontology for describing the components and interconnection of multi-energy system models- a methodology for the automatic composability and setup of the distributed infrastructures of the energetic scenario to assess (e.g., the impact of PV systems and EVs in a city)
In a nutshell, the final result will provide a tool that exploits visual programming, semantic representation and cloud technologies to offer co-simulation as a service to describe multi-energy systems simulation scenarios in a plug-and-play fashion opening the usage of cosimulation to a wider audience.

The outcomes of this research will be a distributed co-simulation platform for:- planning the evolution of the future smart multi-energy system by taking into account the operational phase- evaluating the effect of different policies and related customer satisfaction- evaluating the performances of hardware components in a realistic test bench

During the first year, the candidate will study the literature solutions of existing co-simulation platforms to identify the best available solution for i) large-scale smart energy system simulation in distributed environments and ii) semantic web solutions to describe complex systems with a focus on the multi-energy system domain. Finally, the student will design the overall framework starting from the requirements identification and definition. 

During the second year, the candidate will face the implementation of the visual and semantic framework for model coupling and scenario creation. Furthermore, the candidate will start developing software solutions to automatic composability and setup of the co-simulation environments in terms of simulator deployment in a cloud system. 

During the third year, the candidate will complete the overall framework development and test it in different case study scenarios to assess the capabilities of the platform in terms of automatic scenario composition and setup.     

Possible international scientific journals and conferences:- IEEE Transaction Smart Grid- IEEE Transaction on Industrial Informatics,- IEEE Transaction on sustainable computing,- IEEE EEEIC internat. conf.- IEEE SEST internat. conf.- IEEE Compsac internat. conf.

Required skills

Programming and Object-Oriented Programming (preferable in Python),
Frameworks for orchestration and setup of containerized applications,
Knowledge of semantic technologies,
Computer Networks

23

Multivariate time series representation learning for vehicle telematics data analysis

Proposer

Luca Cagliero

Topics

Data science, Computer vision and AI

Group website

https://smartdata.polito.it/ 
https://www.tierratelematics.com/

Summary of the proposal

This PhD proposal aims to study new techniques for embedding multivariate time series, apply them to solve established downstream tasks, and leverage these solutions in Data Science pipelines to analyze vehicles' telematics data such as CAN Bus signals. Embeddings will not only capture the series' temporal properties but also their multi-dimensional relations. These models will be used to classify, segment, and cluster signals and to detect anomalies and communities for industrial vehicle usage.

Research objectives and methods

Description: 
Multivariate time series data have peculiar properties related to their sequential and multi-faceted nature. Although state-of-the-art embedding techniques tailored to time series data are effective in handling sequential data relations thanks to the use of auto-regressive or attention-based models, they often struggle to handle multiple dimensions at the same time. For example, CAN bus data acquired from vehicles cover a variety of different aspects (e.g., fuel level, coolant temperature, engine speed, ...) that are worth jointly analyzing to address predictive maintenance, anomaly detection, fleet detection and management, anomaly detection, and telematics service shaping. 
 
The PhD research will advance existing approaches to process and encode multivariate time series data, which encompass (but are not limited to) transformer models [1,2], contrastive and adversarial networks [3,4], matrix profile-based models [5,6], and Large Language Models [7]. The proposed representations will be then used to address various downstream tasks on time series data among which time series classification, forecasting, segmentation, and clustering and anomaly detection. For example, clustering and classifying CAN bus signals can be useful to automatically identify the working status of a vehicle according to both its performed activities and the environmental conditions [8]. Inter-series relations can be also analyzed to detect vehicle fleets and optimize resource allocation. 
 
Research objectives: Study of the state-of-the-art machine learning techniques for time series and compare their performance on the study case; Data collection and analysis of raw and structured data regarding vehicle telematics; Design, develop, test new approaches to time series representation; Benchmarking unimodal and multimodal time series models for time series classification, clustering, forecasting, and segmentation; Design new algorithms and methodologies to process time series data for supervised and unsupervised tasks. 
 
 
Industrial collaborations: 
The PhD activities will be supported by the ongoing research collaboration between Politecnico di Torino and Tierra Spa, a multinational telematics service provider that will provide in-domain data, expert supervision, and related case studies.  
 
In parallel, the research methods and algorithms can be also tested on benchmark data such as the UCR Time Series Classification Archive (https://www.cs.ucr.edu/~eamonn/time_series_data/) and mTAD (https://github.com/OpsPAI/MTAD). 
 
List of possible publication venues: - ECML PKDD, ACM CIKM, KDD, IEEE ICDE, IEEE ICDM, NEURIPS conferences - ACM TIST, ACM TKDD, IEEE TKDE, Elsevier Computers In Industry, Elsevier Information Sciences 
 
References: 
[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention is All you Need. NIPS 2017: 5998-6008 
[2] Chao Yang, Xianzhi Wang, Lina Yao, Guodong Long, Guandong Xu: Dyformer: A dynamic transformer-based architecture for multivariate time series classification. Inf. Sci. 656: 119881 (2024) 
[3] Sana Tonekaboni, Danny Eytan, Anna Goldenberg: Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. ICLR 2021 
[4] hengyu Wang, Kui Wu, Tongqing Zhou, Zhiping Cai: Time2State: An Unsupervised Framework for Inferring the Latent States in Time Series Data. Proc. ACM Manag. Data 1(1): 17:1-17:18 (2023) 
[5] Eamonn J. Keogh: Time Series Data Mining: A Unifying View. Proc. VLDB Endow. 16(12): 3861-3863 (2023) 
[6] Yue Lu, Renjie Wu, Abdullah Mueen, Maria A. Zuluaga, Eamonn J. Keogh: DAMP: accurate time series anomaly detection on trillions of datapoints and ultra-fast arriving data streams. Data Min. Knowl. Discov. 37(2): 627-669 (2023) 
[7] Azul Garza, Max Mergenthaler Canseco: TimeGPT-1. CoRR abs/2310.03589 (2023) 
[8] Silvia Buccafusco, Luca Cagliero, Andrea Megaro, Francesco Vaccarino, Riccardo Loti, Lucia Salvatori: Learning industrial vehicles' duty patterns: A real case. Comput. Ind. 145: 103826 (2023) 

Required skills

The PhD candidate is expected to  - be proficient in Python and SQL programming- have a deep knowledge of statistics and probability fundamentals- have a solid background in data preprocessing and mining techniques- know the most established machine learning and deep learning techniques- have natural inclination for teamwork- be proficient in English speaking, reading, and writingWe seek motivated students who are willing to work at the intersection between academia and industry. 

24

Designing a cloud-based heterogeneous prototyping platform for the development of fog computing apps

Proposer

Gianvito Urgese

Topics

Parallel and distributed systems, Quantum computing, Computer architectures and Computer aided design, Software engineering and Mobile computing

Group website

https://eda.polito.it/

Summary of the proposal

The PhD project enables SW developers to prototype complex solutions on heterogeneous systems (CPU, GPU, FPGA, Neuromorphic HW) effectively. 
We tackle technology transfer challenges in adopting neuromorphic HW for IoT and industry. 
The project objectives are the development of a Heterogeneous Prototyping Platform (HPP) that offers cost-effective testing of neuromorphic and traditional HW, a user-friendly interface, a HW-sharing system, and an energy monitoring system.

Research objectives and methods

Research objectives
The technology transfer process often incurs high costs, as new HW needs to be purchased and there is a risk that the results may not meet expectations, rendering the investment futile. Currently, there is growing interest among academic and industrial developer teams in utilizing neuromorphic HW for complex IoT and industrial use cases. Neuromorphic platforms represent a new type of brain-inspired HW designed to efficiently simulate Spiking Neural Networks (SNNs), which are considered suitable for low-power and low-latency computation. However, the existing neuromorphic boards can be prohibitively expensive for companies in the early stages of experimentation.
To address this issue, there is a need for an environment that allows companies to test and evaluate this new HW cost-effectively. This entails combining the neuromorphic chips with traditional components such as microcontrollers and GPUs in a heterogeneous system. The proposed solution, named Heterogeneous Prototyping Platform (HPP), enables companies to experiment with neuromorphic solutions before committing to the acquisition of the HW. This approach mitigates the risk of substantial costs and facilitates informed decision-making regarding further exploration of neuromorphic technology.
The objectives of the PhD plan will focus on designing and developing a user-friendly interface for remote prototyping of digital/neuromorphic solutions. The user front-end will leverage Kubernetes and microservices technologies, which have been adopted in the cloud-based services such as the remote screen system implemented in the CrownLabs project. On the back-end, the candidate will develop a HW-sharing system to provide different users with access to the HW resources based on their specific requirements. Additionally, an energy monitor component will be designed using e-meters to track the power consumption of the various HW/SW components prototyped within the cloud platform.
The envisioned platform architecture will be organized into multiple levels, including a login level, a computation level (encompassing programming, compilation, etc.), and an end-node level comprising heterogeneous HW interconnected with the system.
In the project, the candidate will target several emerging HW technologies such as FPGA, GPU, Neuromorphic platforms, and parallel architectures. 
The activities of research will be evaluated on three primary areas of application:
-       Medical and bioinformatics data stream analysis;
-       Video surveillance and activity recognition;
-       Smart water management system.
 
Outline of the research work plan
1st year. The candidate will explore cutting-edge frameworks for designing AIoT solutions on fog-based systems. He/She will analyze technologies such as SCADA systems, IoT platforms, data lakes, CrownLabs remote environment, and microservice SOA frameworks. Additionally, he/she will gain expertise in Neuromorphic HW and embedded systems, contributing to the development of tools for leveraging HW technologies in the Heterogeneous Prototyping Platform. 
2nd year. The candidate will develop an integrated heterogeneous prototyping framework supporting HW-sharing system, container/virtualization, front-end interface, and energy monitoring system dashboard.
Additionally, the candidate will be involved in the design and development of the HPP that facilitates the implementation and execution of neuromorphic simulations and AI applications on heterogeneous digital/neuromorphic computing systems.
3rd year. The candidate will validate the framework and platform on selected use cases with scientific and industrial partners. He/She will define and measure relevant KPIs to showcase the benefits of HPP for prototyping HW-heterogeneous computing solutions in fog systems. The candidate will also assist in integrating HPP into the EBRAINS service ecosystem.
 
The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead fPVN project, and the EBRAINS-Italy project.
 
List of possible venues for publications
The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the bioinformatics and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events.
In the following the possible conference and journal targets:
-       IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC);
-       IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).

Required skills

MS degree in computer engineering, electronics engineering. 
Excellent skills in computer programming, computer architecture, cloud computing, SOA paradigm, embedded systems, and IoT applications. 
Technical background in network, cloud services, modelling, simulation, and optimization.

25

Designing a Development Framework for Engineering Edge-Based AIoT Sensor Solutions

Proposer

Gianvito Urgese

Topics

Data science, Computer vision and AI, Life sciences, Parallel and distributed systems, Quantum computing

Group website

https://eda.polito.it/

Summary of the proposal

The transition to digitalization, driven by the Industry 4.0 paradigm, requires advanced frameworks and tools to effectively integrate System of Systems (SoS) within industrial use case scenarios. 
The PhD project aims to develop a framework for the digital design and testing of data fusion algorithms. 
The objective is to integrate distributed sensing techs deployed at the edge of industrial production lines through the fog computing paradigm, enhancing efficient and interoperable collaboration.

Research objectives and methods

Research objectives
In automation use cases, information extracted by different System of Systems (SoS) modules is combined along the path from raw sensors to actuators. The pipeline typically follows an intuitive order: first, the information from the sensors is processed; then, AI-based models or expert systems elaborate on this information; finally, the extracted knowledge is used to develop a control strategy for the actuator. 
The objective of the PhD activity is the development of a framework to support the automation of each step in the engineering process, enabling the creation of optimized Artificial Intelligence of Things (AIoT) sensor solutions. The framework will:
-       Manage onboard sensor data collection and labelling, allowing developers to build datasets from each available sensing system for any given use case.
-       Select and customize AI-based models to accomplish classification, data fusion, and continual learning tasks compatible with the execution on edge devices.
-       Define optimization strategies and tools for identifying model parameters by leveraging the labelled data acquired by the onboard sensing systems. 
-       Support the implementation of solvers for solving tasks mapped on constraint optimization problems.
-       Validate model accuracy on the target edge device.
The framework will undergo evaluation using selected sensing technologies and analytics tasks associated with relevant industrial use cases in the automation field, such as the digitalization of smart water grids. 
In this project, the candidate will focus on emerging hardware technologies, including FPGA, GPU, Neuromorphic platforms, and parallel architectures, to implement new computational paradigms that optimize computation on the edge. Strong integration of Neuromorphic technology will be emphasized in the supported use cases.
 
Outline of the research work plan
1st year. The candidate will explore cutting-edge frameworks for designing AIoT solutions on the edge. He/She will gain experience in Neuromorphic HW and embedded systems for industrial applications. Additionally, he/she will contribute to defining framework requirements, technologies, and solutions for developing AI applications on edge devices.
2nd year. The candidate will develop an integrated methodological approach to a fog computing platform for modelling applications and systems. He/She will develop a user-friendly framework for AI applications on edge devices, considering multi-scenario analysis and benchmarking. The candidate will select SW libraries for developing and deploying models in edge computing devices using the techniques available in the TinyML field, including Imitation, Continual, Federated, Deep learning, and neuromorphic models for relevant industrial use cases.
3rd year. The candidate will apply the proposed approach to different complex use cases such as the digitalization of smart water grids, enabling greater generalisation of the methodology to different domains. The candidate will define and measure relevant KPIs to demonstrate the advantages of using the developed framework, compared to the use case's baseline. 
 
The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead FPVN project and the EBRAINS-Italy project.
 
List of possible venues for publications
The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the AIoT and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events.
In the following the possible conference and journal targets:
-       IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC);
-       IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).

Required skills

MS degree in computer engineering, electronics engineering or physics of complex systems. 
Excellent skills in computer programming, computer architecture, embedded systems, and IoT applications. 
Technical background in deep learning, AI, edge computing, electronic design, modelling, simulation and optimization.

26

Computational Intelligence for Computer-Aided Design

Proposer

Giovanni Squillero

Topics

Computer architectures and Computer aided design, Data science, Computer vision and AI

Group website

https://cad.polito.it

Summary of the proposal

The proposal focuses on the use and development of "Intelligent" algorithms specifically tweaked on the need and peculiarities of CAD industries. Generic techniques ascribable to Computational Intelligence have long been used in the CAD field: probabilistic methods for the analysis of failures or the classification of processes; bio-inspired algorithms for the generation of tests and optimization of parameters or the definition of surrogate models.

Research objectives and methods

The recent fortune of the term "Machine Learning" renewed the interests in many automatic processes; moreover, the publicized successes of (deep) neural networks smoothed down the bias against other non-explicable black-box approaches, such as Evolutionary Algorithms, or the use of complex kernels in linear models. The goal of the research is twofold: from an academic point of view, tweaking existing methodologies, as well as developing new ones, specifically able to tackle CAD problems; from an industrial point of view, creating a highly qualified expert able to bring the scientific know-how into a company, while being also able to understand the practical needs, such as how data are selected and possibly collected. The need to team the experts from industry with more mathematically minded researchers is apparent: frequently a great knowledge of the practicalities is not accompanied by an adequate understanding of the statistical models used for analysis and predictions.

In the first year, the research will consider techniques less able to process large amount of information, but perhaps more able to exploit all problem-specific knowledge available. It will almost certainly include bio-inspired techniques for generating, optimizing, minimizing test programs; statistical methods for analyzing and predicting the outcome of industrial processes (e.g., predicting the maximum operating frequency of a programmable unit based on the frequencies measured by some ring oscillators; detecting dangerous elements in a circuit; predicting catastrophic events). The activity is also like to exploit (deep) neural networks, however developing novel, creative results in this area is not a priority. On the contrary, the research shall face problems related to dimensionality reduction, feature extraction and prototypes identification/creation.

Then the research shall also focus on the study of surrogate measures, that is, the use of measures that can be easily and inexpensively gathered as a proxy for others, more industrially relevant but expensive. In this regard, the tutors are working with a semiconductor manufacturer for using in-situ sensors values as a proxy for the prediction of operating frequency. The work could then proceed by tackling problems related to "dimensionality reduction", useful to limit the number of input data of the model, and "feature selection", essential when each single feature is the result of a costly measurement. At the same time, the research is likely to help the introduction of more advanced optimization techniques in everyday tasks.

From a practical standpoint, starting in the second year, the activity would continue by analyzing a current practical need, namely: "predictive maintenance". A significant amount of data is currently collected by many industries, although in a rather disorganized way. The student would start by analyzing the practical problems of data collection, storage, and transmission, while, at the same time, practicing with the principles of data profiling, classification, and regression (all topics that are currently considered part of "machine learning"). The analysis of sequences to predict the final event, or rather identify a trigger, is an open research topic, with implications far beyond CAD. Unfortunately, unlikely popular ML scenarios, the availability of data is a significant limitation, a situation where the amount of available data for training is insufficient and is sometimes labeled "small data". 

Expected target publications:

Top journals with impact factors

* ASOC -- Applied Soft Computing
* TEC -- IEEE Transactions on Evolutionary Computation
* TC -- IEEE Transactions on Computers

Top conferences

* ITC -- International Test Conference
* DATE -- Design, Automation and Test in Europe Conference
* GECCO -- Genetic and Evolutionary Computation Conference
* CEC/WCCI -- World Congress on Computational Intelligence
* PPSN - Parallel Problem Solving From Nature

Notes:

* The CAD Group has a long record of successful applications of intelligent systems in several different domains. For the specific activities, the list of possibly involved companies include: SPEA, Infineon (through the Ph.D. student Niccol? Bellarmino), ST Microelectronics, Comau (through the Ph.D. student Eliana Giovannitti)
* The tutors are collaborating with Infineon on the subjects listed in the proposal: Two contracts have been signed, the third extension is currently under discussion; A joint paper has been published at ITC, other one was submitted, others are in preparation.
* The tutors are collaborating with SPEA under the umbrella contract "Colibri". Such a contract is likely to be renewed on precisely the topics listed in the proposal.

Required skills

Required skills: Proficiency in Python (including deep understanding of object-oriented principles and design patterns); Proficiency in using libraries such as NumPy and SciPy for data analysis and manipulation // Preferred: Knowledge of Electronic CAD

28

Security of Software Networks

Proposer

Cataldo Basile

Topics

Cybersecurity, Parallel and distributed systems, Quantum computing

Group website

https://www.dauin.polito.it/research/research_groups/torsec_security_group

Summary of the proposal

The massive progress in software network complexity, flexibility, and manageability was only marginally used to increase the security of these networks: attacks may remain undiscovered for months, and human errors mainly cause them.
The PhD proposal has a high-level research objective: investigating and exploiting software networks' full potential to mitigate cybersecurity risks automatically and provide defensive tools with more intelligence and a higher level of automation.

Research objectives and methods

Nowadays, attackers are always one or more steps behind the security defenders. When vulnerabilities are found, patches follow only days later, and anti-virus signature updates come after discovering new malware. Intrusion Prevention Systems provide simple reactions triggered by simplistic conditions often considered ineffective by large companies. Moreover, companies face risks of misconfiguration whenever security policies or network layouts need an update. Statistics are clear: attacks are discovered with unacceptable delays, and in most cases, attacks are caused by human errors. The solution is also clear: providing defensive tools with more intelligence and a higher level of automation. 
 
This PhD proposal aims to use these features for security purposes, i.e., to develop AI-based systems able to perform policy refinement, configure the network and security controls starting from high-level security requirements, and policy reaction to respond to incidents and mitigate risks. Coupling then understanding the features of security controls and software networks will build more resilient information systems that discover and react to attacks faster and more effectively.
 
The initial phases of the PhD will be devoted to formalizing the framework models needed to reach the most ambitious research objectives. 
During the first year, the candidate will improve the model of security controls' capabilities and define the formal model of the software networks' reconfiguration abilities. The most relevant families of security controls will be analyzed, starting from filtering (up to layer seven) and channel protection. The candidate will contribute to a journal publication that extends an existing conference publication. 
The work on software network modelling will start with analysing the features of Kubernetes technology. It will also identify strategies to use pods and clusters to define policy enforcement units that merge security controls with complementary features for protecting network parts, which will be used for refinement purposes. The results of this task will be first submitted to a conference and then extended to a journal publication.
 
More attention will be devoted to the refinement and reaction models from the second year. The candidate will study the possibility of building refinement models that use advanced logic (forward and abductive reasoning) to represent decision-making processes. AI (Artificial Intelligence) and machine learning techniques will be investigated to learn from decisions overridden and manual corrections made by humans for fine-tuning security decisions. 
The candidate will also perform research towards an abstract framework for abstractly representing reaction strategies to security events. Every strategy requires adaptations to be enforced in each context; the research will investigate how to characterize and implement this adaptation and what the proper level of abstraction for strategies is. The effectiveness of these models will be evaluated on relevant scenarios like corporate networks, ISP, automotive, and Industrial Control Systems, also coming from two EC-funded European Projects. The candidate will be guided in evaluating and deciding on the best venues to publish the results of his research.
 
Moreover, to increase the impact of the research and cover existing gaps, the candidate will investigate how to standardize the information used to model the scenarios requiring reactions and the reaction and threat intelligence data with the proper level of detail.
 
One or two 3-6 months internship periods are expected in an external institution. The objective is to acquire competencies that may emerge as needed. Research collaborations are ongoing with EU academia and with leading companies in the EU.
We expect at least two publications on top-level cybersecurity conferences and symposia (e.g., ACM CCS, IEEE S&P) or top conferences about software networks (e.g., IEEE NetSoft).
 
The models of the security controls and software networks' capabilities models will be submitted to top-tier journals in the cybersecurity, networking, and modelling scope (e.g., IEEE /ACM Transactions on Networking, IEEE Transactions on Network and Service Management, IEEE Journal on Selected Areas in Communications, IEEE Transactions on Dependable and Secure Computing).
 
We also expect results for at least one journal article about the automatic enforcement and empirical assessments of software protections. Together with the journals reported above, if the innovation of the results will deserve it, also IEEE Transactions on Emerging Topics in Computing.

Required skills

The candidate needs to have a solid background in cybersecurity (risk management), defensive controls (e.g., firewall technologies and VPNs), monitoring controls (e.g., IDS/IPS and threat intelligence) and incident response. Moreover, he should also possess a background in software network technologies (SDN, NFV, Kubernetes) and cloud computing. Having skills in formal modelling and logical systems is a plus.

29

Emerging Topics in Evolutionary Computation: Diversity Promotion and Graph-GP

Proposer

Giovanni Squillero

Topics

Computer architectures and Computer aided design, Data science, Computer vision and AI

Group website

https://www.cad.polito.it/

Summary of the proposal

Soft Computing, including evolutionary computation (EC), is currently experiencing a unique moment. While fewer scientific papers focus solely on EC, traditional EC techniques are frequently utilized in practical activities under different labels. The objective of this analysis is to examine both the new representations that scholars are currently exploring and the old, yet still pressing, problems that practitioners are facing.

Research objectives and methods

Although the classical approach to representing solutions in EC involves bit strings and expression trees, far more complex encodings have been recetly proposed. More specifically, graph-based representations have led to novel applications of EC in circuit design, cryptography, image analysis, and other fields.

At the same time, divergence of character, or, more precisely, the lack of it, is widely recognized as the most impairing single problem in the field of EC. While divergence of character is a cornerstone of natural evolution, in EC all candidate solutions eventually crowd the very same areas in the search space, such a "lack of speciation" has been pointed out in the seminal work of Holland back in 1975. It is usually labeled with the oxymoron "premature convergence" to stress the tendency of an algorithm to convergence toward a point where it was not supposed to converge to in the first place. The research activity would tackle "diversity promotion", that is either "increasing" or "preserving" diversity in an EC population, both from a practical and theoretical point of view. It will also include the related problems of defining and measuring diversity.

The research project shall include an extensive experimental study of existing diversity preservation methods across various global optimization problems. Open-source, general-purpose EA toolkits, inspyred and DEAP, will also be used to study the influence of various methodologies and modifications on the population dynamics. Solutions that do not require the analysis of the internal structure of the individual (e.g., Cellular EAs, Deterministic Crowding, Hierarchical Fair Competition, Island Models, or Segregation) shall be considered. This study should allow the development of a, possibly new, effective methodology, able to generalize and coalesce most of the cited techniques.

During the first year, the candidate will take a course in Artificial Intelligence, and all Ph.D. courses of the educational path on Data Science. Additionally, the candidate is required to improve the knowledge of Python.

Starting from the second year, the research activity shall include Turing-complete program generation. The candidate will work on an open-source Python project, currently under active development. The candidate will try to replicate the work of the first year on much more difficult genotype-level methodologies, such as Clearing, Diversifiers, Fitness Sharing, Restricted Tournament Selection, Sequential Niching, Standard Crowding, Tarpeian Method, and Two-level Diversity Selection.

At some point, probably toward the end of the second year, the new methodologies will be integrated into the Grammatical Evolution framework developed at the Machine Learning Lab of University of Trieste ? GE allows a sharp distinction between phenotype, genotype and fitness, creating an unprecedented test bench (the research group is already collaborating with a group in UniTS on these topics, see "Multi-level diversity promotion strategies for Grammar-guided Genetic Programming" Applied Soft Computing, 2019).

A remarkable goal of this research would be to link phenotype-level methodologies to genotype measures.

Target Publications

Journals with impact factors
- ASOC - Applied Soft Computing
- ECJ - Evolutionary Computation Journal
- GPem - Genetic Programming and Evolvable Machines
- Informatics and Computer Science Intelligent Systems Applications
- IS - Information Sciences
- NC - Natural Computing
- TCIAIG - IEEE Transactions on Computational Intelligence and AI in Games
- TEC - IEEE Transactions on Evolutionary Computation

Top conferences

- ACM GECCO - Genetic and Evolutionary Computation Conference
- IEEE CEC/WCCI - World Congress on Computational Intelligence
- PPSN - Parallel Problem Solving From Nature

Notes:

The tutors regularly present tutorials on Diversity Preservation at top conferences in the field, such as GECCO, PPSN, and CEC. Additionally, they are involved in the organization of a workshops focused on graph-based representation for EA. Moreover, the research group is in contact with industries that actively consider exploiting evolutionary machine-learning for enhancing their biological models, for instance, KRD (Czech Republic), Teregroup (Italy), and BioVal Process (France).

The research group has also a long record of successful applications of evolutionary algorithms in several different domains. For instance, the on-going collaboration with STMicroelectronics on test and validation of programmable devices, does exploit evolutionary algorithms and would benefit from the research.

Required skills

Proficiency in Python (including deep understanding of object-oriented principles and design patterns, and handling of parallelism); Preferred: Experience with metaheuristcs, Experience with optimization algorithms

30

Advanced ICT solutions and AI-driven methodologies for Cultural Heritage resilience

Proposer

Edoardo Patti

Topics

Data science, Computer vision and AI, Software engineering and Mobile computing, Parallel and distributed systems, Quantum computing

Group website

https://eda.polito.it/

Summary of the proposal

This Ph.D. research leverages on cutting-edge technologies to preserve Cultural Heritage (e.g., monuments, historical sites, etc.) against natural disasters, climate change, and human-related threats. The interdisciplinary approach integrates ICT tools, Machine Learning, and Data Analytics to develop proactive strategies for risk assessment, monitoring, and preservation of cultural assets by addressing challenges through innovative solutions for sustainable conservation and resilience

Research objectives and methods

Recent crises and disasters have affected the European citizens' lives, livelihoods, and environment in unforeseen and unprecedented ways. They have transformed our very understanding of them by reshaping hitherto unchallenged notions of the ?local? and the ?global? and putting into question well-rehearsed conceptual distinctions of ?natural? and ?man-made? disasters. Modern and high-performance ICT solutions need to be deployed in order to prevent and mitigate the effects of disasters and climate change events by enabling critical thinking and framing a holistic approach for better understanding of catastrophic events.

The objective of this Ph.D. proposal consists of the design and development of ICT-driven solutions to develop proactive strategies for risk assessment, monitoring, and preservation of Cultural Heritage. The candidate will adopt a comprehensive interdisciplinary approach, seamlessly integrating modern techniques rooted in IoT, Machine/Deep Learning, and Big Data paradigms within the realm of cultural heritage resilience. This approach transcends purely technical facets, encompassing social and cultural dimensions to provide a holistic understanding and effective solutions.

During the three years of the Ph.D., the research activity will be divided into five phases:- Survey existing literature on modern Ai-driven ICT solutions and applications in software engineering and analyze methodologies and challenges Cultural Heritage resilience.- Design and develop a data-driven digital ecosystem - i.e., distributed IoT platform - for the collection and harmonization of heterogeneous data from the real world to enable on-top advanced visualization and analysis services (e.g., Digital Twins). A multidisciplinary approach ranging from IoT paradigms to the application of Machine/Deep Learning methodologies for Big Data analysis is required in order to allow the development of proactive strategies for risk assessment, monitoring, and preservation of Cultural Heritage.- Develop algorithms and strategies for a context-aware Cultural Heritage resilience by implementing prototype systems for evaluation and refinement.- Design and implement continuous improvement and fine-tuning strategies for the development of increasingly effective and high-performing prevention strategies.- Evaluate the effectiveness of the data-driven digital ecosystem and developed strategies through user studies and real-world projects.

Possible international scientific journals and conference: - IEEE Transactions on Computational Social Systems- IEEE Transactions on Industrial Informatics- Journal on Computing and Cultural Heritage- Journal of Cultural Heritage- Engineering Applications of Artificial Intelligence,- Expert Systems with Applications,- IEEE CoStProgramming and Object-Oriented Programming (preferable in Python).- Knowledge of web application programming.- Knowledge of IoT paradigms.- Knowledge of Machine Learning and Deep Learning.- Knowledgeof frameworks to develop models based on Machine Learning and Deep Learning Model-  internat. Conf.- IEEE SKIMA internat. Conf.

Required skills

Programming and Object-Oriented Programming (preferable in Python).
Knowledge of web application programming.
Knowledge of IoT paradigms.
Knowledge of Machine Learning and Deep Learning.
Knowledge of frameworks to develop models based on Machine Learning and Deep Learning Models

31

Monitoring systems and techniques for precision agriculture

Proposer

Renato Ferrero

Topics

Data science, Computer vision and AI, Software engineering and Mobile computing

Group website

 

Summary of the proposal

The most challenging current demand of the agricultural sector is the production of sufficient and safe food for a growing population without over-exploiting natural resources. This challenge is placed in a difficult context of unstable climate conditions, with competition for land, water, energy, and in an increasingly urbanized world. The research activity aims to increase the competitiveness of the agri-food system in terms of safety, quality, sustainability, and added value of food products.

Research objectives and methods

The research activity of the PhD candidate will investigate devices and techniques for monitoring the agricultural produce in a holistic vision, with the aim of limiting environmental pollution, preventing the misuse of pesticides and fertilizers, reducing water and energy request, and increasing net profit.
 
A first activity concerns the development of a low-cost proximity monitoring system. Off-the-shelf sensors will be selected to measure the most meaningful parameters, such as temperature and humidity of both air and soil, light condition, PH of soil, concentration of NPK (nitrogen, phosphorus, and potassium) in the soil. The adoption of low cost sensors will make possible a pervasive distribution in the environments to be monitored. All the gathered data will be associated with GPS coordinates, date and time of the measurement. The measurements will be repeated several times at different points in the crop; at the end of each sample, the measurements will be synchronized on a server to keep track over time. The integration of the sensing, computing, and communication functionalities within small-size devices will be a key element for increasing the pervasiveness and robustness of the network. Possibly, the integration of the sensor network with drone-based systems will be investigated.
A strictly correlated subsequent activity regards the analysis of the data collected by the sensor network: several goals are set, as detailed in the following. Different calibration strategies will be evaluated: reference values provided by other sensors will be used to determine the most effective calibration strategies and when the calibration needs to be repeated in order to ensure precise measurements. The correlation of the collected data with operating conditions and environmental conditions (e.g., measurement range, microclimatic characteristics) will be analyzed in order to assess the variability of the measurements, both in time and space. In particular, understanding spatial variability may lead to the development of models for data spatialization. Finally, the benefits of sensor redundancy, in terms of data availability, reliability, network performance, and maintainability will be investigated.
 
A complementary research activity will focus on optical remote sensing. Non-destructive analysis techniques based on UV-Vis- NIR spectroscopy will be adapted in order to allow continuous monitoring of many critical aspects of the production. In particular, new procedures will be developed to correlate the absorption of light radiation, measured with spectroscopic techniques, with the chemical and physical properties of soil, crops, and horticultural produce. Computer graphics techniques will be studied for developing new protocols of calculation of vegetation and soil indices (e.g., NDVI, GNDVI, SAVI, RE). Images will be taken by cameras at different wavelengths, ranging from 1 to 14 microns. Algorithms for pattern analysis and recognition will be developed for the automatic identification of specific parts of the plant, such as leaves or stem, and the detailed analysis of its state of health, with the goal of correlating the images of leaves to the growth and onset of specific diseases.
 
The PhD research activities can be grouped into three consecutive phases, each one roughly corresponding to one year in the PhD career. Initially, the PhD candidate will improve his/her background by attending PhD courses and surveying relevant literature. After this initial training, the student is expected to select and evaluate the most promising solutions for monitoring agricultural produce. The second phase regards experimental activities on the field aimed at the development of monitoring systems and techniques, such as the integration and deployment of the sensor network, the evaluation of effective calibration strategies, the acquisition of multispectral images, the computation of vegetation and soil indices, and the integration with drone-based systems. Finally, the data collected will be analyzed during the third phase with different goals: assessment of the measurement variability according to the operating conditions (e.g., measurement range, microclimatic characteristics, etc.), influence of sensor redundancy on the network performance, modeling the spatial distribution of data, relationship between sensor measurements and vegetation indices, etc.
 
The research will be carried out as part of the activities of the National Research Centre for Agricultural Technologies (Agritech).
 
Some expected target publications are:
-       IEEE Transactions on AgriFood Electronics
-       ACM Transactions on Sensor Networks
-       IEEE Transactions on Image Processing
-       Information Processing in Agriculture (Elsevier)
-       Computers and Electronics in Agriculture (Elsevier)
 

Required skills

As the research activity regards the design, development, and evaluation of digital technologies for the next generation agriculture in a holistic vision, the PhD candidate is required to own multidisciplinary skills: e.g., distributed computing, embedded systems, computer networks, security, computer graphics, programming, database management.

32

Designing heterogeneous digital/neuromorphic fog computing systems and development framework 

Proposer

Gianvito Urgese

Topics

Parallel and distributed systems, Quantum computing, Life sciences, Data science, Computer vision and AI

Group website

https://eda.polito.it/

Summary of the proposal

The candidate will be involved in the development of:A Heterogeneous Prototyping Platform (HPP) for Spiking Neural Network (SNN) simulations and AI applications on digital/neuromorphic systems.A framework for end-to-end engineering of SNN simulations on neuromorphic devices.A SW library optimizing SNN on RISC-V-based edge devices.
The PhD aims to enhance tools for developing neuromorphic solutions on fog computing systems, advancing their adoption in IoT, bioinformatics, and neuroscience domains.

Research objectives and methods

Research objectives
Neuromorphic HW architectures, originally designed for brain simulations, have garnered interest in various fields, including IoT edge devices, high-performance computing, bioinformatics, industry and robotics. These platforms offer superior scalability compared to traditional multi-core architectures and excel in problems requiring massive parallelism, which is their inherent optimization. Additionally, the scientific community recognizes their suitability for low-power and adaptive applications that demand real-time data analysis.
The objectives of the PhD plan encompass several key aspects:Develop the necessary knowledge to analyze available data from product documentation, extracting experimental features from complex components and systems.Evaluate the potential of Spiking Neural Networks (SNNs) efficiently simulated on neuromorphic platforms when customized at the abstraction level of a flow graph, enabling the implementation of general-purpose algorithms.Contribute to the design and development of a Heterogeneous Prototyping platform (HPP) and a framework for the development of neuromorphic solutions, covering all engineering phases from specification definition to HW procurement and installation of server nodes, Neuromorphic HW, and market-available sensors.Propose a general approach for generating simplified neuromorphic models that implement basic kernels, enabling users to directly apply them in their algorithms. The level of abstraction of these models will depend on the availability of SW libraries supporting the target neuromorphic HW.Utilize the HPP to design proof-of-concept applications by combining a set of neuromorphic models, aiming to provide outputs with acceptable error rates compared to versions running on standard systems. These applications should also reduce execution time and power consumption.Contribute to the design of a SW library that optimizes the execution of SNNs on RISC-V CPUs used in edge computing devices.
The research activities will primarily focus on implementing algorithms in three main application areas:Simulations of models developed by the EBRAINS-Italy neuroscience community.Real-time data analysis from IoT and industrial applications.Analysis and pattern matching of neuroscience and bioinformatics data streams.
 
Outline of the research work plan
1st year. The candidate will extensively study cutting-edge neuromorphic frameworks and their application in deploying simulations on various neuromorphic HW technologies. He/She will contribute to the development of a framework that enables the semi-automatic generation and connection of neuromorphic models, streamlining the modeling process and promoting the exploration of new computational paradigms. Additionally, in the first year, the candidate will participate in designing the Neuromorphic Computing component of the Heterogeneous Prototyping Platform (HPP-NC). He/She will also contribute to the design of a software library that optimizes SNN execution on standard CPUs, specifically RISC-V-based edge computing devices.
2nd  year. The candidate will create an integrated methodological approach for modeling applications and systems. He/She will utilize experiences from the first year of research to conduct a multi-scenario analysis. The candidate will establish the foundational structure of a user-friendly neuromorphic computing framework, providing access and validation for the HPP-NC prototype. Additionally, he/she will define two Modelling, Simulation, and Analysis (MSA) use cases tailored to the needs of Neuroscientists, Bioinformaticians, and Data scientists/engineers.
3rd  year. The candidate will implement the proposed approach in diverse industrial and IoT use cases, enabling its application across different domains. He/She will analyze investments in neuromorphic compilers for upcoming neuromorphic HW, alongside general-purpose CPUs. Moreover, the candidate will assist in integrating the HPP-NC into the EBRAINS service ecosystem.
 
The research activities will be carried out, in collaboration with the partners of three funded projects Fluently project, Arrowhead fPVN project, and the EBRAINS-Italy project.
 
List of possible venues for publications
The main outcome of the project will be disseminated in three international conference papers and at least one publication in a journal of the AIoT and neuromorphic fields. Moreover, the candidate will disseminate the major results in the EBRAINS-Italy meetings and events.
In the following the possible conference and journal targets:IEEE/ACM International Conferences (e.g., DAC, DATE, AICAS, NICE, ISLPED, GLSVLSI, PATMOS, ISCAS, VLSI-SoC);IEEE/ACM Journals (e.g., TCAD, TETC, TVLSI, TCAS-I, TCAS-II, TCOMP), MDPI Journals (e.g., Electronics).

Required skills

MS degree in computer engineering, electronics engineering or physics of complex systems. 
Excellent skills in computer programming, computer architecture, embedded systems, and IoT applications. 
Technical background in deep learning, AI, edge computing, electronic design, modelling, simulation and optimization.

33

Cloud at the edge: creating a seamless computing platform with opportunistic datacenters

Proposer

Fulvio Giovanni Ottavio Risso

Topics

Computer architectures and Computer aided design, Parallel and distributed systems, Quantum computing, Software engineering and Mobile computing

Group website

https://netgroup.polito.it 

Project website: https://liqo.io

Summary of the proposal

The idea is to aggregate the huge number of traditional computing/storage devices available in modern environments (such as desktop/laptop computers, embedded devices, etc.) into an opportunistic datacenter, hence transforming all the current devices into datacenter nodes.
This proposal aims at tackling the most relevant problems towards the above scenario, such as defining a set of orchestration algorithms, as well as a proof-of-concept showing the above system in action.

Research objectives and methods

Cloud-native technologies are increasingly deployed at the edge of the network, usually through tiny datacenters made by a few servers that maintain the main characteristics (powerful CPUs, high-speed network) of the well-known cloud datacenters. However, most of current domestic environments and enterprises host a huge number of traditional computing/storage devices, such as desktop/laptop computers, embedded devices, and more, which run mostly underutilized.
This project proposes to aggregate the above available hardware into an ?opportunistic? datacenter, hence replacing the current micro-datacenters at the edge of the network and the consequent potential savings in energy and CAPEX. This would transform all the current computing hosts into datacenter nodes, including the operating system software.
The current Ph.D. proposal aims at investigating the problem that may arise in the above scenario, such as defining a set of algorithms that allow orchestrating jobs on an ?opportunistic? datacenter, as well as a proof-of-concept showing the above system in action.
 
The objectives of the present research are the following:- Evaluate the economic potential impact (in terms of hardware expenditure, i.e., Capital Expenditures - CAPEX, and energy savings, i.e., Operating Expenses - OPEX) of such a scenario, in order to validate its economic sustainability and the impact in terms of energy consumption.- Extend existing operating systems (e.g., Linux) with lightweight distributed processing/storage capabilities, in order to allow current devices to host ?foreign? applications (in case of availability of resources), or to borrow resources in other machines and delegate the execution of some of its tasks to the remote device.- Define the algorithms for job orchestration on the ?opportunistic? datacenter, which may differ considerably from the traditional orchestration algorithms (limited network bandwidth between nodes; highly different node capabilities in terms of CPU/RAM/etc; reliability considerations; necessity to leave free resources to the desktop owner, etc).
 
The research activity is part of the Horizon Europe FLUIDOS project (https://www.fluidos.eu/) and it is related to current active collaborations with Aruba S.p.A. (https://www.aruba.it/) and Tiesse (http://www.tiesse.com/).
 
The research activity will be organized in three phases:- Phase 1 (Y1): Economic and energy impact of opportunistic datacenters. This would include real-world measurements in different environment conditions (e.g., University lab; domestic environment; factory) about computing characteristics and energy consumption and the creation of a model to assess potential savings (economic/energy).- Phase 2 (Y2): Job orchestration on opportunistic datacenters. This would include real-world measurements of the features required for distributed orchestration algorithms (CPU/memory/storage consumption; device availability; network characteristics), and the definition of a scheduling model that achieves the foreseen objectives, evaluated with simulations.- Phase 3 (Y3): Experimenting with opportunistic datacenters. This would include the creation of a proof of concept of the defined orchestration algorithm, executed on real platforms, with real-world measurements of the behavior of the above algorithm in a specific use-case (e.g., University computing lab, factory with many data acquisition devices, etc.)
 
Expected target conferences are the following:
Top conferences:- USENIX Symposium on Operating Systems Design and Implementation (OSDI)- USENIX Symposium on Networked Systems Design and Implementation (NSDI)- International Conference on Computer Communications (INFOCOM)- ACM European Conference on Computer Systems (EuroSys)- ACM Symposium on Principles of Distributed Computing (PODC)- ACM Symposium on Operating Systems Principles (SOSP)
 
Journals:- IEEE/ACM Transactions on Networking- IEEE Transactions on Computers- ACM Transactions on Computer Systems (TOCS)- IEEE Transactions on Cloud Computing
 
Magazines:- IEEE Computer

Required skills

The ideal candidate has good knowledge and experience in computing architectures, cloud computing and networking. Availability for spending periods abroad would be preferred for a more profitable investigation of the research topic.

34

AI-driven cybersecurity assessment for automotive

Proposer

Luca Cagliero

Topics

Data science, Computer vision and AI, Cybersecurity

Group website

https://www.dauin.polito.it/en/research/research_groups/dbdm_database_and_data_mining_group https://www.dauin.polito.it/research/research_groups/torsec_security_group https://www.drivesec.com/

Summary of the proposal

This PhD proposal aims to investigate how to leverage Generative AI techniques for assessing the cybersecurity posture of vehicles and automotive infrastructures and evaluating the compliance with existing standards (e.g., ISO 21434). It will also propose innovative LLM-based approaches to retrieve, recommend, and generate penetration tests and vulnerability-related information. It will also study innovative methodologies based on Multimodal Learning and Retrieval Augmented Generation.

Research objectives and methods

Research objectives 
Assessing the resilience of vehicles and their components has become crucial; it relies on tests that decree the security of a System Under Test. Vulnerability assessment (VA) and penetration testing (PT) are two primary complementary techniques that serve this purpose. VA is managed with automatic tools, but the existing ones rarely work in the automotive field. PT relies on teams made humans, which are costly and difficult to hire.Hence, the aim of this research is to investigate how the advancements in AI techniques can help automate the threat assessment and risk evaluation for the automotive field. 
The student will investigate innovative methods for the automatic processing and interpretation of the data produced by the analysis tools in their context, understand the implications from the security point of view, and use them to build a risk analysis model. Moreover, the student will explore the potential of Generative AI techniques, in combination with Search Engines, Question Answering models, and Multimodal Learning architectures to automate the process of retrieval, recommendation, and generation of penetration tests. 
 
Outline
The student will get familiarity with the field of cybersecurity for automotive, the peculiarities and the normative framework. His main research goal is to leverage Generative AI to model VA/PT operations and generate new tests for assessing the verification objectives and adapting family of tests to work out of their original context. To this end, the algorithms, models, and techniques considered in the research activities will include (but are not limited to)
- Large Language Models (e.g., GPT [1], Llama 2 [2], Llava [3]), to leverage the capabilities of transformer-based generative models to interpret end-users' questions posed in natural language, generate text and code that meet in-context requirements, and perform multi-hop reasoning based on Chain-of-Thought (CoT) Prompting;
- Multimodal Architectures (e.g., CLIP [4]), to effectively handle input data in different modalities (e.g., images, tables, speech);
- Search engines (e.g., ElasticSearch [5]), to efficiently store, index, and retrieve data about vulnerabilities and penetration tests;
- Retrieval-Augmented Generation (e.g., Llama Index [6]), to efficiently address question answering tasks on proprietary data by leveraging LLM capabilities. 
 
Industrial collaborations
This research will be made in collaboration with Drivesec s.r.l., which will provide the necessary automotive background, the equipment needed, and the data set for the testing and validation of the developed methods.
 
Open resources
Beyond proprietary data and industrial case studies, the PhD activities will also consider opensource data repositories, models, and projects, e.g.,
- MetaSploit (https://www.metasploit.com/)
- MITRE (https://cve.mitre.org/)
- PentestGPT (https://github.com/GreyDGL/PentestGPT)
- HuggingFace (https://huggingface.co/models)
 
List of possible publication venues
- Conferences: IEEE CSR, ECML PKDD, ACM CIKM, KDD, IEEE ICDE, IEEE ICDM
- Journals: IEEE TKDE, IEEE TAI, ACM TIST, IEEE TIIS, IEEE/ACM ToN, Elsevier Information Sciences, Elsevier Computers in Industry
 
References
[1] OpenAI: GPT-4 Technical Report. CoRR abs/2303.08774 (2023)
[2] https://ai.meta.com/llama/
[3] Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang: LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models. CoRR abs/2312.02949 (2023)
[4] https://openai.com/research/clip
[5] https://www.elastic.co/)
[6] https://www.llamaindex.ai/

Required skills

The PhD candidate is expected to 
- Have the ability to critically analyze complex systems, model them and identify weaknesses;
- be proficient in Python programming;
- know cybersecurity fundamentals;
- have a solid background on machine learning and deep learning;
- have natural inclination for teamwork;
- be proficient in English speaking, reading, and writing.
We seek motivated students who are willing to work at the intersection between academia and industry.

35

Applications of Large Language Models in time-evolving scenarios

Proposer

Luca Cagliero

Topics

Data science, Computer vision and AI

Group website

https://dbdmg.polito.it/ https://smartdata.polito.it

Summary of the proposal

Large Language Models are Generative AI models pretrained on a huge mass of data. Since training examples are sampled at a fixed time interval, LLMs require specific interventions to deal with time-evolving scenarios. Furthermore, they are not designed to process timestamped data such as time series and temporal sequences. The PhD proposal aims to propose new LLM-based approaches to analyze textual and multimedia sources in time-evolving scenarios and to leverage LLMs in timestamped data mining.

Research objectives and methods

Context
Large Language Models (LLMs) have emerged as disruptive Artificial Intelligence technologies supporting a variety of Natural Language Generation tasks among which question answering, text summarization, and text paraphrasing [1,2]. Recently proposed LLMs such as LLaVA [3] support visual content as part of the LLM prompts beyond the raw text. LLMs are known to potentially suffer from the bias due to the inherent properties of the training examples. To overcome these "harms", various strategies such as in-context learning, probing, and fine-tuning have been proposed.

Research objectives
The PhD proposal has the twofold aim to address the limitations of LLMs in coping with time-evolving scenarios and timestamped data:
1)    Apply LLMs in time-evolving scenarios: Several textual and visual data sources are, by design, time-evolving. Capturing their temporal evolution is relevant to address several tasks such as intent recognition [4] and summarization [5]. The research activities will investigate the design and development of innovative LLM-based approaches to solve time-evolving tasks. 
2)    Dealing with timestamped data:  Classical LLMs are designed to handle textual data. Recent Multimodal LLMs handle visual content as well. Conversely, only a limited body of work has focused on coping with timestamped data such as time series [6]. The research activities will study new LLM-based solutions to handle timestamped data. 

Tentative work plan
1) Application of LLMs in time-evolving scenarios:
- Analysis of the state-of-the-art of LLMs and Multimodal LLMs;
- Identification of a selection of time-evolving NLP and Multimodal Learning tasks and related benchmarks. Exploration of state-of-the-art models' performance;
- Proposal of new LLM-based approaches to solve the selected tasks.
2) LLMs and timestamped data:
- Review of existing LLM-based approaches to time series and temporal sequences. Classification of their strengths and weaknesses;
- Identification of a selection of tasks related to time series data (e.g., forecasting, segmentation, classification, anomaly detection);
- Design and development of innovative LLM-based approaches to solve the selected tasks.
 
Industrial collaborations
This research activities will be partly carried out in collaboration with Amazon Research Center in Turin.
 
List of possible publication venues
- Conferences: ACM Multimedia, KDD, ACL, COLING, IEEE ICDM, ECML PKDD, ACM CIKM, INTERSPEECH, IEEE ICASSP
- Journals: IEEE TKDE, ACM TKDD, IEEE TAI, ACM TIST, IEEE/ACM TASLP
 
References
[1] OpenAI. GPT-4 technical report. CoRR, abs/2303.08774, 2023.
[2] Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, AdinaWilliams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aur?lien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and finetuned chat models. CoRR, abs/2307.09288, 2023.
[3] Hao Zhang, Hongyang Li, Feng Li, Tianhe Ren, Xueyan Zou, Shilong Liu, Shijia Huang, Jianfeng Gao, Lei Zhang, Chunyuan Li, Jianwei Yang: LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models. CoRR abs/2312.02949 (2023)
[4] Patcharapruek Watanangura, Sukit Vanichrudee, On Minteer, Theeranat Sringamdee, Nattapong Thanngam, Thitirat Siriborvornratanakul: A Comparative Survey of Text Summarization Techniques. SN Comput. Sci. 5(1): 47 (2024)
[5] Henry Weld, Xiaoqi Huang, Siqu Long, Josiah Poon, Soyeon Caren Han: A Survey of Joint Intent Detection and Slot Filling Models in Natural Language Understanding. ACM Comput. Surv. 55(8): 156:1-156:38 (2023)
[6] Azul Garza, Max Mergenthaler Canseco: TimeGPT-1. CoRR abs/2310.03589 (2023)

Required skills

The PhD candidate is expected to 
- Have the ability to critically analyze complex systems, model them and identify weaknesses;
- be proficient in Python programming;
- know data science fundamentals;
- have a solid background on machine learning and deep learning;
- have natural inclination for teamwork;
- be proficient in English speaking, reading, and writing. 

36

Building Adaptive Embodied Agents in XR to Enhance Educational Activities

Proposer

Andrea Bottino

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI

Group website

https://www.polito.it/cgvg

Summary of the proposal

This research explores the integration of Memory-Augmented Neural Networks (MANNs) in Embodied Conversational Agents (ECAs) to create interactive, personalized and engaging learning experiences in XR. Such ECAs can adapt to the learner's/learner group characteristics and progression to personalize education for more effective learning outcomes in both individual and collaborative learning. The challenge is to develop complex yet accessible ECAs for different educational environments.

Research objectives and methods

In the evolving landscape of AI and educational technology, the integration of MANNs and gamification in ECAs offers new opportunities to push the field of AI agents and create highly interactive, adaptive and engaging learning experiences in XR. The use of MANNs allows ECAs to store and recall previous interactions, resolving different problems of actual conversational agents and enabling unprecedented levels of personalized content and engagement. Ultimately, these ECAs can provide an enhanced learning experience that is both dynamic and responsive to learners' individual needs. In collaborative learning scenarios, these ECAs should be designed to act not just as facilitators but as active participants, encouraging group interaction and the overall learning process. The integration of these technologies offers the potential to explore new educational methodologies that align with the evolving digital competencies of today's learners.
 
RESEARCH OBJECTIVES:

1. Enhancing ECAs with MANNs:
- Develop ECAs that integrate with MANNs to provide a personalized learning experience by remembering and leveraging the learner's individual interactions and history.
- Explore how these memory functions can be optimized to adapt to different learning styles and preferences.
- Explore the possibilities of integrating the ECAs digital memory with emotional models to deliver lifelike interactions between agents and between the user and the agents
- In the specific context of XR-based learning, address the main challenges of MANNs, such as the limited storage capacity of certain types of networks, the complexity associated with managing external memory structures and their computational overhead, the problems associated with memory recall processes, and the efficient use of memory to store and retrieve information over extended periods of time.

2. Facilitate collaborative learning through ECAs:
- Develop ECAs that can dynamically participate in collaborative learning environments and contribute to and facilitate group-based educational activities.
- Investigate the effectiveness of ECAs in promoting group dynamics and enhancing the collaborative learning experience.
 
3. Challenges in development and implementation:
- Address Overcome the technological challenges associated with developing complex ECAs that can integrate advanced MANNs and gamification features.
- Ensure the accessibility and effectiveness of these ECAs in a wide range of educational environments, including those with limited technological resources.
- Propose novel methodologies and techniques for the MANN creation, for example or example exploit XR environments, generative AI and gamification concept to support data collection and/or models training and instruction. 

4. Assessment:
- Evaluate the impact of MANN-enhanced ECAs on the overall learning experience in XR environments.

 
The multidisciplinary nature of this project includes expertise from the fields of AI, neuroscience, psychology, education and game design. A key advantage lies in the existing partnership of the proposer's research group with the Department of Neuroscience of the Faculty of Psychology of the University of Turin, which can provide invaluable insights and contributions, especially in the field of cognitive processes and neural mechanisms, enriching the depth and applicability of the project in the field of educational technologies.
 
WORKPLAN
Year 1: Foundation and State-of-the-Art Review
- Q1-Q2: Conduct a comprehensive literature review on the current state of AI, Memory-Augmented Neural Networks (MANNs), Embodied Conversational Agents (ECAs), neuroscience, learning and education. Identify gaps in the current research and develop a detailed research proposal addressing these gaps.
- Q3 ? Q4: Begin preliminary development of the ECA framework, focusing on basic integration of neural memory.
Year 2: Development and Initial Testing
   Q1-Q2: Develop advanced features for the ECA, incorporating MANNs.
   Q3-Q4: Testing of the framework with a user panel, and its refinement according to users feedback.   
Year 3: Implementation, Evaluation, and Thesis Writing
   Implement the ECA in real-world individual and collaborative educational setting. Collect user data on its effectiveness, user engagement, and learning outcomes.
 
PUBLICATION VENUES
Journals: IEEE Trans. on Neural Networks and Learning Systems, IEEE Trans. on Learning Technologies, IEEE Trans. on Visualization and Computer Graphics, Neurocomputing, International Journal of Neural Systems
Conferences:    IJCAI, NeurIPS, ICONIP, AAAI, ICML, ICRA, and other conferences about the project topics             

COLLABORATIONS
The proposer's research group is collaborating with the Neuroscience Department of the Faculty of Psychology of the University of Turin, which will be involved in the project as domain expert and will help in providing insights about neural memory models, develop use cases, design the approach and support the assessment phase.

Required skills

The ideal candidate for this PhD project should possess the following skills and characteristics:
-    Expertise in Artificial Intelligence and Machine Learning
-    Proficiency in programming languages such as Python and experience in handling and analyzing large data sets.
-    Familiarity with XR technologies: 
-    Good research and analytical skills: 
-    Excellent communication and collaboration skills
-    Publication and scientific writing skills
-    Adaptability and problem-solving skills

37

Real-Time Generative AI for Enhanced Extended Reality

Proposer

Andrea Bottino

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI

Group website

https://www.polito.it/cgvg

Summary of the proposal

The integration of generative AI (GenAI) in extended reality (XR) offers transformative potential for the creation of dynamic and immersive experiences in many fields. The project aims to develop optimized GenAI models for XR, with a focus on algorithms that efficiently generate realistic content within the computational limitations of XR hardware.

Research objectives and methods

In the rapidly evolving field of extended reality (XR), the integration of GenAI offers transformative opportunities. GenAI is at the forefront of creating realistic, dynamic and immersive XR experiences. Its ability to automatically generate complex data such as geometries, textures, animations and even emotional voice modulations has a significant impact on various sectors, including education, entertainment and professional training. However, the practical implementation of these advanced technologies in XR faces critical challenges.
The main challenge is to balance the generation of high-quality content with the real-time processing requirements of XR environments. XR devices, known for their limited processing power, require algorithms that are not only efficient but can also operate under the constraints of lower processing power. This requirement becomes even more critical when you consider that high-resolution data and sophisticated animations are required to ensure a truly immersive experience.
In addition, real-time generation of detailed and diverse content ? from lifelike avatar animations to context-sensitive geometries or textures ? poses significant problems in terms of computational complexity. To overcome these hurdles, the development of lightweight yet powerful GenAI models is crucial. Such models must strike an appropriate balance between execution speed and output quality to ensure that the immersive experience is not compromised.
 
RESEARCH OBJECTIVES
1. To develop efficient real-time GenAI algorithms that can operate in real-time in XR environments. These models should efficiently generate high quality data tailored to the computational constraints of XR devices. By fidning the right balance between computational efficiency and content quality, these models will enable more complex and realistic XR applications. This will increase user engagement and expand the range of possible XR experiences, from games to professional training simulations.
2. Innovations in the creation of lifelike avatars and environment simulations with a focus on realistic body and facial animations and the generation of contextual data. The focus is on generating these elements in real time and adapting to user interactions and changes in XR space. Realistic avatars and environments are key to immersive XR experiences. By improving these aspects, the project aims to increase the sense of presence and immersion for users.
3. Developing systems that can modulate voice and emotional responses in real time based on user interactions. This includes developing AI models capable of understanding and responding to users' emotions to enhance the communicative and interactive aspects of XR. Emotional responsiveness in AI will lead to more natural and intuitive user experiences in XR. This is particularly important for mental health, education and cultural heritage applications where user engagement and emotional connection are critical.
4. Ensure that the GenAI models and techniques developed are compatible with different XR platforms and scalable to different hardware capacities. This also includes ensuring that the solutions can be adapted to future advancements in XR technology.
 
WORKPLAN
- Phase 1: Analysis of the state of the art in GenAI and existing XR systems, identification of gaps and potentials.
- Phase 2: Development of generative algorithms for the creation of XR content (geometries, textures, animations).
- Phase 3: Exploring vocal and emotional modulation and integrating these capabilities into XR avatars.
- Phase 4: Optimization of the models to ensure real-time performance on XR devices.
- Phase 5: Evaluation of the developed models in terms of visual quality and performance.
 
PUBLICATION VENUES
Journals: IEEE Trans. on Visualization and Computer Graphics, Virtual Reality, Pattern recognition, IEEE Trans. On Affective Computing, Computers & Graphics, International Journal of Human-Computer Studies.
Conferences: CVPR, ICPR, ECCV, ICCV, IROS, IJCAI, NeurIPS, ICRA, and other conferences about the project topics              
 

Required skills

The ideal candidate should have a strong background in computer science and AI, with specific skills in generative algorithms and XR. Problem-solving abilities, creativity, and knowledge of model optimization for low-power devices are essential. Experience in GPU programming and immersive user interface development is also required. We also require good communication and collaboration skills and publication and scientific writing skills

38

Transferable and efficient robot learning across tasks, environments, and embodiments

Proposer

Raffaello Camoriano

Topics

Data science, Computer vision and AI

Group website

http://vandal.polito.it/

Summary of the proposal

The project's goal is the design of efficient methods for training, transfer, and inference of high-capacity models for embodied systems. Promising approaches include knowledge distillation, recent fine-tuning and approximation methods reducing the policy execution cost while retaining performance levels. Moreover, constraining model output space to low-dimensional manifold structures arising from the physics of the target problem also holds promise to improve policy efficiency and safety.

Research objectives and methods

Classical learning methods for robotic perception and control tend to target specific skills and embodiments, due to the difficulties in extracting transferable and actionable representations which are invariant to physical properties of the environment and of the robot. However, the performance of such specialized agents can be limited by low model capacity and training on relatively few examples. This can be particularly problematic when tackling complex and long-horizon tasks for which the cost of large-scale data collection on a single robot can be prohibitively high and the complexity of the policy to be learned might benefit from a more expressive function class (i.e., with a larger number of parameters).

Conversely, recent high-capacity, highly flexible machine learning models, such as vision transformers and large multimodal models, proved their worth in less constrained domains such as computer vision and NLP. In such domains, pre-training on large and diverse datasets is possible due to web-scale data availability. This results in rich ?generalist? pre-trained models enabling model fine tuning and adaptation to specific target tasks with large savings in terms of target data collection and positive transfer to new tasks and visual appearances.

A growing research line investigates the extension of high-capacity models to robotic tasks to enable complex skill learning across embodiments and modalities, thanks to the high flexibility of high-capacity architectures (e.g., GATO [1]). RobotCat [2] demonstrates how such models can be applied to solve complex robotic manipulation tasks with visually defined goals, while Open X Embodiment [3] demonstrates positive transfer for task goals specified in natural language. Octo further extends this concept by supporting multimodal goal definitions [4], while AutoRT [5] also supports multi-robot coordination. Large language models can also be employed to guide exploration and automate reward design for reinforcement learning [6].

However, these methods rely on very large numbers of parameters (i.e., in the order of billions), rendering model storage and real-time inference a challenge. This is a relevant roadblock when local execution on limited robotic hardware is required, as is often the case in open-world unstructured environments. Some of the most advanced multi-embodiment models (e.g., RT-2-X [3]) are so extensive that they cannot be stored locally and require communication with cloud environments to perform inference. Even more so when model fine-tuning or open-ended learning are required for tackling new tasks. Impractical computational and communication costs and catastrophic forgetting of previous tasks indeed represent a major challenge.

The objective of this project is the development of efficient methods for training, transfer, and inference of generalist high-capacity models for embodied and robotic tasks. Several approaches will be investigated, including the use of knowledge distillation, recent fine-tuning methods which proved to reduce the cost of execution of robotic policies (i.e., RT-2-X) from quadratic to linear while retaining performance levels [7], and approximation methods to reduce the number of parameters while retaining approximation power [8]. Moreover, constraining model output space to low-dimensional manifolds arising from the physical constraints of the target problem also holds promise to improve policy efficiency and safety [9] [10].

Potential publication venues include major AI, ML, robotics, and computer vision venues (e.g., TRO, RAL, TPAMI, JMLR, ICRA, IROS, CoRL, NeurIPS, ICML, ICLR, etc.)

Preliminary Main Activities Plan- M1-M4 Literature review on foundation models for robot learning- M3-M7 Empirical analysis of state-of-the-art methods for improving foundation model efficiency- M8-M15 Design and development of novel efficient methods focusing on robotic requirements and resource constraints- M16-M22 Experimental evaluation of the proposed methods- M23-M28 Development of novel methods incorporating output space constraints to enforce safety requirements while retaining efficiency and predictive capabilities- M28-M32 Experimental validation and dissemination of the results- M32-M36 Thesis writing

References
[1] Reed, Scott, et al. "A generalist agent." Transactions on Machine Learning Research (2022).
[2] Bousmalis, Konstantinos, et al. "RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation." Transactions on Machine Learning Research (2023).
[3] Padalkar, Abhishek, et al. "Open x-embodiment: Robotic learning datasets and rt-x models." arXiv preprint arXiv:2310.08864 (2023).
[4] Team, Octo Model, et al. "Octo: An open-source generalist robot policy." (2023).
[5] AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents https://auto-rt.github.io/static/pdf/AutoRT.pdf
[6] M. Kwon, S. M. Xie, K. Bullard, and D. Sadigh, ?Reward design with language models,? in Proc. Int. Conf. Learn. Representations, 2023, pp. 1?18.
[7] Leal, Isabel, et al. "SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention." arXiv preprint arXiv:2312.01990 (2023).
[8] Xiong, Yunyang, et al. "Nystr?mformer: A nystr?m-based algorithm for approximating self-attention." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 35. No. 16. 2021.
[9] Liu, Puze, et al. "Robot reinforcement learning on the constraint manifold." Conference on Robot Learning. PMLR, 2022.
[10] Duan, Anqing, et al. "A structured prediction approach for robot imitation learning." The International Journal of Robotics Research (2023).

Required skills

We seek candidates highly motivated to conduct methodological research in ML and robotics.
An excellent background in ML is required, covering theory and software. Proficiency with Python and ML, robotics, or CV frameworks are a must.
Strong communication skills, self-motivation, proven teamwork experience, and independence are necessary.
A proven track record and certifications of fluent speaking and technical writing in English are required.
Prior research experience is highly appreciated.

39

Neural Network reliability assessment and hardening for safety-critical embedded systems

Proposer

Matteo Sonza Reorda

Topics

Computer architectures and Computer aided design, Data science, Computer vision and AI

Group website

https://cad.polito.it/

Summary of the proposal

Neural Networks are increasingly used within embedded systems in many application domains, including cases where safety is crucial (e.g., automotive, space, robotics). Possible hardware faults affecting the underlying hardware (CPU, GPU, TCU) can severely impact the produced results. The goal of the proposed research activity is first to estimate the probability that critical failures are produced, and then to devise effective solutions for system hardening, playing mainly at the software level.

Research objectives and methods

NNs are increasingly adopted in the area of embedded systems, even for safety-critical applications (e.g., in the automotive, aerospace and robotics domains), where the probability of failures must be lower than well-defined (and extremely low) thresholds. This goal is particularly challenging, since the hw used to run the NN often corresponds to extremely advanced devices (e.g., GPUs, or dedicated AI accelerators), built with highly sophisticated (and hence less mature) semiconductor technologies. On the other side, NNs are known to own some intrinsic robustness, and can tolerate a given number of faults inside the hardware. Unfortunately, given the complexity of the NN algorithms and of the underlying architectures, an extensive analysis to understand which (and how many) faults are particularly critical is difficult to perform, at least when usual computational resources are available. The planned research activities aim first at exploring the effects of faults affecting the hardware of a GPU/AI accelerator supporting the NN execution. Experiments will study the effects of the considered faults on the results produced by the NN. This study will mainly be performed resorting to fault injection experiments. In order to keep the computational effort reasonable, different solutions will be considered, combining simulation- and emulation-based fault injection with multi-level one. The trade-off between the accuracy of the results and the required computational effort will also be evaluated. Based on the gathered results, hardening solutions acting on the hardware and/or the software will be devised, aimed at improving the resilience of the whole application with respect to faults, and thus matching the safety requirements of the target applications. 

The proposed plan of activities is organized in the following phases (for each phase, the indicative time span in months from the beginning of the PhD period is reported):
 - phase 1 (M1 to M6): the student will first study the state of the art and the literature in the area of NNs, their implementation on different platforms (including CPUs, GPUs, and hardware accelerators) and their applications. At the same time, the student will become familiar with existing fault injection environments (e.g., NVbitFI). Suitable cases of study will also be identified, whose reliability and safety could be analyzed with respect to faults affecting the underlying hardware.
 - phase 2 (M7-M18): suitable solutions to analyze the impact of faults on the considered accelerator will be devised and prototypical environments implementing them will be put in place.
 - phase 3 (M19-M24): based on the results of a set of fault injection campaigns performed to assess the reliability and safety of the selected cases of study, a detailed analysis leading to the identification of the most critical faults/components will be carried out.
 - phase 4 (M25 to M36): suitable hardening solutions will be proposed and evaluated.

Phases 2 to 4 will include dissemination activities, based on writing papers and presenting them at conferences (e.g., ETS, VTS, IOLTS, DATE). The most relevant proposed methods and results will be submitted for publication on the journals in the field, such as the IEEE Transactions on Computers, CAD, and VLSI, as well as Elsevier Microelectronics & Reliability.

We also plan for a strong cooperation with the researchers of other universities and research centers working in the area, such as the University of Trento, the University of California at Irvine (US), the Federal University of Rio Grande do Sul (Brazil), NVIDIA.

Required skills

The candidate should own basic skills in
- digital design
- computing architectures
- neural networks

40

Design of an integrated system for testing headlamp optical functionalities

Proposer

Bartolomeo Montrucchio

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI

Group website

https://www.dauin.polito.it/it/la_ricerca/gruppi_di_ricerca/grains_graphics_and_intelligent_systems

https://www.italdesign.it/services-electric-and-electronics/harness-and-lighting/

Summary of the proposal

Automobile recent development are based on several sensors such as cameras, radars, and others. These sensors are used also for improving road illumination, both for human and autonomous drivers. 
The purpose of the work will be to design new automatic systems for managing illumination; computer vision algorithms and image processing methods will be used together with optical design, in co-working with Italdesign S.p.A.

Research objectives and methods

Automobile evolution requires increasingly automatic systems for driving and traffic detection, for example of other cars, bicycles or other vehicles. Therefore, also lighting systems are in fast evolution. In particular future vehicles' headlamps will move towards several light sources independently driven, up to several thousand of different sources, each of them driven by means of a technology similar to the one used in digital micromirror projectors. The final purpose is to develop a headlamp able to move automatically the light on obstacles like pedestrian on bicycles suddenly appeared on the road. In order to find where to move the light all the sensors available in the car can be used, mainly cameras and radars. 
This PhD proposal aims at developing the already existent system up to a higher complexity level that allows measurements on matrix high beam functionalities in function of different road and car simulated configurations.
The proposal puts together competences of Dipartimento di Automatica e Informatica and industrial strong knowledge of Italdesign S.p.A.. Therefore experimental activies, also in the foreign sites on the company, mainly in Germany, will be performed.
This work will be developed during the three years, following the usual Ph.D program:
-       first year, improvement of the basic knowledge about lighting systems, attendance of most of the required courses, also on applied optics, submission of at least one conference paper
-       second year, design and implementation of new algorithms for testing headlamp optical functionalities and submission of conference papers and at least one journal
-       third year, finalization of the work, with at least a selected journal publication.
Possible venues for publication will be, if possible, journals and conferences related to computer vision and optics, from IEEE, ACM and SPIE. An example could be the IEEE Transactions on Image Processing.
The scholarship is sponsored by Italdesign S.p.A.  A period of six months abroad will be done during the PhD, and a period of at least six months in Italdesign will be mandatory too.
The work will therefore be done in strict collaboration together with Italdesign Giugiaro S.p.A, with whom there is already a collaboration.
 

Required skills

The ideal candidate should have an interest in optics, computer vision, and image processing.
The candidate should also have a good background in programming skills, mainly in Python.  Good teamwork skills will be very important, since the work will require to be integrated with company work.  

41

Machine unlearning

Proposer

Elena Maria Baralis

Topics

Data science, Computer vision and AI

Group website

https://dbdmg.polito.it
https://smartdata.polito.it

Summary of the proposal

Machine Unlearning is the task of selectively erasing or modifying previously acquired knowledge from machine learning models. This is particularly relevant nowadays due to the increasing concerns surrounding privacy (e.g. the Right To Be Forgotten required by GDPR) and copyright infringements, as highlighted by recent cases involving Large Language Models. The key goal of this proposal is to propose novel architectures, algorithms and evaluation metrics for Machine Unlearning.

Research objectives and methods

In recent years, the rapid advancement of machine learning models, particularly Large Language Models (LLMs), has raised significant concerns regarding privacy and intellectual property rights. The need for responsible AI practices has become increasingly evident, driven by legal frameworks such as the General Data Protection Regulation (GDPR) that mandates the Right To Be Forgotten. Additionally, high-profile cases involving LLMs have highlighted the need to address issues related to the unintentional retention of sensitive information and potential copyright infringements.

The proposed research activity on Machine Unlearning (MU) aims to tackle these challenges by developing novel techniques to selectively erase or modify previously acquired knowledge from machine learning models. The primary objectives of this research are twofold: first, to explore the current state of the art in MU, and second, to propose innovative architectures, algorithms, and evaluation metrics to enhance the efficacy of the unlearning process. Through these goals, the aim is to contribute to the establishment of ethical and responsible AI practices, ensuring compliance with legal requirements and mitigating the risks associated with unintentional information retention by machine learning models.

The workplan for this PhD is structured to comprehensively address the multifaceted challenges of MU. The research will focus on proposing novel architectures and algorithms that facilitate effective unlearning while preserving the model's overall performance. Given the current lack of definitive metrics for MU, part of the research efforts will be focused toward trying to identify more suitable and comprehensive metrics.

The research activity progresses from foundational research to the practical implementation, validation and application of MU techniques. An outline of the possible research plan is as follows.

- First year
The first year will be dedicated to literature review and conceptualization, leading to the formulation of the main research objectives for the rest of the doctorate. This initial phase involves an extensive study of the literature, identifying gaps and shortcomings ? leading to the definition of initial proposals for improvements over state-of-the-art techniques.

- Second year
Based on the areas of opportunity identified and the preliminary proposals made, the candidate will work on the ideation and implementation of novel architectures and algorithms for MU, with ongoing validation and refinement based on the feedback obtained from experiments and evaluations.

-Third year
The final year will focus on consolidating the findings and defining the applications of main interest for the output produced.
During the second/third year, the candidate will have the opportunity to spend a period of time abroad in a leading research center.
Publication venues for this research include leading conferences and journals in the fields of machine learning and artificial intelligence. Key conferences include the conference on Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML), and the International Conference on Representation Learning (ICLR). Additionally, reputable journals such as the Journal of Machine Learning Research (JMLR) and the IEEE Transactions on Neural Networks and Learning Systems will be sought for in-depth dissemination of research contributions.

Required skills

The candidate should have a strong computer and data science background, in particular for what concerns:
- Strong programming skills ? preferably in Python
- Thorough understanding of theoretical and applied aspects of machine and deep learning
- Fundamentals of Natural Language Processing

42

Generative AI models for enhanced text-to-image synthesis

Proposer

Lia Morra

Topics

Computer graphics and Multimedia, Data science, Computer vision and AI

Group website

http://grains.polito.it - http://dbmg.polito.it

Summary of the proposal

This research proposal aims to overcome limitations in current generative text-to-image models. Despite advancements in visual fidelity, existing models struggle with precise control over generated images in response to detailed prompts. The candidate will research innovative strategies to improve spatial composition and alignment with user-defined specifications, including the application of neuro-symbolic AI to embed logical constraints and leveraging background ontological knowledge.

Research objectives and methods

While current generative text-to-image latent diffusion models have reached unprecedented results in terms of visual fidelity, there are still open issues to be addressed in exerting precise control over the generated images. On the one hand, generative models have difficulty in creating correct images when the textual prompt contains many details and often with object placement and spatial awareness.  Recent text-to-image latent diffusion models have shown substantial improvements in prompt following, yet still struggle with the use of words such as ?left? or ?behind?.  Increasing the size of the model  has so far led to small improvements on these aspects, in the face of a significant increase in hardware requirements. Alternatively, other recent works have looked into improving captions at training time. Neither approach, so far, as successfully addressed spatial composition. One possible reason lies in the inherent limitations of the text embedding employed to condition the generation process, that fails to learn sufficiently detailed and disentangled representation; this issue would not be necessarily solved by increasing the amount or complexity of training data. 

One the other hand, there is an also an ongoing struggle in aligning the generated output with human values. Generative models may generate offending images, perpetuate societal biases and stereotypes embedded in the training data, or ?regurgitate? training samples potentially exposing the user to inadvertent copyright infringements. While vendors have generally responded by establishing safeguards for specific inputs or outputs, a more general, robust and reliable solution is called for.  For instance, recent preliminary results have shown that neuro-symbolic AI techniques could be used, in toy datasets, to sample from an unconditioned model under user-defined logical constraints. 

Research objectives:
The present proposal aims at investigating novel ways to condition the generation process to ensure that the generated images comply with user specifications, both in terms of specific content (e.g., ?photo of a man, sitting at the right of the woman, who is looking towards a window at their left?) and/or in terms of general properties and rules (e.g., a photo of a nude person may be considered offensive and should be avoided). To this aim, several strategies will be investigated and compared, such as: 
-       defining and integrating richer, more structured representations, such as scene graphs, as an intermediate step to disambiguate textual prompts, incorporate greater spatial awareness and increase control in image composition; 
-       exploiting emerging techniques, such as neuro-symbolic AI, to incorporate logical constraints in the training objective or in the sampling process;
-       exploiting background ontological knowledge to further constraint and guide the generation; for instance, better differentiating between encyclopedic facts (e.g., Superman is a superhero) and general concepts (e.g., superhero) could prevent the model from excessively relying on memorization of frequent observed patterns. 

Outline of the research plan:
In Year 1, the candidate will review of the current state-of-the-art on controllable image synthesis, text-to-image generative models and their inherent limitations and biases. The candidate will also strengthen competences and skills required to tackle the research program. A suitable dataset of challenging prompts, biased outputs and failures will be created by extensively reviewing open and closed source system, as well as the relevant literature. This dataset will provide the basis for the experimental validation.  
In Year 2, the candidate will investigate novel methods to increase control in object position, spatial composition and fine-grained detail in text-to-image synthesis, incorporating structured representations and/or logical constraints as detailed above. The proposed techniques will be compared against other strategies based, e.g., on prompt engineering and chain-of-thought prompting, in terms of quality, computational cost, resources and biases. 
In Year 3, the candidate will move into investigating issues related to promote fair and robust behaviors across all prompts. The proposed techniques will be extended to ensure that all generated outputs are consistent with basic rules, such as avoid to generate, or ensure sufficient diversity in the generated images. 
 
Possible publication venues include international peer-reviewed journals in the fields related to the current proposal, such as: IEEE Transactions Image Processing, IEEE transactions Pattern Analysis and Machine Intelligence, Pattern Recognition, Computer Vision and Image Understanding, International Journal of Computer Vision, and top-tier international conferences, such as CVPR, ICCV, ECCV, NeurIPS, ICPR, ACM Multimedia.

Required skills

- Good knowledge of machine learning, deep learning, and generative models.
- Preferred previous experience with diffusion model, large language models, or multi-modal models
- Strong analytical skills

43

Test, reliability, and safety of intelligent and dependable devices supporting sustainable mobility

Proposer

Riccardo Cantoro

Topics

Computer architectures and Computer aided design, Cybersecurity

Group website

https://cad.polito.it

Summary of the proposal

The research addresses the pressing need for dependable electronic systems in safety-critical domains, specifically focusing on sustainable mobility. The objective is to develop innovative hardware and software methodologies to qualify electronic systems against stringent reliability and safety requirements. The work will involve developing suitable hardening techniques on the hardware, software safety mechanisms, and a comprehensive assessment methodology supported by EDA partners.

Research objectives and methods

Research objectives
The novelty of this research lies in its focus on sustainable mobility, which is an emerging area of research with great potential for real-world impact. The work is expected to significantly improve the reliability and safety of electronic systems, thereby enhancing the performance of safety-critical applications. The research team's expertise in electronic design automation (EDA) will be leveraged to develop robust methodologies that are both practical and effective. Furthermore, this research is aligned with the goals of the National Centers on Sustainable Mobility and HPC, as well as the Extended Partnership on Artificial Intelligence, which further emphasizes its significance in advancing the state-of-the-art in this field.

The objectives of this research are summarized as follows:- Identify a suitable hardware platform for sustainable mobility applications with particular emphasis on RISC-V based systems.- Identify suitable software for mobility applications to be used as a representative benchmark for the qualification activities.- Assess dependability figures on the identified hardware/software infrastructure to identify critical parts of the design that require hardening.- Develop innovative hardening solutions to improve the reliability of critical areas in the design.- Focus on sustainable mobility as an emerging area of research with great potential for real-world impact.- Establish a comprehensive assessment methodology in collaboration with EDA partners.

Outline of possible research plan

First year: 
The candidate will start by conducting a thorough literature review on dependable electronic systems and sustainable mobility to identify the most recent and relevant research works. They will then select a suitable hardware platform for sustainable mobility applications, taking into account a variety of RISC-V based systems publicly available, and using IP cores from industrial partners (e.g., Synopsys). Furthermore, they will identify suitable software for mobility applications, including AI applications, and leverage publicly available benchmarks developed for other domains such as automotive and space. The candidate will develop a preliminary assessment methodology for the identified hardware and software infrastructure, which will be refined and improved in the following years.

Second year: 
The candidate will focus on identifying the critical parts of the design that require hardening and implementing initial solutions to enhance the overall system reliability. They will perform dependability analysis on the identified hardware/software infrastructure, aiming to improve the quality of the developed assessment framework. The candidate will explore various hardening techniques, including redundancy, error-correcting codes, and fault-tolerant architectures, and select the most suitable ones to enhance the system reliability and safety.

Third year: 
The candidate will develop innovative hardening solutions to improve the reliability of critical areas in the design, while ensuring the availability of safety mechanisms in the event of a fault. They will develop a comprehensive assessment methodology in collaboration with EDA partners, leveraging their expertise in electronic design automation to refine and optimize the assessment process. The proposed methodologies will be extensively evaluated through simulations and testing, and the candidate will collaborate with industry partners to validate their effectiveness on real-world applications.

List of possible venues for publications
The candidate will prepare and submit papers to top-tier conferences and journals in the field of electronic systems, embedded systems, and fault tolerance.

Possible venues for publications could include:- IEEE Transactions on Computers - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems- IEEE Transactions on Very Large Scale Integration (VLSI) Systems- International Conference on Computer-Aided Design (ICCAD)- International Test Conference (ITC)- IEEE European Test Symposium (ETS)- Design, Automation and Test in Europe Conference (DATE)- RISC-V Summit

Projects
The research is consistent with the themes of the National Centers on Sustainable Mobility and HPC, as well as with those of the Extended Partnership on Artificial Intelligence, in which members of the CAD group participate.
The research will be supported by industrial partners involved in active collaborations. Synopsys is involved in research activities on functional safety and reliability, and provides licensed tools, IP cores, and support. Infineon is also involved in the frame of research contracts on electronic system dependability.

Required skills

Background in digital design and verification.
Solid foundations on microelectronic systems and embedded system programming.
Experience with fault modeling and testing techniques for digital circuits, such as stuck-at, transition, and path-delay faults.
Knowledge of EDA tools, particularly for fault simulation.

44

Cybersecurity for a quantum world

Proposer

Antonio Lioy

Topics

Cybersecurity, Parallel and distributed systems, Quantum computing

Group website

https://security.polito.it/
https://qubip.eu/

Summary of the proposal

Cybersecurity is typically based on cryptographic algorithms (e.g. RSA, ECDSA, ECDH) that are threatened by the advent of quantum computing.
Purpose of this research is to create quantum-resistant versions of various security components, such as secure channels (e.g. TLS, IPsec), digital signatures, secure boot, Trusted Execution Environment (TEE).
The final objective is the design and test of quantum-resistant versions of security solutions in an open-source environment (e.g. Linux, Keystone).

Research objectives and methods

Hard security is typically based on mathematical cryptographic algorithms that support computation of symmetric and asymmetric encryption, key exchange, digital signature, and hash values.
 
Several of these algorithms (e.g. RSA, ECDSA, ECDH) are threatened by the advent of quantum computing. NIST and other bodies have thus selected new quantum-resistant algorithms and advocated their fast adoption in current security solutions. However this is not a simple change, as there are several intertwined aspects to be considered, such as hardware support, key-length, and X.509 certificates.
 
Purpose of this work is to evolve various security components of modern ICT infrastructures to quantum-resistant versions. This may include secure network channels (e.g. TLS, IPsec), digital signatures, secure boot, Trusted Execution Environment (TEE), and remote attestation.
 
The gross objective is the design and test of quantum-resistant versions of several security solutions in an open-source environment (e.g. Linux, Keystone). 
 
The specific objectives of this research activity are:
1. Identify security components threatened by quantum computing and review proposed standards to make them quantum-resistant.
2. Extend existing open-source systems and components (e.g. Linux, Keystone, openSSL, mbedTLS could be suitable targets) to support the proposed quantum-resistant solutions.
3. Implement a system with the hardware and software components needed to demonstrate the feasibility and performance of the improved quantum-resistant elements.
 
The first year will be spent studying the existing security paradigms and how they are affected by quantum computing. The PhD student will also analyse the proposed post-quantum algorithms and evaluate their performance and hardware requirements. During this year, the student should also follow most of the mandatory courses for the PhD and submit at least one conference paper.
During the second year, the PhD student will design a custom approach for quantum-resistant secure channels and trusted execution environment, possibly enriched with specialized hardware elements. At the end of the second year, the student should have started preparing a journal publication on the topic and submit at least another conference paper.
Finally, the third year will be devoted to the Implementation and evaluation of the proposed solution, compared with the existing ones. At the end of this final year, a publication in a high-impact journal shall be achieved.
 
Possible target publications: IEEE Security and Privacy, Springer International Journal of Information Security, Elsevier Computers and Security, Future Generation Computer Systems.

This research is part of the Horizon Europe QUBIP project (Quantum-oriented Update to Browsers and Infrastructures for the PQ Transition) https://quibip.eu/

Required skills

REQUIRED SKILLS
Cybersecurity (mandatory)
Network security (mandatory)
Trusted computing (preferred)
Basics of quantum computing (useful)

45

Bridging Human Expertise and Generative AI in Software Engineering

Proposer

Luca Ardito

Topics

Software engineering and Mobile computing

Group website

https://softeng.polito.it

Summary of the proposal

In collaboration with Vodafone Digital and the ZTO team in Network Operations, the PhD project aims to define a framework to generate code from functional requirements, fostering synergy between human developers and AI-based actors. The project will involve a systematization of metrics and methodologies to evaluate the correctness of the generated code and requirements performed by the generative AI components to increase the effectiveness and gain trust in the outputs of generative AI.

Research objectives and methods

Research objectives

The main objectives of the PhD programme are the following:The identification of generative AI mechanisms that can aid in code generation from software requirements. The development and assessment of methods for the evaluation of the correctness and the dependability of the application of generative AI to code development;The conduction of formal experiments to evaluate how code generated by AI Compares to human-written code in both functional and non-functional terms.

Outline of the research work plan

Task 1: Preliminary evaluation of state-of-the-art solutions (M1-M3)

The task involves a comprehensive assessment of current solutions in the domain. The objective is to evaluate existing methodologies, technologies, and frameworks relevant to the research context. This preliminary analysis will be conducted systematically by applying Kitchenham's guidelines for conducting Systematic Literature Reviews in the Software Engineering research field. The systematic literature review will also consider grey literature sources (i.e., non-peer-reviewed sources available on various internet sources) to cope with the high novelty of the generative AI research field. The systematic evaluation of the state of the art will be complemented with open and structured interviews with practitioners and developers to understand their main needs and most common practices.

Task 2: Selection and Integration of Generative AIs for Code Generation (M4-M18)

This task focuses on selecting, customizing, integrating, and training a Generative AI, specifically a Large Language or Foundation Model, to generate code from formal requirements. It includes understanding various use case and formal requirement languages and creating modules for translating natural language requirements into structured notations like Use Case Diagrams. The process involves preprocessing data -collecting, cleaning, and structuring use case language datasets- and training the AI to understand these scenarios. Ongoing evaluation and refinement of the AI are crucial for accuracy. The main goal is to develop a solution that translates use case specifications into high-quality code, with evaluations based on the development effort, error rates, and requirement alignment. The task will use Software Repository Mining (MSR) techniques for diverse dataset collection. The implementation phase of this research will follow the Agile Software Development practices, streamlining the entire software development lifecycle to assess the efficacy of AI-generated code against existing tools and manually written code for both front-end and back-end applications. Furthermore, the research is dedicated to pioneering methods for automatically generating synchronized documentation and unit testing. It will also investigate strategies for conducting code quality reviews, monitoring resource usage efficiently, and evaluating the software's business impact, thereby tailoring the development process to meet the demands of network operations.

Task 3: Definition of assessment methods for Generative AI-based code development (M13-M24)

The task focuses on defining robust methods to assess Generative AI-based code development. This entails the definition of structured procedures to assess the accuracy, reliability, and compliance with the requirements of the generated code. The goal is to establish a rigorous framework for ensuring the quality of code produced by Generative AI, thus advancing the state of the art in Generative AI code development.
Task 3 will involve applying systematic techniques to build taxonomies in Software Engineering (ref. Paul Ralph) through the Straussian Grounded Theory technique (ref. Strauss).

Task 4: Analysis of the non-functional implications of Generative AI-based code development (M22-36)

The task focuses on a comprehensive analysis of the non-functional implications inherent to Generative AI-based code development. This includes scrutinizing factors such as scalability, performance, readability and maintainability of the generated code. The objective is to discern and mitigate any adverse effects of integrating Generative AI into the code development process.
The task will involve conducting empirical experiments over statistically significant samples to compare non-functional properties of software generated by human developers, software obtained through Generative AI, and software obtained through a synergistic interaction between human developers and generative AI tools.

List of possible venues for publications

The target for the PhD research includes a set of conferences in the general area of software engineering (ICSE, ESEM, EASE, ASE, ICSME) as well as in the specific area of testing (ICST, ISSTA).
More mature results will be published in software engineering journals, mainly IEEE Transactions on Software Engineering, ACM TOSEM, Empirical Software Engineering, Journal of Systems and Software, and Information and Software Technologies.
 

Required skills

The main skills required by the candidate are the following:
General knowledge about Large Language Models and application of AI-based algorithms to software development;
Experience in software development with object-oriented languages (e.g., Java or C#) and knowledge of the web and/or mobile application domain;
Knowledge of software verification and validation techniques (e.g., scripted unit and integration testing, end-2-end testing, Graphical User Interface testing).

46

Explaining AI (XAI) models for spatio-temporal data

Proposer

Elena Maria Baralis

Topics

Data science, Computer vision and AI

Group website

https://dbdmg.polito.it
https://smartdata.polito.it

Summary of the proposal

Spatio-temporal data allow an effective representation of many interesting phenomena in application domains ranging from transportation to finance. Current state-of-the-art deep learning techniques (e.g., LM, CNN, RNN) provide black-box models, i.e., models that do not expose the motivations for their predictions. The main goal of this research activity is the study of methods to allow human-in-the-loop inspection of reasons behind classifier predictions for spatio-temporal data.

Research objectives and methods

Machine learning models are increasingly adopted to assist human experts in decision-making. Especially in critical tasks, understanding the reasons behind machine learning model predictions is essential for trusting the model itself. For example, experts can detect model wrong behaviors and actively work on model debugging and improvement. Unfortunately, most high-performance ML models lack interpretability.
 
The research activity will consider application domains (e.g., transportation, industry, medical care, climate) in which the availability of understandable explanations is particularly relevant for explaining anomalous behaviors. The explanation algorithms will target different types of spatio-temporal data (e.g., multivariate time series, spatiotemporal graphs, trajectories, spatio-temporal matrices). The following different facets of XAI (Explainable AI) will be addressed.
 
Model understanding. The research work will address local analysis of individual predictions. These techniques will allow the inspection of the local behavior of different classifiers and the analysis of the knowledge different classifiers are exploiting for their prediction. The final aim is to support human-in-the-loop inspection of the reasons behind model predictions.
 
Model trust. Insights into how machine learning models arrive at their decision allow evaluating if the model may be trusted. Methods to evaluate the reliability of different models will be proposed. In case of negative outcomes, techniques to suggest enhancements of the model to cope with wrong behaviors and improve the trustworthiness of the model will be studied.
 
Model debugging and improvement. The evaluation of classification models generally focuses on their overall performance, which is estimated over all the available test data. An interesting research line is the exploration of differences in the model behavior, which may characterize different data subsets, thus allowing the identification of potential sources of bias in the data.

PhD years organization
YEAR I: state-of-the-art survey for algorithms and for XAI both for time series and spatio-temporal data considering, e.g., feature attribution-based explanations, attention-based explanation, and counterfactual explanation; performance analysis and preliminary proposals of improvements over state-of-the-art algorithms; exploratory analysis of novel, creative solutions for XAI; assessment of main explanation issues in 1-2 specific industrial case studies. 
 
YEAR 2: new algorithm design and development; experimental evaluation on a subset of application domains considering public domain datasets (e.g. in the transportation field, TaxiNYC, METR-LA, PEMS-BAY, in the healthcare field, PTB-XL, NYUTron, and MIMIC-III, in the financial field, StockNet and NASDAQ-100); deployment of algorithms in selected industrial contexts. 
 
YEAR 3: algorithms improvements, both in design and development, experimental evaluation in new application domains.
 
During the second-third year, the candidate will have the opportunity to spend a period of time abroad in a leading research center.
 
Publication venues for this research include leading conferences and journals in the fields of spatio-temporal data managent, machine learning and artificial intelligence: 
 
IEEE TKDE (Trans. on Knowledge and Data Engineering)
ACM TKDD (Trans. on Knowledge Discovery in Data)
ACM TIST (Trans. on Intelligent Systems and Technology)
Information sciences (Elsevier)
Expert systems with Applications (Elsevier)
Machine Learning with Applications (Elsevier)
Engineering Applications of Artificial Intelligence (Elsevier)
 
IEEE/ACM International Conferences (e.g., ACM KDD, ACM SIGSPATIAL, IEEE ICDM, NeurIPS)

Required skills

The candidate should have a strong computer and data science background, in particular for what concerns:
- Strong programming skills ? preferably in Python
- Thorough understanding of theoretical and applied aspects of machine and deep learning
- Fundamentals of spatio-temporal data management
- Fundamentals of Natural Language Processing

47

Advanced data modeling and innovative data analytics solutions for complex application domains

Proposer

Silvia Anna Chiusano

Topics

Data science, Computer vision and AI

Group website

dbdmg.polito.it

Summary of the proposal

Data science projects entail the acquisition, modelling, integration, and analysis of big and heterogeneous data collections generated by a diversity of sources, to profile the different facets and issues of the considered application context. However, data analytics in many application domains is still a daunting task, because data collections are generally too big and heterogeneous to be processed through machine learning techniques currently available. Thus advanced data modeling and machine learning/artificial intelligence techniques needs to be devised to uneart meaningful insights and efficiently manage large volumes of data.

Research objectives and methods

The PhD student will work on the study, design and development of proper data models and novel solutions for the integration, storage, management and analysis of big volumes of heterogeneous data collections in complex application domains. The research activity involves multidisciplinary knowledge and skills including database, machine learning and artificial intelligence algorithms, and advanced programming. 
 
Different application contexts will be considered to highlight a wide range of data modeling and analysis problems, and thus lead to the study of innovative solutions. The objectives of the research activity consist in identifying the peculiar characteristics and challenges of each considered application domain and devise novel solutions for the modelling, management and analysis of data for each domain. Example scenarios are urban context and in particular urban mobility, and the medical domain. 

 More in detail, the following challenges will be addressed during the PhD:
 
1. Modeling Heterogeneous Data: Design innovative approaches for modeling heterogeneous data, including structured and unstructured data from different sources, integrating them into a single coherent framework. The experience gained on data modeling in different application contexts can lead to the realization of a Computer-Aided Software Engineering (CASE) tool that guides the user through the design process, reducing design time and improving the quality of the modeling result.
 
2. Innovative algorithms for data analytics. Study, design, and implementation of innovative machine learning algorithms, with a primary emphasis on clustering and classification tasks. The objective is to overcome limitations of current approaches, enhancing their accuracy, scalability, and ability to deal with heterogeneous data collections. 
 
3. Scalable Learning: Investigate scalable learning techniques to address the increasing complexity and volume of data for achieving optimal performance in big data environments. This research is indeed driven by the growing demand to develop machine learning systems capable of dynamically adapting to the increasing complexity of data and models. For recent machine learning/AI applications, it is crucial to propose innovative models capable of handling large volumes of data with parallel and scalable solutions.
 
The research activity will be organized as follows.
1st Year. The PhD student will start considering a first reference application domain (for example the urban scenario) and a first reference use case in this scenario (for example urban mobility). The PhD student will review the recent literature in the selected use case to (i) identify the most relevant open research issues, (ii) identify the most relevant data analysis perspectives for gaining useful insights, and (iii) assess of main data analysis issues. The PhD student will perform an exploratory evaluation of state-of-the-art technologies and methods on the considered domain, and she/he will present a preliminary proposal for the optimization techniques of these approaches.
 
2nd and 3rd Year. Based on the results of the 1st year activity, the PhD student will design and develop a suitable framework including innovative data analytics solutions to efficiently model data in the considered use case and extract useful knowledge, aimed at overcoming weaknesses of state-of-the-art methods.
 
Moreover, during the 2nd and 3rd year, the student will progressively consider a larger spectrum of application domains. The student will evaluate if and how his/her proposed solutions can be applied to the new considered domains as well as he/she will propose novel analytics solutions.
 
During the PhD, the student will have the opportunity to cooperate in the development of solutions applied to the research project on smart cities (e.g., PRIN project on the development of an atlas for historic buildings in an urban context). The student will also complete his/her background by attending relevant courses. The student will participate to conferences presenting the results of his/her research activity.
 
Possible pubblication venues includes international journals such as IEEE Transactions on Intelligent Transportation Systems, Information Systems Frontiers (Springer), Information sciences (Elsevier), and international conferences such as IEEE Big data, 
ACM Inter. Conf. on Information & Knowledge Management (CIKM), IEEE International Conference on Data Mining (ICDM)

Required skills

The candidate should have good programming skills, and competencies in data modelling and techniques for data analysis.

48

Functional Safety Techniques for Automotive oriented Systems-on-Chip

Proposer

Paolo Bernardi

Topics

Computer architectures and Computer aided design

Group website

 

Summary of the proposal

The activities planned for this proposal include efforts toward Functional Safety Techniques for Automotive Systems-on-Chip (SoC):
-         Techniques for developing Software-Based Self-Test (SBST) libraries, which are demanded by standards such as the ISO-26262 and researched by companies in the Automotive Market.
-         Techniques for grading and developing System-level Test (SLT) libraries, which are considered an indispensable final test step and significantly contribute to chip quality.

Research objectives and methods

The phd student will pursue objectives in the broader research field of the Automotive Reliability and Testing.
 
Key enabling factor for this work is the availability of a setup that includes both netlists to be simulated and real silicon chips with development boards to use effectively. 
 
Techniques for developing Software-Based Self-Test libraries 
In this research field, the PhD student will look in the following directions:
o  Creation of benchmarks design to use along the studies.
    1.     RISC5-oriented SoC
    2.     Industrial benchmarks provided by industrial supporters, including netlist and silicon implementation
o  Identification of the current industrial solutions for the development of SBST libraries and improvement of the state-of-the-art
    1.     Usage of currently available tools designed by EDA and classification of strength and weaknesses of commercial tools
    2.     Creation of "wrappers" to evolve EDA ecosystems for custom data collection and improved generation flows
    3.     Investigation on the extension of existing tools by creating ad-hoc tools adding analytical value and accelerating SBST development.
o  Data collection tools to enrich the log capabilities of EDA tools
o  High-level techniques based on silicon

Techniques for grading and developing System-level Test (SLT) libraries
Research in the field of System-Level Test by addressing the open points of this modern testing technique. SLT is a strong demand and quite an obscure subject that mainly concerns the integration of heterogeneous systems. Development of methods for achieving high SLT coverage will include:
o  Coverage computation techniques including simulation and silicon-based methods 
o  Generation of SLT methods that maximize the coverage levels.

The working plan for the PhD student is recalling the objectives drawn in the previous sections. The order is not fixed and may vary according to the advancement during the PhD program.
 
1st year
1.     Benchmarks design including simulation and test synthesis 
2.     Preliminary usage of EDA tools to measure SBST and STL coverage
3.     Identification of weaknesses and plan a list of counteractions
2nd year
1.     Ad-hoc tools preliminary development upon results collected during the 1st year
2.     Codification of methods to increase the coverage and speed up the generation process
3rd year
1.    Exploration of alternative coverage measurements for SBST and SLT 
Completion of a flexible and extensible environment to quick creation of SBST and SLT libraries.

Required skills

C/C++, ASM, Simulation and Fault Simulation, VHDL, Firmware

49

Human-aware robot behaviour learning for HRI

Proposer

Giuseppe Bruno Averta

Topics

Data science, Computer vision and AI, Controls and system engineering

Group website

vandal.polito.it

Summary of the proposal

Humans are naturally multi-task agents, with an innate capability to interact with objects and tools and plan complex sequences of actions to address a specific activity. Advanced robots, on the other side, are still far from such capabilities. The goal of this work is to investigate how to learn from humans the capability to quickly plan and execute complex procedures in unstructured scenarios and transfer such skills to intelligent robots for an effective human-robot cooperation.

Research objectives and methods

This proposed PhD thesis aims to explore the domain of learning human skills from egocentric videos and transferring them to robotic systems. The increasing integration of robots into various aspects of human life highlight the need of developing a more intuitive and adaptive approach to skill acquisition, able to learn complex skills and adapt to various scenarios, where traditional learning paradigms fail. Egocentric videos, captured from a first-person perspective, provide a rich and unique source of contextual information that can enhance the learning process. This research seeks to leverage this unique sensing approach to develop a framework for transferring acquired human skills to robots, enabling them to perform complex tasks in diverse environments. 

Major Objectives:
-      Human Behavior understanding from ego- and exo-vision: Investigate methods for extracting relevant information from egocentric videos, eventually in combination with third person perspective, focusing on understanding human actions, interactions, and environmental cues.
-       Skill Representation: Develop a robust representation of human skills extracted from egocentric videos, considering both spatial and temporal aspects to capture the dynamics of actions.
-       Transfer Learning to Robots: Design a transfer learning framework that enables the adaptation of learned human skills to robotic systems, accounting for differences in morphology, sensors, and actuators.
-       Adaptive/Continual Learning: Explore techniques for adaptive and continual learning, allowing robots to continuously refine their skills through interaction with the environment and human feedback.
-       Real-world Applications: Evaluate the proposed framework in real-world scenarios, such as assistive robotics, industrial automation, and healthcare, to assess its practicality and generalizability.

Methodology:
The research methodology involves a combination of computer vision, machine learning, and robotics techniques. Deep learning models will be employed for ego-vision learning and skill representation, while transfer learning techniques will be investigated to adapt these representations to robotic platforms. Real-world experiments will validate the effectiveness and efficiency of the proposed framework.

Significance:
This research addresses a critical gap in the field of robotics by focusing on intuitive skill transfer from humans to robots, enhancing their adaptability and autonomy. The outcomes of this thesis will contribute to the development of more versatile and capable robotic systems, fostering their integration into various domains.

Keywords: Egocentric Videos, Skill Transfer, Robotics, Transfer Learning, Human-Robot Interaction.

Required skills

Outstanding passion and motivation for research.
Excellent programming skills (python and pytorch) are required.
Experience in deep learning for videos (egocentric or third person video) is required.
Experience with robot learning is not required, although preferred.