Dottorato in Ingegneria Informatica E Dei Sistemi , 34^o ciclo (2018-2021)

Dottorato concluso nel 2022

Profilo

Argomento di ricerca

Dissecting Deep Language Models: The Explainability and Bias Perspective

Interessi di ricerca

Life sciences

Biografia

I’m a third-year Ph.D. student at the Department of Control and Computer Engineering of Polytechnic of Turin. Presently, I work on the understanding and regularization of large Language Models in the context of bias and fairness application. In the past, I worked on modeling and forecasting financial time series.
I currently live in Turin. I love reading (Sci-Fi, please) and playing basketball, while circumstances led me to discover that I’m not so bad at cooking. I also like DIY and automating boring stuff. Besides that, I am a passionate learner. I spend countless hours on lectures and tutorials about languages, frameworks, and technologies which I deem interesting.

Personal website: https://gattanasio.cc

Premi e riconoscimenti

Course: The Fourth Industrial Revolution: Promises and Pitfalls in Blending New and Traditional Approaches in Manufacturing and Service Sectors – ASP School | Code: 01TGLJZ | Description: I was one of the five tutors for this Alta Scuola Politecnica spring school. Before the School, I contributed to designing the task for the final project. During the school (5 working days), I held a lesson on data analytics and business intelligence on real-world textual reviews from the IKEA website. Additionally, I supervised students working on their group project. | Official role: tutor to the school | Official hours: A.Y. 18/19: 30h | File attached: program of the School (2021)
Award: During the HuggingFace Flax/JAX Community Week, we had access to a Google TPU that allowed us to train CLIP-Italian, the first of its kind Italian version of OpenAI’s CLIP model. This project has won a special nomination from the jury and was also included in the top-15 projects of the challenge out of 100. The published model becomes state of the art for connecting images and text in tasks like zero-shot classification and image retrieval. The trained architecture is publicly available on the HuggingFace Hub. | Event: HuggingFace Flax/JAX Community Week in collaboration with Google. | File attached: a screenshot of two never-seen images retrieved by our trained CLIP from the Unsplash 25K dataset | arxiv: https://arxiv.org/abs/2108.08688 (2021)
Course: Basi di dati | Code: 14AFQOA | Description: I have been a teaching assistant to the course laboratories. | Official role: teaching assistant to laboratories | Official hours: A.Y. 18/19: 21h | File attached: placeholder file (2021)
Course: Business Intelligence per big data | Code: 01RLBNG, 01RLBPG | Description: I have been a teaching assistant to the course laboratories. I have also designed the task for the final exam project, allowing students to study and process multi-lingual real-world tweets concerning COVID-19 diffusion | Official role: teaching assistant to laboratories | Official hours: A.Y. 19/20: 21h, A.Y. 20/21: 21h | File attached: a portion of the dataset students faced in the final project (it ends with .pdf but it is a .csv in reality since the web app wouldn't allow it :) ) (2021)
Course: Data: Theory | Code: 01VGFTJ | Description: I have designed and conducted a short introductory tutorial on Data Visualization with Python. The proposed exercises make use of NumPy, Pandas, and introduce Matplotlib, Seaborn, and Plotly basics for quick data exploration and visualization. A non-solved (with exercises) and a solved version are available on GitHub as Jupyter Notebooks. | Official role: lecturer | Official hours: 3h | File attached: the laboratory handout with solutions (2021)
Dissemination: I presented my research activity on associative classification applied to quantitative stock trading to the members and steering board of the SmartData interdepartmental center. | Event: 4th SmartData@PoliTO Workshop | File: slides of the talk (2021)
Dissemination: I will be presenting my research activity on CLIP-Italian with a talk entitled "Connecting Images and Italian Text" to the workshop "Dati, AI e Robotica @polito" held on the 29th of September, 2021. | Event: "Dati, AI e Robotica @polito" Workshop | File: program of the event (2021)
Course: Data Science Lab: process and methods | Code: 01TWZSM | Description: I have been a major contributor in shaping the course Data Science Lab: process and methods, launched in September 2019. The course is the first introduction to the Python programming language and the basics of Data Science and Machine Learning libraries for MD students. Flavio Giobergia and I worked to provide students with comprehensive exercises and solutions (10 laboratories, for a total of 60+ pages of lab exercises and 250+ pages of solutions). All the material is freely available on the course website. | Official role: teaching assistant to laboratories | Official hours: A.Y. 19/20: 39h, A.Y. 20/21: 39h, A.Y. 21/22: 60h (expected) | File attached: one of our lab solutions on clustering techniques. (2021)
Dissemination: I presented my research activity on machine learning techniques to model, forecast, and trade cryptocurrencies to the members and steering board of the SmartData interdepartmental center. | Event: 5th SmartData@PoliTO Workshop | File: slides of the talk (2021)
Dissemination: In this talk, I present a chronological overview of modeling solutions for time series and natural language. I go through seminal papers on the applications of recurrent models to tasks like language modeling, neural machine translation, image captioning, or multi-model text generation. Next, I describe the advantages and shortcomings of RNNs, motivating why the latter brought the advent of the Attention mechanism. I conclude with a brief introduction to the Transformer architecture. The talk is part of Research Bites, the series of research-oriented seminaries we organized for students of the course Data Science Lab: process and methods. | File attached: slides of the talk | Video: https://youtu.be/v4dqhP6HVns (2021)

Mostra di piùMostra meno

Didattica

Insegnamenti

Corso di laurea magistrale

Business intelligence per big data. A.A. 2019/20, INGEGNERIA GESTIONALE. Collaboratore del corso
Business intelligence per big data. A.A. 2020/21, INGEGNERIA GESTIONALE. Collaboratore del corso
Data science lab: process and methods. A.A. 2019/20, DATA SCIENCE AND ENGINEERING. Collaboratore del corso
Data science lab: process and methods. A.A. 2020/21, DATA SCIENCE AND ENGINEERING. Collaboratore del corso
Data science lab: process and methods. A.A. 2021/22, DATA SCIENCE AND ENGINEERING. Collaboratore del corso

MostraNascondi A.A. passati

Corso di laurea di 1° livello

Basi di dati. A.A. 2018/19, INGEGNERIA INFORMATICA. Collaboratore del corso

MostraNascondi A.A. passati

Pubblicazioni

Pubblicazioni durante il dottorato Vedi tutte le pubblicazioni su Porto@Iris

Raus, Rachele; Tonti, Michela; Cerquitelli, Tania; Cagliero, Luca; Attanasio, Giuseppe; ... (2022)
L’analyse du discours et l’intelligence artificielle pour réaliser une écriture inclusive : le projet EMIMIC. In: 8e Congrès Mondial de Linguistique Française. ISSN 2261-2424
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Nozza, Debora; Hovy, Dirk; Baralis, Elena (2022)
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists. In: Association for Computational Linguistics, pp. 1105-1119
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Nozza, Debora; Bianchi, Federico (2022)
MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities. In: 16th International Workshops on Semantic Evaluation (SemEval), Seattle, WA (USA), July 14-15, 2022, pp. 654-662. ISBN: 978-1-955917-80-3
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe (2022)
Dissecting Deep Language Models: The Explainability and Bias Perspective. relatore: BARALIS, ELENA MARIA; , 34. XXXIV Ciclo, P.: 132
Doctoral Thesis
Bellocca, Gian Pietro; Attanasio, Giuseppe; Cagliero, Luca; Fior, Jacopo (2022)
Leveraging the momentum effect in machine learning-based cryptocurrency trading. In: MACHINE LEARNING WITH APPLICATIONS, vol. 8. ISSN 2666-8270
Contributo su Rivista
Attanasio, Giuseppe; Nozza, Debora; Pastor, Eliana; Hovy, Dirk (2022)
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection. In: Association for Computational Linguistics, pp. 100-112
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Greco, Salvatore; LA QUATRA, Moreno; Cagliero, Luca; Tonti, ... (2021)
E-MIMIC: Empowering Multilingual Inclusive Communication. In: First International Workshop on Data science for equality, inclusion and well-being challenges, Virtual, Online, 15-18 December 2021, pp. 4227-4234
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Giobergia, Flavio; Pasini, Andrea; Ventura, Francesco; Baralis, ... (2020)
DSLE: A Smart Platform for Designing Data Science Competitions. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid (Spain), July 13-17, pp. 133-142. ISBN: 978-1-7281-7303-0
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Cagliero, Luca; Baralis, Elena (2020)
Leveraging the explainability of associative classifiers to support quantitative stock trading. In: Sixth International Workshop on Data Science for Macro-Modeling, Portland, OR, USA, June 14, 2020, pp. 1-6. ISBN: 9781450380300
Contributo in Atti di Convegno (Proceeding)
Cagliero, Luca; Garza, Paolo; Attanasio, Giuseppe; Baralis, Elena (2020)
Training ensembles of faceted classification models for quantitative stock trading. In: COMPUTING, vol. 102, pp. 1213-1225. ISSN 0010-485X
Contributo su Rivista
Attanasio, Giuseppe; Pastor, Eliana (2020)
PoliTeam@AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets. In: Evaluation Campaign of Natural Language Processing and Speech tools for Italian, Online, 17 December 2020
Contributo in Atti di Convegno (Proceeding)
Attanasio, Giuseppe; Cagliero, Luca; Garza, Paolo; Baralis, Elena (2019)
Quantitative cryptocurrency trading: exploring the use of machine learning techniques. In: 5th workshop on data science for macro-modeling with financial and economic datasets, Amsterdam, Netherlands, June 30 - July 5, 2019, pp. 1-6. ISBN: 9781450368230
Contributo in Atti di Convegno (Proceeding)
Cagliero, Luca; Attanasio, Giuseppe; Garza, Paolo; Baralis, ELENA MARIA (2019)
Combining news sentiment and technical analysis to predict stock trend reversal. In: 9th ICDM Workshop on Sentiment Elicitation from Natural Text for Information Retrieval and Extraction, Beijing, Cina, November 8, 2019, pp. 514-521
Contributo in Atti di Convegno (Proceeding)

Mostra altre pubblicazioniMostra meno pubblicazioni

Giuseppe Attanasio

Dottorato in Ingegneria Informatica E Dei Sistemi , 34o ciclo (2018-2021)

Tesi:

Tutori:

Presentazione della ricerca: