CAMOUFLAGE - Controllable AnonyMizatiOn throUgh diFfusion-based image coLlection GEneration
Durata:
Responsabile scientifico:
Tipo di progetto:
Ente finanziatore:
Codice identificativo progetto:
Ruolo PoliTo:
Abstract
Current social media generate a tremendous amount of visual material, that can be exploited by researchers operating in social media research, digital humanities, and marketing. However, privacy regulations impose significant restrictions to both data collection and sharing. The CAMOUFLAGE project aims at exploiting recent advances in controlled image synthesis to generate a synthetic version of an image corpora with similar characteristics to a target collection, while at the same time removing all personally identifiable information to ensure the anonymity of the user who published the image. Solving this ambitious goal will require tackling three distinct, yet related, research objectives: to design and implement controllable image synthesis that retains the visual and semantic content of a target image; to determine whether the resulting synthetic images can be considered successfully anonymized; and whether the synthetic collection is semantically equivalent to the original collection. The CAMOUFLAGE synthesizer will be based on diffusion models that extract some non-sensitive data from the original image and exploit it to force the model to preserve the composition of the image, under a predetermined measure of “equivalence”, while removing personal identifiers. Of course, the notion of “equivalence” depends on the objectives and needs of the users: ideally, we wish that conclusions drawn on the synthetic dataset would be valid on the original collection as well. As a motivating example and case study, CAMOUFLAGE will focus on the semiotic analysis of visual big data, specifically of a collection of profile pictures, tagged with socio-demographic data, acquired from Facebook and Instagram. Difference analysis scenarios will be considered, from the large-scale automatic extraction of quantitative information with pre-trained neural networks, to the visual analysis by expert semioticians. If successful, CAMOUFLAGE will not only deliver a useful tool and anonymized assets to the community, but may also bring novel insights into the existing limitations and biases of generative models.
Strutture coinvolte
Parole chiave
Settori ERC
Obiettivi di Sviluppo Sostenibile (Sustainable Development Goals)
Budget
Costo totale progetto: | € 49.750,00 |
---|---|
Contributo totale progetto: | € 49.750,00 |
Costo totale PoliTo: | € 49.750,00 |
Contributo PoliTo: | € 49.750,00 |