Research database

AI4Media - Controllable AnonyMizatiOn throUgh diFfusion-based image coLlection GEneration

Duration:
12 months (2024)
Principal investigator(s):
Project type:
UE-funded research
Funding body:
COMMISSIONE EUROPEA (https://www.ai4media.eu/open-call-2/)
Project identification number:
PoliTo role:
Sole Contractor

Abstract

Current social media generate a tremendous amount of visual material, that can be exploited by researchers operating in social media research, digital humanities, and marketing. However, privacy regulations impose significant restrictions to both data collection and sharing. The CAMOUFLAGE project aims at exploiting recent advances in controlled image synthesis to generate a synthetic version of an image corpora with similar characteristics to a target collection, while at the same time removing all personally identifiable information to ensure the anonymity of the user who published the image. Solving this ambitious goal will require tackling three distinct, yet related, research objectives: to design and implement controllable image synthesis that retains the visual and semantic content of a target image; to determine whether the resulting synthetic images can be considered successfully anonymized; and whether the synthetic collection is semantically equivalent to the original collection. The CAMOUFLAGE synthesizer will be based on diffusion models that extract some non-sensitive data from the original image and exploit it to force the model to preserve the composition of the image, under a predetermined measure of “equivalence”, while removing personal identifiers. Of course, the notion of “equivalence” depends on the objectives and needs of the users: ideally, we wish that conclusions drawn on the synthetic dataset would be valid on the original collection as well. As a motivating example and case study, CAMOUFLAGE will focus on the semiotic analysis of visual big data, specifically of a collection of profile pictures, tagged with socio-demographic data, acquired from Facebook and Instagram. Difference analysis scenarios will be considered, from the large-scale automatic extraction of quantitative information with pre-trained neural networks, to the visual analysis by expert semioticians. If successful, CAMOUFLAGE will not only deliver a useful tool and anonymized assets to the community, but may also bring novel insights into the existing limitations and biases of generative models.

Structures

Keywords

ERC sectors

PE6_8 - Computer graphics, computer vision, multi media, computer games

Sustainable Development Goals

Obiettivo 9. Costruire un'infrastruttura resiliente e promuovere l'innovazione ed una industrializzazione equa, responsabile e sostenibile

Budget

Total cost: € 49,750.00
Total contribution: € 49,750.00
PoliTo total cost: € 49,750.00
PoliTo contribution: € 49,750.00