University of Texas develops AI framework to avoid copyright infringement

Education
Webp 524b0nietgg2oxx0ac0kl4dvh13i
Jay hartzell President | University of Texas at Austin

Artificial intelligence (AI) models, while powerful, have been known to falter by either generating false information or replicating others' work. To address the latter issue, a team of researchers at The University of Texas at Austin has developed a framework that trains AI models on images so distorted they are unrecognizable.

The AI models DALL-E, Midjourney, and Stable Diffusion, which convert user text into realistic images, are currently facing lawsuits from artists who claim the generated samples mimic their work. These models have been trained on billions of image-text pairs that are not publicly accessible. They can generate high-quality imagery from textual prompts but may inadvertently replicate copyrighted images.

The newly proposed framework, named Ambient Diffusion, circumvents this problem by training diffusion models using only corrupted image-based data. Preliminary efforts indicate that the framework can continue to generate high-quality samples without ever accessing anything recognizable as the original source images.

Ambient Diffusion was first introduced at NeurIPS, a leading machine-learning conference in 2023. It has since been adapted and expanded upon. The follow-up paper titled "Consistent Diffusion Meets Tweedie" was accepted for presentation at the 2024 International Conference on Machine Learning. In collaboration with Constantinos Daskalakis from the Massachusetts Institute of Technology, the team broadened the framework to train diffusion models on datasets of images corrupted by various types of noise and larger datasets.

"The framework could prove useful for scientific and medical applications," said Adam Klivans, a professor of computer science involved in the research. He added that it would be beneficial for any research where obtaining an uncorrupted dataset is costly or impossible - such as black hole imaging or certain types of MRI scans.

Klivans; Alex Dimakis, a professor of electrical and computer engineering; and other collaborators from the multi-institution Institute for Foundations of Machine Learning directed by these two UT faculty members, first experimented by training a diffusion model on a set of 3,000 celebrity images. They then used this model to generate new samples. The diffusion model trained on clean data copied the training examples outright. However, when the researchers corrupted the training data by randomly masking up to 90% of individual pixels in an image and retrained the model using their new approach, the generated samples remained high quality but appeared significantly different.

"Our framework allows for controlling the trade-off between memorization and performance," said Giannis Daras, a computer science graduate student who led the work. He added that as the level of corruption encountered during training increases, the memorization of the training set decreases.

The researchers believe this indicates a solution that may alter performance but will never output noise. This framework demonstrates how academic researchers are advancing artificial intelligence to meet societal needs - a key theme at The University of Texas at Austin, which has declared 2024 as the "Year of AI."

The research team included members from the University of California, Berkeley and MIT. Funding for this research was provided by various sources including the National Science Foundation, Western Digital, Amazon, Cisco and fellowship and endowment support for the authors.