#Gate 2025 Semi-Year Community Gala# voting is in progress! 🔥
Gate Square TOP 40 Creator Leaderboard is out
🙌 Vote to support your favorite creators: www.gate.com/activities/community-vote
Earn Votes by completing daily [Square] tasks. 30 delivered Votes = 1 lucky draw chance!
🎁 Win prizes like iPhone 16 Pro Max, Golden Bull Sculpture, Futures Voucher, and hot tokens.
The more you support, the higher your chances!
Vote to support creators now and win big!
https://www.gate.com/announcements/article/45974
From mosaics to high-definition images, AI’s ability to generate images has become stronger, but how to achieve a balance between beauty and distortion?
In suspense and science fiction works, we often see this scene: a blurry photo is displayed on the computer screen, and then the investigator asks to enhance the image, and then the image magically becomes clear, revealing important clues.
This looks great, but it has been a completely fictional plot for decades. It was difficult to do even during the period when AI generative capabilities started to grow: "If you just zoomed in on the image, it would become blurry. There would be a lot of detail, but it would be all wrong," Nvidia applies deep learning said Bryan Catanzaro, Vice President of Research.
However, researchers have recently begun incorporating AI algorithms into image enhancement tools, making the process easier and more powerful, but there are still limitations to the data that can be retrieved from any image. But as researchers continue to push the boundaries of enhanced algorithms, they are finding new ways to deal with these limitations and even finding ways to overcome them.
Over the past decade, researchers have begun enhancing images using generative adversarial network (GAN) models, which are capable of producing detailed and impressive pictures.
"The images suddenly looked much better," says Tomer Michaeli, an electrical engineer at the Teonion Institute of Technology in Israel. But he was also surprised to find that the images generated by the GAN showed high levels of distortion, a measure of the enhanced image. Proximity to the underlying reality being displayed. The images generated by GANs look beautiful and natural, but in fact they are "fictionalizing" or "fantasizing" inaccurate details, which leads to a high degree of distortion.
Michaeli observes that the photo restoration field falls into two broad categories: One showcases beautiful images, many of which are generated by GANs. The other shows the data but not many pictures because it doesn’t look good.
In 2017, Michaeli and his graduate student Yochai Blau more formally explored the performance of various image enhancement algorithms on distortion and perceptual quality, using known measures of perceptual quality that correlate with human subjective judgment. As Michaeli expected, the visual quality of some algorithms is very high, while others are very accurate with very low distortion. But no one offers the best of both worlds, you have to choose one over the other. This is called the perceptual distortion trade-off.
Michaeli also challenged other researchers to come up with algorithms that produce the best image quality at a given level of distortion, allowing for a fair comparison between algorithms for pretty pictures and algorithms for good statistics. Since then, hundreds of AI researchers have raised concerns about the distortion and perceptual quality of their algorithms, citing Michaeli and Blau's paper describing this trade-off.
Sometimes the effects of the perceptual distortion trade-off aren't that scary. For example, Nvidia found that high-definition screens could not render some low-definition visual content well, so in February 2023, it launched a tool that uses deep learning to improve the quality of streaming videos. In this case, Nvidia's engineers chose perceptual quality over accuracy, accepting the fact that when the algorithm upscales a video's resolution, it generates some visual detail not present in the original video.
"The model is fantasizing. It's pure speculation," Catanzaro said. “It doesn’t matter if the super-resolution model guesses wrong most of the time, as long as it’s consistent.”
In particular, applications in research and medicine will require greater accuracy. AI technology has made significant progress in imaging, but "it sometimes has undesirable side effects, such as overfitting or adding false features, so it needs to be treated with extreme caution," said Junjie Yao, a biomedical engineer at Duke University.
Last year, he described in his paper how AI tools could be used to improve existing measurements of brain blood flow and metabolism while operating safely on the accurate side of the perceptual distortion trade-off.
One way to get around the limitations of how much data can be extracted from an image is to simply merge data from more images. Previously, researchers studying the environment through satellite imagery have made some progress in integrating visual data from different sources: In 2021, researchers in China and the United Kingdom fused data from two different types of satellites to better to observe deforestation in the Congo Basin. The Congo Basin is the second largest tropical rainforest in the world and one of the most biologically diverse regions. The researchers took data from two Landsat satellites, which have been measuring deforestation for decades, and used deep learning techniques to improve the resolution of the images from 30 meters to 10 meters. They then fused this set of images with data from two Sentinel-2 satellites, which have slightly different detector arrays. Their experiments show that this combined image "enables the detection of 11% to 21% more disturbed areas than when using Sentinel-2 or Landsat-7/8 images alone."
If a direct breakthrough is not possible, Michaeli proposes another method of hard-limiting the availability of information. Rather than seeking a definitive answer on how to enhance a low-quality image, the model can be shown multiple different interpretations of the original image. In the paper "Explorable Super Resolution," he shows how an image enhancement tool can provide multiple suggestions to the user. A blurry, low-resolution image of a person wearing what appears to be a gray shirt can be reconstructed into a higher-resolution image in which the shirt can be black and white vertical stripes, horizontal stripes, or plaid, all with equal plausibility .
We can mitigate these illusions, but that powerful, crime-solving "boost" button remains a dream.
In different fields, various disciplines address the perceptual distortion trade-off in their own way. How much information can be extracted from AI images and the extent to which these images can be trusted remain core questions.
“We should keep in mind that the algorithm is just making up the details in order to output these beautiful images,” Michaeli said.
Original link: