Nvidia's DiffUHaul AI Moves Objects in Images Seamlessly
Nvidia researchers have developed DiffUHaul, an AI tool that can relocate objects within images without affecting the background.
Nvidia researchers have developed a new AI tool called DiffUHaul that can relocate objects within images without altering the background or the object's size. This innovative tool addresses the limitations of current text-to-image models by incorporating "spatial reasoning." How DiffUHaul Works Traditional text-to-image models struggle with complex image editing due to a lack of spatial understanding. DiffUHaul overcomes this by: Masking the object: During the denoising process, the object is masked, allowing the AI to understand its position and separate it from the background. Interpolating the difference: The difference between the original and generated image is interpolated to place the object in its new location without modifying the background. Preserving details: Finer details from the original image are transferred to the new image for consistency. DiffUHaul builds upon BlobGEN, a model that uses spatial understanding for image composition from complex prompts. The re…