ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
Counterfactual dataset and bootstrap supervision that significantly improves object removal and insertion., particularly concerning violations of physical laws like occlusions and shadows
In a bid to mitigate the shortcomings of diffusion models in image editing, particularly regarding the violation of physical laws such as occlusions and shadows, a new practical solution has emerged from the analysis of self-supervised approaches' limitations. This project introduces a methodology centered on a "counterfactual" dataset, capturing scenes before and after the removal of single objects while minimizing other changes. The dataset serves as the foundation for fine-tuning diffusion models to not only remove objects but also their effects on the scene, significantly improving object removal accuracy.
The object removal model effectively eradicates objects and their associated scene effects from images. Despite being trained on a relatively modest counterfactual dataset obtained in controlled environments, the model exhibits impressive generalization capabilities to diverse scenarios, seamlessly removing even large objects from images.
Utilization of bootstrap supervision is proposed to address the challenge of photorealistic object insertion. By leveraging the object removal model trained on a small counterfactual dataset, the researchers synthetically expand this dataset, significantly enhancing its size. This approach yields a high-quality object insertion model, outperforming previous methods, particularly in modeling the effects of objects on the scene.
Moreover, leveraging both the object removal and insertion models enables seamless object movement within images. This involves the removal of objects from their original positions and their re-insertion elsewhere, resulting in realistic transformations. By training first on a large synthetic dataset created using the object removal model and then on a high-quality dataset, the object insertion model accurately captures how objects affect their environment, achieving photorealistic results.
Comments
None