Stable Diffusion 2.0 now features the Depth2Image model, a powerful tool for image editing. This model is conditioned on monocular depth estimates inferred via MiDaS ( and can be used to create structure-preserving transformations of images as well as shape-conditional synthesis. Rather than just adding noise to the original image, Depth2Image first infers the depth of an image and creates a “depth-mask”; this mask is then used to preserve structural coherence in new results that look different from the source but still feature realistic depth information. With these capabilities, Depth2Image opens up exciting new possibilities for creative applications!

