I recently experimented with using Generative AI to cover the nadir with a realistic representation of the ground underneath the camera in my 360 images. The process of editing myself out is quite time consuming, so this is mostly worth it when capturing a small area. This was a fun experiment to see how good generative AI works with imaginative infill on images from a top-down perspective. Turns out at least Firefly from Adobe works pretty well. I also tested out Stable Diffusion XL through Clipdrop and running several Stable Diffusion models locally on my Silicon chip Mac Studio, which did not turn out so well.
I am still thinking whether I will give this a like.
Impressive results!
I use Resynthesizer under Gimp for a while to fill missing parts in sky or ground when doing panoramic stitching. Thatâs nice from an esthetic point of view. Indeed, I would be more carefully when using it for Mapillary in that sens that it is creating non-existing data. It may produce artifact for people wanting to asses road damage or counting steps in stairs.
Valid point most definitely @Eric_S.
The semi transparent glass effect is becoming a more popular design element in the world of VR and AR. One modification to the approach used in this post is using Generative AI to remove the photographer and then add a circular âglassâ logo to the nadir point that slightly blurs the generated area. This would create a nice âaestheticâ effect and obfuscate the generated area enough to not cause any confusion in the data.
Another approach I would like to explore is how to interpolate the ground underneath the photographer based on pictures taken just a few frames before and after, where the same ground is visible. If anyone knows of a technique or paper exploring this process I am all ears.
@Eric_S what tools and camera do you use for panoramic stitching and have you an automated process for batch stiching images?
For spherical pano, I was using a nodal ninja head with a dslr. Manual stitching with Hugin.
About the ground, if it is more or less flat, you should be able to perform a reprojection using neighbour pictures. With Hugin, you will need to use translation options. Could be complex with the need of using an intermediate reprojection.
I presumed you did it like that.
I have been working on an automated workflow to cover up the nadir. It is not perfect but very close. It uses rembg birefnet-general-lite to create a mask of the subject (me the cyclist) and iopaint lama to âeraseâ the subject. What I havenât solved yet is if it is possible to also automatically remove the shadows.
I used this guide for the polar/depolar side of things How to Add a Custom Nadir to a 360 Photo Programmatically (using ImageMagick) | Trek View
Sample images:
-
Crop the image so we get the bottom 35% and apply polar distortion so we can get a top-down view of the subject.
-
Generate a mask of the subject so that the inpaint model knows where to operate.
-
Expand the mask area as the inpaint model works better when the surrounding pixels are also selected.
-
Run the inpaint model
-
Mask out background and apply blur to cover up imperfections and as suggested by others above, to reduce confusion with non-existent detail
-
Apply depolar distortion to prepare for re-combining back into image
-
Overlay the blurry and inpainted result back into the main image.
Here is what my script looks like:
You will need imagemagick, iopaint and rembg installed, in my case I installed iopaint and rembg using a Python virtual environment. Lastly at step 6, modify the size of the cropped image to the width of the 360 image and the height * percentage of the cropped section.
import os
import subprocess
# Configurable variables
MAGICK_PATH = "magick"
REMBG_PATH = r"C:\Users\m\Pictures\Nadir fix\.venv\Scripts\rembg"
IOPAINT_PATH = r"C:\Users\m\Pictures\Nadir fix\.venv\Scripts\iopaint"
SKIP_STEP_1 = False
SKIP_STEP_2 = False
SKIP_STEP_3 = False
SKIP_STEP_4 = False
SKIP_STEP_5 = False
SKIP_STEP_6 = False
SKIP_STEP_7 = False
def ensure_folder_exists(folder):
os.makedirs(folder, exist_ok=True)
def run_command(command):
subprocess.run(command, check=True)
def process_images():
# Define folders
input_folder = "in"
step1_folder = "1"
step2_folder = "2"
step3_folder = "3"
step4_folder = "4"
step5_folder = "5"
step6_folder = "6"
output_folder = "out"
# Ensure required folders exist
for folder in [step1_folder, step2_folder, step3_folder, step4_folder, step5_folder, step6_folder, output_folder]:
ensure_folder_exists(folder)
# Step 1: Process input images with ImageMagick
if not SKIP_STEP_1:
print("Running Step 1: Processing input images...")
run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step1_folder, "-gravity", "North", "-chop", "0x65%", "-flop", "-flip", "-geometry", "1500x1500!", "-distort", "Polar", "0", "-format", "png", os.path.join(input_folder, "*.png")])
# Step 2: Remove background
if not SKIP_STEP_2:
print("Running Step 2: Removing background...")
run_command([REMBG_PATH, "p", step1_folder, step2_folder, "-m", "birefnet-general-lite", "-ppm", "-om"])
# Step 3: Apply dilation using ImageMagick
if not SKIP_STEP_3:
print("Running Step 3: Applying dilation...")
run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step3_folder, "-morphology", "Dilate", "Diamond:40", os.path.join(step2_folder, "*.png")])
# Step 4: Run inpainting
if not SKIP_STEP_4:
print("Running Step 4: Running inpainting...")
run_command([IOPAINT_PATH, "run", "--image", step1_folder, "--model", "lama", "--device=cuda", "--mask", step3_folder, "--output", step4_folder])
# Step 5: Apply blurring using masks
if not SKIP_STEP_5:
print("Running Step 5: Applying blurring...")
for filename in os.listdir(step4_folder):
input_path = os.path.join(step4_folder, filename)
mask_path = os.path.join(step3_folder, filename)
output_path = os.path.join(step5_folder, filename)
if os.path.exists(mask_path):
run_command([MAGICK_PATH, input_path, mask_path, "-alpha", "off", "-compose", "CopyOpacity", "-composite", "-blur", "0x3", output_path])
else:
print(f"Mask for {filename} not found, skipping...")
# Step 6: Apply depolar distortion
if not SKIP_STEP_6:
print("Running Step 6: Applying depolar distortion...")
run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step6_folder, "-distort", "Depolar", "0", "-flip", "-flop", "-geometry", "13888x2430!", "-format", "png", os.path.join(step5_folder, "*.png")])
# Step 7: Composite images onto the original background
if not SKIP_STEP_7:
print("Running Step 7: Compositing images...")
total_files = len(os.listdir(input_folder))
processed_files = 0
for filename in os.listdir(input_folder):
background_path = os.path.join(input_folder, filename)
overlay_path = os.path.join(step6_folder, filename)
output_filename = os.path.splitext(filename)[0] + ".jpg"
output_path = os.path.join(output_folder, output_filename)
if os.path.exists(overlay_path):
run_command([MAGICK_PATH, "composite", "-gravity", "south", overlay_path, background_path, output_path])
processed_files += 1
percentage = (processed_files / total_files) * 100
print(f"Processed {processed_files}/{total_files} ({percentage:.2f}%)")
else:
print(f"Overlay for {filename} not found, skipping...")
print("Done! Press any key to close...")
input()
if __name__ == "__main__":
process_images()
Yeah, but also scary.
I am not sure mappers really need this kind of stuff.
Exactly my line of thinking. What do we capture? Or, do we just fill in gaps, where depending on the definition a gap may be the major or minor part of an image? Lets say AI advances to a point where we capture just a handful of images over a kilometer of road and AI fills in the rest with thousands of synthetic images. Have we then captured anything at all? What did we capture? If we did capture then why did we capture anything in the first place?
Do we need AI for this? Is this a smart use of AI?
I think this might be a better approach for the mapping use case (though something similar already happens during reconstruction but does not get rendered into an image). And, you could use AI for this too. This approach would also produce more truthful images instead of synthetic electric sheep dreams. So, things very much depend on the AI model being used.
@boris Doesnât Mapillary actually already use @seenoneâs approach before reconstruction by masking nonâstatic objects (pedestrians, cyclists, cars, etc) through segmentation?
cc: @manuelknott
amazing work @seenone I used the Trek View guide too, but you have taken this to the text level with segmentation of the cyclist masking them out. I am not familiar with birefnet-general-lite. Is it a branch of Metaâs Segment Anything?