Using Generative AI to cover up nadir

kartmann · July 3, 2024, 9:13am

I recently experimented with using Generative AI to cover the nadir with a realistic representation of the ground underneath the camera in my 360 images. The process of editing myself out is quite time consuming, so this is mostly worth it when capturing a small area. This was a fun experiment to see how good generative AI works with imaginative infill on images from a top-down perspective. Turns out at least Firefly from Adobe works pretty well. I also tested out Stable Diffusion XL through Clipdrop and running several Stable Diffusion models locally on my Silicon chip Mac Studio, which did not turn out so well.

Kirsebærlunden Playground - Mapillary

filipc · July 3, 2024, 10:25am

I am still thinking whether I will give this a like.

boris · July 3, 2024, 10:28am

Impressive results!

Eric_S · July 4, 2024, 11:58am

I use Resynthesizer under Gimp for a while to fill missing parts in sky or ground when doing panoramic stitching. That’s nice from an esthetic point of view. Indeed, I would be more carefully when using it for Mapillary in that sens that it is creating non-existing data. It may produce artifact for people wanting to asses road damage or counting steps in stairs.

kartmann · July 4, 2024, 12:26pm

Valid point most definitely @Eric_S.

The semi transparent glass effect is becoming a more popular design element in the world of VR and AR. One modification to the approach used in this post is using Generative AI to remove the photographer and then add a circular “glass” logo to the nadir point that slightly blurs the generated area. This would create a nice “aesthetic” effect and obfuscate the generated area enough to not cause any confusion in the data.

Another approach I would like to explore is how to interpolate the ground underneath the photographer based on pictures taken just a few frames before and after, where the same ground is visible. If anyone knows of a technique or paper exploring this process I am all ears.

@Eric_S what tools and camera do you use for panoramic stitching and have you an automated process for batch stiching images?

Eric_S · July 8, 2024, 6:32pm

For spherical pano, I was using a nodal ninja head with a dslr. Manual stitching with Hugin.

About the ground, if it is more or less flat, you should be able to perform a reprojection using neighbour pictures. With Hugin, you will need to use translation options. Could be complex with the need of using an intermediate reprojection.

filipc · July 10, 2024, 10:34pm

I presumed you did it like that.

seenone · February 1, 2025, 11:58am

I have been working on an automated workflow to cover up the nadir. It is not perfect but very close. It uses rembg birefnet-general-lite to create a mask of the subject (me the cyclist) and iopaint lama to ‘erase’ the subject. What I haven’t solved yet is if it is possible to also automatically remove the shadows.

I used this guide for the polar/depolar side of things How to Add a Custom Nadir to a 360 Photo Programmatically (using ImageMagick) | Trek View

Sample images:

Crop the image so we get the bottom 35% and apply polar distortion so we can get a top-down view of the subject.

collage1200×1920 351 KB
Generate a mask of the subject so that the inpaint model knows where to operate.

collage1200×1920 183 KB
Expand the mask area as the inpaint model works better when the surrounding pixels are also selected.

collage1200×1920 170 KB
Run the inpaint model

collage1200×1920 301 KB
Mask out background and apply blur to cover up imperfections and as suggested by others above, to reduce confusion with non-existent detail

collage1250×2000 233 KB
Apply depolar distortion to prepare for re-combining back into image

collage1920×592 173 KB
Overlay the blurry and inpainted result back into the main image.

collage1920×1568 359 KB

Here is what my script looks like:
You will need imagemagick, iopaint and rembg installed, in my case I installed iopaint and rembg using a Python virtual environment. Lastly at step 6, modify the size of the cropped image to the width of the 360 image and the height * percentage of the cropped section.

import os
import subprocess

# Configurable variables
MAGICK_PATH = "magick"
REMBG_PATH = r"C:\Users\m\Pictures\Nadir fix\.venv\Scripts\rembg"
IOPAINT_PATH = r"C:\Users\m\Pictures\Nadir fix\.venv\Scripts\iopaint"

SKIP_STEP_1 = False
SKIP_STEP_2 = False
SKIP_STEP_3 = False
SKIP_STEP_4 = False
SKIP_STEP_5 = False
SKIP_STEP_6 = False
SKIP_STEP_7 = False

def ensure_folder_exists(folder):
    os.makedirs(folder, exist_ok=True)

def run_command(command):
    subprocess.run(command, check=True)

def process_images():
    # Define folders
    input_folder = "in"
    step1_folder = "1"
    step2_folder = "2"
    step3_folder = "3"
    step4_folder = "4"
    step5_folder = "5"
    step6_folder = "6"

    output_folder = "out"
    
    # Ensure required folders exist
    for folder in [step1_folder, step2_folder, step3_folder, step4_folder, step5_folder, step6_folder, output_folder]:
        ensure_folder_exists(folder)

    # Step 1: Process input images with ImageMagick
    if not SKIP_STEP_1:
        print("Running Step 1: Processing input images...")
        run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step1_folder, "-gravity", "North", "-chop", "0x65%", "-flop", "-flip", "-geometry", "1500x1500!", "-distort", "Polar", "0", "-format", "png", os.path.join(input_folder, "*.png")])

    # Step 2: Remove background
    if not SKIP_STEP_2:
        print("Running Step 2: Removing background...")
        run_command([REMBG_PATH, "p", step1_folder, step2_folder, "-m", "birefnet-general-lite", "-ppm", "-om"])

    # Step 3: Apply dilation using ImageMagick
    if not SKIP_STEP_3:
        print("Running Step 3: Applying dilation...")
        run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step3_folder, "-morphology", "Dilate", "Diamond:40", os.path.join(step2_folder, "*.png")])

    # Step 4: Run inpainting
    if not SKIP_STEP_4:
        print("Running Step 4: Running inpainting...")
        run_command([IOPAINT_PATH, "run", "--image", step1_folder, "--model", "lama", "--device=cuda", "--mask", step3_folder, "--output", step4_folder])

    # Step 5: Apply blurring using masks
    if not SKIP_STEP_5:
        print("Running Step 5: Applying blurring...")
        for filename in os.listdir(step4_folder):
            input_path = os.path.join(step4_folder, filename)
            mask_path = os.path.join(step3_folder, filename)
            output_path = os.path.join(step5_folder, filename)
            
            if os.path.exists(mask_path):
                run_command([MAGICK_PATH, input_path, mask_path, "-alpha", "off", "-compose", "CopyOpacity", "-composite", "-blur", "0x3", output_path])
            else:
                print(f"Mask for {filename} not found, skipping...")

    # Step 6: Apply depolar distortion
    if not SKIP_STEP_6:
        print("Running Step 6: Applying depolar distortion...")
        run_command([MAGICK_PATH, "mogrify", "-monitor", "-path", step6_folder, "-distort", "Depolar", "0", "-flip", "-flop", "-geometry", "13888x2430!", "-format", "png", os.path.join(step5_folder, "*.png")])

    # Step 7: Composite images onto the original background
    if not SKIP_STEP_7:
        print("Running Step 7: Compositing images...")
        total_files = len(os.listdir(input_folder))
        processed_files = 0
        for filename in os.listdir(input_folder):
            background_path = os.path.join(input_folder, filename)
            overlay_path = os.path.join(step6_folder, filename)
            output_filename = os.path.splitext(filename)[0] + ".jpg"
            output_path = os.path.join(output_folder, output_filename)
            
            if os.path.exists(overlay_path):
                run_command([MAGICK_PATH, "composite", "-gravity", "south", overlay_path, background_path, output_path])
                processed_files += 1
                percentage = (processed_files / total_files) * 100
                print(f"Processed {processed_files}/{total_files} ({percentage:.2f}%)")
            else:
                print(f"Overlay for {filename} not found, skipping...")
    
    print("Done! Press any key to close...")
    input()

if __name__ == "__main__":
    process_images()

GITNE · February 2, 2025, 4:28am

Yeah, but also scary.

I am not sure mappers really need this kind of stuff.

Exactly my line of thinking. What do we capture? Or, do we just fill in gaps, where depending on the definition a gap may be the major or minor part of an image? Lets say AI advances to a point where we capture just a handful of images over a kilometer of road and AI fills in the rest with thousands of synthetic images. Have we then captured anything at all? What did we capture? If we did capture then why did we capture anything in the first place?

Do we need AI for this? Is this a smart use of AI?

I think this might be a better approach for the mapping use case (though something similar already happens during reconstruction but does not get rendered into an image). And, you could use AI for this too. This approach would also produce more truthful images instead of synthetic electric sheep dreams. So, things very much depend on the AI model being used.

@boris Doesn’t Mapillary actually already use @seenone’s approach before reconstruction by masking non‑static objects (pedestrians, cyclists, cars, etc) through segmentation?

boris · February 3, 2025, 12:43pm

cc: @manuelknott

kartmann · February 3, 2025, 1:13pm

amazing work @seenone I used the Trek View guide too, but you have taken this to the text level with segmentation of the cyclist masking them out. I am not familiar with birefnet-general-lite. Is it a branch of Meta’s Segment Anything?

Topic		Replies	Views
We're launching photorealistic Neural Radiance Fields (NeRF) at Mapillary! Web and notifications	29	1651	January 26, 2025
[FEATURE request] remove the person/cycle/car in 360 images Imagery, data, and integrations	0	497	August 7, 2019
3d world generation with 360 degree camera 360° cameras	18	2966	October 28, 2024
Remove unwanted areas of 360 Photos Contributing and equipment	7	760	April 12, 2024
Gopro Fusion Workflow 360° cameras	56	6670	August 20, 2019

Using Generative AI to cover up nadir

Related topics