Mapillary plus audio - OSM features

bob3bob3 · May 9, 2022, 8:55am

Information of interest only - Narrow audience of those prepared to code, start with a movie input and probably use mapillary_tools

Strictly speaking this is not just about Mapillary imaging, but of taking a local copy prior to upload, generating geotagged/linked (own voice) audio from the same dashcam movie, then using both to add/change OSM features. The audio commentary serves to fill data gaps in the images.

This is just a series of notes rather than a set of scripts etc. Am happy to assist anyone that wishes to implement similar. This has all been done on Debian Linux.
I run a Mapillary supplied BlackVue dashcam eventually uploading my take via the mapillary_tools up to 2-3 months later. Waiting for this for my own OSM work degrades and reduces the result. For some time I had processed and geotagged locally, so I could scroll through (1FPS) images at the end of every day. Unfortunately the camera doesn’t always capture sufficient detail and scrolling through thousands of images takes time!
I would use Geeqie (image viewer) and a “side” text file (gpscorrelate -m) of image filenames and lat/lon, then copy/paste these direct to the OSM ID editor search box.
Some months ago I started talking into the camera whilst driving. Eg street addresses, business names, features in rest areas etc. The effort is about rough cut removing “no data” and being able to find feature add/changes quickly/accurately. My action is now;
Process/geotag images as before, but also extract a (1 minute) wav file (ffmpeg) from the BlackVue mp4. This has the same file naming standard and thus indirectly geotag references against the associated image(s).
Move all images and mp4’s that have no movement (speed=0) for an entire 1 minute block out of future processing.
Run the wav files through a sox bandpass (sinc) filter to reduce road noise.
Run each wav through sox vad and remove those that have no voice on it.
Move the already tagged images in with their one minute block matching wavs.
The directory now only contains 1 minute blocks of audio/wav and images/jpg of voice only and vehicle moving.
All of the above steps are launched by a single bash script on the laptop near the BlackVue. I only need connect to the BlackVue WiFi and run it. It starts downloading the camera (curl/wget) and 1-2 hours later the action directory, gpx data and moving mp4’s are ready. This data is rsync’d to another laptop for the OSM processing.
Have now setup 3 plugins (toolbar icons) in Geeqie; Launch the 1 minute wav file in Audacity associated with that image, copy the lat/lon to the clipboard, and move the entire 1 minute band of images and wav out of the active/processing directory.
Playing/viewing the wav in Audacity (via the Geeqie icon) has one second tick marks that correlate with the image filename. Eg 20220504_142256_044.jpg will be at the 44 second mark. If there is only one voice peak on the Audacity display I can quickly scroll/roll to it, not having to view intervening images. If the audio track is very dense, like driving past a long row of shops I can pause audio and scroll/roll as needed.
Clicking on the lat/lon icon puts the current image position onto the clipboard, that then gets pasted to the ID editor search box.
I would suspect that I could save a further few seconds by URL launching “remote” on the browser, but not yet!
When I have completed the feature in ID I click on the Geeqie “move” icon and the entire 1 minute image and wav file set vanishes!

Topic		Replies	Views
Better OSM feature recording - a thought Contributing and equipment	0	408	June 5, 2021
My new scripts to generate geotaged images from Viofo A129 DUO (and simillar) dashcam videos. Contributing and equipment	13	1527	January 18, 2023
From MP4 to Mapillary Contributing and equipment	36	3142	September 11, 2021
What's the opinion on image location accuracy, lagged - 20 secs out of sync with GPS location 360° cameras	19	898	July 21, 2025
Deleting individual pictures is too hard Website	6	521	August 7, 2023

Mapillary plus audio - OSM features

Strictly speaking this is not just about Mapillary imaging, but of taking a local copy prior to upload, generating geotagged/linked (own voice) audio from the same dashcam movie, then using both to add/change OSM features. The audio commentary serves to fill data gaps in the images.

Related topics