Mapillary_tools: Can't upload with it, can't upload without it

Why can’t I upload without it?

Glad you asked.

  • My main camera is a Galaxy S7 which typically takes a bit over 1 picture per second. But the Mapillary app does not refresh the GPS position that often. So over 10% of the pictures have duplicate positions. (this may actually be hard to fix in the app)

  • The Mapillary app also sets the direction flag incorrectly: no matter how many turns on the road the pictures all point in the same direction! That would normally be easy to fix after the upload by “normalizing” the sequences one by one. But because of the duplicate positions this results in some pictures pointing backwards :frowning:

  • Then I also use a Mi Sphere 360 camera. This camera could get the GPS coordinates from the phone but I found that this interferes with the Mapillary app and results in frequent crashes. Plus it drains the 360 camera battery because it has to keep the wifi link up. So the 360 pictures get no GPS coordinates and must be geotagged after the fact.

  • Then this camera takes one picture every 3 to 4 second. Including when I’m stopped at a traffic light. Clearly some deduplication is needed.

  • The 360 camera also dumps all the pictures taken in a day in the same folder. These then have to be matched with the Mapillary app GPX files. For instance on 2018/12/22 it took 5042 pictures corresponding to 23 Mapillary sequences. Splitting the folder manually would totally be doable by hand but automating it would be much nicer.

So why can’t I upload with mapillary_tools?

After all the advertisement said all I had to do is ‘mapillary_tools process_and_upload’ and all would be magically solved, right?

Well… no. Not even close.

  • First mapillary_tools will not run without some exotic Python modules like a custom Piexif library and the pymp4 library. The latter is totally useless for anyone looking to simply upload their Mapillary App photos but must absolutely be installed before mapillary_tools will even start. Sure you can totally bypass your Linux distribution packaging system to get those through Pip but if you want to keep your system clean you’re out of luck.
    Fix (not applied yet)

  • Then, whether it is to fix the GS7 image positions or supply them altogether for the 360 camera I need to pass the --geotag_source gpx --geotag_source_path XXX.gpx options.

  • Those require adding the --advanced option. Oh well, look at that simple ‘process_and_upload’ command was a nice dream.

  • But wait, you cannot even run the process command without supplying your --user_name! Why you would need to provide your mapillary account just to process some local images is beyond me.

  • Next the --geotag* options rely on the gpxpy.py Python library which, until the start of 2019, was losing the second fractions in all of the GPX file timestamps. So gpxpy would turn 15:20:10.950 into 15:20:10.000 and would then correlate that against the image timestamps.
    Fix

  • At the end of the GPX file you would often get two GPS points taken in the same second: one for the last image and one when the GPX file is closed; for instance 15:20:10.102 and 15:20.10.950. Because of the previous gpxpy.py bug those would both get truncated to 15:20:10.000 and mapillary_tools would then dutifully divide by zero!
    Fix (not applied yet)

  • As I said previously the gpxpy.py timestamp rounding bug has been fixed. But at the same time the gpx py developpers added better support for timezones so that all the timestamps they return (datetime objects) are now tagged with the corresponding timezone. But none of mapillary’s own timestamps are tagged with timezone information which makes them incompatible: you cannot compare times if you don’t even know if they are in the same timezone.
    Let me spell out the consequence: AS IS THE --geotag OPTIONS WILL EITHER MANGLE YOUR IMAGE POSITIONS OR CRASH.*
    Fix (not applied yet)

  • Time zones… If you used --geotag* you probably had this warning at some point:
    Your local timezone is XXX. If not, the geotags will be wrong.

    Sure if you took the images with the Mapillary app their filename ends with something like ‘+0200’ which clearly indicates the timezone. The parent directory too ends with the '+0200’ timezone and the GPX filename too. But can the developpers who wrote mapillary_tools take advantage of all the information provided by the Mapillary app? No. Apparently that’s too much to ask.
    Fix (not applied yet)

  • It’s not obvious from the mapillary_tools --help usage, but in addition to the positions the --geotag* option also sets the direction. So what is that --interpolate_direction option for you may ask. Presumably it’s there so you can set the direction field when you don’t want to re-geotag your images. What if you want to geotag your images but not override the direction field? Well, you can’t.
    As far as I can tell --geotag_source exif does the same thing as --interpolate_direction except it creates an extra GPX file. So I would remove the --interpolate_direction option entirely. And maybe a --geotag_keep_direction option should be added in case one wants to adjust the image positions but not their direction.

  • In any case combining --geotag* with --interpolate_direction will likely result in precision loss. Here’s a concrete example. The Mapilary app samples the GPS position about once per second but my 360 camera takes a photo only once every 4 seconds. --geotag* will take the two closest GPS points to determine where the camera was pointing whereas --interpolate_direction will instead use two consecutive image positions. But 4 seconds is more than enough time to complete a right turn. So where --geotag* knows a photo taken 1 second before a right turn points straight ahead, --interpolate_directions will think it points at the next photo, somewhere to the right.

  • Then there’s the issue of figuring out the direction when you’re stopped, at a traffic light for instance (remember, my 360 camera takes photos every 4 second, no matter what).
    If your GPS is well behaved you will have multiple images with the exact same GPS coordinates. In that case mapillary_tools will typically turn the direction 180 degrees around. And if your GPS wanders around a tiny bit then you will simply get a random direction.
    mapillary_tools should recognize that you’re stopped and thus preserve the last know direction of travel.
    And yes, that issue happens with both --geotag* and --interpolate_direction.
    Fix: None yet

  • While we’re at it, considering that the vehicle travels in a straight line from one GPX point to the next and then instantly turns is a really primitive approximation. I really expected more from a company that deals with geospatial information all day. One could imagine fitting splines through the GPX points or even using the famous Kalman filter to get better results for both the position and direction interpolation.

  • So then you want to eliminate duplicates such as all those photos taken at the traffic light. mapillary_tools does that automatically. But by default it keeps images that have the same position if their angle differs by more than 5 degrees. Remember how I said before that when you’re stopped the direction is essentially random? Yep, that break deduplication.
    One workaround is to add --duplicate_angle 360 and give up on the images you took in tight slow turns.
    An alternative is to use the patch below which ignores direction changes for images that are less than 0.5 meters apart. But the real fix would be to correct the direction calculation in the first place.
    Workaround from this pull request (not applied yet)

  • Then I want to review the duplicates before uploading the images. Both because, as the issues above demonstrate, mapillary_tools cannot be trusted; and because when stopped at an intersection I may prefer to keep another image where I don’t have a bus obstructing the view for instance. So I use the --move_duplicates option. Ideally it would make symbolic links for the previous and next images so I can compare them to the duplicates but unfortunately it does not. mapillary_tools also does not set the EXIF GPS coordinates on the duplicates so one cannot do a mock upload to see the images in context. Oh well.

  • After that review it may be necessary to re-run process, for instance if you decided to keep a duplicate image. You may think it’s just a matter of adding --rerun. But no. By default mapillary_tools recurses into subfolders, including the duplicates folder it just created! So not only is it likely to upload the duplicates (something I have not tested yet), but now it’s deduplicating the duplicates and creating a duplicates/duplicates folder! Yuck.
    Fix from this pull request (not applied yet)

  • So add --skip_subfolders. Problem solved. Yep so solved that now mapillary_tools crashes because mapillary_tools uses nonexistent variables.
    Fix from [this](From https://github.com/mapillary/mapillary_tools/pull/335) pull request (not applied yet)

  • So you’ve run process already but are re-running it for good measure. The good thing is that mapillary_tools detects that it has processed all the images and skips them all. Good that takes less time. But it leads to this error message:
    Error, capture times could not be estimated to sub second precision, images can not be geotagged.
    While you puzzle over the reason for this error, just enjoy the fact that it’s not fatal and will not prevent you from uploading your images.
    (btw, in this case you can safely ignore this error)
    Fix (not applied yet)

  • I thought --save_local_mapping might provide interesting data. But I gave up on it because it too crashes mapillary_tools (and then seems to do… nothing?). How many other options crash mapillary_tools?
    Fix (not applied yet)

  • Finally, what’s with these ugly underscores in the option names? Why --skip_subfolders instead of --skip-subfolders like every other application? Did the developer think he had to use underscores to get valid Python variable names?
    Fix (not applied yet)

So after applying several mapillary_tools patches we can get something mostly usable with the command below:

./bin/mapillary_tools process --advanced --rerun
–user_name MYACCOUNT
–import_path 2018_10_24_11_12_48_107_+0200
–skip_subfolders --geotag_source gpx
–geotag_source_path 2018_10_24_11_12_48_107_+0200/LEGACY_CAM_0_2018_10_24_11_12_48_133_+0200.gpx
–offset_angle 90 --overwrite_EXIF_gps_tag
–overwrite_EXIF_direction_tag
–duplicate_distance 4 --duplicate_angle 20 --move_duplicates

Some simple process_and_upload command indeed.

If you’re interested in the patches I mentioned, you can get those, and a few more, from the fgouget branch of:

Have fun!

2 Likes

Haha, welcome to the geeky world of GPS and JPEG photos! I also struggled with the python utilities on Windows - and I am a GIS consultant and Python instructor. I agree that the tools are hard to use and are dated, but the extra modules are necessary because there is no standard module to read EXIF tags. Thanks for your fork for the edits for all of us.

My solution was to get a Bluetooth GPS (Garmin Glo) that handled two satellite systems and also SBAS corrections so that the post processing was unnecessary. Sure there are survey grade receivers that would do an even better job but they cost thousands of dollars rather than hundreds.

If you want to avoid all the post processing, just upload direct from your phone when you have a wifi connection. The odd thing is that the images get an EXIF tag with the lat/long and a timestamp as they are taken. You don’t need to do anything. There is another file of gpx coordinates that is separate from the photos. If something overloads (such as the phone overheating) then you still have a track that could be used for post processing, or perhaps tagging an action camera taking photos out the back at the same time.

Timezones are a problem for everyone. The timestamp with an offset is really a hybrid UTC time, and it does not define the time zone. Unfortunately the timestamp inside the EXIF is a local time and does not have a timezone at all! That is a problem with the standard EXIF specification. The best timestamp to use is the image filename, not the internal EXIF time. But if you do use it, it will have to be reformatted into UTC time without an offset. (I know there are sub-second EXIF extensions, if only they were populated and used)

The GPX file contains pure UTC times so you will have to convert the local+offset times to compare.

This week my times will have a +1300 added, but next week after daylight saving ends they will get a +1200. So which timezone am I in? It would have to be found from the lat/long and a map of zones.
In my case I can do a hack because I am always in the same zone, but it needs a general solution for worldwide photos. I notice that the times displayed on my photos in the explorer app are wrong!

The EXIF time does not store sub-seconds. That is clearly a problem, we are trying to get 1 - 5 m accuracy, yet the car is moving at 20 metres/second…

The next issue is that the GPX file sample points are not the same as the photo times or points. Yes, this requires a more sophisticated interpolation than matching by seconds. You take the GPX track, turn it into a linear referencing dataset (a Route) in seconds from midnight and then find the location of Events (photos) down to fractions of a second. This can be done easily enough in QGIS or ArcMap or FME.


This map has a route with hatches indicating seconds from local midnight and the times of each photo interpolated by time along the track.

I used to worry about duplicate photos while stopped at traffic lights, but a blog post says that Mapillary weeds out identical photos at sequential times, so you could remove them, but it’s no longer necessary.

GPX points are already adjusted by the receiver when creating a track. Have you noticed how the points are always in order? With the random errors every second (no time for averaging) you would not expect such an ordered set of points. Anyway the GPS signal only updates every second so the receiver has to interpolate and also do a running fix to create a useful track. This is not required of Mapillary, the smoothed track is there in the GPX file already. I did not correct the view bearing initially, it can be done online, but I did work out my own way of “running backwards” down the sequence with a cursor so that I can populate the direction, while copying the direction if there is a duplicate point.

I have no problem with underscores in variable names. But what about a nice GIU in QT5 so we can avoid the command line completely? Then we can have a dialog to guide the new user through the options and avoid conflicting switches.

While you are forking the tools, what about a Python 3 version?

I personally am more comfortable with commercial tools so I whipped up my own post processing in Safe Software’s FME in a few minutes because it can read and write the formats natively. I then use the recently released laptop based uploader to release my phone for more travelling.

1 Like

I have been recently looking into doing the first process steps (BlackVue mp4’s) by other methods and using tools for just the later steps ie I write all the GPS etc info in the jpeg EXIF, then allow the tools to write the EXIF description and upload.

These are my steps;

  • wget both the mp4 and gps (nnea) files from the device
  • ffmpeg extract jpegs at 2FPS
  • touch the first jpg file per mp4 with the first date/time found in the gps file.
  • relative touch the rest of the jpg files at 0.5 sec interval
  • Using jhead create the EXIF date/time stamp from each of the previous touched files. Also add a default thumbnail (as the tools give an error if its left empty). Preceding the jhead command with TZ=utc makes the stamp UTC time
  • gpscorrelate each mp4/jpg sequence against a nmea-gpx converted track
  • Parse the gpx file for speed and direction, adding those to the EXIF via exiv2.
  • Move all the days jpgs to one directory (typically 8,000-20,000 files)
  • Run a script (exiv2) that moves those images below a cutoff speed (15kph) away from the upload directory. I also allow a 3-4 frame ramp down of speed to cover stopping at an intersection
  • Then run regular tools.

Early attempts saw GPS time granularity issues rounding to one second, losing half to the standard duplication code. The last few (only just uploaded) seems to have worked okay, but I have more checking to do. (bob3bob3 29-31 March 2019)

Why did I do it? I prefer to leave the camera running for my Australia wide travels and know that below a certain speed, images will never upload. Very handy to exclude car parks and private places. Duplicate distance in the tools is not useful for me. I disable such detection. It’s also handy to have (almost) full GPS data that can be derived from the nmea track for other reasons.

My next project is to do this processing as I drive, saving battery energy later on. (I have no mains power, only vehicle/motor and solar sources)

I don’t think it makes sense to require every Mapillary contributor to buy extra hardware. Ideally Mapillary would fix their Android app making any post-processing unnecessary. But since the source code is not available there is nothing we can do about that. So the next best thing is for mapillary_tools to clean up the mess.

The result would not be up to my standards for all the reasons I stated. Plus I obviously cannot upload my 360 images from the phone.

True. I was a bit lose with the terminology. Fortunately all we really care about is the UTC offset.

Why? I’m not convinced the subsecond information is going to be any better since it seems like it’s going to be further away from when the picture was taken.

Not necessarily. See the gpxpy code.

Android uses them which means the main use-case is covered at least.

That depends on the GPS device, i.e. on the smartphone. On my GS7 it’s not too bad when driving but is more erratic when walking. Still, even when driving there are some glitches that mapillary_tools does not detect. For instance in this GPS trace generated by the Mapillary app:

218 14:07:03.320  t +1.008 s  d  7.7 m 27.4 km/h  a  0.0 g  b 189 Δ  2   2°/s
219 14:07:04.061  t +0.741 s  d  0.0 m  0.0 km/h  a  1.0 g  b   0 Δ171 231°/s
220 14:07:04.305  t +0.244 s  d  7.4 m 108.9 km/h a 12.6 g  b 187 Δ173 709°/s
221 14:07:05.319  t +1.014 s  d  7.6 m 26.8 km/h  a  2.3 g  b 188 Δ  1   1°/s

Not variable names. Option names.

I’d rather keep the diff reasonably small to minimize the work to keep it up to date. And of course I’m still hoping Mapillary will integrate some of my patches. When that happens maybe I’ll look into further modifications.

1 Like