Mapillary_tools 0.8: no status kept?

Used older mapillary_tools versions a lot. After some break updated to 0.8.2.

In older versions, if a processing and upload run was interrupted or failed, it could be re-run on the same directory, and it resumed from where it had left, more or less.
With 0.8, this seems to be broken - running mapillary_tools against the same directory goes through the “Processing”, “Test EXIF writing”, and “Uploading” phases.

Is this indeed correct, do the tools fail to keep status info now, and do everything over and over again?

My understanding is that the “status” is now a server side database. eg you shouldnt be able to upload the same sequence twice. I am not sure what happens with the local json file now as I only upload BlackVue mp4’s

However I dont trust the tools to restart a directory with 6 gazillion mp4’s in it so my command line generates a verbose log, that is then parsed for successful uploads and renames them. This is handy because right now I see reliable tools crashes every 20-30 mp4’s. (My uploads are done by a 3rd party)

Thanks, I do recall the duplicate check during upload being introduced - but that serves a different purpose. It helps Mapillary not to have duplicate images in the end.
Local status in mapillary_tools is for contributors so that they do not waste processing power, bandwidth and not kill their disks that soon.

Uploading many images, sometimes over slow internet connection, I highly value the functionality to do absolute minimum duplicate operations in case some step fails or is interrupted.
It also helps to avoid data loss in case some directory/images have a full/partial failure to process or upload.
For example, if a contributor launches mapillary_tools in a loop against 20 directories with images, some directory inbetween might fail to upload. The contributor should be able to re-run mapillary_tools against all directories and have them pick up only anything not processed/uploaded without re-doing everything.

For me lack of this functionality is a major regression and a dealbreaker.
@asturksever, sorry to bother you directly, but could this please be escalated, if possible?

@Richlv thanks for bringing this up. Currently, mapillary_tools 0.8 does not keep status. We are aware of this missing feature and we will prioritise it. Please follow more updates on tools 0.8.0 upload images over and over again · Issue #450 · mapillary/mapillary_tools · GitHub

1 Like

Thank you for the quick reply, greatly appreciated.
That issue only handles repeated uploads - what about other processing, should I create a new Github issue?

Regarding the impact of this, I have a few dozen of directories, ~1000 images each. There was a mapillary_tools process and upload run on them all in a loop. Without status kept, it seems that I have no way to figure out whether anything there failed, right?

Also, would downgrading to mapillary_tools 0.7 allow to have local status tracking?
That is, are tools version 0.7 compatible with the upstream services?

The latest version (v0.8.3) has duplication check enabled. You can try it out here:

python3 -m pip install --upgrade git+https://github.com/mapillary/mapillary_tools

For example, if a contributor launches mapillary_tools in a loop against 20 directories with images, some directory inbetween might fail to upload. The contributor should be able to re-run mapillary_tools against all directories and have them pick up only anything not processed/uploaded without re-doing everything.

@Richlv The duplication check enables the use case above.

Also @bob3bob3 Mapillary Tools is designed for large uploads. It should be safe to re-upload the same folder of arbitrary size. If not, let me know here or Issues · mapillary/mapillary_tools · GitHub

1 Like

Thank you so much - attempting to upgrade right now still gets 0.8.2. Am I doing something wrong?

Successfully installed appdirs-1.4.4 mapillary-tools-0.8.2

Yes it looks good.

BTW v0.9.0 is released Release v0.9.0 · mapillary/mapillary_tools · GitHub

1 Like

Hmm

Hmm, it might not have been good before (0.8.3 was suggested, I only got 0.8.2 after updating), but now 0.9.0 is available :slight_smile:

I haven’t done extensive tests (time and resource consuming), only noticed that fully uploaded sequences skip the upload step for them.

Could you please confirm whether all of these cases should have their status kept and properly resumed?

a) “Processing” and “Test EXIF writing” steps - if interrupted in the middle, resume from the image where they were interrupted at. If completed, don’t go parsing at all.
b) Temporary directory during upload - any processed data there is kept (in case of an interruption or failure). Upload inside any particular sequence is resumed exactly where it was left off.

Such behaviour was present before (with functional differences due to processing and upload being per-image, but that doesn’t change the need), and is extremely useful when travelling, and perhaps only getting short time windows to do image processing and upload.

@czecko We’re working on fixing this issue!

We only keep track if a sequence is uploaded or not. No status for processing. This is because processing is fast comparing to uploading, and processing won’t be interrupted as often (due to fatal errors) as uploading (due to network errors for example).

For example, if you are uploading a folder that contains 2 sequences, sequence A and sequence B respectively, the procedure will be like:

  1. process A
  2. upload A
  3. process B
  4. upload B

So:

  • if you interrupt the program at step 1, the next run it will run 1,2,3,4
  • if you interrupt the program at step 4 where 80% of B is already uploaded, the next run it will skip 1 and 2 because A is uploaded, and then it will run step 3, and then upload the rest (20%) bytes of B at step 4

To answer your questions:

a) If it gets interrupted, in the next run it will be processed again
b) No temporary directories are created if you process images. For video processing, samples will be created under a folder called “mapillary_sampled_video_frames”, and it will be kept until you remove them manually.
c) Uploading will be assumed from where it’s left.

3 Likes

Thank you for the explanation, greatly appreciated.

Can we please consider adding tracking for all processing as well?
Sometimes, when travelling, I have to upload a directory many, many times. When that happens, it would end up re-doing the processing each time.
This adds extra load on my storage devices. I already lost one external harddrive I used for Mapillary processing.
Reducing the wear and tear on contributor equipment would be extremely welcome :slight_smile:

Very glad to hear about uploads being resumed. Is this happening by keeping an aggregate file in TMPDIR, or is it some other approach?

2 Likes