@GITNE Thank you for the excellent feedback and suggestions — they are very helpful for us to keep improving our tools. All your suggestions make sense to me, and some are already on our roadmap
Do not open and establish a separate TCP/HTTPS connection per image file.
Yes, this is going to be my next optimization. It’s going to be some work though as we need to find a way to share HTTP sessions among multiple uploads (also make sure resumes and retries and so on continue to work). I think it very doable and looks promising. Once it’s there, I’d expect uploading to be even more reliable with less connections (or upload workers). cc @boris @nikola.
For example, although I am on a 1 Gbps fiber link, I had to set MAPILLARY_TOOLS_MAX_IMAGE_UPLOAD_WORKERS
to 8
for my machine’s network stack not to get overburdened.
The default # of upload worker 64 is an empirical number from my test environments that starts to saturate the bandwidth. Per your comment it feels not easy to find a constant for all network environments. Let’s see if the Keep-Alive optimization can help us reduce the default # of workers to 32 or 16. If not, we will see if we can find a dynamic way to increase/decrease workers? Another good point to expose the envvar as a CLI param for user to adjust the bandwidth usage.
Orthogonality remains a key design principle for usability and clarity
Yes we are aware of the challenges. In H2 we are planning to improve the CLI’s UI/UX including redesigning these envvars and CLI params. Stay tuned.
Enable the --skip_subfolders
option with the upload
command.
That’s a good point. Let’s leave it for v0.15 as it’s a breaking change.
Unexpectedly, using the --verbose
option on the process
command basically silences output when correlating GPX tracks to images in the final write step.
Good callout. Fixing in the next version.
To make things even simpler, you do not actually need the --desc_path
option with the upload
command.
In some cases, users need to upload just a subset of files extracted in the description JSON file, then upload import_path –desc_path=desc.json
can be useful.