[bugreport] mapillary_tools hangs on unhandled exception

This has been reported a long time ago in Github, but something similar is still a problem with the latest version (OK, about a week old) of mapillary_tools.

Some connection failures result in an unhandled exception, and upload does not proceed. Especially noticeable when a large number of images is left for an upload overnight (or “overday”) and the exception occurred on the first directory.

Example error:

Exception in thread Thread-1:====================------------] 80.4% … 98 images left…
Traceback (most recent call last):
File “/usr/lib64/python2.7/threading.py”, line 801, in __bootstrap_inner
self.run()
File “/usr/lib/python2.7/site-packages/mapillary_tools/uploader.py”, line 71, in run
upload_file(filepath, max_attempts, **params)
File “/usr/lib/python2.7/site-packages/mapillary_tools/uploader.py”, line 608, in upload_file
response = urllib2.urlopen(request)
File “/usr/lib64/python2.7/urllib2.py”, line 154, in urlopen
return opener.open(url, data, timeout)
File “/usr/lib64/python2.7/urllib2.py”, line 429, in open
response = self._open(req, data)
File “/usr/lib64/python2.7/urllib2.py”, line 447, in _open
‘_open’, req)
File “/usr/lib64/python2.7/urllib2.py”, line 407, in _call_chain
result = func(*args)
File “/usr/lib64/python2.7/urllib2.py”, line 1241, in https_open
context=self._context)
File “/usr/lib64/python2.7/urllib2.py”, line 1201, in do_open
r = h.getresponse(buffering=True)
File “/usr/lib64/python2.7/httplib.py”, line 1121, in getresponse
response.begin()
File “/usr/lib64/python2.7/httplib.py”, line 438, in begin
version, status, reason = self._read_status()
File “/usr/lib64/python2.7/httplib.py”, line 394, in _read_status
line = self.fp.readline(_MAXLINE + 1)
File “/usr/lib64/python2.7/socket.py”, line 480, in readline
data = self._sock.recv(self._rbufsize)
File “/usr/lib64/python2.7/ssl.py”, line 780, in recv
return self.read(buflen)
File “/usr/lib64/python2.7/ssl.py”, line 667, in read
v = self._sslobj.read(len)
error: [Errno 104] Connection reset by peer

I can confirm it still happens with the current version. I wrote a wrapper script to kill the zombied process and restart. It doesn’t appear to lose any images running this way.

I have the fault submitted to support and get occasional “we are working on it” messages.

Still the same. This makes it harder to contribute - leaving images to upload overnight often leads to a sad surprise in the morning.

Still happens. Pretty bad on a trip, when running out of space.

Upgraded to 0.5.2, and it looks like the exceptions might be handled now - at least haven’t seen them hanging after the upgrade.
If that is so, could it be that --max_attempts are not obeyed when an exception happens, and the image immediately gets marked as “failed”?
It would be great to treat exceptions as yet another attempt instead.