Anonymize (blurring faces and license plates) before upload

With the recent acquisition of Mapillary, many users are concerned about Facebook’s access to unblurred images.

One solution could be adding one step more to the workflow, and blurring faces and license plates BEFORE upload to Mapillary (or OpenStreetCam, OpenTrailView, etc).

Well, I tried this project https://github.com/understand-ai/anonymizer and works very good. Its open source and multiplatform. You need to have python 3.6 in your system (I tried with python 3.8 and fail, but I forked the repo I’ll try to add support to recent versions).

Installation
$ python --version
Python 3.6.10

python -m venv ~/.virtualenvs/anonymizer
source ~/.virtualenvs/anonymizer/bin/activate

git clone https://github.com/understand-ai/anonymizer
cd anonymizer

pip install --upgrade pip
pip install -r requirements.txt

Usage
In my test, the process of a 10MB 360 pano jpg image, consumes 3G of RAM. So, its recomended to close heavy applications in your system before executing.
PYTHONPATH=$PYTHONPATH:. python anonymizer/bin/anonymize.py --input /path/to/input_folder --image-output /path/to/output_folder --weights weights

Replace input and output folder paths.
If the weights folder not exist, it created and the files weights_face_v1.0.0.pb weights_plate_v1.0.0.pb are automatically downloaded.

I’ll try to build a docker image for easy installation and usage.

4 Likes

very nice workflow, thanks.

But it would be nice if Mapillary just gave us a checkbox option to indicate if my original raw upload(s) should be removed on their server after blur algorithm has been applied, so they keep only the blurred version of uploaded images

Well done @juanman . This looks promising and could work for my workflow because I use action cams and a 360 camera. Any idea how much time this extra step consumes. I sometimes shoot > 10.000 images a day. I agree with @micmin1972 but we’ll have to wait until Mapilary react which could take some time I guess.

Hi
Looks petty good.
Can many images be anonymized at the same time or does this have to be done for each individual image?
Regards
Dominik

Do drop a line when you have added 3.8 support !

I dont know if is there a GUI. The script write a .json file with the coordinates of plates and faces detected, so could be the input data for a GUI.

In my tests, I dont see false positives.

Another project for the same function is https://github.com/everestpipkin/image-scrubber Its a web GUI, but for do the blur manually.

Yes, you write the input directory path, and the script do the blur for all the images in that directory.

I succeeded to make this tool work under Debian and did some tests with captured sequences. My experience, mainly with plate detection as this is easier to verify than face detection:

  • Face and plate detection seems to work very well. However, with the default parameters, some plates are not detected. Lowering the detection thresholds from 0.3 to 0.1 results in close to 100% detection rate. In several hundred photos, I did not find one where a plate or a face has not been detected.
  • In some rare cases, a plate is detected with a bad position and the the bounding rectangle does not or not fully cover the plate. I suppose that this is a bug in the tool.
  • With the default blurring parameters ( --obfuscation-kernel 21,2,9), large size plates in the foreground are not sufficiently blurred. The size of the gaussian kernel must be increased significantly to get good results. I got good results with --obfuscation-kernel 47,1,9
  • Surprisingly (for me), the processing performance of the tool depends mainly on the blurring parameters, especially on the the gaussian kernel size value. On my system, processing of a picture with the default values takes about one minute. With --obfuscation-kernel 47,1,9 it takes up to 3 minutes, with --obfuscation-kernel 1,1,1 it takes less than 10 seconds (without any blurring, of course).

To get a decent processing time I set up this workflow:

  • Process the images with anonymizer --obfuscation-kernel 1,0,1
  • Throw away the output images and keep the JSON files
  • Blur the original images with imagemagick -gaussian-blur using the rectangle coordinates from the JSON files

This is sufficiently fast for my purpose. The blurred areas have sharp borders which does not bother me. There is probably a way to get smooth transitions with imagemagick.

1 Like

It’s a good suggestion and one we’re looking into. In the meantime, we’re blurring images as soon as they hit our servers and removing the originals.

More updates to come on privacy and blurring.

1 Like

Contrary to most, I would like to keep access to my unblurred pictures.
It is sentimentally important to me. And do not forget the wrong blurring.