Classification of Not Suitable for Work Images: A Deep Learning Approach for Arquivo.pt
; Datia, N.
Classification of Not Suitable for Work Images: A Deep Learning Approach for Arquivo.pt, Proc Portuguese Conf. on Pattern Recognition - RecPad, Evora, Portugal, Vol. , pp. - , October, 2020.
Digital Object Identifier:
Download Full text PDF ( 1 MB)
Arquivo.pt is a Web Archiving initiative, storing contents preserved from
the .pt Web Pages. Among these contents, there are many image files.
Some of these images explicitly nudity and pornography, which are offensive for the users, and thus are Not Suitable For Work (NSFW) images. In
this paper, we propose a solution to classify NSFW images on Arquivo.pt,
using deep learning approaches. We set up a dataset of images with Arquivo.pt data and the ResNet and SqueezeNet models, are evaluated and
fine tuned for the NSFW classification task. These models reported an
accuracy of 93% and 72%, respectively. After a fine tuning stage, the
accuracy of these models improved to 94% and 89%, respectively. This
solution is available at https://arquivo.pt/images.jsp.