I looked into Yahoo's old NSFW detector and can't help but wonder if there is a Yolo DNN version trained on similar (unreleased) datasets that would detect and locate human nudity on pictures?

Is there at least a public database of it or must I gather my own?

1 Answers

jcezarms On Best Solutions

A recent effort has been put together to implement a scraper for that kind of data. As described in this article, it resulted in a 220k image dataset you can find in this repo's /raw_data folder.

It may already be useful for you, but that dataset has very generic and sparsely defined categories, which inspired this newer, better organized dataset. It has 159 defined categories, with a total of 1.58 million imgur URLs. These were taken mostly from Reddit channels, which - in all of Reddit's categorization glory - contributed to the overall placement of tags. The repo's README claims that after data cleaning - e.g. duplicate / corrupted / deleted data removal - your total volume should have ~500 GB and ~1.3 million images.

As for the pretrained YOLO, there's no pulished work on that. If you're okay with the dependency and cost of delegating that content filtering to Google's Cloud Vision API, they claim to be good at classifying visual adult content. Otherwise, since most works on the same nature seem to be held private, you'd have to train your own.