BenEdelman writes "In 2000, Google introduced a feature called SafeSearch, intended to omit from Google's results sites with "pornography and explicit sexual content." My prior research as to filtering systems suggests that all filtering systems block considerably more content than their stated rules suggest, and with that in mind I set out to evaluate SafeSearch's accuracy.
My full report is now available:
"Empirical Analysis of Google SafeSearch".
My research indicates that Google omits at least tens of thousands of web pages without any sexually-explicit content, whether graphical or textual. SafeSearch is easily confused by ambiguous words in web page titles -- like "Hardcore Visual Basic Programming," a web page that describes intense programming for experts, without any sexually-explicit content whatsoever. SafeSearch also makes mistakes that are harder to understand -- like filtering the National Middle School Association (nmsa.org) and even the front page of Northeastern University (neu.edu), not to mention numerous sites operated by US federal, state, and local governments. Among searches on subjects such as reproductive health, SafeSearch allows some results but not others in a way that seems essentially random; it is difficult to construct a rational non-arbitrary basis for which pages are allowed and which are blocked. See highlights of pages omitted from SafeSearch seemingly inconsistent with SafeSearch's stated filtering policy.
In addition to providing a listing of specific URLs excluded from Google SafeSearch, I have provided a testing system to let users quickly determine whether a given URL is excluded from SafeSearch, and to determine, for a given keyword search, which ordinary Google results are excluded by SafeSearch.
Ben Edelman
Berkman Center for Internet & Society
Harvard Law School
"