Privacy guru Richard M. Smith, formerly Chief Technology Officer of the Privacy Foundation, brought an interesting juxtaposition of Amazon's recommendation lists to C|Net's attention (Amazon blushes over sex link snafu). Apparently, one of Christian televangelist Pat Robertson's books was being linked to an anal sex manual for men. Amazon has removed the curious linkage, but the article raises the issue of how easily data mining can be gamed. This is similar to how search engines are gamed.
In the search engine context, webmasters attempt to game the search algorithms in order to increase their ranking and placement - which causes search engines to improve and alter their algorithms - which causes webmasters to find new exploits, and so on and so on, except where the webmaster chooses to sue instead (PageRank by Judicial Decree? SearchKing Sues Google). This arms race is a predictable one. Search engines will strive to improve their algorithms, while webmasters will attempt to game those algorithms. The search engines have managed to stay one step ahead so far.
Smith has pointed out this new exploit, which authors might use for increased marketing. Presumably, Amazon will alter their recommendation services to thwart such gaming of their system. The system works. However, what about people who aren't trying to game the system for marketing advantage, but to cause harm to another? How easy will these systems be to game from a privacy point of view? And what will be the counterbalance to prevent it?
For example, there was a lot of talk in the blogosphere recently about whether your TiVo thought you were gay and what you could do about it. What happens if someone can game these datamining systems to create similar results? Although it isn't clear how Amazon's system could be gamed, what if someone subscribed another to a number of gay rights causes, for example? Would these datamining databases tag the target as "gay"? If so, how would you change that? What pernicious effects might be caused? This is an issue that is still over-the-horizon for the most part, but it bears watching. As databases become ever more prevalent.