Jeffrey Pawlick and Quanyan Zhu, New York University Tandon School of Engineering
Those who propose the use of obfuscation to protect privacy must understand the consequences of using obfuscation, and judge when those consequences are morally justifiable. Opponents accuse obfuscators of “the valorization of deceit and dishonesty… wastefulness, free riding, database pollution, and violation of terms of service” [1]. Indeed, these concerns draw upon strong cultural norms against lying and deceit—as well as particularly powerful recent movements against wastefulness and environmental pollution. What type of criteria can be used to evaluate these harms against the potential benefits of obfuscation?
As mathematical game theorists, we typically answer that a utility function can be constructed: the first term in the function can represent privacy benefits, and the second term can represent harms. If the first term outweighs the second—we might argue—then obfuscation is justified. We might imagine that this argument draws off the purity and objectivity of mathematics, and therefore is not burdened by any particular philosophical view.
Of course, that is not the case. Game theorists in this community implicitly adhere to consequentialist or utilitarian points of view, such as those advanced, for instance, by John Stuart Mill. In particular, “act consequentialism is the claim that an act is morally right if and only if that act maximizes the good, that is, if and only if the total amount of good for all minus the total amount of bad for all is greater than this net amount for any incompatible act available to the agent” [2,3]. Extreme versions of consequentialism might be summed up in the vernacular phrase: “the ends justify the means.”
Of course, not all philosophers (or non-philosophers) are consequentialists, and therefore not everyone judges obfuscation solely by its ends. For example, those who criticize obfuscation as dishonest draw upon the idea of lying as malum in se. Others claim that obfuscation violates terms of service which have legal force, and is therefore mala prohibitum—wrong because it is illegal—regardless of its consequences.
First, we address the claims of dishonesty. While it is true that many traditions reject lying as malum in se (e.g., those of Kant and Aquinas), most obfuscation does not involve lying. Lying requires “making a believed-false statement” [4], and obfuscation techniques do not make statements. Obfuscation is indeed deception, but deception is probably not malum in se.
Second, we consider the argument that (online) obfuscation violates terms of use and is therefore malum prohibitum. Our argument is that terms of use do not have legal force, and even less likely moral force. It would take the average Internet user 76 days to read the privacy policies of all of the websites he or she visits each year [5]. This conveys the general idea that policies online are not effectively promulgated. It seems, then, that obfuscation is neither malum in se nor malum prohibitum.
Obfuscation, therefore, passes at least two litmus tests for moral permissibility as a means. Of course, that in itself does not make obfuscation justified. But it suggests that we can return to considering obfuscation’s consequences. Arguments along these lines can perhaps be found in just war doctrine [6] or the principle of double effect [7]. According to both of these ideas, an act which produces bad effects can be tolerated (under certain criteria) if its bad effects are proportional to its intended good. Obfuscation certainly can produce harms; it may waste computational cycles, degrade personal advertising, or even distract law enforcement. These must be weighed against the good of protecting users’ privacy. But how can we evaluate this proportionality?
Here, we return full circle to game theory. Utility functions are poor tools to capture complex ethical issues, but they are excellent tools to capture proportionality. Game theory is a branch of mathematics which models strategic interactions between two or more rational agents. (See, e.g., [8].) Models in game theory assume that agents choose strategies in a way that anticipates the strategies of the other agents. A game-theoretic equilibrium predicts the strategies at which rational agents will deadlock. We will use equilibrium predictions to assess the long-term consequences of obfuscation technologies.
Consider a game-theoretic model with N+1 players: N users and one machine learning agent which computes some statistic of the users’ data. The total utility function of each user is composed of an accuracy term, a privacy term, and a term that reflects the cost of perturbation:
Here, isproportional to accuracy,
is inversely related to privacy, and
implies that user i pays a flat cost of
for using obfuscation. If user i perturbs her data maximally, then she receives zero benefit for accuracy and zero loss for privacy, and she pays the cost
for using obfuscation. On the other extreme, if user i does not perturb at all, then she gains
for accuracy and loses
for privacy. The utility function for the learner is similar, except that it does not have a term related to privacy.
We have three tasks. The first is to determine the conditions under which user i will employ obfuscation. (See [9, 10] for the mathematical details.) There are three cases to consider. First, if users are much more accuracy-sensitive than privacy-sensitive, then they will never obfuscate. Second, if the opposite is true, then they will always obfuscate. The most interesting case is in the middle. In this intermediate case, the equilibrium predicts that user i will obfuscate if other users—on average—obfuscate above a threshold amount. This is important. It suggests a strategic reason why adoption could cascade; this reason is in addition to the type of epidemic spreading often seen in technology adoption.
Our second task is to ask how a machine learning agent can avoid this large-scale adoption of obfuscation. For if many users perturb, then the learning agent’s accuracy will be greatly decreased. We find that if the learning agent proactively provides a sufficient level of privacy protection, then the users will have no incentive to obfuscate. Their obfuscation was only a tool to protect their privacy, and if the learning agent does this himself, then the users are content to submit truthful data.
The third task is to analyze whether providing this protection is incentive-compatible for the machine learning agent. In other words, if he concedes some accuracy in order to protect privacy to some degree, can he improve his outcome in the long run by avoiding cascading adoption of obfuscation?
We find that he can—but only under certain circumstances. Protection is incentive-compatible if obfuscation is cheap for the learner and expensive for the users. Perhaps more surprisingly, protection is also incentive-compatible for the learner to the degree that he is highly accuracy-sensitive. This sensitivity will make him more cautious about cascading adoption of obfuscation. Finally, this incentive-compatibility is proportionally difficult to the degree to which users are privacy-sensitive. The more sensitive they are, the more the learning agent will have to perturb, and the more this will cost him.
While much work remains to be done, we have shown that if 1) users care about privacy enough to cause cascading adoption of obfuscation and 2) obfuscation is sufficiently cheap (and accuracy is sufficiently important) for the learning agent, then it is optimal for learning agents to avoid obfuscation by protecting privacy themselves. In this case, the threat of obfuscation is sufficient to accomplish its objective; and this satisfies the type of proportionality that makes obfuscation morally justified.
References
[1] Brunton, Finn and Helen Nissenbaum. Obfuscation: A User’s Guide for Privacy and Protest. The MIT Press, Cambridge, Massachusetts, 2015.
[2] Sinnot-Armstrong, Walter. “Consequentialism,” The Stanford Encyclopedia of Philosophy (Winter 2015 Edition), Edward N. Zalta (ed.). (Online. Available: https://plato.stanford.edu/archives/win2015/entries/consequentialism/.) 2015.
[3] Moore, George Edward. Ethics. Oxford University Press, New York. 1912.
[4] Mahon, James Edwin. “The definition of lying and deception,” The Stanford Encyclopedia of Philosophy (Winter 2016 Edition), Edward N. Zalta (ed.). (Online. Available: https://plato.stanford.edu/cgi-bin/encyclopedia/archinfo.cgi?entry=lying-definition.) 2016.
[5] McDonald, Aleecia and Lorrie Faith Cranor. “The cost of reading privacy policies,” Privacy Year in Review, 2008.
[6] Catechism of the Catholic Church: Paragraph 2309, 1992.
[7] McIntyre, Alison. “Doctrine of double effect,” The Stanford Encyclopedia of Philosophy (Winter 2014 Edition), Edward N. Zalta (ed.). (Online. Available: https://plato.stanford.edu/cgi-bin/encyclopedia/archinfo.cgi?entry=double-effect.)
[8] Fudenberg, Drew and Jean Tirole. Game Theory. The MIT Press, Cambridge, Massachusetts, 1991.
[9] Pawlick, Jeffrey and Quanyan Zhu. “A Stackelberg game perspective on the conflict between machine learning and data obfuscation,” IEEE Workshop on Information Forensics and Security, 2016. (Available: https://arxiv.org/pdf/1608.02546.pdf.)
[10] Pawlick, Jeffrey and Quanyan Zhu. “A Mean-Field Stackelberg Game Approach for Obfuscation Adoption in Empirical Risk Minimization,” Submitted to IEEE Global SIP, Symposium on Control & Information Theoretic Approaches to Privacy and Security, 2017. (Available: https://arxiv.org/pdf/1706.02693.pdf.)