The large-scale monitoring of computer users’ software activities has
become commonplace, e.g., for application telemetry, error reporting,
or demographic profiling. This paper describes a principled systems
architecture—Encode, Shuffle, Analyze (ESA)—for performing such
monitoring with high utility while also protecting user privacy. The
ESA design, and its Prochlo implementation, are informed by our
practical experiences with an existing, large deployment of
privacy-preserving software monitoring.
With ESA, the privacy of monitored users’ data is guaranteed by its
processing in a three-step pipeline. First, the data is encoded to
control scope, granularity, and randomness. Second, the encoded data
is collected in batches subject to a randomized threshold, and blindly
shuffled, to break linkability and to ensure that individual data
items get “lost in the crowd” of the batch. Third, the anonymous,
shuffled data is analyzed by a specific analysis engine that further
prevents statistical inference attacks on analysis results.
ESA extends existing best-practice methods for sensitive-data
analytics, by using cryptography and statistical techniques to make
explicit how data is elided and reduced in precision, how only
common-enough, anonymous data is analyzed, and how this is done for
only specific, permitted purposes. As a result, ESA remains compatible
with the established workflows of traditional database analysis.
Strong privacy guarantees, including differential privacy, can be
established at each processing step to defend against malice or
compromise at one or more of those steps. Prochlo develops new
techniques to harden those steps, including the Stash Shuffle, a novel
scalable and efficient oblivious-shuffling algorithm based on Intel’s
SGX, and new applications of cryptographic secret sharing and
blinding. We describe ESA and Prochlo, as well as experiments that
validate their ability to balance utility and privacy.
This is joint work with Andrea Bittau, Úlfar Erlingsson, Petros
Maniatis, Ilya Mironov, David Lie, Mitch Rudominer, Ushasree Kode,
Julien Tinnes, and Bernhard Seefeld.
Ananth Raghunathan is a computer scientist interested in cryptography,
security, and privacy. At Google, he is a Senior Research Scientist
working on novel techniques to collect data with privacy, post-quantum
security, and topics at the intersection of security and machine
learning. Prior to joining Google, Ananth received his Ph.D. from
Stanford where his research focused on modeling and building secure
deterministic and searchable encryption schemes.