skip to main content
10.1145/1989284.1989290acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Pan-private algorithms via statistics on sketches

Published:13 June 2011Publication History

ABSTRACT

Consider fully dynamic data, where we track data as it gets inserted and deleted. There are well developed notions of private data analyses with dynamic data, for example, using differential privacy. We want to go beyond privacy, and consider privacy together with security, formulated recently as pan-privacy by Dwork et al. (ICS 2010). Informally, pan-privacy preserves differential privacy while computing desired statistics on the data, even if the internal memory of the algorithm is compromised (say, by a malicious break-in or insider curiosity or by fiat by the government or law).

We study pan-private algorithms for basic analyses, like estimating distinct count, moments, and heavy hitter count, with fully dynamic data. We present the first known pan-private algorithms for these problems in the fully dynamic model. Our algorithms rely on sketching techniques popular in streaming: in some cases, we add suitable noise to a previously known sketch, using a novel approach of calibrating noise to the underlying problem structure and the projection matrix of the sketch; in other cases, we maintain certain statistics on sketches; in yet others, we define novel sketches. We also present the first known lower bounds explicitly for pan privacy, showing our results to be nearly optimal for these problems. Our lower bounds are stronger than those implied by differential privacy or dynamic data streaming alone and hold even if unbounded memory and/or unbounded processing time are allowed. The lower bounds use a noisy decoding argument and exploit a connection between pan-private algorithms and data sanitization.

References

  1. G. Cormode, M. Datar, P. Indyk, and S. Muthukrishnan. Comparing data streams using hamming norms (how to zero in). TKDE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. Journal of Algorithms, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Dwork. Differential privacy in new settings. In SODA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Dwork, F. Mcsherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Dwork, F. McSherry, and K. Talwar. The price of privacy and the limits of lp decoding. In STOC, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Dwork, M. Naor, T. Pitassi, G. Rothblum, and S. Yekhanin. Pan-Private streaming algorithms. In ICS, 2010.Google ScholarGoogle Scholar
  8. C. Dwork, T. Pitassi, M. Naor, and G. Rothblum. Differential privacy under continual observation. In STOC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Gupta, K. Ligett, F. McSherry, A. Roth, and K. Talwar. Differentially private combinatorial optimization. In SODA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Indyk. Stable distributions, pseudorandom generators, embeddings, and data stream computation. In Journal of ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Kane, J. Nelson, and D. Woodruff. An optimal algorithm for the distinct elements problem. In PODS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. McGregor, I. Mironov, T. Pitassi, O. Reingold, K. Talwar, and S. Vadhan. Limits of two-party differential privacy. In FOCS, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Mcsherry and K. Talwar. Mechanism design via differential privacy. In FOCS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. I. Mironov, O. Pandey, O. Reingold, and S. Vadhan. Computational differential privacy. In CRYPTO, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. P. Nolan. Stable Distributions - Models for Heavy Tailed Data. Birkhauser, 2010. In progress, Chapter 1 online at academic2.american.edu/~jpnolan.Google ScholarGoogle Scholar
  16. S. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of American Statistical Association, 1965.Google ScholarGoogle Scholar

Index Terms

  1. Pan-private algorithms via statistics on sketches

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            PODS '11: Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
            June 2011
            332 pages
            ISBN:9781450306607
            DOI:10.1145/1989284

            Copyright © 2011 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 June 2011

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            PODS '11 Paper Acceptance Rate25of113submissions,22%Overall Acceptance Rate642of2,707submissions,24%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader