ABSTRACT
We propose the first differentially private aggregation algorithm for distributed time-series data that offers good practical utility without any trusted server. This addresses two important challenges in participatory data-mining applications where (i) individual users collect temporally correlated time-series data (such as location traces, web history, personal health data), and (ii) an untrusted third-party aggregator wishes to run aggregate queries on the data.
To ensure differential privacy for time-series data despite the presence of temporal correlation, we propose the Fourier Perturbation Algorithm (FPAk). Standard differential privacy techniques perform poorly for time-series data. To answer n queries, such techniques can result in a noise of Θ(n) to each query answer, making the answers practically useless if n is large. Our FPAk algorithm perturbs the Discrete Fourier Transform of the query answers. For answering n queries, FPAk improves the expected error from Θ(n) to roughly Θ(k) where k is the number of Fourier coefficients that can (approximately) reconstruct all the n query answers. Our experiments show that k << n for many real-life data-sets resulting in a huge error-improvement for FPAk.
To deal with the absence of a trusted central server, we propose the Distributed Laplace Perturbation Algorithm (DLPA) to add noise in a distributed way in order to guarantee differential privacy. To the best of our knowledge, DLPA is the first distributed differentially private algorithm that can scale with a large number of users: DLPA outperforms the only other distributed solution for differential privacy proposed so far, by reducing the computational load per user from O(U) to O(1) where U is the number of users.
- AGRAWAL, R., FALOUTSOS, C., AND SWAMI, A. N. Efficient similarity search in sequence databases. In FODO (1993). Google ScholarDigital Library
- BARAK, B., CHAUDHURI, K., DWORK, C., KALE, S., MCSHERRY, F., AND TALWAR, K. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In PODS (2007). Google ScholarDigital Library
- BLUM, A., DWORK, C., MCSHERRY, F., AND NISSIM, K. Practical privacy: The suLQ framework. In PODS (2005). Google ScholarDigital Library
- CRAMER, R., DAMGARD, I., AND NIELSEN, J. B. Multiparty computation from threshold homomorphic encryption. In EUROCRYPT (2001). Google ScholarDigital Library
- DWORK, C. Differential privacy: A survey of results. In TAMC (2008). Google ScholarDigital Library
- DWORK, C., KENTHAPADI, K., MCSHERRY, F., MIRONOV, I., AND NAOR, M. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT (2006). Google ScholarDigital Library
- DWORK, C., MCSHERRY, F., NISSIM, K., AND SMITH, A. Calibrating noise to sensitivity in private data analysis. In TCC (2006). Google ScholarDigital Library
- EISENMAN, S. B., MILUZZO, E., LANE, N. D., PETERSON, R. A., AHN, G.-S., AND CAMPBELL, A. T. The bikenet mobile sensing system for cyclist experience mapping. In ACM SenSys (2007). Google ScholarDigital Library
- EVFIMIEVSKI, A., GEHRKE, J., AND SRIKANT, R. Limiting privacy breaches in privacy preserving data mining. In PODS (2003). Google ScholarDigital Library
- FOUQUE, P., POUPARD, G., AND STERN, J. Sharing decryption in the context of voting or lotteries. In FC: International Conference on Financial Cryptography (2000), LNCS, Springer-Verlag. Google ScholarDigital Library
- GANTI, R. K., PHAM, N., TSAI, Y.-E., AND ABDELZAHER, T. F. PoolView: stream privacy for grassroots participatory sensing. In ACM SenSys (2008). Google ScholarDigital Library
- GOLDREICH, O. Secure multi-party computation.Google Scholar
- GRITZALIS, D. Secure electronic voting.Google Scholar
- GUHA, S., REZNICHENKO, A., HADDADI, H., AND FRANCIS, P. Serving ads from localhost for performance, privacy, and profit. In HotNets (2009).Google Scholar
- HAY, M., RASTOGI, V., MIKLAU, G., AND SUCIU, D. Boosting the accuracy of differentially-private histograms through consistency. In VLDB 2010.Google ScholarDigital Library
- HULL, B., BYCHKOVSKY, V., ZHANG, Y., CHEN, K., GORACZKO, M., MIU, A., SHIH, E., BALAKRISHNAN, H., AND MADDEN, S. Cartel: a distributed mobile sensor computing system. In ACM SenSys (2006). Google ScholarDigital Library
- KRUMM, J. A survey of computational location privacy. Personal and Ubiquitous Computing, 2008 (2008). Google ScholarDigital Library
- KRUMM, J., AND HORVITZ, E. Predestination: Where do you want to go today? IEEE Computer (2007). Google ScholarDigital Library
- MACHANAVAJJHALA, A., GEHRKE, J., KIFER, D., AND VENKITASUBRAMANIAM, M. l-diversity: Privacy beyond k-anonymity. In ICDE (2006). Google ScholarDigital Library
- PAPADIMITRIOU, S., LI, F., KOLLIOS, G., AND YU, P. S. Time series compressibility and privacy. In VLDB (2007). Google ScholarDigital Library
- RASTOGI, V., HAY, M., MIKLAU, G., AND SUCIU, D. Relationship privacy: Output perturbation for queries with joins. In PODS 2009. Google ScholarDigital Library
- RASTOGI, V., AND NATH, S. Differentially private aggregation of distributed time-series with transformation and encryption. Tech. Rep. MSR-TR-2009-186, Microsoft Research, 2009.Google Scholar
- RASTOGI, V., SUCIU, D., AND HONG, S. The boundary between privacy and utility in data publishing. In VLDB (2007). Google ScholarDigital Library
- SHAMIR, A. How to share a secret. Communications of the ACM 22, 11 (1979). Google ScholarDigital Library
- SHILTON, KATIE, B. J., ESTRIN, D., GOVINDAN, R., AND KANG, J. Designing the personal data stream: Enabling participatory privacy in mobile personal sensing. In 37th Research Conference on Communication, Information and Internet Policy (TPRC) (2009).Google Scholar
- SWEENE, L. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems (2002). Google ScholarDigital Library
- XIAO, X., WANG, G., AND GEHRKE, J. Differential privacy via wavelet transforms. In ICDE 2010.Google ScholarCross Ref
- YAO, A. C.-C. Protocols for secure computations (extended abstract). In FOCS (1982). Google ScholarDigital Library
Index Terms
- Differentially private aggregation of distributed time-series with transformation and encryption
Recommendations
Relationship privacy: output perturbation for queries with joins
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsWe study privacy-preserving query answering over data containing relationships. A social network is a prime example of such data, where the nodes represent individuals and edges represent relationships. Nearly all interesting queries over social ...
Bayesian Differential Privacy on Correlated Data
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataDifferential privacy provides a rigorous standard for evaluating the privacy of perturbation algorithms. It has widely been regarded that differential privacy is a universal definition that deals with both independent and correlated data and a ...
Publicly Verifiable Private Aggregation of Time-Series Data
ARES '15: Proceedings of the 2015 10th International Conference on Availability, Reliability and SecurityAggregation of time-series data offers the possibility to learn certain statistics over data periodically uploaded by different sources. In case of privacy sensitive data, it is desired to hide every data provider's individual values from the other ...
Comments