A Privacy Model for Data Mining
Privacy preservation has emerged in the recent half decade as one of the more intriguing aspects of data mining. This is due to both rising concerns about rights violation using data mining and to the emergence of important markets (e.g., homeland security, cross company production chain data mining) for this type of applications. This area of research is rapidly maturing. Unfortunately, recent studies all point to one major deficiency -- the lack of a well defined way of modeling the privacy retained by a privacy preserving data mining algorithm.
In this work we approach the modeling problem by extending an existing privacy model -- $k$-anonymity -- which was originally considered in the context of anonymous communication and then transfered to the context of data tables releases. We show how this model can be extended to apply to various models of a data table. Beyond its immediate contribution for the analysis of the privacy of practically any data mining model, our extension is also useful for the development of new data anonymization techniques and of new privacy preserving data mining algorithms.
*Joint work with Assaf Schuster and Arik Friedman.