Abstract
This paper considers the problem of providing security to statistical databases against disclosure of confidential information. Security-control methods suggested in the literature are classified into four general approaches: conceptual, query restriction, data perturbation, and output perturbation. Criteria for evaluating the performance of the various security-control methods are identified. Security-control methods that are based on each of the four approaches are discussed, together with their performance with respect to the identified evaluation criteria. A detailed comparative analysis of the most promising methods for protecting dynamic-online statistical databases is also presented. To date no single security-control method prevents both exact and partial disclosures. There are, however, a few perturbation-based methods that prevent exact disclosure and enable the database administrator to exercise "statistical disclosure control." Some of these methods, however introduce bias into query responses or suffer from the 0/1 query-set-size problem (i.e., partial disclosure is possible in case of null query set or a query set of size 1). We recommend directing future research efforts toward developing new methods that prevent exact disclosure and provide statistical-disclosure control, while at the same time do not suffer from the bias problem and the 0/1 query-set-size problem. Furthermore, efforts directed toward developing a bias-correction mechanism and solving the general problem of small query-set-size would help salvage a few of the current perturbation-based methods.
Keywords
Affiliated Institutions
Related Publications
Microdata disclosure limitation in statistical databases: query size and random sample query control
A probabilistic framework can be used to assess the risk of disclosure of confidential information in statistical databases that use disclosure control mechanisms. The authors s...
Secure statistical databases with random sample queries
A new inference control, called random sample queries, is proposed for safeguarding confidential data in on-line statistical databases. The random sample queries control deals d...
Knowledge Discovery in Databases
From the Publisher: Knowledge Discovery in Databases brings together current research on the exciting problem of discovering useful and interesting knowledge in It spans many ...
Revealing information while preserving privacy
We examine the tradeoff between privacy and usability of statistical databases. We model a statistical database by an n-bit string d1,..,dn, with a query being a subset q ⊆ [n] ...
A learning theory approach to noninteractive database privacy
In this article, we demonstrate that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of...
Publication Info
- Year
- 1989
- Type
- review
- Volume
- 21
- Issue
- 4
- Pages
- 515-556
- Citations
- 966
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1145/76894.76895