Various data mining techniques can be used to discover useful knowledge from large collections of data. However, there is a risk of disclosing sensitive information when data is shared between different organizations. The balance between legitimate mining needs and the protection of confidential knowledge when data is released or shared must be carefully managed. In this paper, we study privacy preservation in association rule mining. A new distortion-based method is proposed which hides sensitive rules by removing some items in a database to reduce the support or confidence of sensitive rules below specified thresholds. In order to minimize side effects on knowledge, the information on non-sensitive itemsets contained by each transaction is used to sort the supporting transactions. The candidates that contain fewer non-sensitive itemsets are selected for modification preferably. In order to reduce the distortion degree on data, the minimum number of transactions that need to be modified to conceal a sensitive rule is derived. Comparative experiments on real datasets showed that the new method can achieve satisfactory results with fewer side effects and data loss.
- Association rule hiding
- Privacy preserving data mining
- Sensitive association rules
- Side effects