Data Anonymization software for healthcare organizations and data warehouse solutions

MSc Defense by By Andrea Del Popolo and Simone Leomann


Many are the contributions about anonymization of private information, however, we have noticed a lack of tools which can be integrated with data warehouse and business solutions techniques. On the contrary, many are the scientific contributions about anonymization and privacy preserving techniques available in literature, the main goal of this work is to adapt and integrate such methodologies in a tool, capable to match the business needs driven by Extract Transform and Load processes and data warehouse construction.

In this dissertation we perform an analysis of the state of the art in matter of:

· privacy and attack models

· anonymization operations

· anonymization algorithms

· information metrics

· risk analysis

We also investigate the needs of anonymization in the contexts of data warehouse, healthcare datasets and business intelligence, performing an analysis of the problems related to such environments and providing a formal definition of these last.

Thereafter, we will report about the design and consequent implementation of our anonymization tool, addressing the following aspects:

· requirements elicitation

· case study on datasets provided by Statens Serum Institute · project management · software architecture by viewpoints and perspectives · software validation through testing · performance interpretation · future work

External Examiner:

Philippe Bonnet, ITU


Ken Friis Larsen, DIKU

Kennie Nybo Pontoppidan, Rehfeld