Anonymising your data

Have you thought about what you will need to do in order to anonymise your data adequately?

It is a good idea to take advice from relevant staff in your organisation - such as the manager responsible for data protection.  Check if there are any standard protocols for data management that you should be following. 

You can also access very good advice on anonymisation of data through the UK Data Archive.  They have specific guidance on how to anonymise qualitative data, and quantitative data

Rather than attempting to reproduce this excellent guidance here, we suggest you check the appropriate information on that site.  However, the key points to note are as follows:

Quantitative data

  • Remove direct identifiers (e.g., personal information such as addresses)
  • Aggregate or reduce the precision of variables that might be identifiable (such as postcode). 
  • Generalise text variables to reduce identifiability
  • Restrict continuous variables to reduce outliers
  • Pay particular attention to anonymising relational data - some anonymised variables may become identifiable when considered in combination.

Qualitative data

Anonymisation of qualitative data can be particularly complex, and is not simply a matter of removing personal information such as names or addresses, or of using pseudonyms.  As one of the interviewees who advised on the development of this guidebook observed, you do not need much of somebody’s life history to work out who they are, if you know them, or if they are distinctive in some way.  A distinctive event or combination of descriptions in a qualitative account could make somebody recognisable.  These concerns can mean that qualitative data can need some editing to ensure their anonymity but the UK Data Archive warns that:

Whenever editing is done, researchers need to be aware of the potential for distorting the data. For example, deleting all possible identifiers from text or sound recordings is a simple but blunt tool that creates data that are confidential but may be unusable.

This guidance notes that it may be better to use a reasonable level of anonymisation, alongside other regulations - for example in data access - to ensure that the assurances of confidentiality and anonymity that you gave to participants can realistically be maintained.  See the UK Data Archive guidance on access restriction for a discussion of these considerations.