I set out here a proposal (unfinished) for the standards to be used when anonymising case law.
The present approach is haphazard with some, only occasionally effective to me, anonymisation taking place.
The anonymisation itself is not that difficult, though I happily acknowledge it may be tediously time consuming and requiring its own techniques and rules and care. I am concerned here merely with the selection of names.
The problem appears to arise from a failure to recognise that the way people look up case law has changed fundamentally. The days of looking up a paper index are long gone. The main systems are now primarily digital. Digital indices work differently, and the advantages of such indices are being discarded. Take, for example a case regarding a child, say Emma Booth. The first an obvious choice might be E (a child).
The purpose of the case name is to help to get to the relevant case as easily as possible. This has two elements: finding the candidate cases, and excluding cases of no interest. ‘E (a child)’ does neither effectively. The use of a single letter simply does not help. There are very many such cases, and te name does very little to help you get there.
It would help first to suggest (as often happens) that once anonymisation is granted, the new choice of names should be a complete restart. ‘E (a child) should never allow a thought that ‘E’ may actually be correct initial. Someone wrongly attempting a reconstruction of the identity by piecing together bits of information should be in no doubt that there will be no connection between the initial chosen and the actual name.
It is necessary to be able to trace an individual case through its various stages. (Is it?)
Suggest an agreed list of unnacceptable sensitive codes (swear words etc). This is something which needs doing because it can be too easy to use a word with unexpected meaning.
What do computers look for. Exact and inexact matches.