This article details a work we did in collaboration with the French administration (DINSIC) and a French supreme court (Cour de cassation) around 2 well-known Named Entity Recognition (NER below) libraries, Spacy and Zalando Flair. Spacy accuracy was too limited for our needs, and Flair was too slow. At the end we optimized Flair up to a point where inference time has been divided by 10, making it fast enough to anonymize a large inventory of French case law.
Major ideas behind our approach are described below.
Source: towardsdatascience.com