A Manual Approach to Anonymising and De-Anonymising Data

A Manual Approach to Anonymising and De-Anonymising Data

I made this because I don't have Co-Pilot. Here's a demo of the tool.

Overview

The tool operates locally (literally double click on the HTML file), requiring no external services or NLP technologies to identify and replace named entities. Instead, it provides a manual, user-driven process, allowing for the customisation of data handling in line with specific privacy policies or requirements. There's nothing fancy in this tool, but the code is straightforward enough to allow anyone to customise it.

Features

  • Manual Anonymisation: Users can input categories of data such as attendee names, teams, and company identifiers they wish to anonymise. By entering entities like personal names or specific team references, users can replace these with generic placeholders directly through the tool's interface.
  • Process and Outcome: Hit the anonymisation button and identifiable information is substituted with placeholders (e.g., Attendee A, Team B, Company C), effectively stripping the text of direct identifiers. This is particularly useful when preparing documents for processing through AI or other third-party services where data privacy is a concern.
  • De-Anonymisation Capability: The tool also reverses the anonymisation process. This feature is essential for restoring the original context to anonymised content after it has been processed, ensuring that the final content is both useful and respects the privacy of the individuals involved.

Application

The primary use case for this tool is in environments where sensitive information is frequently handled and needs to be anonymized before being sent to AI services for analysis, summarization, etc. It serves as a straightforward solution for maintaining privacy while still leveraging the capabilities of AI technologies.

Configuration Saving

A nice feature of the tool is its ability to save configurations. This means that users can create and store settings for specific types of content on a project by project basis, allowing for quicker anonymisation for similar content in the future.

And...

This tool provides a basic yet effective approach to data anonymisation and de-anonymisation, focusing on manual input and customisation. While it might not feature the sophistication of automated NLP tools, its simplicity and direct control over the anonymisation process offer a level of precision and security that some users may find beneficial for their specific needs. Whether for adhering to privacy policies or simply for personal data protection, this tool offers a straightforward method for managing sensitive information in documents before and after processing with AI services.

Now you might not find this has immediate benefit to you, or maybe the industry, however industries such as education where people want to benefit from Generative AI but budgets are so constrained that Co-pilot is out of reach, but free GPT is available - then having a free simple tool that can help keep you inside policy.

I've created a repository for this on Github:

GitHub - ryanmcdonough/SwapScribe
Contribute to ryanmcdonough/SwapScribe development by creating an account on GitHub.

Here's a link to the hosted version of the tool: https://65c4ee22df1243169ac20a06--loquacious-zuccutto-019ed2.netlify.app - simply to to File -> Save As in your browser and hey presto you have a copy locally.