How to stop staff leaking data to ChatGPT and AI tools

You stop it with policy and controls together, not a ban. Write a short acceptable use policy that says what staff may and may not paste into AI tools, give them a sanctioned tool that does not train on your data, and back it with Microsoft 365 data loss prevention and sensitivity labels, browser and network controls, and brief, practical training. A ban alone does not work, because the tools are too useful and people route around it.

By Daniel McClure Fisher, Founder. CISSP, Chartered member of the Institute of Information Security (MCIIS). Updated May 2026

The short version

Your staff are already using ChatGPT and similar tools, whether you have approved them or not. The risk is not the technology itself. It is what people paste into it: client records, contracts, source code, financial figures, and anything else that gets the job done faster. Once that text goes into a public model, you have lost control of where it sits and who can see it.

A blanket ban feels like the safe answer, but it rarely holds. The tools are genuinely useful, so people find a way, often on a personal phone or a home laptop where you have no visibility at all. The approach that works is to make the safe path the easy path: a clear rule, a sanctioned tool that does not learn from your data, and a few technical controls that catch the obvious mistakes.

  • The real risk is staff pasting sensitive data into public models, not the existence of AI.
  • A ban on its own pushes the behaviour into the shadows, where you cannot see or control it.
  • Policy, a sanctioned tool, and a handful of technical controls together are what actually reduce the risk.

What staff are actually pasting in

To control the risk you have to be honest about what it looks like day to day. It is rarely malicious. It is a busy person trying to save time, and not thinking about where the text ends up. The common cases are worth naming plainly.

  • Client and customer data. Pasting a list of contacts to draft a mail merge, or a support ticket full of personal details to get a tidier reply. That is personal data leaving your control, which is a UK GDPR problem.
  • Contracts and commercial documents. Dropping in a supplier agreement or a tender response to summarise it or check the wording. Confidential and often covered by an NDA.
  • Source code. Developers pasting proprietary code to debug it or write tests. Your intellectual property, handed to a third party.
  • Financial and board material. Management accounts, forecasts, or a sensitive email, pasted in to rewrite or analyse.
  • Internal documents. Strategy notes, HR cases, and anything marked confidential, used to draft a quick summary.

The worry is not that the model gives a wrong answer. It is that the input is retained, may be used to train future versions on the consumer tiers, and could surface or be exposed later. You cannot recall it, and in a regulated context you may have to disclose that it happened.

Why a ban on its own fails

The instinct to block the lot is understandable, and on a managed work device you can do it. The problem is what happens next. The tools save people real time, so a hard block without an alternative does not remove the demand, it just moves it somewhere you cannot see.

Staff switch to a personal phone, a home machine, or a browser extension you have not accounted for. The work still gets done with AI, but now it happens completely outside your controls, with no logging and no policy. This is shadow IT, and a ban often makes it worse, not better. The honest goal is not zero AI use. It is AI use you can see, govern, and trust.

What actually works

Reducing this risk is a layered job. No single control does it. The combination below is what we put in place for clients, in roughly the order that gives the fastest return.

1. An acceptable use policy people will actually read

Start with a short, plain English policy that says what staff may and may not put into AI tools. Not a ten page document nobody opens, but a one page rule: never paste client data, personal data, contracts, code, credentials, or anything marked confidential into a public AI tool, and here is the sanctioned tool to use instead. Name the approved tools, name the banned behaviours, and say who to ask when someone is unsure. A policy people understand changes behaviour. One they never see does not. This sits inside your wider policy frameworks, alongside your acceptable use and data handling rules.

2. Give them a sanctioned tool with data controls

This is the step most organisations skip, and it is the one that makes the rest stick. If you give staff an approved tool that is safe to use, the temptation to use the consumer version drops sharply. The business and enterprise tiers of the main providers, for example ChatGPT Enterprise, Microsoft Copilot, and the equivalents, contractually do not train their models on your data and give you administrative control. The free consumer tiers are the ones to keep your data out of. Provide the safe option, make it easy to reach, and most people will take it.

3. Turn on Microsoft 365 data loss prevention and sensitivity labels

If you run Microsoft 365, you already have strong tools for this, often unused. Data loss prevention (DLP) policies can detect sensitive content, such as card numbers, national insurance numbers, or anything matching your own patterns, and warn or block when someone tries to share or paste it. Sensitivity labels let you classify documents and email as, say, confidential, and then enforce what can be done with them. Microsoft Purview extends this to detect sensitive data being typed into AI tools in the browser. These controls do the watching so a tired employee does not have to remember the rule at five on a Friday.

4. Add browser and network controls

On managed devices you can go further. Browser management can restrict risky extensions and apply policies in the browser itself, which is where most of this activity happens. Network and web filtering can block the consumer AI sites you have not sanctioned while allowing the approved one, and a secure web gateway can inspect and limit what leaves the network. These controls are blunt on their own, which is exactly why they pair with a sanctioned alternative rather than replacing it. They sit naturally alongside your email and endpoint security.

5. Train people, briefly and concretely

Most leaks are honest mistakes, so a little awareness goes a long way. Keep it short and specific. Show the real examples above, explain in one line why pasting a client list into a public tool is a data breach, and point everyone at the sanctioned tool. People follow a rule far more readily when they understand the reason behind it, and when the safe option is genuinely easier than the risky one.

The governed alternative

The controls above stop the leaks. The opportunity is to go one step further and give your people AI that is both useful and safe by design. That is the work we do: we build AI you can audit, on infrastructure you control, with approval gates, a full log of what the assistant did, and your data kept inside your own boundary rather than handed to a public model.

Done this way, you do not have to choose between productivity and control. Staff get a capable assistant for the real work, and you get an audit trail an insurer or a regulator will accept. If you want the benefit of AI without the data risk, the answer is not to ban it. It is to adopt it deliberately. See how we approach AI adoption, governance included as standard.

FAQ

Common questions

Should we just ban ChatGPT at work?

A ban on its own rarely works, because the tools are useful enough that people route around it on personal devices, where you have no visibility at all. The better approach is to give staff a sanctioned tool that does not train on your data, set a clear policy on what they must not paste in, and back it with technical controls. The goal is AI use you can see and govern, not zero AI use you cannot enforce.

Is it a data breach if an employee pastes client data into ChatGPT?

It can be. If the text contains personal data, putting it into a public tool sends that data to a third party outside your control, which is a likely breach of UK GDPR and may be reportable. On the consumer tiers the input can also be retained and used to train future models. Treat any client, personal, or confidential information pasted into an unsanctioned AI tool as a data incident, and that is exactly the behaviour a clear policy and data loss prevention are there to prevent.

Do the paid versions of AI tools train on our data?

Generally no. The business and enterprise tiers, such as ChatGPT Enterprise and Microsoft Copilot, contractually commit not to train their models on your data and give you administrative control. The free consumer tiers are the ones to be careful with, as they may use your inputs to improve the model. Always confirm the current terms of the specific tier you are buying, because providers change them, and choose the tier that matches the sensitivity of your data.

How does Microsoft 365 help stop AI data leaks?

Microsoft 365 includes data loss prevention and sensitivity labels through Purview. DLP policies can detect sensitive content and warn or block when someone tries to share or paste it, and sensitivity labels let you classify documents and control what can be done with them. Newer features can detect sensitive data being entered into AI tools in the browser. Most organisations already pay for these capabilities and have not turned them on.

What should an AI acceptable use policy cover?

Keep it to a page. Say plainly what staff must never paste into a public AI tool, such as client data, personal data, contracts, source code, credentials, and anything marked confidential. Name the sanctioned tool they should use instead, and say who to ask when they are unsure. A short policy people understand changes behaviour far more than a long one nobody reads, and it should sit inside your wider data handling and acceptable use frameworks.

Can we use AI safely at all then?

Yes, and you should. The point is to adopt it deliberately rather than let it leak in. Give people a sanctioned tool that keeps your data private, set clear rules, and add controls that catch the obvious mistakes. Going further, AI can be built on infrastructure you control, with approval gates and a full audit trail, so you get the productivity without handing data to a public model. That is the difference between AI that is a risk and AI you can prove is safe.

Want AI your business can actually trust?

Tell us how your people are using AI today, and what you need to keep private. We will help you put the policy and controls in place, and show you how AI can be adopted safely, with the evidence to prove it. We reply within one working day, and you will speak to an engineer, not a salesperson.

Reading, Berkshire  /  UK based and accountable  /  reply within one working day