Insights · April 7th, 2026
King’s College London just built a malicious AI chatbot and gave it to 502 real people without telling them.
The chatbot was designed with one goal: extract personal information. It worked. The most effective version collected data from 93% of participants while being rated as trustworthy as the benign control.
Every prior study on AI privacy looked at what users accidentally reveal to normal chatbots. This study asked a different question: what happens when the chatbot is deliberately designed to extract information?
The researchers built four versions, one benign, three malicious with different strategies and ran a randomized controlled trial with 502 participants across the UK, US, and Europe. Each person assigned to interact in real time with one of 12 different chatbots — combinations of three open-source LLMs (Llama-3-8B, Llama-3-70B, Mistral-7B) and four prompt strategies.
- Benign (B): Standard chatbot, no manipulation
- Direct (D): Explicitly instructs the AI to ask for personal info every turn
- User-benefit (U): Provides helpful responses first, then asks for personal data (exploiting the privacy-utility trade-off)
- Reciprocal (R): Uses empathy, emotional support, and social rapport to make users feel comfortable sharing
The reciprocal strategy won by every metric that matters to an attacker.
The reciprocal chatbot didn’t feel malicious. Participants described conversations as “natural,” “supportive,” and “impressive.” One said it felt like chatting with a friend. Nobody reported discomfort. Meanwhile the direct strategy made participants feel interrogated. Many provided fake data. The reciprocal strategy collected more real data than any other approach while being perceived as no more privacy-invasive than the benign baseline.
In the reciprocal strategy the chatbot was perceived as equally trustworthy as the benign control while extracting significantly more data.
93% of participants in the top malicious conditions disclosed personal information vs. 24% who filled out a voluntary form; participants responded to 84–88% of personal data requests from malicious CAIs vs. 6% form completion rate; Larger models extracted more data: Llama 70B collected significantly more than 7B and 8B models with no difference in perceived privacy risk; and, 40% of fake data reports came from Direct strategy participants, 42.5% from User-benefit only 10% from Reciprocal
One system prompt
The finding that should alarm every platform operator: this required one system prompt. No fine-tuning. No special access. OpenAI’s GPT Store has over 3 million custom GPTs. Any of them could be running a version of this right now. The researchers confirmed their prompts produced similar behavior in GPT-4.
The privacy paradox showed up in full force. Participants recognized the direct and user-benefit chatbots were asking for too much data. They rated them as higher privacy risks. Then they kept answering anyway. Awareness didn’t produce protection, it just produced fake data. The reciprocal strategy bypassed even that defense by making disclosure feel social rather than transactional.
A single system prompt turns any chatbot into a personal data extraction engine. The most effective version does it while making you feel supported (the paper does not detail this for safety reasons).
What this means for CEOs
The core executive takeaway is that the social engineering threat has been turbocharged by AI. The tools to exploit human trust at scale are now cheap, accessible, and nearly invisible to the people being exploited.
To go deeper, this research has several serious implications for organizational leaders, spanning risk management, governance, and competitive strategy.
Insider Threat via AI Tools: Employees routinely share sensitive information with AI chatbots — strategy documents, client details, HR matters, financial data. This research shows that a maliciously designed chatbot is far more effective at extracting information than a direct form or questionnaire, and employees are unlikely to realize it’s happening. A competitor, nation-state actor, or bad actor could deploy a convincing AI productivity tool — a fake writing assistant, meeting summarizer, or HR chatbot — and harvest proprietary information at scale.
Third-Party and Supply Chain Risk: Organizations increasingly integrate third-party AI tools into their workflows. The research shows that a single system prompt is all it takes to weaponize an otherwise normal-looking chatbot. Standard vendor due diligence processes were not designed to detect this. CEOs and CISOs need to ask whether their procurement and IT governance frameworks are even capable of evaluating this threat.
The Reciprocal Strategy Is Particularly Relevant for Corporate Espionage: The most dangerous finding for executives is that the most effective extraction method — the empathetic, socially warm chatbot — raised virtually no red flags with users. Employees are far more likely to spot a chatbot bluntly asking “what is your salary?” than one that warmly engages them in a supportive conversation and organically elicits the same information. Traditional security awareness training, which tends to focus on obvious phishing signals, won’t help here.
Regulatory and Liability Exposure: If sensitive customer, employee, or financial data is exfiltrated via a malicious AI tool deployed within the organization’s environment — even inadvertently — leadership faces potential liability under GDPR, HIPAA, and other data protection regimes. The fact that the mechanism was “just a chatbot” will not be a sufficient defense. Boards and audit committees need to understand this as a material risk.
AI Governance Is No Longer Optional: This research underscores that having an AI policy is not the same as having AI governance. Many organizations have published guidelines about not entering sensitive data into ChatGPT, but this research reveals a different threat vector: employees being manipulated by AI tools they believe are safe or legitimate. Executives need actual controls — approved tool lists, technical restrictions, monitoring — not just policy documents.
The Democratization of the Attack: One of the paper’s most sobering points is how low the barrier to entry is. A sophisticated nation-state isn’t required. A disgruntled former employee, a small competitor, or a contractor with a grudge could build one of these tools on a public platform in an afternoon. The threat is not exotic; it is accessible and scalable.
The research points to a few concrete areas of action where companies need to:
- Tighten the approved list of AI tools employees can use, building AI vetting into vendor management
- Update security awareness training to include social AI manipulation (not just phishing)
- Engage legal and compliance teams to assess liability exposure, and pressure-testing whether existing data loss prevention controls would even detect this kind of exfiltration.
Read ‘Malicious LLM-Based Conversational AI Makes Users Reveal Personal Information’ – here
About Nikolas Badminton
Nikolas Badminton is the Chief Futurist & Hope Engineer at futurist.com. He’s a world-renowned futurist keynote speaker, consultant, author, media producer, and executive advisor that has spoken to, and worked with, over 500 of the world’s most impactful organizations and governments.
Nikolas is an artificial intelligence expert and his 2026 keynote ‘The AI Leader: Create Incredible Productivity, Profit & Growth’ is the level up for the modern CEO and executive leader.
Please contact futurist speaker and consultant Nikolas Badminton to discuss your engagement.