April 24th 2026

Deepfakes and voice phishing: when AI impersonates your boss to drain your accounts

In February 2024, an employee at Arup — a UK-based engineering firm with 18,000 staff — receives an email from the company’s CFO in London. The message mentions an urgent, confidential transaction. The employee suspects a classic phishing attempt. Cautious, they request a video call to verify.

The call takes place. The CFO is there, on screen. Several colleagues are present too — all familiar faces. The conversation flows naturally, the instructions are clear: transfer 200 million Hong Kong dollars to five bank accounts. The employee complies. Fifteen transfers in total.

A week later, after contacting headquarters, the truth comes out: every participant on that call was a deepfake. The CFO, the colleagues — all AI-generated impersonations built from publicly available video footage. $25 million gone.

This isn’t a movie plot. It happened just two years ago.

What we’re starting to see in companies

At CreativMinds, we’ve been supporting Swiss SMEs with their cybersecurity challenges for the past seven years. Email phishing is nothing new to us — it’s part of the daily landscape. But since 2023, we’ve been seeing a shift that changes everything: artificial intelligence is making voice and video attacks alarmingly convincing.

The concept is easy to grasp, even if the technology behind it is complex. A deepfake is AI-generated audio or video that mimics a real person. The term combines “deep learning” and “fake.” Voice, face, expressions, accent — everything can be replicated from existing recordings. A YouTube interview, a webinar, a few minutes of conference footage can be enough to train a model.

So how does it actually work? These systems rely on what are known as generative adversarial networks. Two algorithms compete against each other: one generates content, the other tries to detect whether it’s fake. They continuously improve in response to one another until the output becomes nearly indistinguishable from reality. What was still academic research five years ago is now accessible to anyone with an internet connection.

And voice phishing — or “vishing” — has been completely transformed as a result. In the past, attackers needed someone who could convincingly imitate a voice. Today, a few audio samples and an online tool are enough. Some solutions can clone a voice with less than a minute of recording.

The numbers are staggering: according to several studies, deepfake fraud attempts increased by 3,000% in 2023. Voice deepfakes alone? Up 680% over the same period. And the average cost of a successful attack exceeds $500,000. Deloitte estimates that losses linked to generative AI fraud could reach $40 billion by 2027.

Three cases that highlight the scale of the problem

The Arup case (2024) — $25 million

As mentioned in the introduction, what stands out here is that the employee did exactly what’s typically recommended: verify via video call. Except the video call itself was compromised. The attackers had downloaded publicly available videos of the individuals involved and used AI to recreate their voices and faces in real time.

When attackers can simulate an entire meeting room, traditional reflexes are no longer enough.

The Ferrari case (2024) — attempted attack, successfully stopped

A Ferrari executive receives WhatsApp messages from CEO Benedetto Vigna. The profile picture checks out — the CEO in front of the Ferrari logo, in a suit, arms crossed. Urgent tone, confidential acquisition story. “Be ready to sign the NDA our lawyer will send you. The Italian regulator and the Milan Stock Exchange have already been informed. Absolute discretion.”

Then a call comes in: the voice perfectly mimics Vigna’s distinctive southern Italian accent. The attacker explains he’s using a different number for confidentiality reasons and asks for a foreign exchange hedging operation to be executed.

But something feels off — slightly mechanical intonations, almost imperceptible. The executive asks a simple question: “What book did you recommend to me last week?” The attacker hangs up immediately.

A personal question. That’s all it took to avoid a disaster.

The UK energy company case (2019) — $243,000 lost

This is one of the first documented cases of deepfake voice fraud. The director of a UK subsidiary receives a call from his German CEO. The voice is flawless — tone, accent, speech patterns. The instruction: transfer $243,000 to a Hungarian supplier, urgently, with immediate reimbursement to follow.

The director complies. A second call comes in, requesting another transfer. This time, something raises suspicion: the call is coming from an Austrian number, and the first reimbursement never arrived. Too late for the initial $243,000 — the money had already been routed through Hungary to Mexico before disappearing.

That was in 2019. Six years ago. The technology has advanced significantly since then.

Why SMEs are affected

You might think: “Ferrari, Arup — those are large multinationals. We’re a 50-person SME in French-speaking Switzerland. Why would anyone target us?”

That mindset is exactly what creates vulnerability.

First, the tools have become widely accessible. Creating a voice deepfake now costs less than a couple of dollars and takes only minutes. Attackers no longer need to go after big targets to make a profit.

Second, SMEs often have less formalized validation processes than large organizations. An urgent transfer request from the CEO? In many companies, it goes through without a second check. Relationships are more direct, trust is more immediate — which is a strength day to day, but a weakness when facing this type of attack.

Finally — and this is critical — SMEs have far less room to absorb losses of 50,000, 100,000, or 200,000 francs. What would be an embarrassing incident for Ferrari could seriously threaten a family-owned business. And unlike large corporations, SMEs typically don’t have dedicated legal teams or advanced cyber insurance to handle the aftermath.

There’s also the reputational impact. In a local environment where word of mouth matters, this kind of fraud can cause damage far beyond the immediate financial loss.

How to protect yourself in practice

The good news: effective countermeasures exist — and they don’t require massive investment. Most of them rely on human processes, not sophisticated technology.

Set up a verbal password

The idea comes straight from the Ferrari case. Agree with your key team members on a word or question that only you know. Something that doesn’t exist anywhere online: a shared memory, an inside joke, the name of a former colleague, a detail from a recent conversation.

If there’s any doubt during a call, ask the question. A deepfake can’t answer what it has never learned. It’s simple — almost low-tech — and that’s exactly why it works.

Enforce a systematic call-back policy

For any urgent financial request received by phone or video call, require a call-back using the person’s usual number — not the number displayed on the incoming call, but the one saved in your professional contacts.

Is it inconvenient? Yes. Does it slow things down? A bit. But it’s exactly what would have prevented the $25 million loss at Arup. The time spent verifying is negligible compared to the time spent dealing with fraud.

Train teams to recognize warning signs

A few signs can give away a voice deepfake:

  • Unusual micro-pauses in the conversation, as if there’s slight latency
  • A speech pattern that feels slightly off or too consistent
  • Audio quality that fluctuates in strange ways
  • An inability to answer personal or context-specific questions
  • Excessive pressure around urgency and confidentiality — the classic levers of social engineering

These signals are subtle. But once you know they exist, they become easier to spot. The goal isn’t to turn everyone into detection experts, but to build a reflex: when something feels “not quite right,” take the time to verify.

Separate validation channels

If a request comes via email, confirm it by phone. If it comes by phone, confirm it by email or in person. The idea is simple: never approve a sensitive action through a single channel — especially if that channel could be compromised.

This rule has long existed in banking procedures. It’s now essential for any organization handling funds or sensitive data.

The legal framework — still unclear

A quick note on the legal side, since we’re often asked: what does the law say about deepfakes?

The honest answer: not much that’s specific — at least for now. In Europe, the upcoming AI Act aims to regulate high-risk uses of artificial intelligence, with transparency requirements. In the United States, some laws require AI-generated content to be labeled. But there’s no global harmonization, and in most jurisdictions, using a deepfake for fraud falls under existing fraud laws — not dedicated legislation.

In practice, this means protection has to come from within. Waiting for regulation to catch up with the technology leaves you exposed for years.

What this means for tomorrow

Let’s be honest: deepfake technology will keep improving. What currently requires a few minutes of audio to clone a voice will soon take just a few seconds. Quality will keep getting better, and automated detection will always lag behind the latest techniques.

That doesn’t mean we’re powerless. It means we need to integrate this reality into the way we work.

Authentication is becoming just as critical for humans as it is for IT systems. Default trust — “it sounds like my boss, so it must be them” — is no longer something we can afford.

And paradoxically, this may be an opportunity to return to verification practices we’ve abandoned for the sake of convenience. Calling someone back to confirm a transfer, asking a personal question before taking action, taking an extra thirty seconds before making a decision — that’s not paranoia, it’s professional hygiene.

Key takeaways

Deepfakes and AI-powered voice phishing are no longer theoretical threats. Some companies are losing millions, while others avoid disaster thanks to simple reflexes.

Protection relies less on technology than on processes: verbal passwords, systematic call-backs, multi-channel validation, and team training.

And the best defense may be the oldest one: when something feels urgent and unusual, that’s precisely when you should slow down.

Practical checklist

Urgent financial request by phone or video? → Call back using the usual number before taking action

Doubt about someone’s identity? → Ask a personal question only they would know

Pressure around confidentiality and urgency? → That’s a red flag, not a reason to rush

Email + call about the same sensitive topic? → Confirm through a third channel (SMS, in person)

Something sounds “off” in the voice? → Trust your instinct and verify

Explore more insights in our blog