Ready for a paranoid view about how AI, deepfakes, censorship, and the metaverse can be combined in a trust-destroying and damaging way? Read on! During a discussion I had with an analytics executive, we talked about current trends in AI and technology and brainstormed how things could go bad. One scenario we landed upon is covered here.
Note that I am NOT saying that I believe what I discuss here is certain to happen. Rather, I’m saying that as crazy and paranoid as it sounds, what follows will soon be technologically plausible! The plausibility of a seemingly paranoid and crazy scenario made me think readers might enjoy having their own thinking tested. Those of us in the industry, as well as society at large, must figure out how to move forward safely and ethically with the powerful technologies being developed today. Otherwise, bad scenarios like the one that follows will eventually happen.
Start With The Metaverse
While the metaverse has lost a bit of steam lately in terms of hype and adoption, there is no doubt that we’ll continue to see increases in people interacting digitally. The early versions of the metaverse provided cartoonish avatars for users. However, with the rapid evolution of generative AI, we aren’t far from having highly realistic avatars that look and sound just like us. Typically, a metaverse provider will let your voice pass through as-is during a conversation. However, there are also scenarios – such as when your words must be translated to a different language – where our words will be delivered with an AI voice generator that mimics our own.
The key point is that metaverse providers will not only have access to everything you say and do in their environments but will also be trusted make your avatar move and speak as you have asked, even if in some cases, like the translation case, you will know the metaverse provider is making changes to your actual words or movements.
Now, Layer Deepfakes On Top
Deepfakes continue to grow in sophistication while dropping in cost and expanding in availability. It doesn’t cost much to make a convincing deepfake today, and even an average person can create one. Malicious users of a metaverse environment could easily replicate your avatar and voice and pretend to be you in an interaction with one of your contacts. This is already happening with phone calls and email, so it will certainly occur in the metaverse as well.
Taking it further, it is also possible that metaverse providers will implement rules that allow their systems to intentionally create a deepfake of you if you’ve done something that violates their terms of service. For example, if you try to get your avatar to say or do something illegal, the provider might edit that out in real time and either show your conversation partner a pause or fill in with something generic. This isn’t far removed from how live TV today is on a short delay so that inappropriate or unexpected events can be cut on the fly. The key here is that your communications in the metaverse could be altered in real-time and you wouldn’t necessarily be notified.
Finally, Add Censorship
There has been much debate about whether censorship should be allowed, if not encouraged or required. Even among people who agree that censorship should occur, there is little agreement on where the lines should be drawn. Bringing the last two sections together, a metaverse provider could implement AI to strip out certain speech in real time so that the recipient doesn’t hear what you really said. Think of this as the metaverse equivalent of intercepting and blocking a social media post before it goes live.
Taken to the extreme, if it is decided that a certain viewpoint “isn’t acceptable”, a metaverse provider could not only block your statement, but could do a deepfake of you saying the opposite, “approved” viewpoint. For example, you say you like Candidate A, but your conversation partner hears your deepfake say you like Candidate B. The person you’re talking with might be surprised you’d say you liked Candidate B. When they ask about it, their own response might be replaced with a deepfake saying “I agree”. In other words, like in the book 1984, it wouldn’t even be possible for people to express or discuss the censor’s unfavored viewpoints since the platform would intervene to prevent it.
Where Does That All Take Us?
The point of this extreme example is that we could end up in a truly Orwellian world where everyone is being censored frequently – and even having their words replaced -- but we don’t know it! Anything you see or hear in the metaverse could be a deepfake in this scenario. Thus, for anything of importance an in-person meeting or some other mechanism to validate that what you are seeing and hearing is what someone else is truly doing and saying would be necessary! Note that the same concept of intercepting and altering live discussions isn’t unique to the metaverse and could also happen in texting apps, cellular calls, and other communication channels.
I’ll reiterate that I’m not saying we’ll end up in this scenario, but I can say that we absolutely will soon have technology available that will make it possible. If it is possible, I have no doubt that someone will try to do it. With some governments and communication platforms already censoring citizens and punishing them for saying the wrong things, however, is an uncomfortable small stretch to think that they’d try implementing scenarios such as what was outlined here.
In the end, the way to avoid these outcomes is the smart, transparent, and ethical rollout and management of AI and other technologies that are advancing so rapidly and changing our lives today. If we don’t anticipate the nefarious ways the technologies could be used up front, then we risk learning the hard way what is possible. None of us want to live in a world where we literally can’t trust anything we see or hear unless we are face to face with the source. But if we aren’t diligent, we could end up there – tinfoil hats and all.