John DiMarco on Computing (and occasionally other things)

John DiMarco on Computing (and occasionally other things)
I welcome comments by email to jdd at cs.toronto.edu.

Mon 02 Sep 2024 15:19

Video-conferencing with Deepfaked Avatars

AI-generated image of a metallic, blue-eyed, smiling android face, looking forward — AI-generated image by Gilang Fajar Perdana from Pixabay

Very shortly after my previous post about the social implications of AI-generated deepfakes, where I pointed out how the development of AI technology to misrepresent things in video-conferences is an illustration of emerging deepfake capability, Susan McCahan kindly pointed me to a rather interesting interview in the summer of 2024 with Eric Yuan, founder of Zoom (one of the leading video-conferencing vendors) where he talks about deepfakes in video-conferencing. But he does not talk about them in the same way I do, where I express concerns about deepfakes eroding confidence in digital media channels. Rather, he says that deepfaking video-conference attendance is a good thing, coming soon, and it is something he wants to do himself.

This interview with Eric Yuan took place in early June of 2024. The interviewer was Nilay Patel, and the interview is written up in The Verge. In the writeup, Patel summarizes by saying that "Eric really wants you to stop having to attend Zoom meetings yourself. You"ll hear him describe how he thinks one of the big benefits of AI at work will be letting us all create something he calls a `digital twin' -- essentially a deepfake avatar of yourself that can go to Zoom meetings on your behalf and even make decisions for you while you spend your time on more important things, like your family." Indeed, Yuan seems to say what Patel claims. He starts by describing how many video-conferencing meetings he attends in a typical work day. He wishes he had an AI avatar to attend for him, not just to listen, but to "interact with a participant in a meaningful way". He says he would like to "count on my digital twin. Sometimes I want to join, so I join. If I do not want to join, I can send a digital twin to join. That's the future." Patel, later in the interview, points out the obvious implications of this notion. He says, "If the vision is `I have a digital twin that goes to a Zoom meeting and makes a decision,' you need to deepfake me. You need to make a realistic render of me that can go act in those situations". Yuan does not deny the deepfake accusation, yet he does not fully confirm it either: the interview does not make it clear whether Yuan's vision includes disclosing to the other video-conferencing participants that they are dealing with a digital twin rather than the actual person.

Patel is right to raise the issue. Yuan's vision of sending a digital twin (a simulacrum of himself) to a meeting, instead of going himself, does not seem as if he is trying to create a sort of virtual AI subordinate, disclosed as such to the other meeting attendees. After all, sending a subordinate to a meeting in lieu of oneself is already possible, and entrusting various sorts of decision-making to a subordinate is also nothing new. Having that subordinate be an AI simulacrum of oneself is new, but a virtual subordinate is still a subordinate. But it sounds like Yuan wants not a subordinate, but a convincing stand-in for himself. He seems to want an actual deepfake, something that the other attendees will believe is Yuan. He wants it because he knows he needs to attend certain meetings himself, instead of sending a subordinate, but he does not want to attend.

Yuan is right in that AI technology is rapidly approaching the ability to deepfake meeting attendance. But I do not think he has fully thought through the implications of doing so. I wonder if he is thinking too much about what AI can do, and not enough about how people work. If he sends a simulacrum of himself to a meeting, without disclosing the fact, the other meeting attendees can hardly be expected to be pleased if they discover it. Meetings, even video-conference meetings, are for humans to connect with each other. To pretend to connect with other people by sending software that looks and acts like a human is to construct an elaborate lie, and nobody likes being lied to. Imagine going to a meeting and finding that all the other attendees are simulacra. Or turn the tables here: imagine sending a simulacrum of oneself to a meeting, then finding out that key decisions were made on your behalf, ones you deeply regret not making yourself, because the other attendees assumed they were actually dealing with you.

Moreover, it is exactly the sort of use Yuan seems to envision that leads to a market for lemons as I described in my previous post. In my view, if his vision prevails, it risks destroying the entire video-conferencing marketplace, because if human beings cannot tell if the person with whom they are video-conferencing is a real person or a simulacrum, they will find a different, more trustworthy way to meet. In the meanwhile, video-conferencing itself will become farcical: robots meeting with robots online, with no human connection at all. Is this really what Yuan wants? It seems Yuan and I agree that the technical capability to make convincing AI deepfakes in video-conferencing will soon be here, but we disagree about whether it will be a good thing. We will all see in the next few years how it turns out.

Processor	PassMark	Price	PassMark/$	Price-Performance vs G640
Pentium G640	2893	$79	36.6	100%
i3-2120	4222	$125	33.8	92.2%
i5-3570	7684	$215	35.7	97.6%
i7-3770	10359	$310	33.4	91.3%

Processor	CINT2006 Base	Price	CINT/$	Price-Performance vs G640
Pentium G640	34.4	$79	0.44	100%
i3-2120	36.9	$125	0.30	67.8%
i5-3570	48.5	$215	0.23	51.8%
i7-3770	50.5	$310	0.16	37.4%

Processor	CINT2006 Rate Base	Price	Rate Base/$	Price-Performance vs G640
Pentium G640	61.7	$79	0.78	100%
i3-2120	78.8	$125	0.63	80.7%
i5-3570	146	$215	0.68	87.0%
i7-3770	177	$310	0.57	73.1%

(The Verge article I repeatedly quote from here can be found at https://www.theverge.com/2024/6/3/24168733/zoom-ceo-ai-clones-digital-twins-videoconferencing-decoder-interview)