Caleb Booker

Business in Virtual Worlds

Caleb Booker RSS Feed
 
 
 
 

ROI in Virtual Worlds 1 - Why Webcams Fail

This is the first of a series of blog posts dedicated to answering two questions:

  1. Why are virtual worlds a good alternative to existing technologies?
  2. How can one best get a Return On Investment (ROI) from virtual world ventures?

I hope to give you lots of references as I go, but for this first one please forgive my off-the-top-of-my-head handling of the subject matter as time is a bit short. The source material for all of this comes from dozens of studies, articles and experiences I’ve come across or done myself over the past few years. If you have something that applies, please place it in the comments!

UPDATE: there is now an archive page for other entries in the ROI in Virtual Worlds series.


Why Not Just Use Webcams?

One of the most common questions I get about virtual worlds is simply: don’t webcams do the same thing, but better?

There’s a certain logic to this, after all. On a webcam you can read facial expressions, track body language, and communicate in most of the same ways you can in real life. Why should we use a virtual space with an avatar if webcams are so cheap, easy to use, and hold a proven track record of stability?

As it turns out there are some serious psychological and physiological reasons for why webcam experiences are necessarily limited to casual conversation. In order to understand why we need to take a good look at how the mind interacts with the computer screen.

When you look at a screen, your mind bends space

The easiest way to understand what happens when a person looks at a computer or television screen is that, in their minds at least, only that screen exists. This is why you can sit fifteen feet away from a 30 inch faded display in a meeting and still get something out of the training video. While this may be the scene from the back of the room…

… your mind does a lot of filtering and resizing so that all you really see is this:

Right away, marketing folk should be paying attention here. As we all know, the more “space” your message or logo occupies in a customer’s perception the more mindshare you actually have.

So then does the shape of the screen matter?

This phenomena also goes a long way toward explaining why screens are getting wider instead of just bigger: most people see with two eyes, one next to the other. This makes our field of vision a rectangle close to the shape of a car’s windshield.

This means that a 16×9 widescreen feels much more natural and comfortable than a regular 4×3 of the same size. Mentally you’re able to “zoom in” with a widescreen much more easily. This is also why most memorable corporate logos are square or wide, but rarely tall.

vs.

That’s not to say that 4×3 screens are going anywhere. They’re taller so they’ll continue to sell as they seem bigger at first glance. Besides, when you’re scrolling through a lot of web pages with text information organized in columns, tall is important and practical.

What happens, however, when you’re not just trying to dump text on people? What happens when you’re trying to interact with them and make an emotional connection? What happens when you want them to feel like they’re actually immersed in your experience?

Was LBJ a “close talker”?

There’s another problem with the whole webcam phenomena. Remember how we discussed the “mental zoom” people apply to screens, filtering out all surrounding data? Well, they’ll do that with images inside the screen too!

Let’s say you take a webcam-enabled VOIP call from LBJ here:

Sure, it’s just a small 300 pixel wide image. After talking for awhile though, you’ll find you just have to look away to continue the conversation because his face is completely filling your vision. It’s like his face is just inches from yours, uncomfortably close.

Socially this gets pretty awkward, and strictly speaking you can’t really look away anyhow because they’ll see you “not paying attention”. That issue comes up again as your conversation continues and you notice LBJ never makes eye contact: he’s looking at his screen, not you! (Well, he is looking at you, but he sees you’re not looking at him either… kind of.)

This all happens in your unconscious of course, but the end result is that one of two things happens:

  1. You start to feel really awkward, or…
  2. You stop taking the conversation too seriously, instead opting for the “crowded party” feeling.

This is where avatars come in

The irony is that you can take a conversation into a virtual space filled with fluorescent colors and have a serious conversation far more easily than you could on a webcam.

The entire virtual world phenomena works because it accomplishes one simple thing: the perception of space. This is one of the most underestimated and wildly powerful tools of the past decade. Without even needing 3D glasses, a virtual space moves another person’s “presence” to a comfortable distance while still creating a sense that you are somehow physically together.

Take a good look at this pic of one of Torley’s meetings I went to awhile back. In fact, stop now and click on the image to see a larger version in order to follow along (this one will open in a new window).

You get the sense here that you’re actually standing just behind the guy in the pinstripe and the girl with the yellow hair. Stand there for a moment and take in the scene. A conversation is underway. Take a second now and think: what’s your first impulse?

For most people, they want to walk in and have a seat. After that they’ll look around the circle and participate in the conversation via typed text or with voice, and sometimes both. They’ll be “close” to the people they want to be close to without that “close talker” feeling.

So we worked around the distance issue, but something else happened. Did you catch it?

Three learning modes in one conversation

Humans tend to be biased toward one of three basic learning modes: audio, visual, or kinesthetic. You probably understand this intuitively already but let’s break it down real quick:

  • Audio learners need to hear things for them to ring true.
  • Visual learners need to see something for it to become clear.
  • Kinesthetic learners learn by doing, or at the very least by feeling like they experienced something.

These aren’t absolute for any one person (contrary to what some might tell you, by the way), nor are you really stuck in any one mode forever. People do, however, tend to have a preference at any given time.

This means that if you’re on a webcam conversation, you’re overwhelmed if you’re in visual mode and squirming if you’re in kinesthetic mode. Audio learners may not notice a problem at all here, but the person they’re speaking to might.

By way of contrast, the virtual meeting above works with a full mix. Visual folk can look around the room to “place” the voice they’re hearing or the text they’re reading (critical for them if they want to remember anything that happened!). Auditory people can just sit back and chat, occasionally glancing at the typed text. As for the kinesthetic people, well, they’re in absolute heaven.

The thing is, you’d be surprised how many leaders tend to be kinesthetic. It’s a mode that lends itself to getting things done, after all.

The best part: these people are all communicating in whatever mode works for them, and they all “see eye-to-eye” or “hear what’s being said” or “feel they have an understanding”. The entire “room” leaves having fully appreciated the conversation. Even in a real-world meeting that might not have happened, as non-auditory people wouldn’t have had the option to use text chat!

Display dimensions, revisited

We also get a bit of a cheat with virtual worlds: the display is inadvertently a wider rectangle than your standard webcam image. This is actually a happy accident of interface design. Application designers discovered a long time ago that they needed to put the menus and controls either at the top or the bottom, and rarely use the sides for anything other than “content”.

You’ll see this general philosophy in word processors, video editing suites, and virtual spaces. If you’re going to be immersed in what you’re doing, the content must be wider than tall. It’s only in applications like Adobe Photoshop, where most of the users use wide screens already, that you can get away with putting controls on both sides.

(NOTE: I realize that heads-up displays are almost an exception here. They tend to sit on the sides comfortably, although we usually try to “dock” them at the top or bottom. They have the advantage of being interfaces that we can look right through, so we can cheat the whole issue there.)

Conclusions

Virtual space experiences work better than a webcam experience for three reasons:

  1. You can maintain some “personal space”.
  2. Whatever learning mode you’re in, chances are you’ll do fine.
  3. The experience fills your field of vision far more readily.

Again, this entire post is off the top of my head. If I’ve missed something or if you’ve written something that can expand on these points, feel free to add a link in the comments.

Next time on ROI in Virtual Worlds: “Meetings 101″.

27 Responses to “ROI in Virtual Worlds 1 - Why Webcams Fail”

  1. Cherisa Burk Says:

    I like your three reasons, Caleb, but there are two others I can think of off the top of my head, too. 1. Aesthetic. Sometimes, the lighting in the office seems harsh or someone’s having a bad hair day or a bit of lunch stuck in their teeth, and hey, I’d rather not be seeing them in that state (or showing myself in that state, either, for that matter). 2. Having an avatar stand-in for me and for the other attendees at a virtual meeting is more fun. The element of fun, however slight, is engaging. One is more likely to participate in something fun, and there’s your ROI for you.

  2. Caleb Booker Says:

    You know what Cherisa, you’re 100% right. Not having to worry about your appearances is a big big deal, and fun is even bigger. Good call!

    Maybe the irony is that if you give those two reasons to your non-tech-savvy boss for why he should approve the budget of your virtual world experiment project, they’ll actually be arguments against. “We have a dress code around here for a reason, and we’re here to work, not play! Now excuse me while I go grumble about someone not doing their job grumble grumble grumble…”

    Or not. Depends on the boss. I think you might have given me a few ideas for future blog posts though. Your points are valid and I don’t think many realize how important they really are!

  3. Peter Quirk Says:

    Very good analysis Caleb!
    Another point to note is that direct eye contact is often difficult in web conferencing because the camera is off-axis.
    I’ve yet to see a webcam in the center of the visual field. By contrast, in virtual worlds your avatar can look directly at the other party even while you’re looking down at your keyboard to type or glancing at another window or screen.
    Some additional thoughts here: http://tinyurl.com/6kxuzg

    – Peter

  4. Peter Quirk Says:

    Off-axis eye contact is solved with this kludge: http://www.datenform.de/blog/2009/01/here-is-looking-at-you-kid.html

  5. HatHead Rickenbacker Says:

    Some great points made here Caleb!

    I don’t think that web cams and avatars are necessarily mutually exclusive. For example, I have done some experiments broadcasting live video of my real life head onto the head of my Second Life avatar. This combination of shared space together with real facial expression is very powerful and compelling.

    Also, expect the emergence of 3D displays - already being shipped in some high-end TVs for example - to make huge impact in this area in the next 3-4 years, for both live video and virtual worlds.

    Cheers!

  6. Caleb Booker Says:

    Peter: hey good points! I tried to address the off-axis issue but I’m glad you’ve highlighted it a little better here. That kludge is BRUTAL though!

    HatHead: You know, with the new polarized lens 3D tech I wouldn’t be surprised if it was more commonly used, but having been to a few (bad) 3D movies I’m not entirely convinced they create a greater sense of immersion. I found myself acutely aware of the bottom and top of the screen the whole time - far more than I am in a regular movie.

    I am, however, *very* interested in your experience with the webcam-head-on-an-avatar. Let me know if you have upcoming events you’re planning on doing this at, I’d love to come and get a sense of it.

  7. Kyle G Says:

    Great article Caleb! If only our avatar eyes moved with our own eyes as we scan the virtual world around us and our virtual lips synched as we used audio we could then bridge the gap between webcam & virtual world more. Simply great writing!

  8. skribe Says:

    Excellent post and analysis. Coincidently I was speaking with someone about this very issue earlier today. They were relating a story about how one of their clients was moving their operations to virtual environments because they hated doing telepresence. The idea that having to get dressed up in a suit and being stuck in front of a camera in the middle of the night - often in a deserted building in a foreign city - just so they could talk to someone half-way around the world isn’t exactly appealing. Especially when using a avatar you can do it from the comfort of your home/office/hotel room on your laptop while dressed in your fluffy bunny slippers and Mickey Mouse PJs =).

    Cheers,

    skribe

  9. Deep Semaphore Says:

    I address how to merge webcams and virtual worlds here in an attempt to address the issues raised
    http://tinyurl.com/apbqq2

  10. John Norris Says:

    I wonder if some of Scott McCloud’s theories on abstraction also come into play here. The difference between a photograph of someone and an ‘icon’ (a more abstract representation of them, an comic or avatar.)

    The icon helps focus our attention, by eliminating what is not important.

    The icon forces us to fill in the gaps ourselves. We become further engaged with the our own avatar, but also with the other because we impart a bit of ourselves in them.

    John

  11. Peter Quirk Says:

    Apple has a petent for the webcam behind the screen: http://www.appleinsider.com/articles/09/01/08/apple_files_patent_for_camera_hidden_behind_display.html

  12. Troy McLuhan Says:

    I’m not sure if it’s covered by the reasons you give, but another advantage is that you don’t have to worry about your (RL) appearance when using an avatar. You can be wearing pajamas and skip makeup and nobody will be any the wiser.

    In fact, not only does your RL appearance not matter, but you have a *lot* of control over your avatar’s appearance. That can increase your confidence.

  13. Mark Young Says:

    Your points about the flaws of the typical webcam view make a lot of sense but I would not throw the baby out with the bath water. A huge number of cues for social perception are conveyed by facial motion and body motion. There is no better interface for conveying facial expressions than your own face - a video camera is a simple method to transmit those cues to your communication partner but there is room for creative and technical alternatives. The simplest thing to do to skirt your criticisms would be to apply some image processing tricks to cut and paste the human part of a headshot into a different view.

  14. John Oeffinger Says:

    Excellent post Caleb. Three other thoughts you might consider. Two of the three trace their roots back to the 80’s and computer-conferencing environment where we had to help people create a 3D metaphor environment for 2D computer-conferences.

    First, avatars and an immersive 3D environment destroy individual pre-conceptions. In RL or telepresence (previously videoconferencing), people tend to size others up by their appearance. They establish pre-conceptions which determine how they will interact or become engaged in the meeting. This is especially true if most of the participants are meeting F2F for the first time. Avatars destroy pre-conceptions of people based on person’s RL appearance, look, age, gender, etc.

    Second, 3D immersive environments enable sidebar conversations in context within a meeting without disrupting the flow of the meeting. In many cases, these private sidebar chats can improve the meeting’s outcome. In the 90’s, people training corporate organizations to change and go to a more bottom-up or asymetric decision-making environment talked about making it “safe” to trade notes in F2F meetings or have sidebar conversations. In 3D environments, this is greatly expanded using private IMs, chat, and voice. This advantage can take the meeting in entirely new and beneficial directions increasing the ROI value-prop.

    Finally, 3D worlds provide community to people who have been shut out or severely limited in their community activities. Individuals that are disabled or homebound, even stay-at-home parents, now have the ability to interact and engage in meetings that prevented their participation in the past.

    One additional thought regarding webcams and telepresence operations - you are still very limited to the number of sites you can have online at the same time. Webcams are still limited to less than 5 (I think) and there is a telepresence unit limitation based on cost.

    Thanks for a great post. Cheers!

  15. ROI in Virtual Worlds - Anatomy of an Avatar « Caleb Booker Says:

    [...] your frame. Show me a user that doesn’t ever want to do that and I’ll eat my hat. (See last week for more info on [...]

  16. The SLENZ Update - No 44, February 4, 2009 « Second Life Education in New Zealand Says:

    [...] On Mark Kingdon’s case (above) for the benefits of holding real world meetings in virtual worlds Metaverse developer Caleb Booker has provided a compelling argument for the use of virtual worlds like Second Life for real world education environments and meeting spaces.( http://www.calebbooker.com/blog/2009/01/27/roi-in-virtual-worlds-1-why-webcams-fail/) [...]

  17. Dabbler Says:

    The one issue not mentioned here is the propensity for deception. Whether it’s your avatar’s eye color or gender, the person viewing the scene is not interacting with a truthful perception. Imagine interviewing a potential hire for your company using SL. In my book the genuine interactions of a webcam transaction override all the other benefits of VR.

  18. Sitearm Says:

    On the Space Thing:
    1. The ability to freely interact visually in 3D to check people and environments out: front, back, top, bottom, sideways, extreme zoom in, mid zoom, extreme zoom out. Priceless sensations.

    On the Personal Space Thing:
    2. The ability to check things out visually with no one else being able to tell where you are looking.

    On the Trustworthy Thing:
    3A. I agree you get certain cues “up close and personal” BUT
    3B. My illusions have been shattered that these cues have ever reliably told me what the person will do for true
    3C. Bottom line, in real and virtual interactions, is “By Our Actions We Are Known” and a Promise Made that is a Promise Kept speaks louder than any “visual cues”

  19. Daniel Says:

    Not sure I am with you on this Caleb. The first 17 responses seem to a self-selected survey…only those who want to live in what the boss accurately calls a “game world”. Video conferencing is intended to provide as much of the face to face experience as possible without being there. The Star Wars example of using full body holographs projected onto real furniture seems most useful to emulating the real life experience. Avatars are all about hiding who you are, what you are really doing, behind a make-believe facade. We have enough of that in business relationships without adding to it. Notice how all of your virtual examples are game scenarios and Alice in Wonderland visualizations. If this technology was actually a replacement for live meetings, it would have real business meeting examples. The “close-talker” problem can be solved by projecting people at an accurate, or life-size dimension, in all three dimensions. That doesn’t mean holographs, it means sitting at a real table and having the screen at the other side of the table with the projection at life size for three to five feet away. You also need to set the volume of the speakers and sensitivity of the microphone to mimick a live conversation.

  20. The SLENZ Update - No 45, February 10, 2009 « Second Life Education in New Zealand Says:

    [...] I referred to one  in my previous blog (SLENZ Update, No 44)  by Caleb Booker ( ROI in Virtual Worlds 1 - Why Webcams Fail (http://www.calebbooker.com/blog/2009/01/27/roi-in-virtual-worlds-1-why-webcams-fail/) [...]

  21. Working in the Virtual World « Official Second Life Blog Says:

    [...] but there are some psychological aspects that limit its potential. Caleb Booker recently blogged on this very topic. He posits two very interesting theories. First, usually when you’re in a [...]

  22. amanda van nuys Says:

    Hey Caleb, Wanted you to know that I really appreciated your blog entry & referenced it in my blog post today. Check it out (http://blog.secondlife.com). Looking for great dialogues ahead! Cheers, Amanda

  23. Dr. Ralph E. Chatham Says:

    There is another reason for distant screens being different from close ones. The following insight comes from research I funded at the Defense Advanced Research Projects Agency a few years back. I called it the “Does Size Matter — in Displays” project. The brain has two circuits (at least two) that process visual information. Visual-Cortical and Visual-motor. One makes sure that you hand touches what you are reaching for or that your foot comes down at the right angle on the slope you are climbing. The other is the part that the ego thinks of as seeing with - it influences planning and complex cortical processes, like reading.
    Turns out that the visual-cortical circuit has very strange biases (you see slopes very much steeper than they actually are — if you have ever driven up the back side of Lombard Street in San Francisco, you will estimate that the slope is 45 degrees and then back down to say, it really couldn’t be that and say maybe 30 degrees. There is no slope in SF roads that is steeper than 18 degrees. Your visual-cortical circuit exaggerates the slope to help you discriminate among slopes that are of any interest to your doing something with. You can’t walk up a grassy slope greater than 30 degrees so the cortical perception expands the zero-to-30 degree segment of the circle to make better discriminations. AND if you are tired, your perception adds another 10 degrees to the slope that your ego sees. But if your foot or hand is asked to estimate it (hidden from sight) the visual-motor system insures that they get the slope right. —
    One more thing needed to finish this thought. The visual-motor circuit stops working at ranges farther than you can touch. So the upshot is that you get two circuits confusing you when you look at a computer screen, but on the wall, only the visual-cortical circuit is processing the information. Your visual-motor circuit says to the brain “this PC image is a toy, not a real world thing” but it isn’t around to say anything about an image projected on the wall even if it has the same visual angle and the same number (or even fewer) pixels.
    I can proofread on the wall as well as I can on paper, but I can not proofread at all on the computer screen. Others who have tried it agree. My guess is that it has something to do with the visual-motor circuit checking out when the handwriting is on the wall. (Yes, I can’t explain why I can proofread when the text is on paper, so this is not that simple, but that doesn’t change the visual-motor/visual-cortical confusion issue relative to screen distance.)
    Look up papers by Dennis Proffitt of the University of Virginia to get more of this.

  24. Current Vibes in Technology and Marketing » Blog Archive » Virtual Conferences and Events - A Brief Review Says:

    [...] Booker has done a great job recently of breaking down the basic benefits of virtual worlds for meetings. April 28th, 2009 in [...]

  25. Working in the Virtual World - Hypergrid Business Says:

    [...] but there are some psychological aspects that limit its potential. Caleb Booker recently blogged on this very topic. He posits two very interesting theories. First, usually when you’re in a [...]

  26. Andrew Chapman Says:

    Using an Avatar can in fact reveal more about a person than most think. Our choices of presentation, how we want others to preceive us, is an important insight.
    The ability to use a webcam representation of our faces within an avatar is something I have been experimenting with for sometime, simply because there is no substitute for facial expression, it goes beyond the phrase ‘A Picture can tell a thousand words’. How many of us were brought up with the visuals of the film Tron?
    I believe in the future, we will all have ‘Carbon Footprint Ration books’ whereby we can ’spend’ our allotments on things we believe vital. No longer will latge Corps be able to hold extravagant meetings (I am not speaking about financial luxury) simply because they have the money, it will be rated upon the effects on the planet and the human race.
    Figure how much time is spent just going between meetings, I am not talking outside the office. Being able to click a icon on your desktop, you can attend a virtual meeting with other members of your company, negating a accumilation of those 15 minutes leaving your desk, printing documents out and travelling between floors/buildings, to attend a meeting with fello workmates. Add the cost of those spaces being maintained, the frustration of coordinating groups of people based upon their schedules and the right rooms availability? With Virtual meeting spaces, they are always available and can handle Ad Hoc meetings.
    These are the goals we are meeting with the launch of our own virtual environments, secure and non-secure.

  27. links for 2009-06-04 « Boris in Wonderland Says:

    [...] ROI in Virtual Worlds 1 – Why Webcams Fail « Caleb Booker webcams vs virtual worlds (tags: gpjinsight Dogear-Nation virtualworlds Secondlife collaboration webcam meetings education) [...]

Leave a Reply

Recent Comments

Twitter Updates

    RSS Feed from Clever Zebra

    Blogroll