Enhancing User Interaction With First Person User Interface

Advertisement

Though many computer applications and operating systems make use of real-world metaphors like the desktop, most software interface design has little to do with how we actually experience the real world. In lots of cases, there are great reasons not to directly mimic reality. Not doing so allows us to create interfaces that enable people to be more productive, communicate in new ways, or manage an increasing amount of information. In other words, to do things we can’t otherwise do in real life.

But sometimes, it makes sense to think of the real world as an interface. To design user interactions that make use of how people actually see the world -to take advantage of first person user interfaces.

First person user interfaces can be a good fit for applications that allow people to navigate the real world, “augment” their immediate surroundings with relevant information, and interact with objects or people directly around them.

Navigating the Real World

The widespread integration of location detection technologies (like GPS and cell tower triangulation) has made mobile applications that know where you are commonplace. Location-aware applications can help you find nearby friends or discover someplace good to eat by pinpointing you on a map.

When coupled with a digital compass (or a similar technology) that can detect your orientation, things get even more interesting. With access to location and orientation, software applications not only know where you are but where you are facing as well.

This may seem like a small detail but it opens up a set of new interface possibilities that are designed from your current perspective. Consider the difference between the two screens from the TomTom navigation system1 shown below. The screen on the left provides a two-dimensional, overhead view of a driver’s current position and route. The screen on the right provides the same information but from a first person perspective.

This first person user interface mirrors your perspective of the world, which hopefully allows you to more easily follow a route. When people are in motion, first person interfaces can help them orient quickly and stay on track without having to translate two-dimensional information to the real world.

TomTom’s latest software version goes even further toward mirroring our perspective of the world by using colors and graphics that more accurately match real surroundings. But why re-draw the world when you can provide navigation information directly on it?

Nearest Tube2 is a first person navigation application that tells you where the closet subway station is by displaying navigation markers on the real world as seen through your phone’s camera.

As you can see in the video above, the application places pointers to each subway station in your field of vision so you can quickly determine which direction to go. It’s important to note, however, that the application actually provides different information depending on your orientation.

When you hold the phone flat and look down, a set of arrows directs you to each subway line. Holding the phone in front of you shows the nearest subway stations and how far away they are. Tilting the phone upwards shows stations further away from you.

Making use of the multiple perspectives naturally available to you (looking down, looking ahead, looking up) is an example of how first person interfaces allow us to interact with software in a way that is guessable, physical, and realistic. Another approach (used in Google Street View3) is to turn real world elements into interface elements.

Street View enables people to navigate the World using actual photographs of many major towns & cities. Previously, moving through these images was only possible by clicking on forward and back arrows overlaid on top of the photos. Now, (as you can see in the demo video below) Street View allows you to use the real-world images themselves to navigate around. Just place a cursor on the actual building or point on the map that you want to view and double-click.

Augmented Reality

Not only can first person user interfaces help us move through the world, they can also help us understand it. The information that applications like Nearest Tube overlay on the World can be thought of as ÒaugmentingÓ our view of reality. Augmented reality applications are a popular form of first person interfaces that enhance the real world with information not visible to the naked eye. These applications present user interface elements on top of images of the real world using a camera or heads up display.

For example, an application could augment the real world with information such as ratings and reviews or images of food for restaurants in our field of vision. In fact, lots of different kinds of information can be presented from a first person perspective in a way that enhances reality.

IBM’s Seer4 application provides a way to navigate this year’s Wimbledon tennis tournament more effectively. Not only does the application include navigation tools but it also augments your field of vision with useful information like the waiting times at taxi and concession stands.

Wikitude5 is an application that displays landmarks, points of interest, and historic information wherever you point your phone’s camera. This not only provides rich context about your surroundings, it also helps you discover new places and history.

These augmented reality applications share a number of design features. Both IBM Seer and Wikitude include a small indicator (in the bottom right corner of the screen) that lets you know what direction you are facing and how many points of interest are located near you. This overview gives you a quick sense of what information is available. Ideally, the data in this overview can be manipulated to zoom further out or closer in, adjust search filters, and even select specific elements.

Wikitude allows you to manage the size of this overview radius through a zoom in/out control on the left side of the screen. This allows you to focus on points of interest near you or look further out. Since it is dealing with a much smaller area (the Wimbledon grounds), IBM Seer doesn’t include (or need) this feature.

In both applications, the primary method for selecting information of interest is by clicking on the icons overlaid on the camera’s view port. In the case of IBM Seer, different icons indicate different kinds of information like concessions or restrooms. In Wikitude, all the icons are the same and indicate information of interest and distance from you. Selecting any of these icons brings up a preview of the information. In most augmented reality applications, a further information window or screen is necessary to access more details than the camera view can display.

When many different types of information can be used to augment reality within a single application, it’s a good idea to allow people to select what kinds of information they want visible. Otherwise, too many information points can quickly overwhelm the interface.

Layar6 is an augmented reality application that allows you to select what kinds of information should be displayed within your field of vision at any time. The application currently allows you to see houses for sale and rent, local business information, available jobs, ATM locations, health care providers, and more. As the video below highlights, you can switch between layers that display these information points by clicking on the arrows on the right and left sides of the screen.

Layar also provides quick access to search filters that allow you to change the criteria for what shows up on screen. This helps narrow down what is showing up in front of you.

Interacting with Things Near You

First person user interfaces aren’t limited to helping you navigate or better understand the physical space around you -they can also enable you to interact directly with the people and objects residing within that space. In most cases, the prerequisite for these kinds of interactions is identifying what (or who) is near you. As a result, most of the early applications in this category are focused on getting that part of things right first.

One way to identify objects near you is to explicitly provide information about them to an application. An application like SnapTell7 can identify popular products like DVDs, CDs, video games, and books when you take a picture of the product or its barcode. The application can then return prices, reviews, and more to you.

This approach might eventually also be used to identify people as illustrated in the augmented ID concept application from TAT8 in the video below. This proposed application uses facial recognition to identify people near you and provides access to digital information about them like their social networking profiles and updates.

While taking pictures of objects or people to get information about them is a more direct interaction with the real world than typing a keyword into a search engine, it only partially takes advantage of first person perspective. Perhaps it’s better to use the objects themselves as the interface.

For example, if any of the objects near you can transmit information using technologies like RFID tags, an application can simply listen to how these objects identify themselves and act accordingly. Compare the process of inputting a barcode or picture of an object to the one illustrated in this video from the Touch research project9. Simply move your device near an RFID tagged object and the application can provide the right information or actions for that object to you.

This form of first person interface enables physical and realistic interactions with objects. Taking this idea even further, information can be displayed on the objects themselves instead of on a device. The 6th Sense project10 from the MIT Media Lab does just that by using a wearable projector to display product information on the actual products you find in a library or store.

Though some of these first person interfaces are forward-looking, most are available now and ready to help people navigate the real world, “augment” their immediate surroundings, and interact with objects or people directly around them. What’s going to come next is likely to be even more exciting.

The next time you are working on a software application, consider if a first person user interface could help provide a more natural experience for your users by bringing the real world and virtual world closer together.

↑ Back to topShare on Twitter

LukeW is an internationally recognized Web thought leader who has designed or contributed to software used by more than 600 million people. He is currently Senior Director of Product Ideation & Design at Yahoo! Inc., author of two popular Web design books, and a top-rated speaker at conferences and companies around the world. You can follow Luke on Twitter at lukewdesign or by using RSS.

  1. 1

    Nice article Luke… thanks for putting it together.

    DKumar M.

    0
  2. 2

    Thanks for the article. I think “layar” is pretty cool.

    0
  3. 3

    This is great!

    I remember when the latest iphone was announced and a friend of mine said “I bet someone can make some cash if they programmed an app that allowed for star tracking and mapping via the iphone”. With so many advancements being made I know this kind of app isn’t impossible any more. Great article, thanks!

    0
  4. 4

    I’d like to see a googlesky-like app for the iPhone. A user would point the iPhone to the sky and on the display would appear constellations or celestial objects in that area. If there are Hubble images of that of object, for example, a user could select to view that. It could also tell you what is nearby. In another part of the application you could type in what object you would like to see and it would tell you where to go – in relation to your location – in order to see that object.

    0
  5. 5

    I have to say, even though I love my iPhone 3Gs the new apps on the Android based phones blow most of my apps away! I’ve used both Layar, WikiTude World Browser and another app for mapping the stars and its just mind boggling!
    But I bet my iGun or Lightsaber can beat your Android anyday!

    0
  6. 6

    this is hardcore :) i love it.

    0
  7. 7

    Like it!

    0
  8. 8

    The near future is…

    …Tagged world!

    0
  9. 9

    Personally, I find 3D interfaces on a GPS to be thoroughly difficult to follow, mainly because the perspective makes everything harder to read. I much, much prefer a 2D overhead map.

    That said, augmented reality stuff could be quite fun and interesting, though I imagine it’ll be quite amusing to go to popular spots and watch people standing there spinning around with a smartphone outstretched in front of them, like some surreal robot :)

    0
  10. 10

    if the screen was made transparent like that of a camera frame, everything would appear much more real! =)

    0
  11. 11
  12. 12

    Great post! Thanks.

    0
  13. 13

    wow, cool stuff!! Thanks for the article

    0
  14. 14
  15. 15

    gr8 article!!!! thnk u :)

    0
  16. 16

    I hate to be the negative one but I find people wandering around in cities with kind of gadgets very stupid. It takes all the fun of discovering a new city away. Soon nobody will be able to read a simple map or what?

    0
  17. 17

    Emanuel, I agree with you.

    0
  18. 18

    I first read the title as ‘Enchanting’ User Interaction With First Person User Interface. I thought, what next? Alluring Search Box Layouts? Ravishing Drop-down Menus?

    0
  19. 19

    This is great Luke! The possibilities for first-person social interfaces make my head spin!

    0
  20. 20

    great article.. you guys should look into the email ´n walk app… great first person view of whats going on in the street below while typing away at the keyboard

    0
  21. 21

    More ways to keep tabs on the world….

    0
  22. 22

    I thought just by the simple fact of looking at a screen, we were just using a “first-person-user interface”, should/nt it be called, “inmersive user interface” or something like that? Plus, you get more information (and faster) prior to getting to a place, or just with the GPS feature on maps of many cellphones. This thing takes away from the joy of “getting there”!

    0
  23. 23

    Another fantastic article, Luke! One intriguing aspect of mobile visual browsing is that temporal progression of constructing a mosaic of images forming the computer’s 3-D digital view of the world, as in tiling images in a Photosynth. Riaz Ahmed and I had discussed some of the considerations for integrating a mobile Photosynth with a location-aware internet of objects in our recent Search Matters column, Brave New World of Visual Browsing on UXMatters.

    Do you think there is some value, for example, in being able to browse a collage of pictures others have taken of the same landmark or product? Possibly along with the tags and wiki content people attached to these pictures? It seems like a lot of mobile apps (including Google Maps) make no effort to differentiate past pictures from the present, but instead choose to augment the moment closest to “now” with the additional information. In other words, more task-oriented way-finding, and less exploring. Is that due to UI limitations, lack of available band-width or content or simply lack of customer demand for this kind of applications? What if friend-generate pictures were more prominently included in the set? I am curious as to your thoughts on this subject.

    Great work on the article!
    Greg Nudelman

    0
  24. 24

    All- thanks for the comments!
    Greg- sure there’s value in the type of information you suggest. I think most apps are task-focused as you imply. But I’m sure more exploratory apps will crop up. See Wikitude for an example.

    0
  25. 25

    Oh that’s neat I like that, very detailed so I’m looking forward to getting that add on. That’s the thing about Iphone and apple their so flexible with what they created. I think they did that on purpose no doubt.

    You say batman I saw vawncast the world changes

    0
  26. 26

    Awesome info, simple explanation, thx

    0
  27. 27

    Is this another instance where we can learn a thing or two from game design? The first-person perspective (and terminology) reminds me a bit of the “shooters” that the upcoming generation has been training on for some time.

    0

↑ Back to top