Above The Surface Interaction

I have not failed. I’ve just found 10,000 ways that won’t work.
– Thomas Edison

This past year, I’ve had the wonderful opportunity to work on several projects that did not go anywhere. I would started getting discouraged when it was clear a project was not going to arrange itself into a nice academic paper-sized package, and would have trouble letting go. However, over the past year I have become better at taking away “lessons” from a “failed” piece of work. Here is one of those bits of work that I called Above the Surface Interaction.

Back from my internship at Microsoft Research, I had this idea that a future of interaction with multi-touch input on the same medium as visual output (let’s call it “direct multi-touch”) would be incredibly inefficient. While direct multi-touch feels expressive and rewarding when playing around, it gets tedious for larger interfaces an when you actually have a task in mind you want to accomplish. Much like playing with a balloon, it feels fun on its own, but when you want to do something with it (say pile blue and red balloons in different piles) it quickly becomes ridiculous.

So I thought “let’s keep direct multi-touch, but understand its role in an ecology of different input techniques”. After seeing the G-stalt system at CHI, I was also intrigued by the use of free-hand (i.e. no touch) gestures. It seemed there was a lack of work on combining Above-the-Surface interaction and touch. There also was not much discussion on how touch could be combined with the keyboard and the mouse. There is work on comparing mouse and touch, but it tends to assume you are using either one or the other, not going back and forth.

I assume the computer of the future will have direct multi-touch, a mouse, a keyboard and coarse hand tracking in the rectangular prism in front of the screen, via a time-of-flight camera or another system.

Here’s a diagram I drew to represent what I thought the interaction space looked like:

So anytime you are transferring between physical interaction artifacts, you are really in the empty space. I found this idea pretty intriguing at the time. However, I get the sense now that users aren’t really aware that they are “in the empty space”. I mean, being in the empty space is not the user’s goal, but rather the end destination of the user’s movement! So, maybe there is a benefit in tracking the user’s hands to predict their next action, but it may not be possible to use their position in empty space as explicit input.

I am not discouraged, because every wrong attempt discarded is another step forward.
– Thomas Edison

So here’s a video of what I did:

Here are a few of the key ideas I developed in Above the Surface Interaction:

mirror metaphor representation of the hands, contrasting with a shadow metaphor representation. I did this so you can show stuff being held in the hand.
holding and tossing items with your hands (think multiple clipboards)
leaving items in the “space between”
displacing the representation of your hand for laziness (by mouse right button hold)

I think the idea of displacing the representation of your hand is pretty compelling. My left hand, in this case. Traditionally, if you want to manipulate something in a complex way, you have to get it in reach of your hand. Simple cursors (like the mouse) can be moved to items that are far away. However, if I want to manipulate something that is far away in a complex way, you have to move your whole arm (which is energy intensive). So, I like the idea of moving just your hand. While my left hand is the manipulator, my left arm does not move. My right arm moves the mouse slightly (there is a significant gain from the mouse to the representation on the Microsoft Surface). There has been bimanual work similar to this in dgp.

While the idea of keeping items in the space between is mildly compelling, I found it hard to tell where it was in 3D space. It didn’t “pop”, and it looked instead kind of like the item was just sitting on the interface itself, faded out. I thought of doing shadows, or having Brownian motion as a function of where the object was in space, but that didn’t communicate position clearly. The other option would be 3D goggles. But, unless you have contact-lens-size goggles, or a holographic display, then goggles are stupid.

If at first you don’t succeed, redefine what you did as success.
– Stephen Colbert

Fortunately, during this project, I was inspired to try out two other ideas, which have turned into very interesting projects. One is motivated by how to make simple multi-touch interaction more efficient, and another is based on combining touch and text entry. I consider the ideas from “Above-The-Surface-Interaction” on hold for now, slow-cooking in the back of my brain. I am going to work at HP Research in India this summer on some combination of gesture and voice input, and I expect the lessons I have learned will help me there!