Touching and Gesturing on the iPhone

NOTE: This post is out of date.
Read our updated version of this post for more up to date information!

Everyone who owns an iPhone (or who has been holding out for an iPhone 3G) is bound to be excited about a lot of the new things the device can finally do, particularly the introduction of third-party applications. But those of us in the web development community have been itching for something further still: good web applications on the iPhone. This means we need a suitable replacement for mouse events. And boy did we get them! Though at first the APIs seem a little sketchy, once you’ve learned them you should be able to do amazing things in your application.

I’ll start with how to set up the iPhone console, since I found it invaluable while testing. Under Settings > Safari > Developer, you can turn it on or off. Simple log, error, and warn functions are provided (as part of the console object), all of which accept a single object.

My quest to understand the API led me to this Apple Developer Connection page that, while providing pretty thorough documentation about what’s available, left me a little confused about the details. Also, if you aren’t a member of ADC, trying to follow this link will leave you even more confused.

Clearing it Up

Apple introduced two new ideas with this API: touches and gestures. Touches are important for keeping track of how many fingers are on the screen, where they are, and what they’re doing. Gestures are important for determining what the user is doing when they have two fingers on the screen and are either pinching, pushing, or rotating them.

Touches

When you put a finger down on the screen, it kicks off the lifecycle of touch events. Each time a new finger touches the screen, a new touchstart event happens. As each finger lifts up, a touchend event happen. If, after touching the screen, you move any of your fingers around, touchmove events happen.

We have the following touch events:

touchstart: Happens every time a finger is placed on the screen
touchend: Happens every time a finger is removed from the screen
touchmove: Happens as a finger already placed on the screen is moved across the screen
touchcancel: The system can cancel events, but I’m not sure how this can happen. I thought it might happen when you receive something like an SMS during a drag, but I tested that with no success

node.ontouchstart = function(evt){
  console.log(evt.pageX + "/" + evt.pageY);
  // OH NO! These values are blank, this must be a bug
}

My first mistake was monitoring these events and trying to get location information from the events (pageX, pageY, etc). After consulting the ADC documentation again, I learned about three event lists that come attached to the object. But I wasn’t sure what they did, so I went back to testing, logging, and experimenting.

It helped when I figured out the problem the Apple developers were trying to solve. With a mouse, you really only have one point of contact: through the cursor. With your hand, you can keep two fingers held down on the left of the screen while you keep tapping the right side of the screen.

Our event object has a list, and this list contains information for every finger that’s currently touching the screen. It also contains two other lists, one which contains only the information for fingers that originated from the same node, and one which contains only the information for fingers that are associated with the current event. These lists are available to every touch event.

We have the following lists:

touches: A list of information for every finger currently touching the screen
targetTouches: Like touches, but is filtered to only the information for finger touches that started out within the same node
changedTouches: A list of information for every finger involved in the event (see below)

To better understand what might be in these lists, let’s go over some examples quickly

When I put a finger down, all three lists will have the same information. It will be in changedTouches because putting the finger down is what caused the event
When I put a second finger down, touches will have two items, one for each finger. targetTouches will have two items only if the finger was placed in the same node as the first finger. changedTouches will have the information related to the second finger, because it’s what caused the event
If I put two fingers down at exactly the same time, it’s possible to have two items in changedTouches, one for each finger
If I move my fingers, the only list that will change is changedTouches and will contain information related to as many fingers as have moved (at least one).
When I lift a finger, it will be removed from touches, targetTouches and will appear in changedTouches since it’s what caused the event
Removing my last finger will leave touches and targetTouches empty, and changedTouches will contain information for the last finger

Using these lists, I can keep very close tabs on what the user is doing. Imagine creating a(nother) Super Mario clone in JavaScript. I’d be able to tell what direction the user currently has his or her thumb on, while also being able to watch for when the user wants to jump or shoot a fireball.

I’ve been saying that these lists contain information about the fingers touching the screen. These objects are very similar to what you’d normally see in an event object passed to an event handler A limited set of properties are available in these objects. Following is the full list of properties for these objects:

clientX: X coordinate of touch relative to the viewport (excludes scroll offset)
clientY: Y coordinate of touch relative to the viewport (excludes scroll offset)
screenX: Relative to the screen
screenY: Relative to the screen
pageX: Relative to the full page (includes scrolling)
pageY: Relative to the full page (includes scrolling)
target: Node the touch event originated from
identifier: An identifying number, unique to each touch event

For those of you coming from the normal web design world, in a normal mousemove event, the node passed in the target attribute is usually what the mouse is currently over. But in all iPhone touch events, the target is a reference to the originating node.

One of the annoyances of writing web applications for the iPhone has been that even if you set a viewport for your application, dragging your finger around will move the page around. Fortunately, the touchmove‘s event object has a <a href="http://developer.mozilla.org/en/docs/DOM:event.preventDefault">preventDefault</a>() function (a standard DOM event function) that will make the page stay absolutely still while you move your finger around.

Drag and Drop with the Touch API

We don’t have to worry about keeping track of down/up events as we do with mousemove since the only way touchmove is triggered is after touchstart.

node.ontouchmove = function(e){
  if(e.touches.length == 1){ // Only deal with one finger
    var touch = e.touches[0]; // Get the information for finger #1
    var node = touch.target; // Find the node the drag started from
    node.style.position = "absolute";
    node.style.left = touch.pageX + "px";
    node.style.top = touch.pageY + "px";
  }
}

Gestures

This was much easier to figure out than the touch API. A gesture event occurs any time two fingers are touching the screen. If either finger lands in the node you’ve connected any of the gesture handlers (gesturestart, gesturechange, gestureend) to, you’ll start receiving the corresponding events.

scale and rotation are the two important keys of this event object. While scale gives you the multiplier the user has pinched or pushed in the gesture (relative to 1), rotation gives you the amount in degrees the user has rotated their fingers.

Resizing and Rotating with the Gestures API

We’ll be using WebKit’s transform property to rotate the node.

var width = 100, height = 200, rotation = ;

node.ongesturechange = function(e){
  var node = e.target;
  // scale and rotation are relative values,
  // so we wait to change our variables until the gesture ends
  node.style.width = (width * e.scale) + "px";
  node.style.height = (height * e.scale) + "px";
  node.style.webkitTransform = "rotate(" + ((rotation + e.rotation) % 360) + "deg)";
}

node.ongestureend = function(e){
  // Update the values for the next time a gesture happens
  width *= e.scale;
  height *= e.scale;
  rotation = (rotation + e.rotation) % 360;
}

Conflicts

Some readers might have noticed that a gesture is just a prettier way of looking at touch events. It’s completely true, and if you don’t handle things properly, you can end up with some odd behavior. Remember to keep track of what’s currently happening in a page, as you’ll probably want to let one of these two operations “win” when they come in conflict.

In Action

I put together a quick demo:

This is a simple application that showcases the incredible flexibility and power of these APIs. It’s a simple gray square that can have its colors and borders restyled, can be dragged around, and can be resized and rotated.

Load http://tinyurl.com/sp-iphone up on your iPhone and try the following:

Keep a finger over one of the colored squares, and put another finger on one of the border squares
Try the same thing using two colored squares or two border squares
Use one finger to drag the square around the page
Pinch and rotate the square
Start dragging the square, but put another finger down and turn it into a pinch and rotate. Lift one of your fingers back up, and resume dragging the square around

Can I Do X?

I’m not sure what sort of APIs we’ll be able to build on top of what Apple has provided for us. What I do know is that Apple has given us a very well thought out API.

mousedown and mouseup are events we can easily emulate with this new API. mousemove is a beast. First of all, we only get touch events after the finger has made contact (the equivalent of mousedown) while we get mousemove events regardless of whether the button is down or not. Also, preventing the page from jumping around isn’t something we can automate. Attach a handler to the document and the user wouldn’t be able to scroll at all!

Which brings us to DnD in general. Even though DnD only cares about mousemove in the context of the mouse button being down (the way that touchmove works), we don’t have any way to tell what node the user’s finger is over at the end of the drag (since target refers to the originating node). If a DnD system is to be used, it would have to be for registered drop targets who are aware of their position and size on the page.

Touching and Gesturing on the iPhone

Clearing it Up

Touches

Drag and Drop with the Touch API

Gestures

Resizing and Rotating with the Gestures API

Conflicts

In Action

Can I Do X?

Partner with SitePen

SitePen can help you build applications the right way the first time. Schedule a complimentary strategy session with our technical leadership team to learn more.

Touching and Gesturing on the iPhone

Clearing it Up

Touches

Drag and Drop with the Touch API

Gestures

Resizing and Rotating with the Gestures API

Conflicts

In Action

Can I Do X?

Partner with SitePen

SitePen can help you build applications the right way the first time. Schedule a complimentary strategy session with our technical leadership team to learn more.

Receive Our Latest Insights!