Offline Gmail and Blogger Using the Dojo Offline Toolkit

[Note: This blog post is out of date. For up to date information on Dojo Offline please see the official web page.]

Introduction

The best way to start development on a programming framework is to ground it in the kinds of user interfaces it will be used in. This ensures that you don’t create astronaut architectures that have no real world use.

The first step in creating the Dojo Offline Toolkit is to therefore figure out what offline web applications might look like. This will help us determine what to include in the Dojo Offline API and what to leave out, and will also be bundled as a simple HTML/CSS template that developers can easily drop into their applications.

SitePen kicked this off by creating offline-enabled mockups of three popular and useful web applications: Gmail, Blogger, and a corporate portal named Adenine. For all three the goal is to find a consistent, simple user-interface for offline web applications.

Gmail

We start with a mockup of an offline-enabled Gmail (click on any of the images to view them full-size):

The first thing to notice is the addition of a new widget on the left-hand side of the page, the Offline Info widget. This widget encapsulates all of our offline functionality for Gmail.

Continue reading

Dojo Offline Toolkit Status Report for Week Ending January 7th, 2007

Overview

  • Every week we will be sending out a status report on the Dojo Offline Toolkit project to keep folks abreast of what SitePen has accomplished the last week and what is planned for the week ahead.

Last Week

  • On Tuesday we kicked off the project with a blog post on the SitePen blog (view in HyperScope). Feedback from the blogosphere and comments were great; as of today there are almost 70 responses on the SitePen blog. Here’s a few snippets of coverage:
    • Ed French: “I like what you’re doing with this, good luck. With 2/3rds of PCs sold now being notebooks using webapps offline is clearly going to be important.”
    • Jadon: “Awesome! Finally, an open-source implementation of an off-line client toolkit! I’ve been seriously wanting this.”
    • Chris Barber: “This is really awesome news! It’s the stuff like offline support that make Dojo the Javascript toolkit! Good luck!”
    • Najier: “Beautiful…This would open up many opportunities in the mobile Health Care space. I will explore some potential products to develop based on this platform. “
    • Andy Dean: “Wonderful project. I lose connection with the Internet not only on an airplane, but many other ways, especially outside the US.”
    • Jonathan: “Awesome idea! It would be neat to be able to WordPress while away from a network. That, and I like the name.”
    • Peter van Dijk: “Seems like 2007 will indeed be the year of offline access. Finally!”
    • Dan Moore: “…the Dojo Offline Toolkit promises to take things to a whole new level.”
    • Jeroen Coumans: “It looks like 2007 will become the year in which we’ll get offline storage and offline functionality in web applications.”
    • IBM developerWorks Blog: “[The] Dojo Offline Toolkit has the potential to allow for disconnected use in a lot of common web applications, such as email, calendering, and more. To the users and developers out there, how would you use such a technology?”
    • christmasgorilla: “Competition for Adobe — and Open”
    • cwinters: “enable AJAX applications to work offline, using an OS-specific component; could be HUGE”
    • dbt60626: “I’ll be following this effort closely. Wow.”
    • James Governor: “wow. Dojo takes on the synchronised web”
    • tazzzzz: “I’ve looked previously at using proxies for offline web app usage, but the coupling of a proxy and dojo.storage really makes a powerful combo.”
  • The big goal last week was to figure out on paper common user-interfaces for offline web applications. This is to ensure that the technology and API behind Dojo Offline are focused on the correct end-user’s needs. On paper we created Dojo Offline-enabled mockups of several web applications:
    • Gmail
    • Blogger
    • Google Docs
    • A Corporate Portal
    • Sales Application
  • The reason for several Google apps is I tried to modify apps that we actually use, and we use those three applications all day long.
  • We created a fake mockup on the computer of Gmail that has offline access using the Dojo Offline Toolkit. Some work still needs to be done on the mockup, but you can see a document with some of the UI possibilities here: http://codinginparadise.org/clients/sitepen/dojo_offline/SPOT_Mockups/overview.html

This Week

  • The big goal this week is to finish the UI mockups and determine the API for Dojo Offline. For the Gmail mockup, I think I am going to add a Work Online/Work Offline to what is there. I also am going to create computer mockups of one more application, perhaps Blogger or the corporate portal example I have on paper. I might also work out a simple UI for resolving merge conflicts; if this gets too hairy I will back out quickly, since it might be out of scope for this release.
  • The other big goal is to determine an API for Dojo Offline.

Outstanding Issues

  • We still need to figure out where in the Dojo Subversion repository this will live, since Polipo is currently under the GPL and we need to get a contributor license agreement from him.

SitePen Presents…

For the past year, SitePen has been providing Dojo training at our clients’ request and more recently we’ve seen an increase in individuals requesting open workshops. Developers who are looking for a solid training course but whose companies have neither the capacity nor the resources to hold private onsite workshops need look no further. We’re going to try out a few one and two-day workshops that will allow lone developers to gain expertise from industry leaders without fear of persecution by the person signing off on their expense reports.

So…without further ado, SitePen presents:

http://sitepen.com/services/training.php

Going Data-Driven

Dylan’s last post on performance is only one in a series we’ll be running on the topic, and as promised this post is all about the tools of the trade for doing Ajax app performance tuning and how to use them. Here at SitePen, we often get called into a project when the heat is really on: after most of the code is written and just before the hoped-for (or worse, already slipped) ship date. Needless to say, we like this situation even less than our clients do. We always prefer to be involved early enough in the development process to be able to steer clients towards architecture decisions that will scale and perform better and still fit the budget, but sometimes the damage is done. What then?

Let the data guide you.

It sounds simple and naive, but most developers who have been through their share of performance tuning crunches will have horror stories of hours or days lost to phantom performance “problems” that turned out to be nothing more than the hunch of one developer. It’s no use optimizing your database configuration or adding more expensive storage systems if the bottleneck is at the JavaScript or HTTP levels. Likewise, tuning your HTTP server to the hilt may have no effect if the bottleneck is storage contention or TCP fragmentation. Getting to a root cause requires defining the goal of a tuning project, testing each change to isolate causality, and keeping a log book handy and up-to-date.

It may be necessary to write custom tools to help you diagnose problems in some environments, but there’s a stable of tools that we always seem to fall back on here at SitePen and I’m going to do a quick run-through of what we do at each step when we’re analyzing webapp performance and scalability problems (note: they’re not the same thing!). Here’s the short list, and why we can’t live without them:

  • Firebug 1.0 Beta
    • To users, perceived performance is the only thing that matters, and that means that investigation should examine the system from the user’s perspective and work backward from there. There was a time when the Firefox TamperData extension ruled the roost for this, but no more. Now that page loading requests can be graphed inside of Firebug, things are getting a lot easier. Not only can the graph view show 404 requests and slow responses, it often lays bare the synchronous nature of script execution and requests and the 2 HTTP connection limit. Generating “before” and “after” evaluations for clients has never been so easy.
  • Venkman and Firebug 1.0 Beta
    • Now that Firebug includes some profiling and debugging support, Venkman may finally be on the way out, but whichever tool you use it’s highly valuable to be able to profile in-browser JavaScript performance at a function level. HTTP and server-side problems are often a source of perceived latency, but simple testing with full caches can easily point to client-side performance issues. Any logging or profiling system will impact overall page performance, but you should be using these tools to get relative timing data. There’s a special mode for the Dojo package loader that can be used to get accurate function names and line numbers. While the timing information may not translate 100% across FF, IE, Opera, and Safari, the relative timings tend to be in line.
  • dojo.profile
    • The dojo.profile module lets you do tic/toc timings of JavaScript code and provide a table showing averages and total timings. We use this to verify relative timings across browsers once Venkman/Firebug point out bottlenecks and to validate fixes in a cross-browser way.
  • Tsung and Apache Bench
    • As I noted earlier, HTTP-level performance problems can seriously impact application latency. Neither tool can pinpoint fundamental problems like outbound bandwidth saturation (making the system more scalable doesn’t matter if you can’t send more data across your link), but when the problem is one of scale and not instantaneous performance, these tools let you begin to validate assumptions. Apache Bench is great for testing balls-to-the-wall concurrency of a single script, but very often you’re more interested in full-app performance under more realistic workloads. While there are commercial tools available that can do this kind of load testing in a “real world” way, Tsung provides a highly-capable “replay” proxy mode that will generate workloads that can be used to monitor system performance from a variety of angles. Since we’re most often interested in “how many users can it handle?” rather than “how many times a second can I request foo.php?”, Tsung is an invaluable ally. As a downside, Tsung often requires a proxy and an Erlang build.
  • bonnie++
    • Databases and web servers alike need good I/O performance, and bonnie++ lets us determine if we’re getting anything like the theoretical disk performance out of a system. Knowing the “shape” of your workload is essential, but I find that when remediating disk I/O issues bonnie++ usually finds its way into my analysis.
    • Please, please, please make sure that your file systems are running with noatime set.
  • “EXPLAIN” statements and slow-query logs
    • SQL is the ubiquitous abstraction that most of the web runs on, and every database system today provides information on how well it’s returning what you request. A thousand other things can niggle your SQL server performance to death, but nothing should get done without logs and EXPLAIN output to guide you.

System development and tuning need to go hand-in-hand, and expert help can clearly make a huge difference. The tools above are some of the most visible artifacts of the process, but it’s discipline in the process itself that’s of the most paramount importance. Let the data guide you and everything else is likely to work out…assuming you know where the goal line is.

Performance tuning can easily drive you crazy should you not have a goal in mind. Without a goal, there will always be another tweak, another 3% to be eeked out of the system. Combined with the marathon sessions that seem to lead nowhere, it’s important that developers doing performance work remember to keep their eye on the ball and to take a walk or a nap or just stop for the day when there isn’t forward progress for an hour or so. That, of course, means having a ball to keep an eye on. So before you start your tuning adventure (or call us in to help), you need to know what your budget is, what your responsiveness goals are, and what your scalability targets are.

For more thorough treatments of how to build things that both perform well and can be made to scale, I strongly recommend Cal Henderson’s “Building Scalable Web Sites”, Theo Schlossnagle’s “Scalable Internet Architectures”, and Jeremy Zawodny’s “High Performance MySQL”.

Next time: why the Dojo build system matters, why the x-domain package loader is awesome, and other stupid HTTP tricks.

When Vendors Attack! Film at 11

I’m sorry to interrupt the performance post series, but this seems to warrant a timely response.

Before I go any further, I should note that once-upon-a-time I was deeply involved in the webapp security community. As an engineer at a small MSSP in ’02 and ’03, I contributed to OWASP, lead one of their main projects, and participated in the associated discussions. I’ve audited web software for security flaws and worked to secure new and existing systems. These days, my involvement in the security world is reduced to reading interesting papers from the various conferences and my occasional trawl of CiteSeer. I have tremendous respect for the security community and many of the smart and talented people I had a chance to work with in those days.

But all is not right with the world of web app security. Paola and Fedon’s paper is an amalgam of other people’s research (response splitting) and a sprinkling of idiomatic JavaScript. When it can get to the front page of slashdot with “the web2.0 is falling!” billing, it only feeds the FUD flames. Pablum as revolution is disturbing. When it’s widely read, it’s urban legend in the making.

Here’s what Paola and Fedon tried to side-step:

  1. Response Splitting attacks aren’t that common (no, really)
    • The scariest bits of the presented paper require a complicit, b0rken proxy.
    • Mitigating the threat therefore means fixing the proxies, not the clients. This is comparatively good news as it implies fewer nodes to upgrade to remove the immediate-term threat. This matters to everyone interested in mitigating and managing risk (not eliminating it).
  2. The fundamental root-of-trust issue here is still an XSS attack. If you are subject to an XSS, the same domain policy already ensures that you’re f’d. An XSS attack is the “root” or “ring 0″ attack of the web. This is the fundamental weakness of the web’s security model today, and one that is difficult to solve (e.g., requires upgrading all clients). That there are problems associated with being rooted should surprise no one.
  3. Characterizing the replacement of existing functions as a “design flaw” in JavaScript is comical. The assumption is malicious code in the same execution scope as the code being attacked (see #2), and that’s not tractable by disallowing redefinition. Even if JavaScript didn’t allow it, any environment that would allow runtime event handlers to be registered would suffice, and since there is no way (in current JS) to determine if code is “valid”, the gig would still be up. Just register a malicious onreadystatechange handler. The only change would be that you might have to target applications more narrowly.

What really makes me sad though is that the work of folks like H.D. Moore, Thor Larhom, and Jeremiah Grossman gets lost in the noise when chaff like this is published. By not providing an honest evaluation of the real-world potential of a threat vector, the authors of a paper like this create a sort of seismograph that can’t tell magnitudes, only number of things shaking. Without magnitude information, an instant market is created for people to stand on the tops of roofs and yell down how bad it is (or in this case, how bad it could have been had they not been valiantly standing there).

Threat information is only valuable as when there is enough data about it to manage and mitigate risk. Yes, security problems are real, and web app security problems aren’t going away any time soon, but without level-headed analysis of the threat vectors, the real-world risk profiles, and the root-of-trust that is being attacked there is very little reason for clients to view the security community as anything but a freakish collection of opportunists, wolves, and disillusioned techno-utopianists. Accurate data builds trust, and trust builds a relationships that allows you to effectively mitigate risk. It’s high time that the security industry developed a code of ethics that prevents FUD-slinging. OWASP could even lead the way although I suspect there’s not a chance in hell of it happening.

The view from the roof is pretty good, after all.

The Dojo Offline Toolkit

[Note: This blog post is out of date. For up to date information on Dojo Offline please see the official web page.]

Digg this story!

Introduction

I’m proud to announce the kick off of the Dojo Offline Toolkit, which SitePen has graciously agreed to sponsor and fund. SitePen is a leader in pushing the web browser in new directions, and I’m extremely excited to be working on this project with the SitePen crew.

Last month, in December, I came up for air after finishing HyperScope 1.1 and touched base with Dylan Schiemann, CEO of SitePen, about consulting with them. On the phone I mentioned off hand to Dylan that I had been prototyping and playing with some ideas around bringing true offline access to web applications in a simple, generic way. Dylan mentioned that SitePen would be very interested in such a framework, since it would help them bring in new clients, and offered to fund full-time development of it for the next three months. Wow; what a mensh.

Starting today, I will be working full-time the next three months on bringing the Dojo Offline Toolkit from the drawing board to reality, thanks to SitePen. The Dojo Offline Toolkit will be an open source library that brings true, offline access to web applications, in a simple, generic way that developers can easily bring into their web applications. Users will be able to access their web applications and work with their data even if no network connection is available, just like desktop applications.

What is the Dojo Offline Toolkit?

The Dojo Offline Toolkit will be a small, cross-platform, generic download that enables web applications to work offline.

Let’s look at the Dojo Offline Toolkit from a user’s perspective. Imagine Alex is using a web-based real estate application for realtors built with the Dojo Offline Toolkit. In the upper-right corner of this web application is a button that says “Work Offline.” The first time Alex clicks on this button, a small window appears informing him that this web application can be accessed and used even if he is offline. If Dojo Offline has never been installed, Alex is prompted to optionally install a small 100K through 300K download that is automatically selected for his appropriate OS, including Windows, Linux/x86, and Mac OS X/Universal Binary.

Once Dojo Offline is installed with the included installer, the web-based real estate application prompts Alex to drag a hyperlink to his desktop and bookmark the web application’s URL. As Alex works online, anything that should be available offline is simply stored locally. If Alex is offline, he can reach his application by simply double-clicking the link on his desktop, opening its bookmark, or by simply typing in its normal web address. The application’s user-interface will magically appear in the browser, even if the user is offline, and all offline data will be retrieved from and stored into local storage. Dojo Offline detects when the network has reappeared, allowing the web application to send any data stored in local storage to the web server.

Local storage is done using Dojo Storage, which allows web applications to store hundreds of K or megabytes of information inside the browser, with the user’s permission. Dojo Storage is complete and works across 95% of the existing installed base of the web, including Firefox, Safari, and Mozilla on Windows, Linux, and Mac OS X. The Dojo Offline Toolkit will come bundled with Dojo Storage.

Once Dojo Offline has been installed, it will work for any web application that codes to it — it is completely generic and has no application specific information in its download. Applications have a consistent, simple API they can code to, the Dojo Offline and Dojo Storage APIs, to enable offline ability. Even better, since the user always interacts with the web application through its domain name, rather than through a file:// URL or http://localhost domain name, the web application runs under the same security policies as standard web sites, which means a user’s machine will not be compromised by an untrusted web application. The Dojo Offline Toolkit will work in Internet Explorer, Firefox, and Safari, and will run on Windows, Linux/x86, and Mac OS X/Universal Binary.

The Dojo Offline Toolkit will be fully open source, available under the same licenses as Dojo: the BSD and the AFL.

Continue reading