[Note: This blog post is out of date. For up to date information on Dojo Offline please see the official web page.]

Digg this story!

Introduction

I’m proud to announce the kick off of the Dojo Offline Toolkit, which SitePen has graciously agreed to sponsor and fund. SitePen is a leader in pushing the web browser in new directions, and I’m extremely excited to be working on this project with the SitePen crew.

Last month, in December, I came up for air after finishing HyperScope 1.1 and touched base with Dylan Schiemann, CEO of SitePen, about consulting with them. On the phone I mentioned off hand to Dylan that I had been prototyping and playing with some ideas around bringing true offline access to web applications in a simple, generic way. Dylan mentioned that SitePen would be very interested in such a framework, since it would help them bring in new clients, and offered to fund full-time development of it for the next three months. Wow; what a mensh.

Starting today, I will be working full-time the next three months on bringing the Dojo Offline Toolkit from the drawing board to reality, thanks to SitePen. The Dojo Offline Toolkit will be an open source library that brings true, offline access to web applications, in a simple, generic way that developers can easily bring into their web applications. Users will be able to access their web applications and work with their data even if no network connection is available, just like desktop applications.

What is the Dojo Offline Toolkit?

The Dojo Offline Toolkit will be a small, cross-platform, generic download that enables web applications to work offline.

Let’s look at the Dojo Offline Toolkit from a user’s perspective. Imagine Alex is using a web-based real estate application for realtors built with the Dojo Offline Toolkit. In the upper-right corner of this web application is a button that says “Work Offline.” The first time Alex clicks on this button, a small window appears informing him that this web application can be accessed and used even if he is offline. If Dojo Offline has never been installed, Alex is prompted to optionally install a small 100K through 300K download that is automatically selected for his appropriate OS, including Windows, Linux/x86, and Mac OS X/Universal Binary.

Once Dojo Offline is installed with the included installer, the web-based real estate application prompts Alex to drag a hyperlink to his desktop and bookmark the web application’s URL. As Alex works online, anything that should be available offline is simply stored locally. If Alex is offline, he can reach his application by simply double-clicking the link on his desktop, opening its bookmark, or by simply typing in its normal web address. The application’s user-interface will magically appear in the browser, even if the user is offline, and all offline data will be retrieved from and stored into local storage. Dojo Offline detects when the network has reappeared, allowing the web application to send any data stored in local storage to the web server.

Local storage is done using Dojo Storage, which allows web applications to store hundreds of K or megabytes of information inside the browser, with the user’s permission. Dojo Storage is complete and works across 95% of the existing installed base of the web, including Firefox, Safari, and Mozilla on Windows, Linux, and Mac OS X. The Dojo Offline Toolkit will come bundled with Dojo Storage.

Once Dojo Offline has been installed, it will work for any web application that codes to it — it is completely generic and has no application specific information in its download. Applications have a consistent, simple API they can code to, the Dojo Offline and Dojo Storage APIs, to enable offline ability. Even better, since the user always interacts with the web application through its domain name, rather than through a file:// URL or http://localhost domain name, the web application runs under the same security policies as standard web sites, which means a user’s machine will not be compromised by an untrusted web application. The Dojo Offline Toolkit will work in Internet Explorer, Firefox, and Safari, and will run on Windows, Linux/x86, and Mac OS X/Universal Binary.

The Dojo Offline Toolkit will be fully open source, available under the same licenses as Dojo: the BSD and the AFL.

What Are Some Example Offline Applications?

True offline access for web applications is one of the holy grails of web development. I believe that providing the web with true, reliable offline access will open the door to many exciting, powerful possibilities; this is why I have devoted the last few years to figuring out how to bring this ability to the contemporary web. Consumer web sites, such as GMail and Blogger, can continue their relentless march to being the central applications users work with day to day by operating even when users don’t have a network; companies can make their corporate portals and sales CRM systems available offline for a mobile work-force, such as sales agents out in the field; and the web itself can begin to replace many custom, vertical Visual Basic and Java Swing applications that could only be done as applications because they had to work offline. With Dojo Offline these can now migrate to the web, bringing the web’s operational and development cost-savings to many internal companies important applications.

Let’s quickly see how many common web applications can be updated to work offline if they adopt the Dojo Offline Toolkit.

GMail

Imagine a version of GMail with a “Work Offline” button on the left-hand side of the screen. When pressed, GMail downloads 100 of your most recent emails into Dojo Offline, including pieces of it’s user-interface. A user can now close their browser and leave the network, stepping on an airplane for example. Once in the air, the user can then simply open their laptop and browser and type in mail.google.com. The GMail UI magically appears, along with their 100 most recent emails. A user can read these mails, compose new ones, or reply to existing ones. A flight attendant announces that the plane will land soon; the user closes their browser and laptop. Later, when they are back on the network, they can click the “Work Online” button, which will send all of their locally written emails to the GMail server.

Corporate Portal

Imagine you are a sales woman, out on the road with your laptop, visiting suppliers and potential customers. Your company has set up a corporate portal that lists potential sales leads, contacts, opportunities, new products, important documents, and more; the information on this portal means the difference between making a big sale that pays the rent this month with its sales commision or working every weekend to make ends meet. What if your corporate portal could download important information into your Dojo Offline cache before hitting the road, so that when that potential customer asks about Widget X you can quickly pull it up in your browser without a network, making the sale?

Google Docs

If online office suites, such as Google Docs and Spreadsheets, don’t have offline access they can never truly compete with Microsoft Office. This is an easy one to imagine; simply select which documents you want to have locally. Later, open your browser and navigate to docs.google.com, working from anywhere you want, even without a network. When you are done, press the “Sync” button to send it back to the server with your changes when the network reappears. A more sophisticated UI might be available to merge changes that others have made to documents while you were offline.

Blogger

I love to blog. I commonly have an inspiration for a blog post while walking around, and carry my laptop in my backpack. Many times I am at a book store, coffee shop, or friends house, and would love to quickly write a new blog post or lightly edit an existing one even if I don’t have a WiFi network, which is very common. I know there are custom desktop applications I can download to work offline, but I don’t want to learn a new user-interface. Why can’t I use Blogger offline? Imagine a Blogger that works with the Dojo Offline Toolkit; when I start Blogger, it automatically downloads its UI and most recent blog posts into Dojo Offline. Later, when inspiration hits, I can simply pop open my laptop, open my browser, and navigate to blogger.com; the blogger.com UI magically appears, informing me that I am working offline. I can now write my blog post or edit the ones that are locally inside Dojo Offline, which get saved into Dojo Offline locally. When I hit the network again, I simply hit the “Sync” button on the Blogger page, which uploads my new posts and edited ones to the server.

How Dojo Offline Works

Problem: How can a user access a web application’s user-interface while offline?

I have been working on this problem for years, trying many different configurations. The solution provided by the Dojo Offline Toolkit is surprisingly simple. We don’t need to adopt radically different or exotic programming models, such as loading Single Page Applications like TiddlyWiki from the filesystem, adopting Adobe’s Apollo framework, or downloading huge, entire web servers with specialized application logic that run locally, such as Zimbra’s offline solution.

Instead, Dojo Offline’s answer is to simply use a very small, standard web proxy that runs locally. Web proxies are perfect; they speak standard HTTP/1.1 between a web browser and a server, caching files that wish to be cached for later access without hitting the network. Many companies run a web proxy on their networks, caching commonly accessed pages for later access; why can’t this web proxy run on a user’s local machine, caching a web application’s UI for offline access? A web server can simply turn on standard HTTP/1.1 caching headers on its user-interface files, which the proxy dutifully caches. If the browser comes up but the network is down, the local web proxy will simply hand back its cached UI files. Even better, the proxy will automatically update any of its cached files if they have been updated, based on their caching headers, which means the UI gains auto-update for free — no new standards are needed.

How do we configure the web browser to talk to our local web proxy? We use a standard from the late nineties not known by many but which has deep and mature support in all browsers called Proxy AutoConfiguration (PAC). A PAC file is a small bit of JavaScript that is invoked on each browser request. This JavaScript can decide how to resolve the address, either by directly talking to the web site or by using a proxy. For Dojo Offline, we only want to talk to the local proxy and cache files for Dojo Offline web applications, not for all web sites so that that we don’t fill up our hard drive. Our PAC file will therefore talk to the local web proxy for any domain names that want to work offline, and will ignore the proxy for all other addresses; this will be a simple JavaScript if/else statement in the PAC file. We programatically register our PAC file for a user’s browser. This PAC file is actually generated dynamically by the local Dojo Offline proxy.

How does a web application add itself to the PAC file so it can work offline? We have to be very careful here. We don’t want to create an attack vector to the user’s local computer by having the web application “talk” to localhost, such as “http://localhost:1234/add-web-app?url=mywebapp.com” or make it possible for one web application to spoof another one and have it be added to the PAC file if it doesn’t want to be added. The entire focus of security for Dojo Offline is to keep the surface area of trust as narrow and small as possible, constraining privilege to just the small web proxy, which only runs on the loopback address and never touches the real network — everything else must use standard domain names, forcing them into the browser’s standard, restricted web privilege level. Further, the Dojo Offline Toolkit’s proxy is completely generic and does not have to be tailored for individual applications.

Dojo Offline’s PAC file comes bundled with a single, magical bootstrap domain name initially, “offline.dojo.web.app,” that a web application can invoke to add itself to the PAC file. The PAC file routes any request for this domain to the local proxy, and the Dojo Offline proxy checks the referer (sic) header for the domain name to be added offline. Normally the referer field can be spoofed, but there is no way for a web application to spoof the referer field from inside the web browser. The predefined offline.dojo.web.app domain name also exposes other services a web application can use, such as knowing whether it is on- or off-line. Access to these services is mediated by a thin, easy-to-use Dojo Offline JavaScript API, bundled with the web application itself.

The web browser does not know the difference between whether you are on- or off-line, since the proxy serves up the UI either way. Dojo Storage can save hundreds of K or megabytes of application-level data, and is keyed off of the domain name for security; Dojo Storage is therefore “tricked” into not knowing the difference and is therefore accessible either way with the same data store. Applications can use this persistent, megabyte-capable store for all offline data needs, accessing the same information whether you are on- or off-line.

The last step is to wrap the Dojo Offline Toolkit into a small installer for each target platform, and to have it start up silently on system startup. The download size will be only 100 to 300K, making it extremely easy to download and try; an uninstaller will also exist for each platform, bundled with the download. Everything is automated, hands-off, and easy.

The important pieces of Dojo Offline have already been prototyped and found to work; all that remains is engineering work. An off-the-shelf, open source (GPL), C-based web-proxy will be used, named Polipo, saving months of development time creating a custom HTTP/1.1 proxy. Polipo compiles to only 150K and is portable to Windows, Linux, and Mac OS X; it is the smallest, most-full featured web proxy available. There are a few bugs in Polipo that will be cleaned up for Dojo Offline. The open source NullSoft Install System (NSIS) will be used for the Windows installer, while Linux installation will be through Firefox’s Cross-Platform Installer (XPI) technology; the Mac OS X installer technology has not been determined yet.

Development Details

I hope you are as excited about the prospects of the Dojo Offline Toolkit as I am. Here’s the plan:

SitePen will be sponsoring my full-time development of Dojo Offline over the next three months.

The goal will be to get running code as quickly as possible, generating prototypes that are iteratively refined and made more reliable with each pass, starting with Windows. Regular builds and demos will be posted, and weekly Dojo Offline status reports will be blogged on my weblog every Monday.

The final deliverable will consist of:

  • the Dojo Offline proxy
  • installers and uninstallers
  • PAC file generation and registration
  • the Dojo Offline API for easy, application-level access
  • the Dojo Offline web-based installer UI for downloading Dojo Offline
  • documentation
  • a sample application, Moxie, modified to work with Dojo Offline
  • QA and bug fixing

I will target Windows and Mac OS X/x86 initially, with builds for Linux/x86 and Mac OS X/PowerPC if time allows. I will also explore whether I can provide automatic network up/down notification if it is feasible.

I will be attempting to maintain copies of most of my documentation inside the HyperScope, to see how it performs in the context of a project. This will be useful for folks who want to point deeply into the Dojo Offline Toolkit docs using granular addressability.

Digg this story!