The Dojo Offline Toolkit

By on January 2, 2007 10:54 pm

[Note: This blog post is out of date. For up to date information on Dojo Offline please see the official web page.]

Digg this story!

Introduction

I’m proud to announce the kick off of the Dojo Offline Toolkit, which SitePen has graciously agreed to sponsor and fund. SitePen is a leader in pushing the web browser in new directions, and I’m extremely excited to be working on this project with the SitePen crew.

Last month, in December, I came up for air after finishing HyperScope 1.1 and touched base with Dylan Schiemann, CEO of SitePen, about consulting with them. On the phone I mentioned off hand to Dylan that I had been prototyping and playing with some ideas around bringing true offline access to web applications in a simple, generic way. Dylan mentioned that SitePen would be very interested in such a framework, since it would help them bring in new clients, and offered to fund full-time development of it for the next three months. Wow; what a mensh.

Starting today, I will be working full-time the next three months on bringing the Dojo Offline Toolkit from the drawing board to reality, thanks to SitePen. The Dojo Offline Toolkit will be an open source library that brings true, offline access to web applications, in a simple, generic way that developers can easily bring into their web applications. Users will be able to access their web applications and work with their data even if no network connection is available, just like desktop applications.

What is the Dojo Offline Toolkit?

The Dojo Offline Toolkit will be a small, cross-platform, generic download that enables web applications to work offline.

Let’s look at the Dojo Offline Toolkit from a user’s perspective. Imagine Alex is using a web-based real estate application for realtors built with the Dojo Offline Toolkit. In the upper-right corner of this web application is a button that says “Work Offline.” The first time Alex clicks on this button, a small window appears informing him that this web application can be accessed and used even if he is offline. If Dojo Offline has never been installed, Alex is prompted to optionally install a small 100K through 300K download that is automatically selected for his appropriate OS, including Windows, Linux/x86, and Mac OS X/Universal Binary.

Once Dojo Offline is installed with the included installer, the web-based real estate application prompts Alex to drag a hyperlink to his desktop and bookmark the web application’s URL. As Alex works online, anything that should be available offline is simply stored locally. If Alex is offline, he can reach his application by simply double-clicking the link on his desktop, opening its bookmark, or by simply typing in its normal web address. The application’s user-interface will magically appear in the browser, even if the user is offline, and all offline data will be retrieved from and stored into local storage. Dojo Offline detects when the network has reappeared, allowing the web application to send any data stored in local storage to the web server.

Local storage is done using Dojo Storage, which allows web applications to store hundreds of K or megabytes of information inside the browser, with the user’s permission. Dojo Storage is complete and works across 95% of the existing installed base of the web, including Firefox, Safari, and Mozilla on Windows, Linux, and Mac OS X. The Dojo Offline Toolkit will come bundled with Dojo Storage.

Once Dojo Offline has been installed, it will work for any web application that codes to it — it is completely generic and has no application specific information in its download. Applications have a consistent, simple API they can code to, the Dojo Offline and Dojo Storage APIs, to enable offline ability. Even better, since the user always interacts with the web application through its domain name, rather than through a file:// URL or http://localhost domain name, the web application runs under the same security policies as standard web sites, which means a user’s machine will not be compromised by an untrusted web application. The Dojo Offline Toolkit will work in Internet Explorer, Firefox, and Safari, and will run on Windows, Linux/x86, and Mac OS X/Universal Binary.

The Dojo Offline Toolkit will be fully open source, available under the same licenses as Dojo: the BSD and the AFL.

What Are Some Example Offline Applications?

True offline access for web applications is one of the holy grails of web development. I believe that providing the web with true, reliable offline access will open the door to many exciting, powerful possibilities; this is why I have devoted the last few years to figuring out how to bring this ability to the contemporary web. Consumer web sites, such as GMail and Blogger, can continue their relentless march to being the central applications users work with day to day by operating even when users don’t have a network; companies can make their corporate portals and sales CRM systems available offline for a mobile work-force, such as sales agents out in the field; and the web itself can begin to replace many custom, vertical Visual Basic and Java Swing applications that could only be done as applications because they had to work offline. With Dojo Offline these can now migrate to the web, bringing the web’s operational and development cost-savings to many internal companies important applications.

Let’s quickly see how many common web applications can be updated to work offline if they adopt the Dojo Offline Toolkit.

GMail

Imagine a version of GMail with a “Work Offline” button on the left-hand side of the screen. When pressed, GMail downloads 100 of your most recent emails into Dojo Offline, including pieces of it’s user-interface. A user can now close their browser and leave the network, stepping on an airplane for example. Once in the air, the user can then simply open their laptop and browser and type in mail.google.com. The GMail UI magically appears, along with their 100 most recent emails. A user can read these mails, compose new ones, or reply to existing ones. A flight attendant announces that the plane will land soon; the user closes their browser and laptop. Later, when they are back on the network, they can click the “Work Online” button, which will send all of their locally written emails to the GMail server.

Corporate Portal

Imagine you are a sales woman, out on the road with your laptop, visiting suppliers and potential customers. Your company has set up a corporate portal that lists potential sales leads, contacts, opportunities, new products, important documents, and more; the information on this portal means the difference between making a big sale that pays the rent this month with its sales commision or working every weekend to make ends meet. What if your corporate portal could download important information into your Dojo Offline cache before hitting the road, so that when that potential customer asks about Widget X you can quickly pull it up in your browser without a network, making the sale?

Google Docs

If online office suites, such as Google Docs and Spreadsheets, don’t have offline access they can never truly compete with Microsoft Office. This is an easy one to imagine; simply select which documents you want to have locally. Later, open your browser and navigate to docs.google.com, working from anywhere you want, even without a network. When you are done, press the “Sync” button to send it back to the server with your changes when the network reappears. A more sophisticated UI might be available to merge changes that others have made to documents while you were offline.

Blogger

I love to blog. I commonly have an inspiration for a blog post while walking around, and carry my laptop in my backpack. Many times I am at a book store, coffee shop, or friends house, and would love to quickly write a new blog post or lightly edit an existing one even if I don’t have a WiFi network, which is very common. I know there are custom desktop applications I can download to work offline, but I don’t want to learn a new user-interface. Why can’t I use Blogger offline? Imagine a Blogger that works with the Dojo Offline Toolkit; when I start Blogger, it automatically downloads its UI and most recent blog posts into Dojo Offline. Later, when inspiration hits, I can simply pop open my laptop, open my browser, and navigate to blogger.com; the blogger.com UI magically appears, informing me that I am working offline. I can now write my blog post or edit the ones that are locally inside Dojo Offline, which get saved into Dojo Offline locally. When I hit the network again, I simply hit the “Sync” button on the Blogger page, which uploads my new posts and edited ones to the server.

How Dojo Offline Works

Problem: How can a user access a web application’s user-interface while offline?

I have been working on this problem for years, trying many different configurations. The solution provided by the Dojo Offline Toolkit is surprisingly simple. We don’t need to adopt radically different or exotic programming models, such as loading Single Page Applications like TiddlyWiki from the filesystem, adopting Adobe’s Apollo framework, or downloading huge, entire web servers with specialized application logic that run locally, such as Zimbra’s offline solution.

Instead, Dojo Offline’s answer is to simply use a very small, standard web proxy that runs locally. Web proxies are perfect; they speak standard HTTP/1.1 between a web browser and a server, caching files that wish to be cached for later access without hitting the network. Many companies run a web proxy on their networks, caching commonly accessed pages for later access; why can’t this web proxy run on a user’s local machine, caching a web application’s UI for offline access? A web server can simply turn on standard HTTP/1.1 caching headers on its user-interface files, which the proxy dutifully caches. If the browser comes up but the network is down, the local web proxy will simply hand back its cached UI files. Even better, the proxy will automatically update any of its cached files if they have been updated, based on their caching headers, which means the UI gains auto-update for free — no new standards are needed.

How do we configure the web browser to talk to our local web proxy? We use a standard from the late nineties not known by many but which has deep and mature support in all browsers called Proxy AutoConfiguration (PAC). A PAC file is a small bit of JavaScript that is invoked on each browser request. This JavaScript can decide how to resolve the address, either by directly talking to the web site or by using a proxy. For Dojo Offline, we only want to talk to the local proxy and cache files for Dojo Offline web applications, not for all web sites so that that we don’t fill up our hard drive. Our PAC file will therefore talk to the local web proxy for any domain names that want to work offline, and will ignore the proxy for all other addresses; this will be a simple JavaScript if/else statement in the PAC file. We programatically register our PAC file for a user’s browser. This PAC file is actually generated dynamically by the local Dojo Offline proxy.

How does a web application add itself to the PAC file so it can work offline? We have to be very careful here. We don’t want to create an attack vector to the user’s local computer by having the web application “talk” to localhost, such as “http://localhost:1234/add-web-app?url=mywebapp.com” or make it possible for one web application to spoof another one and have it be added to the PAC file if it doesn’t want to be added. The entire focus of security for Dojo Offline is to keep the surface area of trust as narrow and small as possible, constraining privilege to just the small web proxy, which only runs on the loopback address and never touches the real network — everything else must use standard domain names, forcing them into the browser’s standard, restricted web privilege level. Further, the Dojo Offline Toolkit’s proxy is completely generic and does not have to be tailored for individual applications.

Dojo Offline’s PAC file comes bundled with a single, magical bootstrap domain name initially, “offline.dojo.web.app,” that a web application can invoke to add itself to the PAC file. The PAC file routes any request for this domain to the local proxy, and the Dojo Offline proxy checks the referer (sic) header for the domain name to be added offline. Normally the referer field can be spoofed, but there is no way for a web application to spoof the referer field from inside the web browser. The predefined offline.dojo.web.app domain name also exposes other services a web application can use, such as knowing whether it is on- or off-line. Access to these services is mediated by a thin, easy-to-use Dojo Offline JavaScript API, bundled with the web application itself.

The web browser does not know the difference between whether you are on- or off-line, since the proxy serves up the UI either way. Dojo Storage can save hundreds of K or megabytes of application-level data, and is keyed off of the domain name for security; Dojo Storage is therefore “tricked” into not knowing the difference and is therefore accessible either way with the same data store. Applications can use this persistent, megabyte-capable store for all offline data needs, accessing the same information whether you are on- or off-line.

The last step is to wrap the Dojo Offline Toolkit into a small installer for each target platform, and to have it start up silently on system startup. The download size will be only 100 to 300K, making it extremely easy to download and try; an uninstaller will also exist for each platform, bundled with the download. Everything is automated, hands-off, and easy.

The important pieces of Dojo Offline have already been prototyped and found to work; all that remains is engineering work. An off-the-shelf, open source (GPL), C-based web-proxy will be used, named Polipo, saving months of development time creating a custom HTTP/1.1 proxy. Polipo compiles to only 150K and is portable to Windows, Linux, and Mac OS X; it is the smallest, most-full featured web proxy available. There are a few bugs in Polipo that will be cleaned up for Dojo Offline. The open source NullSoft Install System (NSIS) will be used for the Windows installer, while Linux installation will be through Firefox’s Cross-Platform Installer (XPI) technology; the Mac OS X installer technology has not been determined yet.

Development Details

I hope you are as excited about the prospects of the Dojo Offline Toolkit as I am. Here’s the plan:

SitePen will be sponsoring my full-time development of Dojo Offline over the next three months.

The goal will be to get running code as quickly as possible, generating prototypes that are iteratively refined and made more reliable with each pass, starting with Windows. Regular builds and demos will be posted, and weekly Dojo Offline status reports will be blogged on my weblog every Monday.

The final deliverable will consist of:

  • the Dojo Offline proxy
  • installers and uninstallers
  • PAC file generation and registration
  • the Dojo Offline API for easy, application-level access
  • the Dojo Offline web-based installer UI for downloading Dojo Offline
  • documentation
  • a sample application, Moxie, modified to work with Dojo Offline
  • QA and bug fixing

I will target Windows and Mac OS X/x86 initially, with builds for Linux/x86 and Mac OS X/PowerPC if time allows. I will also explore whether I can provide automatic network up/down notification if it is feasible.

I will be attempting to maintain copies of most of my documentation inside the HyperScope, to see how it performs in the context of a project. This will be useful for folks who want to point deeply into the Dojo Offline Toolkit docs using granular addressability.

Digg this story!

Comments

  • Pingback: ajax everywhere » Blog Archive » Dojo Offline Toolkit()

  • You’re just a genius, Brad. Thanks to SitePen to help you in this great project.

  • Ah, you’re too nice JB! Thanks for the kind words :)

  • Go go go Brad, go luck and thanks to you and to SitePen!

  • Pingback: ninthspace » Streets Ahead()

  • I like what you’re doing with this, good luck.
    With 2/3rds of PCs sold now being notebooks using webapps offline is clearly going to be important. My guess is that doing neat synchronisation between sets of data is going to be the hardest bit (e.g. calendars), is that being left to the individual application of will there be some supporting tools?

  • Pingback: Web2OS » Blog Archive » Dojo Offline takes the local proxy approach()

  • ff

    Great idea! I hope you have a lot of success in this great project!

  • Pingback: Luca Zappa Web Corner » Blog Archive » The Dojo Offline Toolkit()

  • […] http://www.lucazappa.com/2007/01/03/the-dojo-offline-toolkit/ by Luca Zappa on Wednesday 3 January 2007 at 2:08 pm […]

  • Excellent news!

    Not addressed in the post, what kind of code will devs need to write in order to “re-post” form submissions when the app goes back online? Can these be automatically replayed in order, or is the stored data submitted all at once, or…?

    FYI, OSX has a standard installer, but most Mac folks would rather just have a self-contained .app bundle they can drag to their Applications folder. Depends on what you need the installer to do, I guess.

  • Awesome! Finally, an open-source implementation of an off-line client toolkit! I’ve been seriously wanting this.

    My thought was to simply use a servlet server, eg. Tomcat, like what the eXo Platform does. That doesn’t sound too different from what you are proposing.

    I also thought of modifying the proxy.pac file, but be sure you continue to support the existing proxy.pac. Don’t forget about auto-discovery.

    For Dojo, I’d expect you to be able to avoid the native application installation, since Scrybe seems to be able to do that. Of course, Scrybe could simply be leaving that part out of there demo video. This is one reason why I think we need a WebOS.

    Sorry for the long comment, I’m so excited I couldn’t contain myself. :)

  • This is really awesome news! It’s the stuff like offline support that make Dojo the Javascript toolkit! Good luck!

  • I am already using dojo. I was thinking to use apollo for convert my web based applicatition to desktop but now if this dojo offline toolkit coming. I think i definately like to check it out once and Brad..best of luck.

  • Pingback: Ritesh Jariwala - (Actkid) Freelance RIA Developer India for Flash, Flashcom, PHP, MySQL, AMFPHP, Remoting, ColdFusion, Flex » The Dojo Offline Toolkit()

  • Michael Schall

    This sounds very cool. I’m excited to see how the project unfolds.

    Any thoughts about using the Wix toolkit (http://wix.sourceforge.net/) for the Windows install? It would create an msi instead of an setup exe. Msi’s are nicer in the corporate world as they can run with elevated privileges as a lot of people are not administrators of their machines.

  • How many web applications can really maintain functionality without that live connection to the server side logic and data? This is all not to mention the incredible security concerns that arises. This is such a cool idea but it seems that you’re following the MS mentality of valuing function over security.

    In my opinion, anything that gives a web application elevated permissions to my local system/resources is a BAD idea outside the realm of an intranet business application.

  • Excellent stuff! I’m very happy to see this happening.

    A few things come to mind:
    1. I already use PAC for our corporate proxy detection. Ponder a situation where it would be most excellent, then, to retain or respect existing proxy configurations.

    2. What’s more, I’m running a locked-down desktop where a user does not have access to change proxy settings. Is there an alternate entry point, such as URL-based access to the proxy e.g. http://localhost:32891/site=gmail.com?

  • Beautiful.

    This would open up many opportunities in the mobile Health Care space. I will explore some potential products to develop based on this platform.

    Question: Wouldn’t this require a client side MVC framework?

    Keep up the good work guys.

  • Hey Brad, Sounds great. Any thoughts on increasing storage limits? As you know we are in development of an offline version of Zimbra’s AJAX web client and currently use Derby to back our storage layer. This enables us to sync and index multi-GB mailboxes. Using a common storage access layer like Dojo would be helpful and might save us time but the current storage limits prevent that. Also any idea what the APIs to access the storage will look like? Could this be made pluggable for those who want something more than get/put?

  • Great news, and looking forward to seeing Dojo Offline Toolkit develop.

    Taking a similar approach in web2os, but added in a javascript / spidermonkey MVC framework and SQLite database in the proxy. Haven’t set up Dojo server-side on the proxy yet, but will give it a go.

  • EdFrench: Dojo Offline will just address getting the offline part for now; a Dojo Sync package is a future possibility.

    Jadon: I actually used a local servlet container, a modified version of Jetty, for a project a few years ago called P2P Sockets (http://p2psockets.dev.jxta.org/). Unfortunately, a major requirement for Dojo Offline is a very small download, 100K to 300K, and the only way to do this was by using the C-based Polipo proxy. Last month I explored other options, including gcj and compiled python, but they are still too large and have cross-platform issues. The Web Proxy AutoDiscovery is interesting, but complicated for me to support. Unfortunately, there is no way I know of supporting the existing PAC file; I was going to clobber it, and save the value for returning it to its old value during uninstallation. Do you have a better idea on what to do here? For Scrybe to do true offline accesss, they most definently must have some kind of download; they are using Flash storage for local storage, but Flash doesn’t have native offline access. There are some browser cache tricks you can do to have offline access without a download, which Julien Couvreur discovered and which I explored in conjunction last year with Dojo Storage, but it is not reliable enough for deployed apps.

    Ritesh: The impending arrival of Adobe Apollo is definently one of the major reasons for the Dojo Offline Toolkit; I want to make sure the web itself is competitive with Apollo and Microsoft’s WPF — gotta keep things open and non-proprietary :)

    Michael: Thanks for the installer info; I didn’t know the difference betwee MSIs and setup.exe’s; thanks. I’ll check that package out.

    Andrew: In the blog post I gave some examples of web apps that would be very useful offline, with example functionality. Also, I am actually very focused on security as well. The Dojo Offline Toolkit is a generic proxy that is not application specific; web applications use it by working through the browsers normal security policy for foreign domains, such as http://www.foobar.com, versus working with a localhost or file: URL, to minimize the capabilities of the web application.

    Mike: I’m not sure how to work with a pre-existing PAC file configuration; I was going to not address that for this three month interval, but if you have some ideas and they are realistic for me to implement I might reconsider. A super locked down desktop might also be difficult to work with. If you’ve got some fresh ideas I’d love to hear about them.

    Najier: We don’t need a client-side MVC framework because the browser is mostly tricked into thinking its still a web site; you can use any Ajax toolkit you want, including Dojo, YUI, etc., ones that run predominantly on the client-side. When the Dojo Offline Toolkit comes out, it will be a focus of mine to keep it’s JavaScript API small so that you can easily combine it with any JavaScript/Ajax library you are already using, so you can simply mix it in.

    KevinH: Storage limits are based on the underlying storage provider, which are several megabytes once the user has given permission. I plan on doing a bunch of benchmarking over the next three months to see how scalable the storage is across different things, based on the discussions we have had through email and the blogosphere about how Zimbra used things. I want to know how performance is affected as I increase the number of records (Q: how does Dojo Storage work at 100 records? 10,000? 100,000?) I will test this across the platforms and browsers I normally test across (Windows, Linux, Mac on Safari, IE, and Firefox). About the API, that is my first task. I am going to start with real mocked up applications as my start, to keep me honest. I’d _love_ to know more about your real world requirements, because there are only three real offline web apps that currently exist: yours, SalesForce.com, and Scrybe (which is mostly a demo movie at this point). Can we talk on the phone today (Wednesday)? My # is 510-938-3263.

    To everyone else: Thanks for all the kind comments and excitement about the project! I really appreciate it.

    Best,
    Brad

  • Ooh, I haven’t used PAC for anything since hacking Greasemonkey-alike functionality in the nineties. Great revival, Brad, and good luck! :-D

  • Pingback: .блог вебмастера » The Dojo Offline Toolkit()

  • Unfortunately, there is no way I know of supporting the existing PAC file; I was going to clobber it, and save the value for returning it to its old value during uninstallation. Do you have a better idea on what to do here?

    Corporate Intranets won’t work great if you don’t continue to support their PAC files. You could simply run wrap the existing FindProxyForURL function with your own implementation. That will avoid you needing to cache/proxy sites that you don’t process. You’ll need to have a JavaScript interpreter to run the existing function so that you can determine the right URL for the real host yourself. If you don’t process the existing PAC file, you’ll just break browsers around the world.

    Since you are going to have to do a trusted install, I still don’t understand the 100-300K requirement. It isn’t like the proxy can be stuck in the Dojo client-side library.

  • Jadon: How many corporate intranets use PAC files? Is it in use by major companies, like Sun?

    About the size, the reason I want to keep it small is users hate downloading things; Flash was very successful by keeping their plugin download to about 700K.

  • Andrew Grey

    If I remember correctly this is what Morfik demonstrated/talked about in 2005 – offline Gmail.

  • I work at Texas Instruments and we use a proxy.pac file hosted on a web server. I think this must be common or it wouldn’t be such a prevalent feature in the browser. I’d say TI is pretty big (over 30,000 employees).

    700K sounds like a fair target. I’d personally give priority to the architecture roadmap, rather than hitting a particular implementation target. I can see the need to get something out there that is practical early too.

    Is the hope to long-term compatibility with other storage standards? Will using Dojo give us all a wrapper we can use for remote storage (using something like Amazon S3), local storage with your browser plug-in, local storage in Firefox 3.0 (which I believe will be Moz Storage, and whatever standard ends up in all browsers.

    If there is a well-defined interface to the locally-host proxy server, I’d think I could still provide a Java+JavaScript-based solution that was compatible. If you read my blog, you’ll know I think standardizing that interface is critical.

  • Wonderful project. I lose connection with the Internet not only on an airplane, but many other ways, especially outside the US. Is it possible to choose BOTH local storage and website storage at the same time–a kind of tandem or dual server approach? That way I can go happily forward whether it is the Internet or my local machine that suddenly becomes unavailable.

  • Great project Brad!
    Keep up the good work!

  • Pingback: links for 2007-01-04 « Werner Ramaekers()

  • Kevin, SitePen has been doing use-case analysis on what the requirements of an offline email client might be, such as GMail or Zimbra; is it really necessary to download _all_ emails? We have two major use cases right now:
    * Read your newest emails (perhaps 100)
    * Create a new email or respond to an existing one

    I know that the Zimbra client supports search across emails when offline; while useful, the technological requirements to support this offline are large, which you have found with needing to have a local copy of Lucene, indexed versions of your full email box, plus the emails themselves to display after being found by Lucene. I don’t think the user need for offline search is strong enough to support its technological heaviness.

    With the 2 requirements above, you can do quite alot offline using the technological limits of what Dojo Offline and Dojo Storage provide. Based on your experience with offline email and Zimbra, do you think this is enough?

    Best,
    Brad

  • Cool to see that you’re always involved in interesting projects Brad: is it one more step towards the Paper Airplane browser you envisioned 2 years ago?-)

    P@

  • Pingback: tecosystems » links for 2007-01-05()

  • Hi Patrick! Great to hear from you; I’ve been following the good work you’ve been doing for Google the last few years.

    It would be nice to build something as collaborative as Paper Airpane someday (btw, Patrick is talking about a research project Hoang Dinh and I did a few years ago: http://codinginparadise.org/paperairplane). Dojo Offline isn’t really connected to that vision; perhaps in the sense of pushing the browser in new directions it is, but not in terms of human collaboration.

    Are you in the Bay Area now? We should get a beer sometime if you are; you should drop by the coworking space too (http://coworking.pbwiki.com). Call me at 510-938-3263.

    Best,
    Brad

  • Andy: The web application itself could choose to replicate everything both locally and remotely; in fact, I hope it does, since that’s one of the use cases of Dojo Offline :)

    Best,
    Brad

  • Jadon: I explored what it would look like to put an S3 storage system behind Dojo Storage. It’s an interesting idea, but I realized it kind of pushes the framework in a direction that its not designed in. Dojo Storage is a generic API that needs to accomodate all of its storage types; if I allow remote, network based storage, that means I have to make alot of the methods asynchronous, such as the get() method, which makes it a pain to use. I think that level of generality would make it a pain in the butt to use, even though it seems cool on paper, so I’ve decided to make Dojo Storage be just about local storage versus a completely generic storage system that could be remote as well.

    Best,
    Brad

  • Jadon, BTW, Dojo 0.4.1 now comes bundled with both a FlashStorageProvider, which uses a hidden Flash file for storage, and a WhatWGStorageProvider, which will automatically use Firefox 2’s native storage abilities. There is actually a third storage provider in the Dojo repository, which is a FileStoragProvider. This storage provider uses ActiveX or XPCOM to store files if a page is loaded from a file:// URL. I finished this provider, but I actually think I’m going to deprecate and remove this code. It’s cool, but I don’t think anyone will actually use it, and I want to reduce the bloat of Dojo Storage (since it will be used by Dojo Offline). See this page for full details on the available Dojo Storage storage providers: http://manual.dojotoolkit.org/WikiHome/DojoDotBook/Book50

    Best,
    Brad

  • Andrew: Do you have some pointers to actual screenshots of the Morfik offline GMail? I looked on the net but can’t find any. I looked over the Morfik developers documentaton but it actually doesn’t seem like they have solved the hard parts of having a web application that is ocassionaly connected; I know you can run your app either locally or remotely, but I don’t see much help in their on the real world UI and programming issues involved in doing so.

    Brad

  • Pingback: 今日連結 (2007-01-03) [JeffHung.Blog]()

  • Michael Schall: Wix is amazing; thanks for pointing me to that. I was going to use NSIS, but I think I will use Wix + MSI for the Windows Installer.

  • Brad,

    Sorry hadn’t dropped by this thread in a day and didn’t have a chance to call. I think the use case SitePen and you have described is a bit different from what we envision. While this may be a more power user or enterprise use case it’s a core use case we want to cover.


    User has multi-gigabyte mailbox. Folders have 10’s or 100’s of thousands of items. Search, tags, etc are the primary way you interact with email. My mailbox has around 40K emails and I use search to pull out unread items and the various conversations to group related items. Our use case is you are a busy exec and get on a plane and fly for 10hrs+. So having just a 100 messages really won’t let you take a few hours and catch up on mail. You at the very least need a big portion of your inbox and ideally need search to navigate it as if you are online. Maybe this is really only a power user type need and most users won’t be offline for this long or need so much of their mailbox in a consumer app. For long trips I use POP to get my entire gmail account offline. Using Zimbra I get both my Zimbra mail, gmail (though our POP aggregation) in a fully searchable manner just as if I was connected. Any mail sent or composed while offline will get sent when I reconnect.

  • KevinH, does a user really need multi-gigabytes of all their email at all times? That seems like a use-case requirement that will force you to use pretty heavy duty technology on the client-side, where if you relax it you can use a much simpler implementation. Is this based on actual studies you have done? It seems like hitting the low-hanging fruit of what offline access needs for email is much easier.

    Thanks for responding!

    Best,
    Brad

  • Jon Smirl

    You don’t need to use Lucene. Rhys is on the right track with SQLite which now has full-text search support.
    http://www.sqlite.org/cvstrac/wiki?p=FtsOne

    Build a special version of this for Firefox. You can build the transparent proxy into Firefox with a few hundred lines of C. Then the browser cache and the proxy cache become the same cache. Since Firefox is around you can take advantage of it shipping with SQLite and Javascript to add those features to the off-line server. Check out Rhys’ model http://www.web2os.com/

    I’ve built a system similar to what you describe but never did anything with it. I even used polipo in the first version. If you do the Firefox version an IE user can download it and run the local proxy using the Firefox libraries and never actually run the browser. Piggybacking off from the Firefox libraries is simply an efficient way to get Javascript, proxy and SQLite support installed.

  • Pingback: Dan Moore! » XHR Data Caching and the Dojo Offline Toolkit()

  • Jon, lots of good ideas in your post. I need to take a closer look at web2os. It’s not clear to me whether that project is open source or not, and what it’s platform and browser support is. Thanks for pointing me to it though.

    Best,
    Brad

  • Hey Brad;

    Just surfing the Web to keep up to date with your project. Seems there is a great amount of interest in this project. Good luck and much success.

    Love ya;

    D@D

  • Jon Smirl

    My thinking went along these lines. First pass was to build a local transparent proxy. Polipo was perfect for this. But then I ran into the problem that off-line apps storage needs to be segregated. Plus the off-line app needed some kind of programming support. This caused me to add Spidermonkey and SQLite to polipo.

    Next I started talking to Brendan (Mozilla) about this and he said the download size was unacceptable. These conversations led to building the off-line proxy on top of the Mozilla run-time. This eliminates 99% of the download. It also gets around corporations restricting the install of binaries since most have approved the Firefox binary.

    Note that the proxy app really has nothing to do with Firefox, it is only using the Firefox libraries. Mozilla has already done the work of cross platform porting and locking Javascript into a secure sandbox. It’s also a pain to build a different one of these apps for Safari, IE, Opera, etc. Then you end up with a different Javascript engine in each one.

    I found it simpler to turn off all caching in the browser (Mozilla or IE or whatever) and then configure the browser to always use the local proxy. Local proxy is only bound to the loopback device so it is not visible on the net. Local proxy then takes over the cache function of the browser.

    To implement something like offline gmail. You browse to gmail.com. A page loads into your browser. Embedded Javascript uses some method to test for presence of the offline cache. If the cache is there it hits some URLs which bring down pages of javascript with a new mime type. The proxy sever caches these pages and when they are served it executes them instead of serving them. The executed code build the browser pages by reading the local database. Now this code has been inserted into the middle of the browser/server session. It can watch all of the interaction with gmail and build a local cached copy of the mail database. Of course there are a bunch of other ways to accomplish loading the offline cache with code.

    Adding the offline proxy to the Mozilla download only adds a few KB of code. They are not adverse to building it into the browser download once it is proven secure.

    There is no money in the base platform and I can’t see building a startup around it. It would be much better if the people in this space agreed on a common, open source platform and then concentrated on building the apps that load into it.

    I stopped working on this because I got involved with an unrelated startup doing embedded Linux work.

  • Brad, Jon,

    Web2os will be released as open source, to coincide with the public beta. The platform support is Windows, Mac (universal binary) and Linux, with broad browser support – Firefox, IE, Safari, Opera (i.e. anything that can talk to the proxy).

    The current version is focusing on what the architecture and api should look like, together with the security and deployment model. Getting the open platform right, and building the first set of apps for it, are the main priorities – the optimisation of the codebase and download size will follow once the api is fixed and we have the feedback + experience of running real apps.

    The initial set of applications that I’m working on fall into two categories :
    – new hybrid applications that make use of the platform api
    – offline and desktop capability added to existing web applications

    For these two categories, I am currently testing :

    – adding offline and desktop integration to Google web apps (calendar, mail, docs and spreadsheets). The approach is similar to what Jon describes, with code running in the middle of the browser/server session on the proxy. The proxy can also add the required code to existing apps without having to change any of the client or server code.

    – examples that mirror demo applications based on the Adobe Apollo platform, since it’s a good exercise to compare the same type of application over different (competing?) platforms.

    – desktop integrated web apps which make use of the web2os security model for Applescript, COM objects and cross-site ajax.

    I’d also like to see how best to help / incorporate Dojo support both for the Offline Toolkit and for proxy-level Dojo javascript applications. A development path that follows Jon’s suggestion of using the Firefox libraries should ultimately give a less-than 300k download if Firefox was already present.

    Rhys

  • Jon Smirl

    Note that a local transparent proxy is similar to how Akamai works, it may be useful to study how you write Akamai edge server apps.

  • Pingback: tech decentral » links for 2007-01-06()

  • Hey Brad, good to see you two Ajax experiences ago — and great to see your brilliant work here! Proxy Auto-Config was a one-week hack by Ari Luotonen and myself back in the Netscape 2 days, but it has proven insanely useful in many settings (e.g., Sandia uses a seriously big PAC file). You should be able to wrap the existing FindProxyForURL function somehow, and call it instead of returning “DIRECT” from your function.

    As Jon notes, I have been arguing for offline support to be added to the browser tier, rather than inventing a new tier for web programmers to learn (sure, there is new API, but it’s a delta to the existing DOM and browser APIs; it’s not a different execution model and set of APIs for a loopback proxy or local server). Small download increment is critical, and it sounds like you’re on track there. With a few hundred K download, we could bundle your work with Firefox 3.

    We should talk more tomorrow — phone or email, your choice.

    /be

  • Brendan, thanks for the kind words! I appreciate it; oh, and thanks for Proxy Auto-Config as well ;)

    I don’t think I will wrap an existing PAC file for the first release; I think that’s a bit too ambitious. I’ll try to hit that in a point release after I iron out other more important things first, like getting the API right, making sure my understanding of what offline UIs look like is correct, making sure the proxy is reliable, etc. However, I will save the URL for the old PAC file so it can be restored after uninstallation.

    I’m in complete agreement with you about not creating a new programming model. The Dojo Offline Toolkit is meant to be a very small shim really, a tiny set of APIs that can be added to existing Ajax/DHTML-oriented JavaScript applications. While I respect what Rhys is doing with his Web2OS framework (which is very cool and elegant by the way Rhys!), the Dojo Offline Toolkit is not meant to be the ‘next new programming’ API like Apollo strives to be. As I was telling Dylan a few days ago, I hope you and the browser folks put me out of business in a few years by bringing this functionality into the browser itself. While the implementaton behind the Dojo Offline Toolkit is a small web proxy, I’m not really going to expose that through the API — it should be a cleaner set of methods than that. The API won’t require a new MVC-style framework or buy in to a whole new programming abstraction, for example; it should be pretty straightforward and not require the programmer to drink any significantly new kool-aid. Very evolutionary.

    I’d love to talk to you this week; lets actually talk later in the week, after I’ve done some more API design. My big task last week was to design a consistent offline UI across different web applications, since the UI is the most important thing. I want to make sure I don’t create tech that isn’t backed by a reasonable UI. What SitePen and I did was create on paper offline UIs for several representative online apps that we actually use: Gmail, Blogger, Google Docs, and a corporate portal. SitePen’s goal is to find the commanality between these different offline UIs, so that we can tailor a small, focused framework and drop-in-UI. We’re mostly done; you can see some of the raw thinking and mockups of an offline-based Gmail here: http://codinginparadise.org/clients/sitepen/dojo_offline/SPOT_Mockups/overview.html We’ve got a bit more work to bring some of the mockups on to the computer from paper, and I think I want to add an explicit Work Offline/Work Online toggle.

    Starting on Tuesday, SitePen and I will probably start to do some API design; I want to understand the kind of developers who will use this API, what their most common needs are, what are the easy things that should be easy, and perhaps what are the hard things that should be possible. I should have a better handle on the API by the end of the week; perhaps we can talk by phone on Thursday or Friday. My # is 1-510-938-2363 (PST). If you have some feedback before then that can help tailor the UI or API definently post it here for now.

    Thanks Brendan! :)

    Best,
    Brad

  • BTW, just to make sure folks understand: the local web proxy is completely generic. We don’t use it for data storage; that’s Dojo Storage’s domain, which actually uses a hidden Flash file or native browser storage in Firefox 2 to give a programmer the illusion of a persistent, storable hashtable. Instead, the web proxy just caches the user-interface files, like the JavaScript, CSS, HTML, etc. The web proxy doesn’t have any magic, application-specific code you have to put there, like Apollo, Web2OS, etc. would require; you don’t have write JavaScript to some new programming model. Instead, what happens is the web proxy has some small hooks to securely indicate that you want to start caching the UI files of some web app based on its domain name, such as mail.google.com. While offline, your web app stores it’s _data_ into Dojo Storage, such as your emails, blog posts, etc., and the browser is ‘tricked’ into thinking it’s still online because the UI files (CSS, HTML, etc.) are simply returned by the local web proxy.

  • Jon Smirl

    If you are storing all of the data into a simple hash, how do you generate data that needs to be computed? For example searching the data stored in the hash. The model still needs to work if I transfer the one million messages I have in my gmail account into the off-line storage. What if I want to store GPS way points and search them? etc…

    As for Brendan’s comment on building off-line capabilities in browser that would be great in a perfect world but how are we going to get a certain 800lb gorilla to cooperate? If the off-line capability is in a separate module like web2os the gorillas don’t need to agree.

    From my perspective I would like to see the system architected to handle large off-line apps from the beginning so that we don’t have to rev through a bunch of APIs with increasing capability.

  • MASA

    Does it have to be on a diffrent port then 80? My network blocks almost all other ports (Imagine how hard it is to work with online stuff)

  • Jon: gorillas have to compete in the ecosystem too. In a perfect world there would have been no IE stagnation for years after Netscape died; on the other hand, in the real world, we shipped Firefox and took back market share from IE, which was unthinkable according to most realists. It didn’t require perfection to do that.

    As for offline via yet another large-download runtime, the problem of distribution is challenging. Over-designing for uncommon scale will take longer and result in a big download. Better to get the high-usage (especially among leading-edge sites) browsers adopting a thin layer. In the worst case, until the gorilla competes, the extra download burden is imposed only on its users.

    Searching stored data can be done with SQLite in Firefox 3. Whether this level of API will make it to other browsers remains to be seen.

    Speaking of thin, if Brad’s proxy is just for avoiding cache misses when offline, then Firefox 3 should not require it. We plan to extend the link tag, cache controls, etc. so that an offline-enabled app can ensure that the content it needs will be resident when it is offline, never missed or evicted from the cache.

    /be

  • Jon Smirl

    I’m happy with a model based on using the Firefox download to service IE users. But that model requires running a separate proxy task The extra code to build the proxy on top of the Firefox run-time was less than 100K in my attempt and probably could be made a lot smaller.

    The question becomes one of where does the application code run, in the browser process or in the proxy process? I think it has to run in the proxy in order to service IE, Opera, etc from a single programming environment. Since it is the Firefox environment it makes things like SQLite available to all of the browsers.

    Plus we might get them to try Firefox too this way.

  • Jon Smirl

    Here’s another way of explaining the concept.

    Let’s take the existing Firefox code and rearrange it. Divide it into two piles browser and off-line support. Now build the two piles into two separate processes with a transparent proxy pipe between them. Rearranging Firefox this way won’t change the size of the download any. Normal Firefox users shouldn’t notice the change.

    Now an IE/Safari/Opera user wants to run a slick web app built on the Firefox model. They download Firefox and only run the proxy task. Point their favorite browser at the proxy task and now those browsers can run the slick web app too. No need to get the gorillas to agree on adding a new API to their browsers.

  • Jon: without repeating my argument against yet another tier, I want to note that it’s not the case that other browsers can’t be extended to support offline apps in-process. IE is perhaps the most extensible. Plus, competition means browser vendors who are early to the party do the work for us, avoiding the need for big extension downloads.

    Let’s talk turkey, preferably in a mailing list or newsgroup. I don’t agree with your premise that browsers can’t be extended, only faked with a proxy; I don’t see the point of creating a completely new tier, distinct in execution model, security policy, and APIs.

    /be

  • Unfortunately, putting a proxy (even a loopback one) between Firefox and its HTTP cache will hurt performance. If you duplicate the cache code, then besides the footprint cost (we use a static build), you also impose MP locking costs on cache accesses.

    Wouldn’t it be simpler to just use the one true cache? IE’s cache has a COM API that can be used. Safari and Opera can be hacked too, or just play catch-up. Market share leverage means Firefox first, with IE emulation in a small (but not as small as the Firefox incremental cost) download.

    /be

  • Pingback: Planblog » Blog Archive » Dojo Offline Toolkit()

  • Jon: Transferring your one million emails and searching over them for offline access is not a use case. Thats just too hard to do right now; I’m hitting the low-hanging fruit instead, and we can work incrementally to the harder stuff like that. Instead, you can derive alot of value by downloading a few hundred of your newest emails, perhaps including their attachments, and be able to respond, read, and create new emails offline.

    About the 800lb gorillas, if Firefox builds offline access into the browser, than the Dojo Offline JavaScript API will simply use the native browser capabilities when they are present, while requiring the offline proxy when they are not for Internet Explorer and Safari. In fact, this is already what Dojo Storage does: it uses a hidden Flash file for permanent storage for IE, but uses native browser storage for Firefox 2+.

    About architecting for the high-end versus the low-end: I believe in evolvable applications and platforms. I believe that the low-end iteratively develops over time to take over the high-end. In fact, if I start too large and tackle too much, the surface area of the API will be so large that no one will probably adopt it. If instead I hit the low hanging fruit that is useful, hits real needs, and is clean and small, then I can grow over time to hit harder areas like search (though I’d bet you that the use case for folks needing something like that is perhaps 5%, which might not justify a generic, horizontal approach like Dojo Offline provides).

    MASA: The proxy doesn’t run on port 80; it runs on your local machine so a firewall or NAT shouldn’t influence your app communicating. In fact, when the web app is back on the network and starts synchronizing, everything will happen over a normal HTTP channel, probably port 80 if the web application is served from port 80. The only thing you will need is local install privileges, and probably admin privs as well so that the Dojo Offline proxy can start on system start up. If you don’t have that there isn’t much I can do.

    Brendan: the distribution problem is a very hard one, which you focus on as well. Thats why I’m keeping the download small, and making the proxy generic so that once one app has installed the Dojo Offline Toolkit, it is available for any other web app (there will be a standard way to ‘query’ to see if Dojo Offline is running so that it doesn’t have to be installed if it is already there). I’m also hitting the low hanging fruit, and not over-engineering for this release – keep it focused and keep it simple.

    Also, you are correct that the proxy is mostly for avoiding cache misses; if I wanted to get my hands dirty, I could go into Firefox and create a mechanism to indicate that some resources should not be cleared when you clear the cache, since they are ‘offline’ accessible or something. Also, I believe that the usability of the Work Offline switch accessible from the File menu is flawed and confusing; if you were to modify the cache, don’t require the user to indicate they are offline. In Safari, you don’t have to indicate whether you are on- or offline in the browser; instead, the cache will return a file if it had the right HTTP/1.1 headers and is cached, even if you are on the network. Much better usability. In general, I want all of Dojo Offline to simply disappear into the browser; we don’t need a new set of infrastructure. The web proxy is simply a small, focused hack to get offline access soon. If Safari and Firefox absorb it’s APIs in some form or other the web proxy will only be needed for IE and older versions of Firefox and Safari.

    About extending the link tag, are you thinking about creating a new ‘rel’ type, such as Internet Explorer’s old ‘Offline’ one, that could refer to some XML file of a listing of resources to be used offline (they used CDF in their case)? I’m thinking about perhaps doing something similar for the API design next week. I’d love to hear your thoughts on that. It could just be a very simple XML file, keep it dumb and simple, that has a list of URIs that will be placed into the cached for offline access. My concern, though, is that this is usurping HTTP/1.1’s cache control rules; however, its much easier for programmers to add a list of files to some XML file than to modify their web server to return all of the resources that must be cached offline. What do you think?

    Jon: I don’t believe any application logic for these kinds of things should run in the proxy; I believe this is insecure, creates distribution problems for web apps (how do you get it into the proxy?), and creates a new programming model that I think will be difficult to get people to use.

  • Brendan: Let’s say instead of installing Polipo and a PAC file the Dojo Offline Toolkit modifies the cache-control mechanisms of Firefox 1.5+, Internet Explorer 6+, and Safari. What do you think this would involve for Firefox? Are there XPCOM interfaces that expose that level of fine-grained control, or would I have to do some serious C++ surgery on the cache system? What’s the situation for Internet Explorer? You mentioned there are COM interfaces, but do I have control over preventing the entire cache from being wiped out by the user through the Internet Explorer Settings dialog or by things like Norton which come configured to wipe out the cache when the user closes Internet Explorer? I had toyed with this design approach when I started to put together the DOT design, but thought that it might get difficult to come up with a design for how a web app indicates it wants to have its resources placed truly offline to avoid cache misses. Do we add a new x-HTTP header? Do we just use standard HTTP/1.1 caching semantics with a link rel=”Offline” file that is much simpler than CDF? I’d love to hear your ideas both on the implementation detail of ‘hacking’ this into IE, Safari, and Firefox, so that it can be done on the existing web with the Dojo Offline Toolkit download. Can I ensure it will be reliable and that it would be a maintenance nightmare for me? If it is relatively simple, it means the DOT download would be even smaller.

  • Pingback: Starked SF, Unforgiving News from the Bay » Blog Archive » Linker Barn: Monday, January 8()

  • Kevin

    Perhaps I missed this point about how to associate the PAC file with IE or Firefox. If I recall, there are two ways for the PAC file to be discovered by the browser. The first is to set “Auto-detect proxy settings for this network” (on Firefox) and setting the “Automatic proxy configuration URL:” In the later case the URL points to where the .pac file is located somewhere on your network. In this case, it’s the proxy-server you describe. In the former case, it would use the ‘Web Proxy Auto-discovery Protocol (WPAD). In the former, the protocol does not search the proxy server. Either way, it sounds like there needs to be a manual configuration on the Browser side to allow for the location of the PAC file. Does the user have to manually configure the browser, or perhaps I missed a major point.
    Thanks in advance, Kevin

  • Jon Smirl

    Brad, you can get page into the proxy by defining a new mime type. When the proxy encounters these pages it runs them instead of passing them through. You can disable the browser cache by altering the pages as they pass through the proxy to have a no-cache header. Security is not as bad as you think because everything is isolated by domain and you have the normal Firefox Javascript sandbox which has been well tested.

    Brendan, in the Firefox case you could optimize out the loopback. Make it still look like a proxy and use shared memory to move the pages. I do wonder if a couple of milliseconds in the cache code is going to make much difference compared to rendering time. On dual core they can happen in parallel.

    Everything boils down to adding another tier or not. I’m for adding the tier to achieve uniformity across all browsers/platforms. Adding a tier also makes it possible to build much larger apps.

  • Hi Kevin; during installation, I use a variety of ways to autoregister the PAC file with the browser — forcing the user to do this manually would obviously not be very usable. On IE I set a registry setting during installation. I haven’t completely determined the method for Firefox, but I can probably add it to one of the settings JS files that sit around at the end. For Safari there is an AppleScript way of setting this during installation.

  • Pingback: SitePen Blog » Blog Archive » Dojo Offline Toolkit Status Report for Week Ending January 7th, 2007()

  • Pingback: Dojo Offline Toolkit » Blog Archive » Dojo Offline Toolkit Status Report for Week Ending January 7th, 2007()

  • This sounds like a really interesting idea, my first thought was “will it work with HPC?”. I have a web based application that eventually needs to have a data entry component that needs to work when not attached to the server. They want to use HPC (Dell Axim if it matters) to enter this data. Only current option is to go traditional client app unless this could provide the offline capability.

  • Hi Mark; explain to me more about HPC, do you mean High Performance Computing, or some kind of high-availability thing? Tell me more about how you might envision the Dojo Offline Toolkit working with your setup so I can see if the API and UI I’m putting together this week work with your needs.

    Brad

  • Pingback: SitePoint Blogs » News Wire: How’s Your Uncached Performance?()

  • Brad: Mozilla’s plans are sketched at http://wiki.mozilla.org/OfflineAppsSummit2006, and I believe there is already preliminary work happening based on that summit plan, in the code and bugzilla. You will see the [link rel=”offline”] tag mentioned there. I will give you a call on Thursday, unless you’d rather email sooner.

    Jon: if you define “tier” to mean “Ajax library, possibly including a small loopback proxy for certain browsers”, then sure — everything is a tier. document.getElementById is a tier. C’mon, there’s a huge difference between *extending* the browser’s DOM and other APIs, and *creating* a new local server programming model with its own execution rules, security mechanism and policies, scripting languages, APIs, etc.

    Obviously we all, especially Brad, want uniformity across browsers. That says absolutely nothing about whether the uniform new APIs can be implemented in-browser or out-of-browser. Implementation depends on browser-by-browser extensibility costs vs. the out-of-browser solution’s download size, and other trade-offs. Means vs. ends.

    The success of Ajax libraries, in particular Brad’s work on dojo.storage to go beyond what browsers can do, proves that you can extend the browser model without foisting a novel middle tier on developers and users.

    I completely agree with Brad about not engineering for a hard case (megabytes of text to index locally). We’ll get there, but not by grand planning or over-designing the first version, creating an unwanted programming and content (new MIME type?!) tier, and risking a big download size.

    /be

  • Pingback: SitePen Blog » Blog Archive » Offline Gmail and Blogger Using the Dojo Offline Toolkit()

  • Pingback: SitePen Blog » Blog Archive »()

  • Jon Smirl

    Brendan, I don’t want this to go on forever, but I see two philosophies for this problem and both schemes have their merits. My viewpoint is that the browser is a presentation engine with a little bit of scripting to make the presenting better. I would actually remove some of the things that are in Mozilla today and move them into the transparent proxy.

    I believe that applications always belong in a server. In the off-line case the server is hiding inside the proxy engine. Using proxies is just a good way to switch between the servers.

    I can’t even conceive of someone trying to rebuild all of MS Office inside a browser in off-line mode. But I can see someone building it inside the off-line server and then using the browser for the UI. Apps always expand without limit. The local server model allows for that. I’m worried that we’ll build something that isn’t scalable and then have to do it all over again in a couple of years.

  • Jon: you may be right that a separate local server would divide labor more effectively for some apps, or at some scale — or both. Time will tell. If you build something powerful along these lines and people download it, more power to you.

    My disagreement is not just about how to proceed (I’m with Brad: small evolutionary jumps). I think your last paragraph misapprehends web apps, offline-enabled or not. Web apps won’t *ever* do anything like “all of MS Office”. Nor should they.

    Also, your last sentence expresses a common worry, that Parkinson’s Law will bite us in the long run. Yet we don’t know exactly what the long run really looks like, what its requirements are. If we guess wrong and malinvest in a “big system” solution, we’ll both take too long and miss the target.

    Look at Chandler (OSAF) for something that tried to design for five years out, and cover numerous cases. There are other examples, arguably the Mozilla suite among them. They take too long and miss the target.

    On the other hand, who says applications always belong in servers? I’m an old Unix hacker. I say servers are just processes, and so is the web client. It may or may not help scale to put application code in a separate process from the browser. But right now, we have web apps that download more than just presentational logic to the client. We wish to help some of those web apps operate offline. The shortest distance looks like the right path.

    /be

  • Pingback: Bieber Labs » links for 2007-01-09()

  • Jon Smirl

    Bredan, I am in the grid model camp. Build a set of interconnected servers and then have a thin presentation layer for accessing them. A local proxy server is just another node in the grid. Do this right and the code can even migrate. For example how are you going to do an off-line mashup with the in-browser model?

    As for experience in designing the environment. We have been designing server environments for twenty years. I’d hope by now we would have a pretty good idea what features this environment needs. I view the in-browser model as the app breaking new ground.

    I made some posts about using Rhino and Sun’s mobile JVMs instead of Tamarin. That would be a very interesting model if you could get Sun to fix the license so that the JVM can be incorporated into Mozilla.

  • Brad,
    Thanks for the response.
    Sorry for confusion, I am talking about Handheld PC, Pocket PC.
    http://www.microsoft.com/windowsmobile/ may be a good place to go for specifications to see if it is even possible.
    The model I have has 640×480 resolution, there are also several cell phones that use this operating system.

  • Pingback: nonsmokingarea.com » Blog Archive » roundup for 2007-01-10 … SecondLife / iPhone / DRM()

  • I would like to know why a proxy is needed instead of just using the browser cache? I know you mentioned that the proxy serves content when there are cache misses, but with proper cache header directives on server responses, shouldn’t cached content be available through the browser’s native cache? Is there gotcha’s with the native cache? Is it because of cache size limits in the browser can purge cache content (or it can be done manually)? Or is the intent to avoid having to alter cache header directives (since Dojo isn’t a backend framework, but a frontend framework)?

  • Pingback: links for 2007-01-11 at chris.is-a-geek.net()

  • “I believe that the usability of the Work Offline switch accessible from the File menu is flawed and confusing; if you were to modify the cache, don’t require the user to indicate they are offline. In Safari, you don’t have to indicate whether you are on- or offline in the browser; instead, the cache will return a file if it had the right HTTP/1.1 headers and is cached, even if you are on the network.”

    This is actually the case in Firefox and IE, right now. The “offline status” is only useful to the app so that it doesn’t attempt to do XmlHttp requests to the server (which are going to fail and make the browser prompt you to go offline).

    With the dojo offline proxy, should the application use navigator.offLine to check the network status or should it use a dojo api? If you use a dojo api, I’m assuming that the proxy would have some smart connectivity detection logic, but how does the information gets communicated from the proxy back to the javascript running in the browser?

  • Julien: I’m avoiding hooking into the browser’s File > Work Offline flag, because of its usability issues, as you point out; plus, if it is on, the browser won’t talk to the local proxy, which I don’t want. This means I won’t be checking navigator.offLine.

    For the Dojo API, I’m still putting this together. I _might_ be doing automatic detection of the remote website to see if the link has gone done, but I’m not sure yet; this is hard to do reliably, especially cross-OS.

    Kris: The browser cache will indeed cache resources that have correct HTTP/1.1, as Julien cleverly discovered, but unfortunately the reliability of this for production applications is not strong enough. There are significant usability problems; what happens when you try to go to the website offline, but a key file is missing? It’s clever, but not strong enough for deployed applications in my opinion. The proxy is meant to be a more reliable way to achieve this. In the long run, I think browsers will support “pinning” certain items in the browser cache for offline access, as Brendan has suggested.

  • Pingback: Software Development Methodologies, Service Oriented Architecture, SOA, Integration Software » From The Trench » Blog Archive » The Dojo Offline Toolkit = Rich Client User Interface Extinction()

  • Indeed, reliability is certainly a good reason to use your proxy instead of relying on the browser cache and surely that it will make quite a useful tool. Is it possible to use your proxy without Dojo? Is it possible to use it manner where the application basically behaves as it would in relying on a browser cache for offline mode, and then end users could optionally download the proxy to provide the reliability of “pinned” files. Seems like it would really increase the breadth of your proxy’s appeal if web apps that are built without Dojo, but based on using browser’s cache for offline mode, could then just include a little download button to encourage users to download your proxy to establish reliability.
    Do you (or anyone) have any idea just how unreliable browser’s cache is though? Does anyone know what the algorithm is for maintaining the browser cache? Brendan, do you know for Firefox? I assume (and hope) it is more sophisticated than just LRU or LFU since expiration dates and last modified dates should seem to be enter into the equation as well…

  • Pingback: All in a days work…()

  • Jon: in-browser mashups are here, see http://simile.mit.edu/wiki/Piggy_Bank. What’s a “server” and what is a “client” in this Firefox add-on? Does it matter to the user? It’s not as if we need those 20 years of experience building scalable servers just to mash up data for one browser to consume.

    There are many ways to skin a cat. The compelling arguments for an out-of-browser-process “server” are not mash-ups or code migration. I think they come down to browser independence, which has its pluses (works alike with all browsers) and minuses (new tier to program and make secure, plus big distribution problem).

    Sorry, the JVM is too big as well as years too late and the wrong language. Tamarin already does most of JavaScript, which is and will remain the default VM-hosted extension language for browsers (since browsers have to fit on constrained devices, where J2ME may or may not be present, but JS must be if you want to use the full web).

    /be

  • Parting shot at the server special pleading: too much of those 20 years of server experience was misspent on closed-source boondoggling of the three-tier kind decried by Phil Greenspun and the 37signals guys. It’s neither needed nor welcome on my desktop for offline web apps!

    BTW, a full-text-indexing local server is on many desktops already: Google Desktop or the like (increasingly OS-integrated). Why reinvent that wheel?

    /be

  • Kris: There will be a small JavaScript API that web applications will need to bring in to interact with the local web proxy. This will be based on Dojo, including Dojo Storage, but one of my design goals is to keep it relatively small, and to also allow it to be used in heterogenous, non-Dojo environments (where you can mix this small Dojo Offline JS into your YUI, GWT, etc. applications).

  • Brendan: Adding to the JVM beatup, it also still takes too long to start up, and freezes the main UI thread when it does so, creating a terrible usability experience. Plus, Sun just has never gotten how important download size is; its simply way too damn big. Do I really need a CORBA orb in there (perhaps in 1997 when Netscape bundled an orb from Visibroker in the browser….), or JDBC in the JRE?

    I’ve been a Java developer since the beta days, so I’m not casting stones at something I don’t know about.

    Literally, while writing this, my brother, who I live with, came into my room saying “My computer is messed up! I’m trying to install something called Java but my computer says ‘too many Java processes are running on this machine, causing errors’,” causing Java to not install. Sigh.

  • Jon Smirl

    There is more than one JVM, I was thinking about using one of the mobile JVMs as a standard part of the browser. These are small and performance is ok for scripting. Then if someone wants more performance they could download the full JVM and the browser would transparently switch to using it, No intention to bring along the rest of Java, the JVM is being used to implement Javascript.

    I just see hundreds of man years of effort in a piece of code that is probably going to be ignored. But I do agree that GPL’ing Java was done too late. Too many people have already chosen other paths to follow.

  • Jon wrote “These are small and performance is ok for scripting.” No, it’s not: http://tinyurl.com/yd6nub (out of date, but not likely to have changed much; note also Igor’s comments about . Note Tamarin vs. SpiderMonkey performance:

    http://www.playercore.com/pub/Tamarin/Avmplus_vs._javascript.htm

    Java has lost to Flash on the browser in many ways (e.g. virtual tours that don’t suck). It’s too late to fix this. It is no coincidence that the JS-like engine in Flash 9 is the one contributed to Mozilla, the one we’re unifying with SpiderMonkey for Mozilla 2. Java is simply not an option.

    /be

  • I wrote “note also Igor’s comments about .” and left off the rest of the sentence: “… about general C vs. Java program performance. The bulk of all real, competitive browsers is implemented in C++ and C; the Ice browser is dead, RIP. Embedding a JVM to host a JS VM costs way too much, especially in human terms: all the manually maintained root set elements and other GC invariants in the C++ code must be right, or you have an exploitable free memory read.

    I repeat: Java is not an option. As Jon said, “other paths” have been followed, and path dependence effects are strong on the Internet. I suggest that such path dependence is why Dojo Offline Toolkit, which also follows a shorter path than competing systems, will probably succeed over bigger, slower-to-be-finished, costlier-to-deploy alternatives.

    /be

  • Antonio Mota

    This is what i wrote in what was a “hot discussion”, in July 2007:

    “A Ajax app should be capable of working off-line, on “ocasionnaly
    connected” clients, like laptop computers, or other devices with
    browsing capabilities and no local storage.”

    “(in a app i’ve made) The user can even connect a laptot to the net, load the data, disconect, do the work it has to do, and then reconnect to send the data.”

    This “app” is a data-entry grid for a time-sheet application that *in fact* can work offline.

    If someone cares to read the discussion look at this link [link removed by Brad Neuberg because it was too long and broke the UI]

  • Antonio Mota

    That was not in July 2007 ofcourse, it was 2005. Here’s a better link.

    http://groups.google.de/group/ajax-web-technology/browse_thread/thread/e40317a830ad841d/0dd346a49b92e33a

    (if moderation could correct the other post please do so)

  • Brendan,
    Any chance you could tell us what the algorithm is for caching in Firefox? Does it incorporate expiration dates, last access, last modified to determine what files to release from the cache? I am just curious what level of predictably in utilizing cached files is acheivable from a webapp.
    Brad,
    Is there real any interdependency between the storage and proxy caching? Is there any reason that or AMASS (or nothing) could be used for storage and the proxy simplies provides reliable caching?

  • Brad, I just realized that you wrote AMASS and Dojo storage is the replacement for it. Guess you have a monopoly on the JS local storage industry :).

  • Pingback: links for 2007-01-13 at 16cards()

  • Pingback: [SMD] - 8L0G5PH3R3 » Archives » Dojo Offline - Lets Web Works Offline()

  • Kris, Dojo Storage is completely independent of Dojo Offline and has existed for about a year (a year and a half if you include development on AMASS). It is pretty stable these days, and is a good option for persistent client side storage (it works across about 95% if the existing installed base of the web, cross browser and cross platform).

  • Pingback: SitePen Blog » Blog Archive » Weekly Updates on Dojo Offline Toolkit()

  • Kris: Mozilla’s network cache uses LRU-SP (http://citeseer.ist.psu.edu/cheng00lrusp.html) for its replacement algorithm, last I checked, and of course the networking code follows the HTTP spec (http://www.faqs.org/rfcs/rfc2616.html), with lots of real-world quirk fixes for inter-operation (see Mark Pilgrim’s comment at http://www.intertwingly.net/blog/2006/12/13/Tolerance#c1166802157).

    The code can be browsed via http://lxr.mozilla.org/mozilla/source/netwerk/cache/.

    But before you study this code, I should point out that web apps really can’t count on the cache not being blown away by some activity in another tab or window. Wherefore the need to pin cache entries for reliable offline operation, which Firefox 3 will support, and which Brad’s proxy provides for downrev browsers.

    /be

  • Pingback: SunlightLabs Blog » Blog Archive » Why offline is the new online()

  • Pingback: Ajaxian » Dojo Offline Toolkit Kicks Off()

  • Tom

    Very interesting, but I’m really curious how you will solve the “submit” problem.

    Suppose I sent an email in Google Mail, the submit cannot be posted, so it must be cached until the network is back. The submit is not processed at that time, so what page should the browser show next? The to-be-offlined application must have an alternate flow for offline mode.

    Then if all 10 emails (submits) are actually sent, and 5 of them fail, how to handle all failures? Retry until? Continuations would be an interesting approach here.

    And talking about submits, what if the website uses AJAX?

    I see a lot of hurdles and am curious on how generic the solution can stay. Best wishes!

  • C.V. Vick

    Brad,

    This an idea whose time has finally come. In the past I did some similar. I used the .pac to help “re-write” urls when the system was offline, i.e. change http://www.google.com into local.google.com or google.com.local (the dns names were redirected thru the hosts file (ugh) Certainly not as robust or capable as what you propose.

    Since your project is targeted at mobile systems, you may want to consider other aspects of mobility.

    I’ve been working on some context awareness libraries that provide information and events relating to loss and reestablishing of network connectivity (regardless of media type), power levels and state (AC/DC), storage, display, bandwidth, location, etc.) Beside .Net. Java, C++ interfaces we also provide javascript.

    Anyways, I think it would be interesting to add some of the capabilties to your work. Perhaps we could talk and exchange some ideas?

  • Pingback: Blog do Ricardo Alamino » Blog Archive » Trabalhando em aplicações na Internet offline()

  • Pingback: Offline access to GMail and Google Docs coming ? « Miro’s World()

  • Pingback: My Stuff :: Google Operating System: A Toolkit for Offline Web Applications()

  • Brad,
    I am trying to start a new pattern section in ajaxpatterns.org for object persistence. I thought you might be interested in contributing to it since local storage is an important aspect of this pattern:
    http://ajaxpatterns.org/wiki/index.php?title=Object_Persistence
    Thanks,
    Kris

  • Pingback: TaggelBlog » Blog Archive » Why offline is the new online()

  • Pingback: Dojo Offline Toolkit at FrenchWeb.org()

  • C.V.: I’ve seen your work with Intel on occasionally connected computing; it’s very good work that I consulted when putting together the UI and requirements for the Dojo Offline Toolkit. It was unclear to me on whether these libraries are open source and whether there are licensing fees. Do you have more details?

  • Pingback: nonsmokingarea.com » Blog Archive » web-apps going offline in 2007?()

  • Tron

    I have been playing with TiddlyWiki and Dojo JavascriptToolkit. Can you imagine what I have seen?

    No!. You can not imagine it if you don’t fall in love with TiddlyWiki and give a chance to DojoStorageSystem.

  • Pingback: torresburriel.com » Blog Archive » Firefox 3 soportará trabajo offline()

  • Torn,i too have been playing with TiddlyWiki and Dojo JavascriptToolkit, & i can imagine what you have seen!

    & i fall in love with TiddlyWiki and give a chance to DojoStorageSystem.
    :)

  • Pingback: nientenessuno.it : Applicazioni Ajax offline()

  • Pingback: links for 2007-02-08 « Mogore, une femme en or()

  • Pingback: Web, ma senza rete.()

  • Pingback: Code Candies » Blog Archive » Apollo von Adobe()

  • Pingback: eTech 2007: Ajax Unplugged at Celso Martinho()

  • Pingback: The Cliffs of Inanity » Blog Archive » Web App Musings()

  • Pingback: SitePen Blog » Blog Archive » Thoughts on XML, Ajax, and UI Development()

  • Pingback: SitePen Blog » Blog Archive » Ajax “Dark Matter”()

  • Tom

    I’m a little confused. On the one hand, the Dojo Storage documentation talks about lots of different storage mechanisms, including Flash and WHATWG. On the other hand, here you talk about using a proxy for storage. Which one is it actually?

    I’m concerned about the use of a proxy for storage because that will likely simply fail for me: I rely on FoxyProxy and pattern-based proxy selection to access various intranets, and the Dojo proxy simply won’t have the necessary information.

  • Tom: we use Dojo Storage for storing application-level data, such as emails, tasks, etc. The local proxy is only for storing UI resources, such as JavaScript, HTML, CSS, etc.

    Also, what you describe is a known issue, where you have a current proxy already configured. Here’s what we do now: during installation, we save all of your old proxy settings and add our own; during uninstallation we restore your old ones. We don’t currently work in a chained proxy scenario. However, the local proxy has support for connecting to other proxies, I just haven’t turned it on in order to focus on core functionality right now. In terms of scenarios where you already have a PAC setting that will be more difficult to support, but there are possibilities. Even though the core Dojo Offline is done, we need volunteers for edge cases like that.

    Best,
    Brad

  • Pingback: 3greeneggs :: Offline Web Applications Coming Soon()

  • dipen

    runtime client for linux ? The auto-generated page only list windows and mac download

  • Pingback: Cooperatique.com - Démarches et technologies coopératives » Blog Archive » Nouvelles tendances du Web 2.0 : déconnecté et sur votre bureau()

  • Pingback: New trends of Web 2.0: disconnected and on your desktop « Cooperatics()

  • PR

    The question becomes one of where does the application code run, in the browser process or in the proxy process? I think it has to run in the proxy in order to service IE, Opera, etc from a single programming environment. Since it is the Firefox environment it makes things like SQLite available to all of the browsers.

  • Pingback: Roads to RIA()