Inside the Dojo Toolbox

By on September 9, 2008 12:02 am

Building the Dojo Toolbox allowed us to dive into Adobe® AIR™, and to create a blended toolchain of JavaScript, PHP, Python and Rhino (JavaScript on the Java Virtual Machine) for developing an amazing desktop application using open web technologies. Read about how we built the Toolbox and what we really think of AIR.

The Initial Whirlwind

We started in late May with zero experience developing an Adobe AIR application and a goal of getting a great application developed in about one month. We had to figure out what exactly we wanted the Toolbox to do and what it could do. We also had to begin figuring out how to make it all happen. Just about all software projects start out that way, right?

We learned quite a bit in those first few days. For example, our initial plan called for having a plugin system. The idea is that people could make Toolbox plugins, and post their plugin code and a descriptor file on their own website. You point the Toolbox at that plugin, and the code is downloaded and installed for use. The SitePen-developed Toolbox functionality would also be implemented as plugins.

While it is certainly possible to implement a plugin system in an AIR application, it’s not necessarily a good idea. AIR’s security model makes it impractical to do certain types of operations (e.g. dynamic inclusion of code such as plugins) and we decided that it was more trouble than it was worth. The Toolbox is still modular (each tool is in its own directory and has a descriptor file – just like a plugin would). By not making plugins dynamic, we could save ourselves effort and make it easier for other people to contribute tools that do great things for Toolbox users. I’ll write more about AIR’s security model when talking about the Builder.

Though we certainly experimented with some ideas along the way, our initial few days of discovery and our first milestone build set the stage for all of the work leading up to the Dojo Toolbox 1.0 release.

Some thoughts about Adobe AIR

Within days of starting the project, we already had a toolbox with a functional API viewer and proof-of-concept Builder. This is a strong testimony in favor of the approach Adobe has taken with AIR. Everything we knew about developing applications for the web translated directly into being able to create desktop applications with AIR. Dojo 1.1 added support for Adobe AIR, thanks to Adobe sponsoring the work to make that happen. AIR’s use of WebKit gave us a familiar browser, since WebKit is the basis of Apple’s Safari browser. And, we could use Dojo, giving us a familiar JavaScript tool set.

One of the most noticeable things you’ll see when moving from typical browser-based development to AIR is that you only have one browser to worry about. Dojo does a great job of masking browser JavaScript API differences, but there are still enough differences in CSS and other aspects of application development that it is somewhat refreshing to only have one platform to develop again. Also, since AIR includes WebKit, it has one of the fastest JavaScript implementations around and offers numerous useful experimental CSS properties that you can use in the AIR context. Apple has invested a lot in WebKit development, and AIR will naturally inherit those benefits when they next upgrade the included WebKit.

File manipulation is a core need for the Builder tool, and we were very pleased with the air.File class provided in the runtime. air.File contains a very useful collection of methods that makes it easy to find the directory or file you need and then take action. To get an idea of just how rich some of AIR’s APIs are, take a look at some of the properties and methods from air.File:

applicationDirectory applicationStorageDirectory creationDate
desktopDirectory documentsDirectory exists
extension icon isDirectory
isHidden isPackage isSymbolicLink
lineEnding modificationDate name
nativePath parent separator
size systemCharset type
url userDirectory addEventListener
browse browseForDirectory browseForOpen
browseForOpenMultiple browseForSave cancel
canonicalize clone copyTo
copyToAsync createDirectory createTempDirectory
createTempFile deleteDirectory deleteDirectoryAsync
deleteFile deleteFileAsync dispatchEvent
download getDirectoryListing getDirectoryListingAsync
getRelativePath getRootDirectories hasEventListener
moveTo moveToAsync moveToTrash
moveToTrashAsync removeEventListener resolvePath

Table: air.File provides high-level functionality

Even though Adobe AIR essentially provides a browser interface, there are many times when you’ll want to open a page in the user’s normal browser. After all, AIR is not really a browser as much as it is an application platform for desktop apps. The user’s browser has their bookmarks, passwords and a familiar interface. AIR’s URLRequest object makes it quite easy to create a URL and launch it in the native browser.

AIR includes the SQLite database, which is a great little database. It’s fast, keeps the data in a convenient single file, supports a reasonable subset of SQL and even supports transactions. When SQLite was first added to AIR, they gave it an asynchronous interface, using callbacks for most database operations. SQLite is fast enough that it’s rarely necessary to perform your database operations asynchronously. Thankfully, Adobe added a synchronous interface, making it very easy to use a high-quality database to store your data reliably. The synchronous interface even uses the same basic API as the asynchronous one, so you don’t have to re-learn the API if you need to switch from one to the other.

Late in the development of the Toolbox, Adobe released the AIR Update Framework, which is a library you can use for adding automatic update capability to your application. This package is very well done, and offers both a library to use behind your own user interface and a version of the library with a complete user interface on top. Just plug it in and go!

AIR is a well-designed package and we had very few complaints. As Dojo users, one thing that tripped us up is that we typically use dojo.connect to listen to events. However, many parts of the AIR API are actually ActionScript objects and not JavaScript objects and we found that dojo.connect does not work with that type of interface. However, using the “addEventListener” mechanism works just fine and is not difficult to use. It just means that we sometimes need to be aware of whether we’re dealing with a JavaScript object or an ActionScript object.

AIR includes the “AIR Introspector” which is a JavaScript file you can include in your pages that provides something like Firebug for debugging your application. Unfortunately, for those of us who are used to Firebug or Safari’s new debugging tools, the Introspector is not quite as powerful. We really missed the ability to edit CSS on the fly, for example.

There are a couple other things that would have made our lives easier as we created the Builder, such as an easier way to spot sandbox bridge violations or a Gears-like WorkerPool implementation. I’ll talk about these in more detail in the Builder section.

Those complaints are small in the grand scheme of things. AIR provides a powerful environment for the development of desktop applications.

The API Viewer

Of the various pieces of the Dojo Toolbox 1.0, the API Viewer easily has the most moving parts. Dojo’s API documentation generator is written in PHP, and we have a PHP-based system for creating a static export of that documentation. Then, we wrote a Python program that used the wonderful lxml library to reformat some of the HTML and extract the text to build a full-text search index stored in an SQLite database. The result of that Python process gets zipped up and placed on a server from which the Toolbox JavaScript code will download it and then an ActionScript-based library (FZip) is used to unzip it.

The full-text search feature took some effort to implement. We initially used the search index generation and JavaScript search code from the Sphinx documentation tool project. That worked in the prototype phase, but we soon started hitting “out of memory” errors. That system worked by creating a search index in JSON format that gets loaded into memory when you run a search. Not counting extremely common words, the index covers more than 20,000 different words with more than 500,000 occurrences of those words in more than 10,000 files. It was a bit much for the JSON-based format to handle.

The next thought was to use SQLite to store the search index. Some versions of SQLite have a full-text search engine built in. The one that ships with AIR does not, however. Our solution was to modify the indexer script in Python to generate an SQLite database file. We then modified the JavaScript to use the SQLite data rather than the JSON. This demonstrated one of the great benefits of AIR using SQLite: there are bindings for just about any language and all of the major platforms. So, it’s easy to move database files around from other tools into AIR. It turns out that we could run the search query in a single SQL query. That means that the entire search runs in highly optimized C code against an indexed database. The out of memory errors went away and search results are now returned almost instantly.

Below, you can see how this query is set up and what the AIR database API is like:

// Create a "connection" to our database file
var conn = new air.SQLConnection();
conn.open(databaseFile);

// Create a statement, used to query the database
var stmt = new air.SQLStatement();

var query = "select filename, title, " +
	"sum(titlecount)*10+sum(totalcount) as score from " +
	"word_matches, files where files.id=word_matches.file_id " +
	"and word_matches.word_id in (select id from words where ";

// Extend the query for each search term presented by the user
for(var i = 0; i < searchwords.length; i++){
	if(i>0){
		query += " or ";
	}
	query += "word = :word" + i;
}
query += ")";

// Extend the query further for each "excluded" search term
// (ones that start with '-' that mean that they should not
// be included in the result)
if(excluded.length > 0){
	query += " and word_matches.file_id not in (select file_id " +
		"from words, word_matches where word_matches.word_id = " +
		"words.id and ";
	for(i = 0; i < excluded.length; i++){
		if(i>0){
			query += " or ";
		}
		query += "word = :eword" + i;
	}
	query += ")";
}

query += " group by filename order by score";

stmt.text = query;

// We use prepared statement style, which avoids SQL injection
// issues. Here, we plug in the actual search words provided
// by the user
for(i = 0; i < searchwords.length; i++){
	stmt.parameters[":word" + i] = searchwords[i];
}
for(i = 0; i < excluded.length; i++){
	stmt.parameters[":eword" + i] = excluded[i];
}

stmt.sqlConnection = conn;
this.showActivity("Loading search index");
stmt.execute();

var results = stmt.getResult();
results = results.data;

We also had a problem of distributing the API Viewer documentation set. The first build of the Toolbox included the documentation right in the .air file, which weighed in at more than 25MB. Moving the API documentation to a separate zip file and removing unused parts of Dojo (a.k.a. “Dojo Mini”), the final .air file came in at just 3.5MB.

The Builder

Dojo's build system runs on Rhino, the JavaScript interpreter for the Java Virtual Machine. Porting the build system from Rhino to AIR required four areas of change:

  1. File handling and other Java-specific portions of the code
  2. Changes to accommodate AIR's security model
  3. A need to display some notion of progress in the user interface
  4. Customizations made to Rhino for ShrinkSafe

The majority of the build system code is written in standard JavaScript. Rather than sprinkling file access code throughout the build system, all of that code was in a single file called "fileUtil.js". This design proved to be a great help in moving the build to AIR, because we just needed to produce a new version of fileUtil that used the AIR file functions instead of Rhino's. As an example, here is the Rhino version of the fileUtil.deleteFile function:

fileUtil.deleteFile = function(/*String*/fileName){
	//summary: deletes a file or directory if it exists.
	var file = new java.io.File(fileName);
	if(file.exists()){
		if(file.isDirectory()){
			var files = file.listFiles();
			for(var i = 0; i < files.length; i++){
				this.deleteFile(files[i]);
			}
		}
		file["delete"]();
	}
}

And here is the equivalent function in AIR:

fileUtil.deleteFile = function(/*String*/fileName){
	// summary: deletes a file or directory if it exists.
	// convert the fileName to an air.File object, relative
	// to the airbuild/buildscripts directory in the Dojo
	// directory being built
	var file = fileUtil.getFileObj(fileName);
	if(fileUtil.fileExists(file)){
		if(file.isDirectory){
			file.deleteDirectory(true);
		} else {
			file.deleteFile();
		}
	}
}

For people coming from a Java environment, the move to AIR is straightforward, and the air.File class tends to allow for more succinct code than you get with Java's file classes.

The second challenge was moving to AIR's security model. AIR provides two different kinds of sandboxes that code can execute in: the "application" sandbox and the "non-application" sandbox. The windows that you see in the Dojo Toolbox all execute their code in the application sandbox. By AIR's rules, that means that they're allowed to access any site on the internet and any files on disk. What that code isn't allowed to do is dynamically evaluate more JavaScript code.

It turns out that the build system dynamically loads Dojo partway through the process. And, of course, in order to actually "build" anything, it needs to be able to write files. In other words, it needs features of both the application sandbox and the non-application sandbox. AIR has a mechanism for this called the "sandbox bridge". To make the build run, we copy all of our AIR-specific build code into a directory called "airbuild" under your Dojo directory. We know this code is safe, because we put it there ourselves. The Builder creates an iframe in the non-application sandbox to run the build, that way it can load the Dojo code as needed. All of the functions from fileUtil get placed onto the sandbox bridge, so that the files that need to be written. The code below, from airbuild.js, will give you an idea of how the sandbox bridge is populated and passed to the child iframe:

// set up the parentSandboxBridge for the buildframe
var bridge = {
	showActivity: dojo.hitch(this, function(title, options) {
		this.showActivity(title, options);
	}),
	hideActivity: dojo.hitch(this, this.hideActivity), 
	buildDone: dojo.hitch(this, this._buildDone), 
	buildAborted: dojo.hitch(this, this._buildAborted), 
	onBuildFrameLoad: dojo.hitch(this, this._buildKickoff),
	onBuildFrameUnload: dojo.hitch(this, this._buildReset),
};

// copy each function from fileUtil over. Only functions and simple
// objects are allowed, so we can't just put fileUtil itself on
// the bridge.
for(var item in fileUtil){
	bridge[item] = fileUtil[item];
}
bridge.trace = air.trace;

// place the bridge on the iframe window
var buildframe = document.getElementById("buildframe");
buildframe.contentWindow.parentSandboxBridge = bridge;

Getting all of that right was a bit tricky. The sandbox bridge has rules about what can be passed across (only functions and "simple" objects are allowed). We had a couple of hard-to-track-down bugs that turned out to be JavaScript RegExp objects that were converted to just "Object" after crossing the sandbox bridge. Eventually, though, we ironed out the issues and the Builder could both run code and manipulate files.

If you'd like some more depth on this subject, one of the Dojo Toolbox's developers, Sam Foster, wrote an article about AIR's sandbox bridges.

The next challenge was to present the user with some indication of build progression, and that it was even going at all! The Rhino-based build tool runs in the console, so it just prints out information as it moves along. For the Builder tool, we wanted a progress bar with a text display to give the user an idea of what was happening.

Browser-based JavaScript runs in a single-threaded model. Generally speaking, as long as there is code running, the display is not updated and user interface events are not fired. Running a build takes a while (possibly more than a minute, depending on what you're building and the speed of your computer). The Gears project provides a "WorkerPool" allowing you to run code in a background thread. Unfortunately, though, Gears does not work inside of AIR and AIR itself does not offer a WorkerPool. Our solution was to break the build process up into a collection of functions that were joined together by addTimeout calls. By placing a short timeout between the functions, WebKit has a chance to update the screen before the next part of the build would run.

The final challenge in creating the Builder is one that cannot be directly addressed within the AIR environment. Dojo's build system includes a powerful JavaScript compression tool called ShrinkSafe. ShrinkSafe works wonderfully because it uses a customized version of the Rhino interpreter to allow it to work directly with the JavaScript parse tree. This makes it safe and accurate, because it's not working with brittle regular expressions, but rather a true view of the file, as seen by a real JavaScript interpreter. AIR uses WebKit's JavaScript interpreter, so it doesn't have access to the parse tree, and certainly not to Dojo's customizations in Rhino. The only solution to bringing ShrinkSafe into the Builder is to have some kind of process that runs outside of AIR, and that is exactly the approach we plan to take.

The rest of the Toolbox

The Resources tool was clearly the easiest to create, since it is effectively a static list of links. The biggest work there was creating the tabs and the custom styling.

The tool launcher (the main window you see when you start the Toolbox) went through some iterations while it was in development. Initially, it was a native window. As the project progressed and we wanted to add some more polish to the Toolbox, we changed the native window to a transparent window that fills the screen and the tool launcher itself is a DOM element on that transparent surface.

There are trade-offs to this approach. We chose to implement it this way so that we could use all of Dojo's facilities for drag and drop and animation. Dojo understands how to work with DOM elements, not native windows. Using a native window means writing our own code for things that Dojo already knows how to do. We have found two drawbacks to this approach:

  1. If you have multiple monitors, you can't drag the tool launcher window onto your second monitor.
  2. We have heard a couple of reports from Linux users that certain window managers do not pass events through the tool launcher's transparent windows. This is quite likely just behavior of the current AIR alpha release for Linux and will likely be fixed in the near future.

We will be investigating solutions to the multiple monitor issue.

I love it when a plan comes together

The Dojo Toolbox 1.0 was a fun project and is a good foundation for future development of easy-to-use, graphical tools for Dojo users. Adobe AIR proved to be fertile ground for exploration and a powerful enough tool that kept us going until we reached our goal of a functional Builder and incredibly useful API Viewer.

Comments

  • Psy

    Hi, I’ve also debuted on Air using Dojo, and my experience is overall good, but I get the feeling that AIR is more suitable for Flex projects than framework-enhanced Javascript applications.
    When you use a third party framework which its internal code is not familiar to you, the Sandbox limitations become a headache. Things sometimes work and sometimes doesn’t, and you must be very aware of codeflow for both AIR’s and the framework to understand WHEN a feature can be used without AIR’s complain.
    Another thing I’ve noticed is that AIR’s Webkit is not fully implemented. For example, SVG support is nonexistent on AIR’s Webkit, perhaps because Adobe preference goes toward Flash.

  • It’s possible that Adobe paid more attention to making the Flex stack work ideally within AIR when launching AIR. They definitely take the JavaScript stack seriously, though, and have been very receptive to input that we’ve had.

    I think AIR’s support for JavaScript apps will only get smoother over time, and it’s not bad right now.

    The sandbox can be a pain, but I can certainly see why it’s there. They don’t want it to be easy for malicious code to move from the web to your filesystem via AIR. Most applications don’t really need to dynamically run code… but some JavaScript libraries may use eval as a shortcut, because it works fine in the browser. You’re right that that kind of thing does get tricky when you move into AIR.

    Your comment about SVG makes me wonder if SVG is actually part of the WebKit project. As I understand it, Google implemented their own SVG support for Chrome. If WebKit just inherently supported SVG, I would think that AIR, Chrome and Safari would have equivalent support for SVG.

    Thanks for the comment!
    Kevin

  • Pingback: Ajaxian » Want to Use AIR? Read About the Dojo Toolbox Development Effort First()

  • Pingback: Blue Sky On Mars » Inside the Dojo Toolbox()

  • micha

    I think i’ve read that AIR’s getting an update soon, with major work done on the HTML / JS side … if only i coluld find the darn link …

  • Psy

    Thanks for reply Kevin, and yes, you’re right: seems that the official integration of SVG on webkit (based on KSVG2) is still experimental (http://webkit.org/projects/svg/index.html). That means that you must use Flash or Canvas to implement vector graphics. BTW: dojox.chart rendering on canvas works very well on Air.

    About the sandbox and limitations on Air’s JS, I totally understand why they are there and agree, iframe child sandbox looks like a joke the first time you read it but actually works. I think that the real problem I faced was the wrong assumption that Air is just a JS compiler and will accept anything you throw in.

    BTW: a pic of my first Dojo+Air widget :)
    http://snurl.com/3owuh

  • @micha: Flash Magazine mentioned this:

    “AIR 1.5 does not have a date and it’s also a kind of maintenence release, but it will bring two important updates – Flash Player 10 and a more recent release of WebKit (with video-tag support). There will be some bugfixes as well with this release.”

    http://www.flashmagazine.com/news/detail/air_roadmap/

    WebKit has been evolving very quickly. For example, Squirrelfish landed after AIR 1.0 was released. Just keeping up with WebKit changes will be an evolution of the AIR platform.

    @Psy: Cool looking widget!

  • John Wood

    Hi Kevin,

    Thanks for posting this article!
    I have couple questions:

    1) Did you modify builtin Sqlite?

    2) If it not a problem could you please elaborate how you used python was used to modify indexer script to generate FT Sqlite DB?

    Thanks,
    John

  • Hi John,

    I don’t believe it would be possible to modify the builtin sqlite. The AIR runtime really doesn’t provide any way to call native code.

    Creating the fulltext search DB was quite easy. The indexer script was already generating “tables” of information in JSON form. I just converted those tables to a relational DB form.

    The code for that is here:

    https://projects.sitepen.com/svn/toolbox/trunk/apigen/index.py

    The database just needs to match up words with the files the words appear in.

    I’d be happy to go into more detail on any part of that, if it would be useful.

    Kevin

  • John Wood

    Hi Kevin,
    Thanks for quick reply. I see that index.py generates new db and index tables using words parsed by sphinx’s beautifulsoup parser. I really like how u use your own ft tables!

    Did you try to use ft3 that comes with sqlite3?

    How did you package the generated db with air file?

  • John Wood

    Hi Kevin,

    I forgot to ask where in the code do you use index.py?

    Thanks,
    John

  • Hi John,

    ft3 is an optional part of sqlite, and AIR does not ship with it. If it did, I probably would never have created my own full-text search solution.

    One of the great features of sqlite is that its databases are just single, platform-portable files. After the Python code generates the database, I just package up the index database along with the documentation itself. Then, the API viewer just opens a connection to that database as needed.

    The pavement.py file creates a script called “apigen”. That script runs the main function in apigen/transform.py, which is what uses the indexer code. All of that runs completely independently of the AIR code. apigen generates the search index database and it also massages the API docs that we get to better tune it for what we need in the Toolbox.

    Kevin

  • John Wood

    Hi Kevin,

    Where do you store the index database on windows?

    Thanks,
    John

  • John Wood

    I never used Paver before but it looks like it is very powerful.

    John

  • I think Paver is convenient and powerful, but I’m biased :)

    The index database is packaged up in the API doc zip file. That is downloaded by the Toolbox and unzipped into the “Application Storage” directory (which is located in a platform-appropriate space by Air).

    Kevin

  • John Wood

    Thanks I found it.