Dojo on the client and Python on the server make for a great combination. They’re easy, productive and powerful. In this article, I’ll show you how to use Python + Dojo to cut the number of requests to your server by 95% and simplify development and deployment while you’re at it.

The Sample Project

I wanted to be able to highlight the build aspects of this without the extra noise of a big project, so the sample project is about as simple as it gets. It’s a trivial WSGI application that uses Luke Arno’s static package to serve up a static page that uses a single Dojo function to populate the page content. By using the wsgiref server included in Python 2.5, you can get started with a minimum of fuss.

The first step is to download the sample project from http://www.sitepen.com/labs/code/pydojo/PyDojo-1.0.tar.gz.

Once you’ve done that, you can use the bootstrap script to get a virtualenv set up, so that you don’t need to fill up your system’s Python directory. Virtualenv creates an isolated Python environment based on your system’s Python environment. Anything installed in the virtualenv (using easy_install or python setup.py install) is stored completely in the lib and bin directories of the virtualenv directory you create.

python bootstrap.py

Note that this tutorial was written on a Mac. Linux should work similarly. If you’re running Windows, all of the scripts will be in a directory called Scripts rather than bin. Windows users just need to type Scripts/(command) wherever you see bin/(command).

The pavement.py build script used in the sample project is designed to pull a specific version of Dojo out of Dojo’s Subversion repository. As long as you have Subversion installed and available by typing “svn” at the command line, the following command will get the parts of Dojo that the project needs:

bin/paver dojo

If you don’t have svn, you can download Dojo and copy the dojo and util directories to pydojo/static/js.

After you’ve finished this setup, your final directory directory structure within the sample project will look something like this:

bin (or Scripts) location for all of the commands in the virtualenv
lib the virtualenv’s library location
pydojo Python package that contains the .py files
pydojo/static Location of all of the static files
pydojo/static/js JavaScript files
pydojo/static/js/dojo Dojo’s core code
pydojo/static/js/pydojo JavaScript for the sample project
pydojo/static/js/util Dojo’s util package, including build scripts

With Dojo in place, you’re ready to start the server:

bin/start-server

Why make a custom build?

Now that the server is running, we can navigate to the page at http://localhost:8080. In your console, you’ll see the request log (times removed to make it more compact):

"GET / HTTP/1.1" 200 334
"GET /js/dojo/dojo.js HTTP/1.1" 200 5410
"GET /favicon.ico HTTP/1.1" 404 13
"GET /js/dojo/_base/_loader/bootstrap.js HTTP/1.1" 200 16242
"GET /js/dojo/_base/_loader/loader.js HTTP/1.1" 200 24962
"GET /js/dojo/_base/_loader/hostenv_browser.js HTTP/1.1" 200 11555
"GET /js/dojo/_base.js HTTP/1.1" 200 325
"GET /js/dojo/_base/lang.js HTTP/1.1" 200 8506
"GET /js/dojo/_base/declare.js HTTP/1.1" 200 6754
"GET /js/dojo/_base/connect.js HTTP/1.1" 200 10110
"GET /js/dojo/_base/Deferred.js HTTP/1.1" 200 12386
"GET /js/dojo/_base/json.js HTTP/1.1" 200 3968
"GET /js/dojo/_base/array.js HTTP/1.1" 200 7616
"GET /js/dojo/_base/Color.js HTTP/1.1" 200 4859
"GET /js/dojo/_base/browser.js HTTP/1.1" 200 633
"GET /js/dojo/_base/window.js HTTP/1.1" 200 4429
"GET /js/dojo/_base/event.js HTTP/1.1" 200 16900
"GET /js/dojo/_base/html.js HTTP/1.1" 200 40635
"GET /js/dojo/_base/NodeList.js HTTP/1.1" 200 18218
"GET /js/dojo/_base/query.js HTTP/1.1" 200 36053
"GET /js/dojo/_base/xhr.js HTTP/1.1" 200 22454
"GET /js/dojo/_base/fx.js HTTP/1.1" 200 17295
"GET /js/layer.js HTTP/1.1" 200 28
"GET /js/pydojo/core.js HTTP/1.1" 200 145

Wow. For a page that says “Dojo is working!”, that’s an awful lot of requests. It’s also more than 260K.

But that’s actually just fine, as long as we’re in development. We want well-organized, commented, fully readable code to give us more power and flexibility during development.

It’s worth noting that the requests above are what we see with a Dojo source release. If you use a prebuilt version of Dojo or Dojo from AOL’s Content Delivery Network, all of those /js/dojo requests are reduced to a single, 26K gzipped request.

When we deploy our application, we want to reduce how much data the user needs to download and, just as importantly, we want to reduce the number of requests. The browsers place a limit on how many requests they will make in parallel, and some have the limit as low as 2. Given the latency of making a request, grabbing that many separate files will really slow things down.

OK, let’s package it up!

Rather than starting with all of the explanation, let’s jump straight ahead to the fun part and see where we’re going. We want to reduce how many requests are made to the server and also shrink our JavaScript files. Let’s build the project:

Press Control-C to stop the server.

bin/paver bdist_egg

If you’ve used setuptools before, or even Python’s traditional distutils, you’re probably familiar with the python setup.py bdist_* commands. Using distutils and setuptools, it’s non-trivial to override the behavior of those commands. With Paver, making new behavior for bdist_egg is as easy as creating a function called bdist_egg.

This bdist_egg command will first run a Dojo build. Dojo’s build system uses a customized version of the Rhino JavaScript engine and therefore requires Java to run. After running the Dojo build, it will package up the result in a Python egg file. Once it’s done, you can see the egg we ended up with:

-rw-r--r--  1 admin  staff  1068065 Jun  3 14:41 PyDojo-1.0-py2.5.egg

Let’s see what we’ve got

To try out the build, let’s create a new virtualenv so that we don’t install any code globally on your computer.

bin/virtualenv tmp

Virtualenv will create a new directory called ‘tmp’ that is its own private Python environment. We’ll install our newly built egg in there:

cd tmp
bin/easy_install ../dist/PyDojo-1.0-py2.5.egg

Having done that, we’ll now have a start-server command inside the ‘tmp’ virtual environment. Let’s run that:

bin/start-server

Now, navigate to http://localhost:8080/ and let’s take a look at the request log output:

"GET / HTTP/1.1" 200 334
"GET /js/dojo/dojo.js HTTP/1.1" 200 76857
"GET /favicon.ico HTTP/1.1" 404 13
"GET /js/layer.js HTTP/1.1" 200 675

Woah! That’s a lot better! We have only a small fraction of the requests and the total size has gone from 260K to 76K, and that’s without gzip compression, which would take it down much farther still. Not coincidentally, that 76K pre-gzip dojo.js is the same as what you get when you download a pre-built Dojo release. And, you’ll note that we didn’t have to change a single line of our code and had only one file to copy over to our servers and install.

And that is why using the Dojo build tool along with Paver, setuptools and eggs is a good thing.

This tiny example project shows some of the benefits of making a Dojo build, but the benefits for a larger project are much, much greater. Applications that make heavy use of Dijits will see an even greater boost due to making a layered build because Dijit templates are pulled straight into the .js file, and CSS files are compressed as well. The techniques used in this article are exactly the same for projects both large and small, and the benefits just increase as your project grows!

Python Eggs

Since 2005, Eggs have been on the rise as a useful distribution format for Python code. An egg file is a zip file that typically contains compiled Python .pyc files, but can also contain platform-specific compiled modules and other resources such as JavaScript files.

Eggs and the easy_install command make getting and installing Python packages a snap. Combine Eggs with virtualenv or zc.buildout, and you’ve got a nice, isolated environment for exploring Python packages and deploying them in reproducible ways.

How It Works

There are three main tricks to making this work without requiring any code changes between development and deployment:

  1. Use a layer.js file that brings in your modules
  2. Use pkg_resources to find the static files
  3. Do a directory switcheroo before creating the egg

The app itself is quite small, so I encourage you to look over all of the files. I’m going to focus on the tricks above.

layer.js

Dojo custom builds group files together in terms of layers. We have one layer in our simple app, and it’s in a file called layer.js.

The index.html file requests two JavaScript files: dojo.js and layer.js. dojo.js is obvious. Here’s what’s in layer.js:

dojo.require("pydojo.core");

Yep, one line. But that’s because our app is very simple. It could have more lines, or the additional required modules could be listed in pydojo.core. The basic idea with this module is that during development, it’s a bunch of dojo.require statements to load up all of the JavaScript our app needs.

After the build, however, layer.js will actually contain all of the required JavaScript itself. That’s how we can greatly reduce the number of requests. The generation of this all-inclusive layer.js is described by the build.profile.js file:

  dependencies = {
  	layers: [
  		{
  			name: "../layer.js",
  			dependencies: [
  				"pydojo.core"
  			]
  		}
  	],
  	prefixes: [
  		["pydojo", "../pydojo"]
  	]
  }

This file tells the Dojo build system that we want to generate a file called
layer.js that should include the “pydojo.core” module and all of the modules
pydojo.core requires. We also tell the build system where to find pydojo.

Finding the static files

setuptools provides a module called pkg_resources that helps you locate files that your app needs, whether you’re in development or deployed in production. In pydojo/app.py, we can see how the WSGI application is put together and exactly how the static files are found:

import pkg_resources
from static import Cling

_static_directory = pkg_resources.resource_filename("pydojo", "static")

app = Cling(_static_directory)

After the imports, we use pkg_resources.resource_filename to find the location of the “static” directory within the “pydojo” Python package. When we’re in development, that will point to the directory in our development tree. In production, it will point to the directory where the egg has been installed.

The next line defines the WSGI app itself, which is an instance of static.Cling rooted at our static directory.

The directory switcheroo

Finally, we get to the part where we build Dojo and then build our egg. The sequence that we follow in this step is:

  1. Run the Dojo build, putting the output into the dojoBuild directory
  2. Move the pydojo/static/js directory to a temporary location
  3. Move our dojoBuild directory into pydojo/static/js
  4. Collect up a proper list of files to package up for the bdist_egg command
  5. Run bdist_egg
  6. Switch our directories back

All of this is quite straightforward, particularly with Paver’s built-in path support (using the path.py module by Jason Orendorff.). Here is bdist_egg as defined in pavement.py:

@task
@needs('build_dojo')
def bdist_egg():
    """Package up the egg with a built Dojo."""
    curdir = path.getcwd()
    dojoBuild = curdir / "dojoBuild/dojo"
    jsdir = options.dojo.destination
    tempjsdir = curdir / "jsDir"
    # start by moving the built Dojo into place
    jsdir.move(tempjsdir)
    dojoBuild.move(jsdir)

    # we need to update the "package_data" information with the filenames
    # of our built Dojo. Currently, package data can only be updated
    # by changing the Distribution object, because the options.setup
    # information has already been read in by the time this runs.
    runtime.dist.package_data=setuputils.find_package_data('pydojo', package='pydojo', 
                                    only_in_packages=False)


    call_task("setuptools.command.bdist_egg")
    # now move the directories back
    jsdir.move(dojoBuild)
    tempjsdir.move(jsdir)

The code above is responsible for the directory switching, gathering up of the file list and calling the bdist_egg command. Running the Dojo build itself is just a matter of running the same command that would be run on the command line. This is the build_dojo task that is referred to in the @needs line for bdist_egg:

@task
def build_dojo():
    """Runs Dojo's build system."""
    curdir = path.getcwd()
    jsdir = options.dojo.destination
    builddir = jsdir / "util/buildscripts"
    info("cd %s" % builddir)
    builddir.chdir()
    if sys.platform[:3] == "win":
        build_cmd = "build.bat"
    else:
        build_cmd = "./build.sh"
    sh('%s action=release optimize=shrinksafe profileFile="%s" releaseDir="%s"' 
        % (build_cmd, curdir / "build.profile.js", curdir / "dojoBuild/"))
    curdir.chdir()

All the build_dojo task really does is run the build script with the options that we need.

Deploy away!

In this article, I have shown how simple it is to automate building an egg that includes a custom-built Dojo and is ready to deploy in production. The actual deployment is up to you, because there are many ways to serve up WSGI applications, but I hope that now you’ll find it easy to get your app bundled up and ready to go.

Bonus! Serving the static files directly

While eggs make for an excellent way to package up your application and move it around, you often don’t want Python serving up your static files. nginx, Apache and lighttpd can serve static files much, much faster than Python. I found an easy way to serve up the static files from your eggs, at least on systems that support symbolic links.

The exact details of how you set it up depend on how you deploy your application (buildout? virtualenv? mod_wsgi? mod_proxy?). But the idea is simple, you create a symlink that points to the static directory in your egg. Make sure that your pavement.py includes a zip_safe=False option in the setup section of the options. Eggs can be installed as a single zip files, and the symlink trick won’t work then. By setting zip_safe=False, the easy_install machinery will unzip the egg on installation.

Whenever you update your application or start it up or whenever it’s convenient, you just run the create_symlinks function below (replacing the rebuild_link line at the end with a reference to your static directory).

import os
import pkg_resources

def create_symlinks():
    exists = os.path.exists
    join = os.path.join
    abspath = os.path.abspath

    sl = "static-links"
    
    if not exists(sl):
        print "Creating the static-links directory"
        os.mkdir(sl)


    def rebuild_link(name, package, path):
        if exists(join(sl, name)):
            print "Removing old %s symlink" % name
            os.unlink(join(sl, name))
        static_dir = pkg_resources.resource_filename(package, path)
        print "Creating new %s symlink" % name
        os.symlink(abspath(static), join(sl, name))

    rebuild_link("pydojo_static", "pydojo", "static")

Then you just configure your web server to route static requests directly through that symlink. Make sure you turn on the follow symlinks option, if your webserver has one (Apache does, for example).