We Are Django

I’ve always thought that the community is one of the greatest things about Django. Not because Django isn’t great but because the people are just so awesome! Django People is a brilliant new site by Simon Willison and Natalie Downe which brings together Djangonauts all over the world. (Interestingly, I was talking about exactly this same idea yesterday with my friend, in the lines of “there should definitely be something like that”. Great minds think alike (I wish) 🙂

I had the pleasure of meeting Simon and Natalie during the Europython conference last summer. Like every other Django people at the conference, they were super-nice and fun to share thoughts with. The various conversations we had with the Web-gang there were definitely one of the highlights of my last year. I hope that Django People will help in finding more same-minded people, both near you and when traveling around the world.

If you are Djangonaut, add yourself to the site! And when you do, please include your picture and some information about yourself. It’s so much nicer to look at real faces than empty rectangles 🙂

Here’s to many more Django friends.

Offline Development With Django

Coming to Django from the PHP-world, running a local development server (as opposed to Apache or a full LAMP-machine set up for just testing) and doing real offline development is something that takes a little bit of learning. After two years of active development with Django, I’d like to share some of my learnings.

Why Offline?

There are many benefits for developing your site someplace other than the same server which powers the site. I’m sure most of us do development this way.

Generally speaking offline development could mean any development that doesn’t happen on the production server. The meaning for offline development in this article is more literal: by offline I mean literally offline, that is without [requiring] a connection to the Web.

A well configured development environment helps you write better code efficiently — anywhere.

On a side note, don’t blame me if you end up coding Django your whole winter vacation at an idyllic remote cottage 😉

Best Practices

In the same way that Django lets you separate models, templates and views, it also lets you easily separate production and testing environments. Django also offers several tools for local development, such as the built in Web server and DEBUG-mode. In short, Django encourages you to follow best practices.

One thing that I’ve been trying to unlearn is the PHP-esque way of doing small modifications on the production server and at the same time accidentally breaking the site from two other places. Luckily when using Apache and modpython the lure of doing this is a bit smaller because every modification to a python file needs a server restart. By keeping the development strictly off the production server, the probability of breaking something on the live site reduces significantly. (Because you _do want to test your changes before deploying, right?)

Loosely related to local development is version control. When working with version control, you generally don’t want to check in non-functional code. That means that you must test the code before you check it in. Having a local development environment helps with this 🙂

Prepare Your Site — Thoroughly

I’ve been doing web development for nearly ten years now. Everything I do goes trough dedicated testing servers and version control. I’ve always thought my sites to be well prepared for offline development. Then, in spring 2007, I had some problems switching ISPs and I was cut off from web for two weeks. (What a long two weeks they were 🙂 Turns out this was a very good thing since I discovered tons of problems while trying to work really offline.

Use the Settings, Luke

Using different settings for development and production makes it possible to do truly offline development. At work we keep different settings for every development machine in the project-root and separate them by naming convention of settings-hostname.py and settings-production.py. We then symlink the appropriate settings on the machine as settings.py and everything Just Works. You might also want to learn to keep things portable.

Furthermore, there is a good tip in Django documentation about limiting serving static files to DEBUG=True. The given example is a bit un-DRY, though. And it also adds the static url as last element of the URLconf, which sometimes just doesn’t work. Here is what I use:

 from django.conf.urls.defaults import * from django.conf import settings  if settings.DEBUG:     # Serve all local files from MEDIA_ROOT below /localmedia/     urlpatterns = patterns('',         (r'^localmedia/(?P<path>.*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT, 'show_indexes': True}),     ) else:     urlpatterns = patterns('',)  urlpatterns += patterns('',     # your urlpatterns here )

I don’t like the else-part because it looks ugly, but it works. Point is that you want to add the media path as a first urlpattern so it won’t get overwritten by any of your other urls.

Also remember to configure INTERNAL_IPS so you’ll get debug-variable to your templates. This template snippet is a good example how to use that debug data.

In addition to these built-in settings, I also use a custom LOCALDEV boolean for explicitly handling situations that may not work well without internet connection. This way I can just ignore things that should do something over the web when I want without breaking the site. (Ie. if DEBUG and LOCALDEV, don’t fetch this over the Web but use this fixed variable instead, etc.) The combination of these two settings add up to very easy to use and flexible development environment.

Avoid Hardcoding Media Files

For me, most typical problems with offline development come from media links (that is links to images, CSS and JavaScript). Often these files are stored on a separate server(s), like recommended in Django documentation. Instead of hardcoding the links (relative or absolute) to media files, you should use your settings file and let Django take care of the rest. If you’re using SVN-version of Django, you probably already have a default context processor that sets MEDIA_URL in your RequestContext.

If you’re using older version of Django, here is how to do this yourself:

 # myprojects/misc/mediacontext.py from django.conf import settings  def media_url(request):     """ Returns MEDIA_URL url to context."""     return { 'MEDIA_URL': settings.MEDIA_URL }  # and in settings.py TEMPLATE_CONTEXT_PROCESSORS = ("django.core.context_processors.auth", "django.core.context_processors.debug", "django.core.context_processors.i18n", "myproject.misc.mediacontext.media_url")

Now you have MEDIA_URL variable in all your templates that have been rendered with RequestContext instance. Generic views use RequestContext, but unfortunately helpers like render_to_response use Context, not RequestContext, so we’re out of luck there. Luckily these kind of things are really easy to come by in Django. One easy solution is to use a simple wrapper to render_to_response method.

To get your media links work offline, just put MEDIA_URL it in front of any media links like this: <img src="{{ MEDIA_URL }}images/myimage.png" alt="" />. That way the media links Just Work — also offline (if you have the needed media files locally on your machine, of course).

Note that this technique works only for content that is rendered via Djangos template-engine. This means that with CSS-files, for example, you have to separate the parts that have media URLs in them and render them with Django templates. This works great for medium and small sites, but on a high-volume sites you’ll definitely want to make different arrangements to let those static elements be delivered via a separate media server.

Use Sample Data

Local developing means also that you don’t have access to the production database. It’s often necessary to have some data in the database before you can do any development at all. Django provides you a way for setting basic initial data automatically after syncdb-command, and it also helps you move all your data across different databases via fixtures.

A fixture is a collection of data that Django knows how to import into a database. You can export your whole production (Django-)database (or just one app) to a fixture with dumpdata-command. You can then move this fixture to your local machine and import it with loaddata. This way you can easily make copies of your production database and use them in your local development.

I’ve found two compelling reasons for using fixtures with development:

Using a copy of real data from a live production server in development is great because that way you’ll be able to work with those kind of inputs from real users that you’d never dream up writing yourself in your tests — before something breaks and you have to. It also feels nice to work on a site with real content instead of endless lorem ipsum paragraphs.
Also, lately I’ve been developing small sites entirely offline and deployed them via fixtures; I start up with an empty site, test and iterate it offline, add data and finally dump it (mostly from sqlite), and then load it to MySQL or PostgreSQL on the production server. Being able to move data easily from one database backend to another is great!

Conclusions

Offline-development is all about agility and portability. By keeping things not dependent on any specific database, media server or development machine, you’re giving yourself more freedom. In addition to easier development, portability adds to easier deployment, too.

Django provides great tools for fully offline-development. Hopefully this post gave you some ideas why you’d want to do it. I’ve been trying to better my own developing practices with Django for over two years. I’ve learned a lot, but I also believe that there’s still much to learn. Your tips and experiences are more than welcome in the comments!

Defaults Are Important

An article titled Don’t Pollute User Space reminded me of something I love about Mac OS X.

Here is the default folder structure of an user in Windows Vista:

    \Users\User\        (\AppData)           \Local           \Roaming        \Contacts        \Desktop        \Documents        \Downloads        \Favorites        \Links        \Music        \Pictures        \Saved Games        \Searches        \Videos

And this is what user folder looks like in Mac OS X 10.5 Leopard:

   /Users/Usernme/    /Desktop    /Documents    /Downloads         /Library         /Movies         /Music         /Pictures         /Public         /Sites

The Sites-directory is only one that I doubt most of Mac users need, but all the other ones are really needed and they make organizing easier, not harder.

Best thing about this, however, is that Mac users tend to actually use the given structure. Maybe it’s because Windows apps are so shitty that almost every Windows user that I know don’t actually keep their data in My Documents (or Documents in Vista) but instead scattered around the machine. (My data was, too, when I was a Windows user, many years ago.)

Good defaults are important.

Being Robust

Writing software that interacts with other peoples code is hard. To be robust, Postel’s Law suggests to be conservative in what you do; be liberal in what you accept from others. What follows is is a good example of what happens if you don’t.

When I posted my first Flickr pictures in 2005, Flickr photo_ids were counted in millions. Year later, they were in hundreds of millions. December last year, they topped 2.1 billion, which also happens to be the maximum value of signed integer type in some programming languages.

Here are some examples from my own pictures and their photo_ids from Flickr:

`6,029,771`	March 2005
`289,332,856`	November 2006
`2,165,862,620`	December 2007

After reading about someones problems with the 2,1 billion mark, I reviewed my own code. When I first integrated Flickr API to my homemade photo application in early 2006, I was smart enough to use unsigned integers (that would get me as far as 4,294,967,295) as field type for photo_id but not smart enough to read API documentation that explicitly advices to treat photo_id and other IDs as strings, because “format of the IDs can change over time, so relying on the current format may cause you problems in the future“.

This time I took the advice and fixed my code and database. All OK now. Or so I thought.

Yesterday someone left a (local) comment on the latest photo. I got a notification mail via my forked django.contrib.comments-app, but something was wrong. The related object id was OK in the email, but in the database it was pointing to a nonexistent object. That’s weird, I thought. After few minutes of poking around the code, I found out the cause of the problem. A line from the contrib.comments models.py:

object_id = models.IntegerField(_('object ID'))

(Sidenote: Yes, django.contrib.comments does not work at the moment with HUGE object_ids or non-integer primary keys. The comment framework is currently being re-written for newforms and this is hopefully fixed in the upcoming version.)

Somehow it feels good to know that even much smarter people than me make mistakes in evaluating robustness sometimes. I’m sure that whoever wrote Djangos great (and very un-Django-like totally undocumented) commenting framework didn’t see the need for object_ids greater than two billion. I’m also quite convinced that they didn’t expect that in just a couple of years, that same app would be used by thousands of Django-powered sites around the globe. It’s quite impossible to imagine all the possible situations where people might want to use it.

In Ellington CMS and Lawrence.com, where the surroundings are pretty much controlled, it makes sense to use (nothing but) integer-based IDs on generic related objects. With Flickr and many other not-so-common cases, and when being most liberal in what you accept from others, it makes much more sense to use strings.

I think this taught me to be more broad-sighted when developing and using APIs. Maybe you should, too?