EuroPython 2008, day 1

After a long year of exiting projects and neverending days at the office, I found myself again from EuroPython conference at Vilnius. The first conference day was full of interesting happenings, mainly reviving old and creating new connections.

For all the sessions I attended during the day, I made some notes on few of them. Here are some basic notes about selected ones:

Build an App in a Week by Marcin Kaszynski

  • favpico.com etc
  • Suprisingly many of the attendees (about half) use Django
  • Use aDjango admin for users, too (not only for admins)
  • “Commit early, commit often”
  • Share code only via vcs
  • Use conventions
  • @with_template decorator
  • Note: you can use admin docs (template name)
  • Tests Do save time
  • self.login(), get, find
  • coverage.py
  • Instant Django

This talk was interesting but didn’t really give much new stuff (at least for me). There was a discussion about how it makes sense to use convention of naming templates like app/viewname.html but I pointed out that if you for some reason want to part from that convention (that I believe most Django developers use), you can just point Django admin site help section to your designer and just document the fact in your modeldocs. No phone calls necassary still and you can have full control of your customized software.

My God, it’s Full of Files by Tommi Virtanen

  • “I want to work with files in a uniform way no matter where they are”
  • path.py
  • eagain.net

This talk was interesting, but we clearly need someone that says “yes, I can do this”. I, for one, will buy a beer to anyone who makes file- and directory handling more uniform and easier with Python.

py.test — Rapid Testing with Minimal Effort by Holger Krekel

  • looping tests
  • HTML generation
  • javascript tests

This talk was really interesting. It made me want to try py.test with Django test framework but I suppose that it’s not compatible. The talk also made me miss my old two-display setup. Automatically looping tests are very neat.

Data Portability and Python — Christian Scholz

  • OpenId
  • Yadis
  • Oauth
  • FOAF
  • RDF
  • microformats
  • pydataportability.net

Very nice talk in every possible way, but there were nothing new for me this time.

Keynote, Guido van Rossum

  • “Open source needs to move or die”
  • what breaks and why
  • new features
    • argument annotations
    • abstract base classes
    • extended iterable unpacking
    • new string format method
  • enables future evolution
  • 2to3 tool
  • switch when 1) youre ready and 2) all your dependencies have been ported
  • “well help libraries to upgrade”
  • do now
  • use dict.iterkeys()
    • “Django has been separating unicode for. years now”
    • 2.6 warns you with -3 flag
  • q&a
    • documentation is changing (will be up for 2.6)
    • distributing different versions is easy

Quido has been brainwashing us. Only a couple of more times this same time and I’ll actually agree most of the features and if want to try them.

For me this day was a success. Made a couple of new friends, hooked up with old friends and got to listen sme brilliant sessions. During the night we went to a couple of small bars in the old town nd had even more fun.

The only unsure thing about this evening was if we’re going to see the girls again or not 🙂

We Are Django

I’ve always thought that the community is one of the greatest things about Django. Not because Django isn’t great but because the people are just so awesome! Django People is a brilliant new site by Simon Willison and Natalie Downe which brings together Djangonauts all over the world. (Interestingly, I was talking about exactly this same idea yesterday with my friend, in the lines of “there should definitely be something like that”. Great minds think alike (I wish) 🙂

I had the pleasure of meeting Simon and Natalie during the Europython conference last summer. Like every other Django people at the conference, they were super-nice and fun to share thoughts with. The various conversations we had with the Web-gang there were definitely one of the highlights of my last year. I hope that Django People will help in finding more same-minded people, both near you and when traveling around the world.

If you are Djangonaut, add yourself to the site! And when you do, please include your picture and some information about yourself. It’s so much nicer to look at real faces than empty rectangles 🙂

Here’s to many more Django friends.

Offline Development With Django

Coming to Django from the PHP-world, running a local development server (as opposed to Apache or a full LAMP-machine set up for just testing) and doing real offline development is something that takes a little bit of learning. After two years of active development with Django, I’d like to share some of my learnings.

Why Offline?

There are many benefits for developing your site someplace other than the same server which powers the site. I’m sure most of us do development this way.

Generally speaking offline development could mean any development that doesn’t happen on the production server. The meaning for offline development in this article is more literal: by offline I mean literally offline, that is without [requiring] a connection to the Web.

A well configured development environment helps you write better code efficiently — anywhere.

On a side note, don’t blame me if you end up coding Django your whole winter vacation at an idyllic remote cottage 😉

Best Practices

In the same way that Django lets you separate models, templates and views, it also lets you easily separate production and testing environments. Django also offers several tools for local development, such as the built in Web server and DEBUG-mode. In short, Django encourages you to follow best practices.

One thing that I’ve been trying to unlearn is the PHP-esque way of doing small modifications on the production server and at the same time accidentally breaking the site from two other places. Luckily when using Apache and modpython the lure of doing this is a bit smaller because every modification to a python file needs a server restart. By keeping the development strictly off the production server, the probability of breaking something on the live site reduces significantly. (Because you _do want to test your changes before deploying, right?)

Loosely related to local development is version control. When working with version control, you generally don’t want to check in non-functional code. That means that you must test the code before you check it in. Having a local development environment helps with this 🙂

Prepare Your Site — Thoroughly

I’ve been doing web development for nearly ten years now. Everything I do goes trough dedicated testing servers and version control. I’ve always thought my sites to be well prepared for offline development. Then, in spring 2007, I had some problems switching ISPs and I was cut off from web for two weeks. (What a long two weeks they were 🙂 Turns out this was a very good thing since I discovered tons of problems while trying to work really offline.

Use the Settings, Luke

Using different settings for development and production makes it possible to do truly offline development. At work we keep different settings for every development machine in the project-root and separate them by naming convention of settings-hostname.py and settings-production.py. We then symlink the appropriate settings on the machine as settings.py and everything Just Works. You might also want to learn to keep things portable.

Furthermore, there is a good tip in Django documentation about limiting serving static files to DEBUG=True. The given example is a bit un-DRY, though. And it also adds the static url as last element of the URLconf, which sometimes just doesn’t work. Here is what I use:

 from django.conf.urls.defaults import * from django.conf import settings  if settings.DEBUG:     # Serve all local files from MEDIA_ROOT below /localmedia/     urlpatterns = patterns('',         (r'^localmedia/(?P<path>.*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT, 'show_indexes': True}),     ) else:     urlpatterns = patterns('',)  urlpatterns += patterns('',     # your urlpatterns here )

I don’t like the else-part because it looks ugly, but it works. Point is that you want to add the media path as a first urlpattern so it won’t get overwritten by any of your other urls.

Also remember to configure INTERNAL_IPS so you’ll get debug-variable to your templates. This template snippet is a good example how to use that debug data.

In addition to these built-in settings, I also use a custom LOCALDEV boolean for explicitly handling situations that may not work well without internet connection. This way I can just ignore things that should do something over the web when I want without breaking the site. (Ie. if DEBUG and LOCALDEV, don’t fetch this over the Web but use this fixed variable instead, etc.) The combination of these two settings add up to very easy to use and flexible development environment.

Avoid Hardcoding Media Files

For me, most typical problems with offline development come from media links (that is links to images, CSS and JavaScript). Often these files are stored on a separate server(s), like recommended in Django documentation. Instead of hardcoding the links (relative or absolute) to media files, you should use your settings file and let Django take care of the rest. If you’re using SVN-version of Django, you probably already have a default context processor that sets MEDIA_URL in your RequestContext.

If you’re using older version of Django, here is how to do this yourself:

 # myprojects/misc/mediacontext.py from django.conf import settings  def media_url(request):     """ Returns MEDIA_URL url to context."""     return { 'MEDIA_URL': settings.MEDIA_URL }  # and in settings.py TEMPLATE_CONTEXT_PROCESSORS = ("django.core.context_processors.auth", "django.core.context_processors.debug", "django.core.context_processors.i18n", "myproject.misc.mediacontext.media_url")

Now you have MEDIA_URL variable in all your templates that have been rendered with RequestContext instance. Generic views use RequestContext, but unfortunately helpers like render_to_response use Context, not RequestContext, so we’re out of luck there. Luckily these kind of things are really easy to come by in Django. One easy solution is to use a simple wrapper to render_to_response method.

To get your media links work offline, just put MEDIA_URL it in front of any media links like this: <img src="{{ MEDIA_URL }}images/myimage.png" alt="" />. That way the media links Just Work — also offline (if you have the needed media files locally on your machine, of course).

Note that this technique works only for content that is rendered via Djangos template-engine. This means that with CSS-files, for example, you have to separate the parts that have media URLs in them and render them with Django templates. This works great for medium and small sites, but on a high-volume sites you’ll definitely want to make different arrangements to let those static elements be delivered via a separate media server.

Use Sample Data

Local developing means also that you don’t have access to the production database. It’s often necessary to have some data in the database before you can do any development at all. Django provides you a way for setting basic initial data automatically after syncdb-command, and it also helps you move all your data across different databases via fixtures.

A fixture is a collection of data that Django knows how to import into a database. You can export your whole production (Django-)database (or just one app) to a fixture with dumpdata-command. You can then move this fixture to your local machine and import it with loaddata. This way you can easily make copies of your production database and use them in your local development.

I’ve found two compelling reasons for using fixtures with development:

  • Using a copy of real data from a live production server in development is great because that way you’ll be able to work with those kind of inputs from real users that you’d never dream up writing yourself in your tests — before something breaks and you have to. It also feels nice to work on a site with real content instead of endless lorem ipsum paragraphs.
  • Also, lately I’ve been developing small sites entirely offline and deployed them via fixtures; I start up with an empty site, test and iterate it offline, add data and finally dump it (mostly from sqlite), and then load it to MySQL or PostgreSQL on the production server. Being able to move data easily from one database backend to another is great!

Conclusions

Offline-development is all about agility and portability. By keeping things not dependent on any specific database, media server or development machine, you’re giving yourself more freedom. In addition to easier development, portability adds to easier deployment, too.

Django provides great tools for fully offline-development. Hopefully this post gave you some ideas why you’d want to do it. I’ve been trying to better my own developing practices with Django for over two years. I’ve learned a lot, but I also believe that there’s still much to learn. Your tips and experiences are more than welcome in the comments!

Being Robust

Writing software that interacts with other peoples code is hard. To be robust, Postel’s Law suggests to be conservative in what you do; be liberal in what you accept from others. What follows is is a good example of what happens if you don’t.

When I posted my first Flickr pictures in 2005, Flickr photo_ids were counted in millions. Year later, they were in hundreds of millions. December last year, they topped 2.1 billion, which also happens to be the maximum value of signed integer type in some programming languages.

Here are some examples from my own pictures and their photo_ids from Flickr:

6,029,771 March 2005 Factory Philosophy
289,332,856 November 2006 Winter is Here
2,165,862,620 December 2007 Keeping warm

After reading about someones problems with the 2,1 billion mark, I reviewed my own code. When I first integrated Flickr API to my homemade photo application in early 2006, I was smart enough to use unsigned integers (that would get me as far as 4,294,967,295) as field type for photo_id but not smart enough to read API documentation that explicitly advices to treat photo_id and other IDs as strings, because “format of the IDs can change over time, so relying on the current format may cause you problems in the future“.

This time I took the advice and fixed my code and database. All OK now. Or so I thought.

Yesterday someone left a (local) comment on the latest photo. I got a notification mail via my forked django.contrib.comments-app, but something was wrong. The related object id was OK in the email, but in the database it was pointing to a nonexistent object. That’s weird, I thought. After few minutes of poking around the code, I found out the cause of the problem. A line from the contrib.comments models.py:

object_id = models.IntegerField(_('object ID'))

(Sidenote: Yes, django.contrib.comments does not work at the moment with HUGE object_ids or non-integer primary keys. The comment framework is currently being re-written for newforms and this is hopefully fixed in the upcoming version.)

Somehow it feels good to know that even much smarter people than me make mistakes in evaluating robustness sometimes. I’m sure that whoever wrote Djangos great (and very un-Django-like totally undocumented) commenting framework didn’t see the need for object_ids greater than two billion. I’m also quite convinced that they didn’t expect that in just a couple of years, that same app would be used by thousands of Django-powered sites around the globe. It’s quite impossible to imagine all the possible situations where people might want to use it.

In Ellington CMS and Lawrence.com, where the surroundings are pretty much controlled, it makes sense to use (nothing but) integer-based IDs on generic related objects. With Flickr and many other not-so-common cases, and when being most liberal in what you accept from others, it makes much more sense to use strings.

I think this taught me to be more broad-sighted when developing and using APIs. Maybe you should, too?

The Global Django

One thing that annoys me in many US-based software projects is the narrow mindness of the developers who seem to forget that there is life outside of the US, too. Django has been a very international project from the day one, with admin interface translated to 43 languages and the contrib.localflafor full of great localized stuff like validators for Finnish social security number etc.

After Adrian’s official announcement of the worldwide Django sprint (that will be held on Friday, Sept. 14, 2007) there has been a steady flow of volunteers adding their names on the wiki page. As I write this, over hundred volunteers from 26 countries have signed up for the event. It’s just amazing.

I wish more open source projects would have as great community that Django has. And greetings to everyone who is attending the sprint next Friday — let’s have fun! 🙂

Lightning talk: Hacking EuroPython.org (with Django)

I gave a lightning talk yesterday at the EuroPython conference about my little MyEuroPython mashup. The intention of the talk was not to promote the site but to raise a conversation about how to make better conference websites (and especially EuroPython.org).

Main points of my presentation were that conferences are about communication and interaction. Current EuroPython.org does not give any tools for that, and it would be great to have something more 2.0 for the site.

For example:

  • Simple site structure with good (live)search
  • Registration and personal preferences using OpenID
  • Allow commenting of sessios
  • Provide a personalizable timetable (that is useful for example when using a mobile phone)
  • Free the data; RSS feeds (and possibly other APIs?)from everything
  • Aggregate blog entries, images, links to the main site

Good thing was, there was talk about the presentation afterwards, and we actually volunteered for doing the EuroPython.org with Django next year. Hopefully the discussion lives on!

At EuroPython 2007

I arrived at Vilnius Lithuania with my colleague on saturday. We spent the weekend by exploring the beautiful old city and today we’re getting into the business with EuroPython.

Only a couple of weeks late, I today published MyEuroPython, which provides an alternative to the EuroPython conference schedule (powered by Django — of course). It’s very untested and unpolished at the moment, but at least I’m using it myself 🙂 I’m giving a lightning talk about the site later on today.

Hopefully we’ll get a some kind of Django-meetup arranged this week. I’ll be posting more on the conferense soon. Meanwhile, my Jaiku page updates somewhat often and Flickr images, too.

Django: Say Hello to Unicode

After weeks of testing, the Django unicode-branch was merged into trunk today. This changeset brings huge improvements to unicode-awareness of Django and it also fixes a lot of unicode-related bugs. From the announcement at django-users list:

This should be backwards-compatible for all practical purposes (providing you only use ASCII data). The only real difference you will notice in that case is that model fields are Unicode strings instead of bytestrings in type, but since they are ASCII data anyway, that shouldn’t make any real difference.

See Unicode data in Django and Porting Applications (The Quick Checklist) for more.

Furthermore, there was also another great commit today fixing a bug that has always been in top five of my personal “The things I hate most about Django”-list. Changeset 5608 adds finally “unicode-aware slugify filter (in Python) and better non-ASCII handling for the Javascript slug creator in admin”. Until today, slugify-function converted a typical non-english title like “Tässä on älyttömästi ääkkösiä” into (totally unreadable) “tss-on-lyttmsti-kksi” which of course sucks big time when every other slugify function on the planet makes it something like “tassa-on-alyttomasti-aakkosia” (which is totally readable).

I’m really, really happy that Django is slowly but firmly maturing into a unicode-friendly framework. Kudos for Malcolm Tredinnick for his huge efforts on the unicode-branch and also big thanks to everyone who helped with testing and bugfixes!

Using Capistrano to Deploy Django Apps

Capistrano is a great tool for automating tasks on one or more remote servers. It’s mainly used for deploying Rails apps but it can fairly easily be used for other tasks, too. After reading the new getting started docs for upcoming 2.0 version, I created a simple script for deploying Django apps with Capistrano.

Feeling Lazy

Let’s be honest. Deploying Django apps is dead easy. There really isn’t much stuff that would need automation. But there are many situations where you are doing much more work on the remote server than you’d really want to. The more commands you execute the more the chances of screwing up increases. And personally, I’m just too lazy for doing simple monkeywork on the command line again and again. This is where Capistrano comes along: it can do most of the work for you.

Do it Capistrano Way

Capistrano, like Rails, assumes many things about your project. For example, it assumes that you want to deploy a Rails app, and that you are using subversion for version control. Also the current documentation is mostly written from Rails viewpoint. Fortunately the docs have been improved lately and Capistrano is starting to present itself as a true multipurpose tool — which it really is.

I generally like the Rails-way of doing things. When things are always done in a consistent way, developing becomes really easy. (Don’t underestimate the weight of cognitive load!) Until you want to be different. At that point everything usually falls apart. Fortunately Capistrano is fairly flexible in this regard — you can do things your own way.

Capistrano has built-in functions for doing stuff like deploying (a Rails app), controlling FCGI servers, and simple rollbacks. I wanted to do something very basic and in a way that could be easily modified and expanded. After learning the basics and reading Capistano chapter on Rails manual, I wrote and tested the following script in just few hours. It uses just basic shell commands, none of those fancy Capistrano functions. I think that this script is a good starting point for building smarter and more complete scripts.

Run Me Baby One More Time

Okay, let’s cut the crap. In brief, what I wanted to do with Capistrano was ease the upgrading of production sites by automating a set of things I do when upgrading a production site.

I work for a small company where we run dozens of Django sites on many different servers. (I believe we are if not the only one, at least one of the few companies in Finland that specialize in Django.) A typical project consists of one subversion repo and a server setup of Apache+mod_python and dedicated servers for database and static files. All of our projects are configured for easy off-line development and most of them usually need some tweaking on the production server after a subversion checkout.

Personally I like to do development with real data, so normally just before deployment I import the live database to my local machine, modify it if needed, and finally upload the modified dump back to the production server. Naturally, this only works on small sites, but for those (and for me) it works great.

So, my first Capistrano script automates what I normally do manually for an upgrade of a small site. The steps are:

  1. Log in to a remote server
  2. Run svn update in project directory
  3. If there are changes in models:
    1. Backup old database
    2. Upload new database to server
    3. Import new database
    4. Delete the just loaded sql-file
  4. Move settings.py in place
  5. Reload apache

Most importantly, Capistrano does all this in a way that in a case of “OMG! OMG!11 I JUST BROKE THE ENTIRE SITE!11” (which of course never happens because I’m perfect and I never do mistakes) I can roll back all of the changes in a matter of seconds.

The Capistrano script that does all the above looks like this.

This script can be run from the local development machine. I put mine in the projects root directory. And when upgrading the site, I run the upgrade_project-task that I wrote by typing cap upgrade_project. Capistrano then asks for my password, logs in to the remote server(s) and executes every command defined in given task, outputting the results on the terminal as it goes. It’s finished before you can say “freaking cool”. For added security you can (and should) use SSH keys for authentication, and Capistrano lets you even define a gateway server for piped connections.

The capfile is a ruby script that defines one of more tasks. A task consists of one or more commands that will be executed on one or more remote servers. (Capistrano can run a task simultaneously on several remote servers.) A small task could be something like running ls on a remote server, while a complex one could do everything needed for deploying an application on a virgin box. There really isn’t much limits to what Capistrano can do.

IM IN UR DJANGO

Above example is a quick & dirty proof-of-concept type of script. I know it can be bettered quite a bit. It would be nice to see more Django-related scripts out there. If you have used Capistrano for Django-related work, share your experiences! Also bettering the Capistrano documentation wouldn’t hurt. Django wiki is one good place for sharing.

For something to think about, with a little bit of extra work it would be possible to do totally virgin deployment with configuring apache and media servers, too. I’m also pretty sure that it would be possible to use some of the Capistranos built-in helpers for deploying non-Rails apps. Google for using Capistrano for more info.

I hope this post got you at least a bit interested in Capistrano. If you have any suggestions or comments, please add them below 🙂

Django unicode-branch: testers wanted

The long-waited unicode-branch is finally at a stage that wider community testing is needed. Read the notification at django-users mailing list.

Malcolm has done terriffic job with the branch and there are already fairly solid documentation available. For most people, the short checklist (five steps, maximum!) is all you need to convert your applications to handle unicode well. If you want more information, check the detailed documentation from the trunk.

Using this branch means an end for the numerous unicode-related problems (for most of them, anyway) when using Django. So, this is a must for every djangonaut who is living in the Real World 😉

Go on, get on with it! 🙂