Building Site Search And OpenSearch-plugin With Django

While browsing trough Technorati I saw this blue glow on the Firefox search bar. Firefox had autodiscovered an opensearch-plugin for Technorati. How could I implement this functionality on my own site? Turns out this is ridiculously easy to do with Django. The result for this blog looks like this:

Again, it’s very easy to do. (I know, because I could do it! 🙂

Start with search

First, you need to have a search function for your site. Unfortunately Django doesn’t have any generic search functionality built in (yet, search-api is on its way). There is however, very nice search functionality in Admin-app, which can be easily modified to a site search.

After a bit of tweaking, this is the view code that I eventually came up with:

 from django.shortcuts import render_to_response from django.db.models import Q from django.db.models.query import QuerySet from django.http import HttpResponseRedirect from django.template import loader, Context from unessanet.english.models import BlogEntry import operator   def hoyci_search(request, terms=None):     if request.POST:         return HttpResponseRedirect('/en/hoyci/search/%s/' % request.POST['query'].replace(" ","+"))     elif terms:         query = terms.replace("+"," ")         if query:             or_query=Q()             search_fields = ['title', 'strapline', 'excerpt', 'body'] # your search fields here              for bit in query.split():                 or_queries = [Q(**{'%s__icontains' % field_name: bit}) for field_name in search_fields]                 other_qs = QuerySet(BlogEntry) # your class here                 other_qs = other_qs.filter(reduce(operator.or_, or_queries))                 search_results = other_qs.filter(is_draft=False)         else:             search_results = None         return render_to_response('english/search.html', locals())     else:         return render_to_response('english/search.html')

Let’s chew it up a bit. Firstly, in my urls.py I have a line

(r'^search/(?P<terms>\S+)/$', 'hoyci_search'),

which passes the terms variable to the search function. When the function is called, it checks whether the request was POST or GET. Searches via web-form are done with POST, while other searches (like Firefoxes opensearch) are GET. If request is POST, it redirects it to a new URL, which is also the result page.

All the heavy lifting is done when prosessing the GET-request. If you want to use this code, modify the search_fields-variable and the first instance of other_qs-variable to set up right model class and its fields. After that (and after spitting up the necessary template files), everything should just work! You don’t have to worry for example about SQL-injections because Django db-api takes care of things like that for you. How convenient 🙂

(The locals()-trick was found from the Django Book, btw.)

Implementing OpenSearch With Django

Mozilla developer center has a good article about creating OpenSearch plugins for Firefox. Basically, the “plugin” is just an XML-file and a rel-link to notify the browser about it. I basically copied the example XML to a Django template and filled in the data. My description file looks like this:

<OpenSearchDescription   xmlns="http://a9.com/-/spec/opensearch/1.1/"   xmlns:moz="http://www.mozilla.org/2006/browser/search/" >   <ShortName>Same Con To Hoyci</ShortName>   <Description>Search entries from Hoyci</Description>   <InputEncoding>UTF-8</InputEncoding>   <image width="16" height="16" type="image/png"     >http://kuvat.unessa.net/stuff/unessa_ulogo_16px.png</image   >   <Url     type="text/html"     method="GET"     template="http://www.unessa.net/en/hoyci/search/{searchTerms}"   ></Url>   <moz:SearchForm     >http://www.unessa.net/en/hoyci/search/{searchTerms}</moz:SearchForm   >   <Language>en-us</Language> </OpenSearchDescription>

This file needs to be fed to the browser, so we need another line in urlconf and another view. The easiest way to do it would be to call django.views.generic.simple.direct_to_template, but because Django is for perfectionists, we’ll do it with a custom view to get right mime-type. Like this:

 def hoyci_opensearch(request):     response = HttpResponse(mimetype='application/opensearchdescription+xml')     t = loader.get_template('english/hoyci_opensearchxml.html')     response.write(t.render(''))     return response

Quite painless. Now the only thing missing is the autodiscovery link to the template. It looks like this: <link rel="search" type="application/opensearchdescription+xml" title="Hoyci Search" href="http://www.unessa.net/en/hoyci/opensearch/">. And that’s it. Now you can search your site from Firefox search bar!

Final notes

This was yet another “hack together something interesting in half an hour”-type of hack. It’s not 100% optimized or fully tested, but for me it works like a charm.

The search functionality has some known limitations (aka “features”):

  • It only searches from one class. It would be nice to be able to search multiple classes.
  • Due to URL-limitations, my solution breaks with Firefox with characters like ‘/’, because Firefox encodes them and Django doesn’t seem to comprehend it. Django chokes for example on “foo%2Fbar”, which is not nice.

The Firefox OpenSearch implementation has a nice suggestion feature. It would be quite easy to built one with Django using a simple searches-class and json output to the browser. Maybe a sequal to this howto?-)

Any tips on the limitations, and any other comments and suggestions, are appreciated!

Speeding With Django

The past few weeks have been quite Djangofull. We’ve gradually started converting our customers’s sites to Django — and what a fun it has been!

Nordic Bandwagon (in Finnish) was the first site we did fully with Django. The whole site is roughly about 100 lines of Python, with a very nicely working content management system, RSS-feeds for news and very clean templates to present the data. Very compact, very elegant and unbelievably fun to code.

I’ve had very interesting feedback from my Django demos. In fact, building Django models and starting up the automatic admin site in front of the customer during a sales meeting may be the best pitch ever:

“This is production ready. You can start working with this right now, while we’ll get working on the technical stuff.”

I’ve considered hiding a small video recorder somewhere in my suitcase; what would be better material for company Christmas parties than watching the executive face expressions from tape?-)

We’re also cooking up some Really Cool Shit with Django and AJAX. Most, if not all, of it will be published with an open source license, so there’s definitely more Python code coming up in this blog. Stay tuned 🙂

Highlighting Code Using Pygments and Beautiful Soup

Syntax highlighting in blog posts is something that has always bugged me. I don’t like JavaScript-based solutions so I wrote a quick&dirty function that highlights Python-code in my blog posts on the server side. Following examples are written for Django, but they should work on any Python software.

The problem

I want to use Markdown and still be able to have automatic syntax highlighting for Python code that’s inline in my blog posts. Markdown alone tends to break HTML-formatted source code (because of indentations, etc) so fully working solution needs a bit tweaking.

The Solution

We’ll need:

With these tools we’re able to build a helper function that looks for source-code in a given text, highlights it’s syntax and applies Markdown filtering to it without messing up the syntax highlighted code.

The Code

My (simplified) BlogEntry model looks like this:

 class BlogEntry(models.Model):     title = models.CharField(maxlength=500)     body = models.TextField(         help_text='Use <a href="http://daringfireball.net/projects/markdown/syntax">Markdown-syntax</a>')     body_html = models.TextField(blank=True, null=True)     pub_date = models.DateTimeField(default = datetime.datetime.now)     use_markdown = models.BooleanField(default=True)      class Admin:         fields = (             (None, {                 'fields' : ('title', 'body', 'pub_date', 'use_markdown')             }),         ) 

Redundant body_html element is for performance: instead of calculating markdown- and syntax highlight for the body on every request, we calculate it only on every save. (Yes, it could also be done on the body-field itself, but I prefer that the content I’m editing does not change every time I save it.)

Next the highlighting function:

     def _highlight_python_code(self):         from pygments import highlight         from pygments.lexers import PythonLexer         from pygments.formatters import HtmlFormatter         from unessanet.misc.BeautifulSoup import BeautifulSoup          soup = BeautifulSoup(self.body)         python_code = soup.findAll("code", "python")          if self.use_markdown:             import markdown              index = 0             for code in python_code:                 code.replaceWith('<p class="python_mark">mark %i</p>' % index)                 index = index+1              markdowned = markdown.markdown(str(soup))             soup = BeautifulSoup(markdowned)             markdowned_code = soup.findAll("p", "python_mark")              index = 0             for code in markdowned_code:                 code.replaceWith(highlight(python_code[index].renderContents(), PythonLexer(), HtmlFormatter()))                 index = index+1         else:             for code in python_code:                 code.replaceWith(highlight(code.string, PythonLexer(), HtmlFormatter()))          return str(soup)

This function searches <code>-blocks that have class="python" attribute. It first replaces them with placeholder text, then applies markdown if necessary, and finally replaces the placeholders with syntax highlighted code. It may not be the most beautiful code, but it works 🙂

And finally the save method:

 def save(self):     self.body_html = self._highlight_python_code()     super(BlogEntry,self).save()

The body_html-field is updated on every save. On the template side you can use simply {{ entry.body_html }} without applying any additional filters.

The CSS needed for syntax coloring can pe printed out with Pygments for example like this: css = HtmlFormatter().get_style_defs('.highlight'). It may be wise to save the code and put it in a static CSS-file.

Known Limitations

  • Not a bug, but feature, every instance of code-tags that have class="python" will be replaced. This was a bit annoying when trying to document this particular function…
  • Unicode strings break the highlighter. Any help on this is appreciated!

This code is published under Creative Commons License. Please share any comments! 🙂

Too Many Blogs

Same Con to Hoyci (aka Too Many Choices) is finally moved under my Unessa.net-domain. Things are a bit rough on the edges (how two-point-zero of me) for now, but I think this is a good start. And yes, it’s running on Django, of course.

Maybe I’ll actually pick up with writing now that I’m no longer dependent on WP.com. I don’t know why, but for some reason I really don’t like writing anywhere where I don’t have complete control over the content. At least I have Big Plans… 🙂