The 8 Commandments Of LLMs

Basic things (and some vocabulary) to know when working with LLM tools.

1. The model will lie to you

When a LLM model doesn’t know the answer to your question, it can –and will– straight up lie to you. You can design elaborate prompts to try to prevent this but it will still happen.

2. Most LLM tools don’t know how to browse the Web (without special plugins)

One of the most common things that trip people up when using LLM tools is that they will convincingly seem to summarize a Web page or pull data from an URL — even though they usually can’t browse the Web. In other words, most answers based on URLs you give them will be 100% made up. If the tool doesn’t have a plugin that is specifically designed to read URLs, assume the model can’t browse the Web.

3. When sharing an answer (or a screenshot), always include the full prompt

Sharing interesting answers is usually useful only if you remember to share the full query and prompt as well. If you share a screenshot, make sure the full question (and the prompt if any) is displayed as well so that the answer can be replicated.

4. Don’t ask LLMs to do calculations

As the name implies, LLMs are based on (natural) languages. As powerful they are with language, most are absolutely terrible with algebra. Do not try to do maths with LLM tools, and if you for some reason need to, force the model to show as many intermediate steps as possible.

5. Instead of a long and complex query, use several shorter ones

Splitting a complex task into several steps usually yields much better results. In general, if you force the model to work with steps, you can guide it easier and hence get much better results.

6. The output of the tool is fully based on the data it was trained on

All LLM models are trained on a known dataset and they literally don’t know anything that is not in that dataset. (They might hallucinate answers, though!)

Related issue is the knowledge cutoff. For example ChatGPT knowledge cutoff is September 2021, it does not know anything that has happened after that date.

7. Mastering these tools takes time and effort

Like any complex tools, LLMs take lots of practice to master. A whole new field of expertise called Prompt Engineering has been born from the experimenting and learnings of the early pioneers working on these tools. Some speculate that in the near future talented prompt engineers are as sought after as talented developers are today.

If you want to learn and keep up, you need to put the time in and start learning.

8. Learn the basic vocabulary

In order to be able to communicate with others and to learn more, you need to know at least the very basic vocabulary. Here’s some basics terms to get you started:

AI (Artifical Intelligence) is an umbrella term meaning everything from Siri to ChatGPT. (Think “motorised transport”)
LLM (Large Language Model) is a type of a machine learning algorithm for solving problems using computational linguistics and probabilities. (Think “a car” vs a train or a motorcycle)
ChatGPT is a prorietary tool by OpenAI, first published in November 2022. (Think “Porsche 911 GT”)

More advanced terms:

The trained LLM algorithms are called models. Some known ones are for example GPT-3, GPT-4 and Claude.
Prompts are the initial instructions sent to the LLM model when you query it. They can be complex and lengthy, or something very simple like “You are a helpful assistant”. (See Awesome ChatGPT Prompts.)
When the LLM model makes stuff up and lies to you, it’s called a hallucination. All models hallucinate, and often they can be very convincing.
Temperature is a feature of an LLM model that describes in layman’s terms how “creative” the output is. The scale goes from 0 (no creativity) to 1 (very much creativity).

Demoscene Nostalgy

Last night in my weekly 90s Night (Finnish) radio show I played the soundtrack of a 1993 demo Second Reality, which got some of the listeners chatting about their demoscene memories. I was surprised to learn that the code for the demo has been released on GitHub and there’s also a very cool deep dive review into it by Fabien Sanglard.

I sometimes have these proud moments when I hack on something small and obscure on my own but it would be really cool to do something collaborative and creative like this with other people. I can’t imagine how exciting and fun it might have been working on something like the Second Reality back in 1993 with primitive tools without even source code management. We have a Finnish saying for doing something just for the love of it: rakkaudesta lajiin.

The Value of Knee-Jerk Reactions

The big news today was that Twitter has accepted Elon Musks offer. Twitter doesn’t make it easy but I’ve always tried to avoid only following people who think exactly like I do. It seems that my timeline is about 50% feeling really good about the news and the other 50% are talking about migrating to Mastodon or crawling back to IRC cave with the other luddites. (I’m permanently in that lovely cave myself so I can make fun of these luddites!)

It’s only been a few of months since I last watched in disbelief as many people were loudly leaving Spotify because some big corporations and elite (even the White House!) were actively spreading disinformation about Joe Rogan in order get him cancelled. Full disclosure: I listen about 50% of Joe Rogan episodes. I strongly disagree with many of his views but I have to say that almost everything that was said about him was either wrong or purposefully quoted in a misleading way. It’s actually super easy to see if you just focus a bit; one side is honest (albeit sometimes intentionally trolling and/or attention-seeking), the other side twists the message and uses traditional propaganda tactics. It’s really very transparent.

I’ll take misinformation over disinformation or plain old censoring every day of the week.

The “funny” thing (I’m not sure what would be the right word here because it’s so tragic) with the Spotify mass migrations was that many people –including Neil Young– jumped over to Amazon, which by almost any measure is much worse than Spotify! Among other things they treat their workers horribly and they go to extremes to avoid paying taxes. In my opinion, at the end of the day, the Washington Post (owned by Jeff Bezos) does much more controlled damage to the society than Spotify does.

But back to Twitter and Mastodon. If Elon Musk buying Twitter causes some people to jump off it, I think it only does good for the whole Web. It’s sad that many still don’t seem to know or understand how easy it is to use multiple sites and apps instead of just hanging at one giant one. It’s good to have alternative communication platforms and different views about things. If these knee-jerk reactions help people to find their way off from the great big walled gardens, it’s new positive for the whole World Wide Web. In the end, even the Spotify migration was probably good for the music industry.

Lastly, I don’t want to come off as being somehow different or better than anyone else. I have my own opinions and pet peeves which get me going. Not too long ago I reacted pretty strongly when 1Password announced some kind of a partnership with some crypto (as in -coin and not in -craphy as they should have) thing which just wwas the last straw for me and I cancelled my account just minutes after reading the news and then virtue signaled about it on Twitter. I’m just like everyone else, but I think I just value free speech more than most.

Hello, WordPress

This blog has seen quite a lot of technical changes since it first launched in 2006. It first run on WordPress.com, I then migrated it to a homemade Django app, then years later into Gridsome app, and now we’re back to (this time self-hosted) WordPress.

Problem with custom blog engines is that they suck. They’re fine for that “12 minute blog” demo but not too much past that. My Gridsome blog deteriorated in few months to a state where I couldn’t get the development site running at all (and I’m an experienced developer) so I decided that it was time to find something that would allow me to write more. So here we are, again, 16 years later.

Hello, WordPress!

Unicode and Django RSS Framework

Unicode issues are the most annoying thing about Django. Here is one workaround for a bug in Django RSS framework.

I have migrated my Ma.gnolia bookmarks and Flickr photos into this site. Both services have tags that have what Django devs call “funky characters”, that is non-ascii characters in them. Getting these into the database unchanged was one pain in the butt itself, but after that, I wanted to make my own feeds for both Ma-gnolia and Flickr tags with Djangos wonderful syndication framework. Turns out that the framework don’t play well with urls that have funky characters.

The problem is in the feed class that adds automatically appropriate ‘http://’ prefixes in front of any urls that need them. On creation, the feed object it is passed with request object that has unencoded path attribute which throws an uncatched exception when there are funky characters in the url. Adding the site domain to it before passing it to the feed class circumvents the problem.

This is my (stripped down) feeds view:

 from django.contrib.syndication.views import feed  def my_feeds(request, url):     from unessanet.links.feeds import *     from unessanet.photos.feeds import *      unessanet_feed_dict = {         'linkit': LatestBookmarks,         'valokuvat': LatestPhotos,         'valokuvatagi': PhotosForTag,     }      # Fixes a bug in syndication framework     request.path = 'http://www.unessa.net' + request.path     return feed(request, url, unessanet_feed_dict)

Now the feeds render properly. Almost.

A feed with an unquoted url does not validate. It may work, but it doesn’t validate. To fix this, just escape the url with quote function found in urllib module.

This is my feed class for photo tags:

 class PhotosForTag(Feed):      description_template = "feeds/latest_photos_description.html"     title_template = "feeds/latest_photos_title.html"      def get_object(self, bits):         if len(bits) != 1:             raise ObjectDoesNotExist         tag = bits[0]         return PhotoTag.objects.get(tag=tag)      def title(self, obj):         return "Unessa.net Valokuvat: %s" % obj.tag      def link(self, obj):         # Quote the url so the feed validates         from urllib import quote         return 'http://www.unessa.net/valokuvat/tagit/%s/' % quote(obj.tag)      def description(self, obj):         return "Unessa.net Valokuvat: %s" % obj.tag      def items(self, obj):         return obj.flickrphoto_set.filter(is_public=True)[:10]

Note that the quoted part of the url must be unicode or otherwise you’ll end up with a broken url. But after these fixes, the feeds work as expected — with or withouth funky characters.

I really, really hope that Django will be converted to use nothing but unicode strings before the long waited 1.0 release.

My Space

On the matter of spaces versus tabs our BDFL says:

If it uses two-space indents, it’s corporate code; if it uses four-space indents, it’s open source. (If it uses tabs, I didn’t write it! 🙂

I hate spaces (within source code) for some reason. Fortunately TextMate makes working with tabs and spaces trivial. I can easily convert spaces to tabs and vice versa so this really isn’t a problem with any sane file that uses either consistently.

But what I’d really like to see, is a control character for indenting source code. It shouldn’t be too hard to implement it to any modern editor and it would eliminate all (or at least most of) the whitespace-hassle if it would be mandatory. And for an added bonus, it would force using of unicode.