Occasionally most Python programmers needs to save some settings data, usually to save program states, user settings, etc. One way of doing this is to use a standard pickle/cPickle-approach, by which you can save (serialize) and retrieve the state of any object to and from a file. The drawback of this approach is that the resulting data file is in a binary format. And if you are like me, you want your state data in an easily hackable and debuggable format – that is, plain text.

Another approach is to use a traditional Windows INI-style file, with parameters separated in sections. This is a reasonable good approach, but it cannot handle nested data, i.e. sections within sections. And this is mightly handy if you for instance want to serialize a dictionary.

An XML file would seem like the obvious solution, being both text-based and capable of handling nested data. For me, there is two issues with this however:

  1. XML is not the easiest format to read if your settings file gets big
  2. Although Python provides out-of-the-box support for XML hacking, there is no direct support for saving and reading the before mentioned dictionary. There is several recipes for doing this on the net, but it is still something you will have to get to work.

The solution (well, one of several)

There is however another solution which comes with batteries included – namely the lesser known poor cousin of XML – YAML.

According to the not-exactly-Web 2.0-like homepage for YAML, it is “a human friendly data serialization standard for all programming languages”. Well, I don’t know about how YAML works for other programming languages, but the YAML implementation for Python (not surprisingly named PyYAML) works great. The latest version at the time of writing is v3.05.

YAML is a much more condensed and readable data format than XML, while retaining most of the benefits of XML, which makes it ideal for settings and profile files.

A simple example

The following example is a simple YAML document describing a tree structure.

# tree format
treeroot:
    branch1:
        name: Node 1
        branch1-1:
            name: Node 1-1
    branch2:
        name: Node 2
        branch2-1:
            name: Node 2-1

As can probably be seen, YAML uses indentation to specify levels, much like Python, and does thus not need the open- and end-tags of XML.

To read this data into a Python program, simply execute the following code (we assume that the YAML data is kept in a file called ‘tree.yaml’):

import yaml
f = open('tree.yaml')
dataMap = yaml.load(f)
f.close()

The variable ‘dataMap’ now contains a dictionary with the tree data. If you print the ‘dataMap’ using PrettyPrint, you will get something like the following:

{'treeroot': {'branch1': {'branch1-1': {'name': 'Node 1-1'},
    'name': 'Node 1'},
    'branch2': {'branch2-1': {'name': 'Node 2-1'},
    'name': 'Node 2'}}}

Saving data

So now we have seen how to get data into our Python program. Saving data is just as easy:

f = open('newtree.yaml', "w")
yaml.dump(dataMap, f)
f.close()

Summing up

YAML is an excellent data format language for Python which can be used for a wide range of purposes, including application configuration files, saving program state, user settings, profiles and much more.