The Guardian Engineering Blog - Tags are magic!

Last week we began a series of Developer Blog posts looking at how the tags on guardian.co.uk power a wide variety of features across the site and various platforms. In part two, we look at how we use folders to build these tags into a taxonomy.

Using tags as a taxonomy

We can create useful clusters of topics by dropping tags into folders. These can be used to populate pages in a variety ways.

We can make new pages dynamically by combining tags, and if you’re on the Paris page you’ll find a block of links pointing to the most heavily populated combiner pages for “[Paris]+[tags placed in a Trip planning folder]”, giving links to combiner pages of Paris+ Hotels, Restaurants, Short breaks, City breaks, Cultural trips etc.

Screenshot of the guardian.co.uk Paris Travel page — The suggestions panel on our Paris page is driven by calculating popular combinations of tags. *Photograph: guardian.co.uk*

If you’re on a “Type of trip” tag page, such as Skiing, you’ll find a block of links to the combiner pages for [Skiing]+[the destinations that most frequently share a tag with skiing]: France, Switzerland, Austria etc.

Because these links are generated based on frequency of tags in common, they will change over time to reflect our output, without any editor ever having to intervene. Categorising travel keyword tags into “Types of trip”, “Trip planning” and “Places” makes obvious sense, because users tend to browse travel content with very clear goals in mind.

Screengrab of guardian.co.uk business sectors round-up page — Tags allow us to automatically group together business stories by industry sector *Photograph: guardian.co.uk*

On news sections, where the paths users may choose to take are less predictable, we tend to use folders to generate pages rather than potential paths. A group of tags in a folder can be used to populate a roundup page such as Business sectors, Middle East or the US states. You can add a tag to as many folders as you like (Egypt appears on both the Africa and Middle East roundup pages). We also use folders to generate list pages, such as the people page, genre pages and most of our A-Zs.

Screengrab of the guardian.co.uk 'All life and style keywords page' — The A-Z index pages on guardian.co.uk are automatically generated based on the tags belonging to specific sections *Photograph: guardian.co.uk*

Parent/child relationships

A second method of creating tag relationships is “parent/child”. If one subject always implies another (Rap ⇒ Music, Labour ⇒ Politics, Chocolate ⇒ Food and drink) then we can class the first tag as a “child” of the second. Paris (Travel) is a child of France (Travel) and therefore a “sibling” of other “children” of France. This allows us to add automated links on Travel’s Paris page to other French cities and regions.

Annual events such as festivals, awards and sporting events, have an undated “parent” tag (giving us a single resource users can return to year after year) and “child” tags for the individual years (made after the event for archiving). On the undated pages we pull in a “child” driven block of links to previous years’ coverage.

We also use parent/child tag relationships in our tools to propose parent tags when a child tag has been selected. So if an editor picks the “London Evening Standard” tag, the tags “Local newspapers”, “Freesheets”, “Newspapers”, “Press and publishing”, and “Media” are all proposed and can be applied with one click, saving the editor from having to mentally climb a taxanomical tree and speeding up the process of adding tags.

While many of Travel’s parent/child and folder driven components are hard coded, we also embed configurable components into our page templates. Folders, parent/child relationships, single tags and tag+tag combinations can be used to pull trails (headline plus a short description of the piece) of:

• A single tag such as a related blog
• The page’s topic filtered by a single tag such as comment, news, features etc.
• A combination of any two tags
• Any content tagged with any of the tags in a folder
• Content tagged with parent, child or sibling tags

There’s enough flexibility built into these options to automate large chunks of the site, allowing us to concentrate resources on the areas that are of most importance to us and our audience.

In part three of this series, we’ll look at how we manage our tags to keep them in line with editorial style, and to keep them useful for the audience.

Tags are magic! - Part 2

Using tags as a taxonomy

Parent/child relationships