Making The Long Good Read "Data Paper" Covers.


For the previous 6 weeks I've been working on a project with the @guardian and @newspaperclub to revive @thelonggoodread in the form of a "Data Paper". I shan't go over the reasons again as they're covered well enough here: The Guardian experiments with a robot-generated newspaper with The Long Good Read and here: On Algorithmic Newspapers and Publishing. But I did want to cover the covers (heh) quickly because they damn well took a bit of effort, and all good efforts should have words thrown at them.

The weird thing about the covers is that they're actually a failure, in that I didn't manage to do with them what I was planning to at all. The contents of the paper is algorithmically created from a big pool of articles and variables, what I wanted the covers to be was a visualisation of the algorithm and variables that ended up producing the whole paper, while also being aesthetically appealing as a piece of cover art in itself.

Each week the plan was for the cover to be stylistically similar, holding the "brand" from week to week, but visually effected by the content. Each one could stand on its own as an image, but if a reader learnt the language of they cover they'd be able to decipher it and see at a glance what the contents of the paper was likely to be.

A quick example; pretend the paper ran for a few years, when the Olympics arrived the stories algorithmically picked would most likely be sports related, which have their own bunch of related tags and represented in the Guardian by the colour green. Then a while later the Paris Fashion Week hits and so there are different tags, relationships and colours. Some weeks nothing much "Newsy" happens, others it's a new breaking news story each day. There are cycles and explosions to the flow of news that I've learnt over the years of measuring this stuff, to the point of it being a background hum to everything else I do.

It was that which I was hoping to bring to the covers. But alas while I have my hands on the data, I just never found the time to make the data pretty in the way I wanted.

They were also supposed to be fully automated, allowing a user to visit a "paper control panel" preview the algorithmically selected stories, decide on a cover "theme" and then press a button which would throw all the stories into Newspaper Club's ARTHR tool and go off and generate the cover.

Meanwhile I'm happy with what I ended up with instead especially given that the whole project was to produce a paper each week in a tiny amount of time. The covers were more representative of the newspaper and all the data as a whole. There just wasn't quite enough time to get to where I wanted to be.

But anyway, a quick rundown on what we have instead.

* * *

Issue #001

Issue One

This one was a bit short notice, we had some designs for the front cover, but they were things looking like data, but not actually data. In much the same way as computer interfaces in movies look like (HUGE) interfaces but aren't actually real. As I didn't have anything lined up instead I wanted to represent the process of putting together what's inside the paper on the outside. The original Long Good Read did a similar thing, the title/logo at the top and headlines and short descriptions of each story in the bottom half. This was a more visual version of that.

Tools: InDesign, Photoshop.

* * *

Issue #002 & #003

The week came round too fast for the next issue so I dug up some code I'd used previously, which produced a "barcode" type of image, which I rotated this time to have more of a DNA chart look about it, as though it was showing you the underlaying building blocks for life (of a newspaper).

Turns out that's not too much of a stretch, there are 7 strips each representing a 24 hours day, Monday on the left, Saturday & Sunday on the right. Midnight to midnight runs down from the top to the bottom. That's roughly how many articles and when the Guardian publishes.

All Six Issues of The Long Good Read

From a code point of view it's written JavaScript, which calls the Guardian API and draws to a huge canvas element. I then use the Chrome extension Full Page Screen Capture to grab it and shove it into Photoshop to add the title over the top.

For Issue #003 I added the colour of each section that the articles were published into. I also started messing around with the title/logo. Normally you'd take a while establishing the logo before you mess around with it but given our compressed timeline it seemed worth just getting on with blowing it away.

Tools: Guardian API, javascript.

* * *

Issue #004

I had a bit more time to produce the cover this time and had been wanting to throw slabs of colour at it to see how they looked.

Issue Four

This is probably the most popular cover as it happens, misusing Treemaps to show the number of words published by section for the previous 7 days. This is one of those covers that works well once but probably doesn't make much sense to use regularly.

I didn't have a chance to write any custom code for this one, instead using Many Eyes to create the treemap. Which was then taken into Photoshop again and essentially drawn of the top of. Also once more the Guardian API was used to grab the word counts for each section which spat out a CSV file used as the source data.

Tools: Many Eyes, Guardian API, javascript & Photoshop.

* * *

Issue #005

Probably my favourite cover and the closest to what I was aiming for, a semi abstract design that's actually based on real hard data.

Issue Five

This was playing with some graphing code (and d3js) I'd already written which calculates the strength and clusters between tags used on the "top" stories over the last 7 days. I used d3 to position all the nodes, which then dumps out the positions and sizes once it's finished. I then took those positions and re-rendered the image at a much higher resolution with ProcessingJS, because I wanted gradated lines which I couldn't hack into the SVG of d3.

Out of all the designs this is the one closest to being totally automated with the code creating and outputting a high-resolution image.

Oh and at some point add back the tag names onto the image.

Tools: Guardian API, javascript, d3js, ProcessingJS.

* * *

Issue #006

Similar to #004 but instead of word counts it's using article counts instead

Issue Six

I actually spent a bit more time with this one poking at something I'd only read about, which is getting Processing via various messy javascript hackery to play with Illustrator. I managed to get it to the point of drawing the correctly sized and colours circles and then ran out of development time.

Then, because I don't really use Illustrator and do use Photoshop, I pulled it into PS to move them around and put the text on top.

On the automation side I'd managed to get enough done to see that getting Processing to build the whole file, text and everything in Illustrator would be possible… just not by the end of the day.

Tools: Guardian API, javascript, Processing, Illustrator, Photoshop.

* * *

Lessons Learnt

Mainly that to have a tool that generates something for you automatically in a manner of minutes or seconds, takes days to create.

That and weeks come around really quickly. Each week I thought "I know, I'll have some time to code a thing before next week, then it'll be easy" and before you (i.e. me) know it, next week has arrived and I'm all "Agggh, it's newspaper making day again today, what am I going to do?". Repeat.

And indeed repeat as I get to do it all over again next year.

* * *

Photos for issues #001 & #004 by Newspaper Club used under a Creative Commons, Attribution NonCommercial NoDerivs License.

* * *