Getting Philosophical About a Line Chart. Data Visualisation from Scratch P.3

Today’s “from scratch” example with D3 is a must-have element of any data visualisation portfolio: a line chart. Line charts are great of visualizing changes in data over time. Just as in the previous posts in the series, my visualisation is a variation of a piece of code I found on the web. I started with a basic template created by Mike Bostock and then re-worked some of its elements to boost its usability & readability. As with the previous examples, all code can be downloaded, reused, adjusted, and it scales up and down to include extra data series or to remove one.

The small victory of this week is that I’ve moved the blog from wordpress.com to a self hosted domain. Now I can paste my javascript snippets directly in the post area. No more pixelated screenshots!

Now back to the line chart: in general I’m happy with the final effect, though there are things I feel need to be revisited on a per-project basis (e.g. highlighting multiple data points at once). Here’s a review of the things I have adjusted in the original code:

Colour scheme & shape of lines. I’ve used an accessible colour palette for my data series. Since it’s not wise to rely solely on colour to distinguish between the series, I’ve added a variation of the stroke type too. That’s the dashed line on the graph. In the code it’s been baptised as a “trendline”. Dashed stroke looks very nice for trendlines as it sort of blends behind the main series. Changing stroke type in multiseries graphs is actually quite easy to do: first you make a call to svg to collect all line elements, then you give them an index, pick the index of the one you want to modify and assign it a class. Then style that class as you feel like in the css file.

– At first, I’ve adjusted the Y axis to start at 0. It felt more intuitive because I remembered at maths classes we’d always do it this way. But then I’ve read this thread and changed my mind: the point of a line chart is to show variance over time, not its relative position to zero. Which would be the case for the bar chart, for example. With values higher than the ones I’ve used a chart starting at 0 would be unnecessarily squashed at the top: so any variance would be concealed for the sake of showing the meaningless distance from 0. The only thing I’ve adjusted is offset from the x axis to remove any doubt about it reaching 0.

Lines from curved to straight. That’s a matter of preference and the audience the graph is directed to. I’ve decided in favour of straight lines as curves can suggest there are data points in between. A less anal viewer than me could appreciate a more friendly chart with smoothed interpolation between the data points.

Point labels, point size transition & grid lines. Here I had a dilemma: I didn’t know what is the best way of annotating the data points. I liked the idea of showing three points at once from an example I found on StackOverflow. It’s excellent for comparing multiple values across the series. However, if the graph tracks seasonality, multiple data points highlighted at once could interrupt studying the cycle variation. I’ve first imported the code to then realise its simpleness is its enemy too. Let me explain: the graph tracks the mouse movement to compute x & y values. It works great in series of many data points with little space in between. In my my case I wasn’t comfortable with the scale of interpolation it produced: I was given too many data points where in the dataset there were none. Perhaps changing the code to only display the actual data points but leave the vertical line could be an alternative. In the end I’ve opted for grid lines for visual guidance, and added a per-point tooltip and size emphasis for displaying the exact value of the highlighted datum. The tooltip is appended to an invisible circle of a radius set to 10p which makes the mouse-over area larger for hovers on desktops and for taps on mobile screens.

I liked the series’ names appended to the lines: these labels are unmissable. A legend would work as well, but it wouldn’t be as effortless to read. First a legend has to be located on the screen, then the reader has to go back and forth a few times to decode the graph coloring based on the legend information. This is especially tiresome and prone to decoding errors if there are many series (and many colour values).

After all this philosophical detour, here’s my graph in its full glory:

 

-Eve-

Leave a Reply

Your email address will not be published. Required fields are marked *