You can now take a look at my D3 projects at bl.ocks.org! I will soon add some more visualisations in version 5 of the D3 library.Follow @EveTheAnalyst
Spoiling you as usual, I have another exciting D3 example for today: merging historical maps! I’ve been meaning to cover this topic ever since I developed a similar project for my Master’s thesis 3 years ago. Merging maps is challenge-worthy for every D3 enthusiast as it requires a number of things to be aligned: the data format should be compatible with D3.js, the maps should be drawn in the same projection, and cover the same time period as country or regional boundaries are far from static. I will demonstrate the idea by mashing up two maps: a digitalised map of II Polish Republic from 1934 with European boundaries from 1939.
Off topic: If you sent me an email via the contact form in the past and I have ignored you – believe me I haven’t. It was the Google Mail filter that treated all incoming messages from the blog as spam (until a week ago). If that was a fun or supportive message – please resend! I am also on twitter as @EveTheAnalyst
A pretty specific title, huh? The versioning is key in this map-making how-to. D3.js version 5 has gotten serious with the Promise class which resulted in some subtle syntax changes that proven big enough to cause confusion among the D3.js old dogs and the newcomers. This post guides you through creating a simple map in this specific version of the library. If you’d rather dive deeper into the art of making maps in D3 try the classic guides produced by Mike Bostock.Continue reading “Making a Map in D3.js v.5”
Point-in-polygon is a textbook problem in geographical analysis: given a list of geocoordinates return those that fall within a boundary of an area. You could feed the algorithm a list of all European cities and it will recognise which of them belong to Sri Lanka and which to a completely random shape you drew on planet Earth. It applies to many scenarios: analyses that aren’t based on administrative boundaries, situations in which polygons change over time, or problems that aren’t geographical at all, like computer graphics. Not so long ago, I turned to point-in-polygon to generate a set of towns and villages to plot on a map of Poland from 1933. Such list has not been made available on the web and I wasn’t super keen on typing out thousands of locations. Instead, I used that mathematical cookie-cutter to extract only those locations from today’s Poland, Ukraine, Belarus, and Russia that were present within the interwar Poland boundaries. In this post I will show how to perform a point-in-polygon analysis in R and possibly automate a significant chunk of data preparation for map visualisations.Continue reading “R | Point-in-polygon, a mathematical cookie-cutter”
More often than not geographical data visualisation is performed on a a single country or a cluster of countries rather than on all 195 of them. Just as typically, acquired datasets have more features than what’s needed for the analysis. While D3.js allows for filtering the datasets so that we have full control over the visualisation’s output, the size of original datasets can slow down your website load times. To reduce this impact, datasets can be cropped beforehand. This post will explain how to shrink a standard Eurostat geographical dataset to just a handful of countries with OGR2OGR.Continue reading “Extracting countries from GeoJSON with OGR2OGR”
Summary: Intro | Measuring the importance of gossip | Too good to be true | Arbitrary measures produce arbitrary results | tl;dr
Disclaimer: The probability computation of the article is a recap of a talk delivered by Professor Mark Whitehorn at the University of Dundee in 2015, and at PASS Business Analytics Conference in San Jose, CA in 2014. Opinions expressed are my own.
Aren’t we post-NPS hype yet? Such was my thinking until a random article came up on my feed: as one of its core objectives, a tech giant was planning to improve its Net Promoter Score by 2020. A quick internet search told me there are some companies very excited about increasing their NPS. Google Trends suggests the Net Promoter methodology is on the steady growth rate since 2004; a mortal blow to my presumption. There is something problematic about the Net Promoter methodology that I’d like to talk about: on one hand an indicator of an outstanding business delivery, on the other a possibly dangerous framework for workforce assessment. This article decomposes the NPS algorithm, reviews its criticism, and tests its validity from the probability perspective. I have based the scenario and the probability computation on an excellent talk delivered by professor Mark Whitehorn. If you happen to be a manager, a person whose performance is scored with NPS, you are into probability computations, or simply you like debunking managerial fads then this is a tale for you.
Summary: Intro | A case of tl;dr | Where was the graph police? | A quick fix
Today’s “from scratch” example with D3 is a must-have element of any data visualisation portfolio: a line chart. Line charts are great of visualizing changes in data over time. Just as in the previous posts in the series, my visualisation is a variation of a piece of code I found on the web. I started with a basic template created by Mike Bostock and then re-worked some of its elements to boost its usability & readability. As with the previous examples, all code can be downloaded, reused, adjusted, and it scales up and down to include extra data series or to remove one.
Summary: Intro | A Simple Bar Chart | A Multi-Series Bar Chart
If this post was a painting, it would probably be one of Mark Ryden’s works: it seems I have just gone and done a one detailed blog post. The funny thing is that it’s about bar charts, and everything has already been said about bar charts. In fact a bar chart is a graph so simple, this post should never have been written: yet, the simpleness of a bar chart is actually it’s most dangerous trap. It’s very easy to overdo, and with so few elements it’s tempting to tweak or enhance at least some of them. So this blog post is, above all, about resistance. I will look at what – and why – constitutes as a good bar chart, what are the best practices, and how to fight the horror vacui of a simple plot. We will use D3.js and the blank canvas we have built with zero coding skills in the last post to create a reusable template of a simple bar graph, and then of a multi-series bar graph. This is part of a data visualisation with D3 series, throughout which we will create a set of graphics that can be easily re-purposed for data visualisation projects.
Summary: Intro | About D3.js | Initial Setup & Python Server | Canvas Setup
In the following series I will cover the basics of data visualisation. There are many data visualisation tools available (free & paid versions) on the market, so for an everyday analyst the knowledge of how to build graphs from scratch is not essential. However, most (if not all) of these pre-built tools fall short as soon as any customisation is required: it could be a graph type that is not supported, or the design that cannot be adjusted to follow the company branding guidelines. Therefore, there are cases when the knowledge of how to build something yourself is essential.