Merging Historical Maps in D3.js v.5

Spoiling you as usual, I have another exciting D3 example for today: merging historical maps! I’ve been meaning to cover this topic ever since I developed a similar project for my Master’s thesis 3 years ago. Merging maps is challenge-worthy for every D3 enthusiast as it requires a number of things to be aligned: the data format should be compatible with D3.js, the maps should be drawn in the same projection, and cover the same time period as country or regional boundaries are far from static. I will demonstrate the idea by mashing up two maps: a digitalised map of II Polish Republic from 1934 with European boundaries from 1939.

Off topic: If you sent me an email via the contact form in the past and I have ignored you – believe me I haven’t. It was the Google Mail filter that treated all incoming messages from the blog as spam (until a week ago). If that was a fun or supportive message – please resend! I am also on twitter as @EveTheAnalyst

This is where we will get at the end of the exercise:

Data preparation tasks

#1 Find historical maps

We need two historical maps that cover a similar area in the same or close-enough time period. Sourcing historical material is a rather complicated task if you are not a GIS professional but only rely on the Internet. The choices on the web in regard to historical material are very limited, although more and more archives and historical institutes open their datasets to the public. For this project, I was lucky to find maps that are from a decently documented period of history and only within 4 years from each other. Those few years will produce visible inconsistencies in the map visualisation as, especially in turbulent times, country frontiers can change significantly just over months.

My sources will be:

Interwar Poland, 1934: a TopoJSON file containing country and district borders of the Second Polish Republic in 1934. The map has been created by Paul Dziemela at the University of Wisconsin and published under the Open Data Commons Public Domain Dedication and License in 1997.

European borders of 1939: a shape file with European borders in 1939. The file was created by Michael De Groot at Stanford University in 2010 as part of the The Spatial History Project initiative and made available for download for non-profit educational uses.

Go ahead and download the files if you plan to follow the example.

#2 Decide on the time period

At times, geographical sources will come as part of a database spanning a larger period of time. That’s the case of the Stanford collection. The package includes maps ranging from 1938 to 1944. As the Polish dataset solely covers the state in 1934 we should aim to get a map that’s closest to that period.

The base files we will use are:

  • April_30_1938.shp
  • Interwar_Poland_1934_20142.json

#3 Understand the file format

Once the maps are decided we need to validate that their format is D3.js-compatible.

The interwar Poland file is coded in a JSON format, TopoJSON to be precise. TopoJSON looks like GeoJSON , but instead of storing a bunch of coordinates it treats the data as one topology, keeping the geometries in a massive arcs array. In Mike Bostock’s words, TopoJSON is substantially more compact than GeoJSON. TopoJSON can also be more efficient to render since shared control points need only be projected once. Both TopoJSON and GeoJSON are D3.js friendly.

European borders are stored in a shapefile. A shapefile is an Esri (I love Esri) vector data storage format for storing the location, shape, and attributes of geographic features. You will notice that for April 30th 1938 there are 7 files with various extensions: .dbf, .prj, .sbn, .sbx, .shp, .shp, .xml, and .shx. Without going into too much details, the files we should focus on are .shp, and .prj. .shp is the feature geometry itself that we will use to draw the map. .prj is the projection description that includes information about the coordinate reference system used. The shapefile format needs to be translated to a different format if we are to use it with D3.js.

I have decided to use GeoJSON for this project as I might have to adjusts bits and pieces on the map and using coordinates is just simpler. TopoJSON is a great choice too – it’s all up to you.

The great news is that geographical data format translation has never been easier! To move a file between formats simply upload it to mapshaper.org and download it in the desired format. Additionally, Mapshaper will show you exactly how the map looks like.

That’s how the 1938 European boundaries look pasted to Mapshaper:

At the end of this step you should have two files:

  • April_30_1938.json
  • Interwar_Poland_1934_20142.json

#4 Decide on the visualization set

We have talked about it in the past – the less data we load in, the faster our website will render the visualisation. Let’s review the datasets and decide which geographical features to keep.

I would say that the European map is ready – the area is also cropped to the subset of countries we are interested in, so there is no need for further filtering.

That said there is always a potential to cut some things out and the console utility in Mapshaper can help you with exactly that. For example to keep only Germany and Poland, type:

-filter 'Name=="Poland" || Name=="Germany"'

To exclude Iceland and Portugal from the set, run:

-filter 'Name!="Iceland" || Name!="Portugal"'

You can also use the ogr2ogr utility as described in my earlier post, Extracting countries from GeoJSON with OGR2OGR.

In regard to the Polish dataset, you will notice that Mapshaper recognizes 4 layers in the file, namely Towns, Countries, Palatinates, and Districts:

The layer we are interested in are the Palatinates. Let’s download the single layer and save it as palatinates.json.

At the end of this step you should have two files:

  • April_30_1938.json
  • palatinates.json

#5 Standardize the map projection

In order to merge the maps, we need to bring them to the same map projection. A map projection is a way to visualise the earth surface and always requires distortion of some sort (it really would be easier if the world was flat). As the focus of this project is the area occupied by Poland in 1934, we should pick a projection that centers around that region. I will keep the original projection from Dziemela’s project but feel free to consult D3 in Depth’s Geographic Projection Explorer to choose the projection, center, scale, and zoom that you like best.

Before we can reproject a map, we need to understand what is it’s current projection system.

In the Polish boundaries JSON file declaration we read:

"crs":{"type":"name","properties":{"name":"urn:ogc:def:crs:OGC:1.3:CRS84"}}

crs stands for Coordinate Reference SystemCRS:84 can be decoded as WGS84, i.e. World Geodetic System 84, and is an equivalent to EPSG:4326EPSG is a standard way of describing a coordinate system, invented by European Petroleum Survey Group. The standard is currently maintained by Petrotechnical Open Software Corporation.

There are some great open databases that aid the geographical systems research. To get the EPSG code for WGS84 I ran the query through EPSG.IO, an open-source web service with a database of coordinates systems provided by Klokan Technologies. I also found Spatialreference.org pretty helpful in similar searches.

The European borders map specifies the coordinate system in the .prj file. The .prj file is a simple text file that can be opened in a notepad. April_30_1938.prj reads:

PROJCS["Europe_Albers_Equal_Area_Conic",GEOGCS["GCS_European_1950",DATUM["D_European_1950",SPHEROID["International_1924",6378388.0,297.0]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Albers"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",10.0],PARAMETER["Standard_Parallel_1",43.0],PARAMETER["Standard_Parallel_2",62.0],PARAMETER["Latitude_Of_Origin",30.0],UNIT["Meter",1.0]]

We get a lot of information here. Sometimes the EPSG code is saved under Authority tag, but not in this case. Instead, the projection is stored under the PROJCS attribute (that literally translates to PROJected Coordinate System). Here, it’s Europe Albers Equal Area Conic. That’s great – metadata is great! – but how does it relate to the other dataset’s projection? We need to have it translated to a common deliminator. Since we know the EPSG code for the map of Poland, let’s find one for Europe.

Typing “Europe Albers Equal Area Conic” to EPSG.IO returns the code EPSG:102013. That was easy.

If in doubt, use R. R can read shapefiles and will print the exact projection of the database. Just fire up RStudio and install the rgdal library. Navigate to the folder with your shapefiles and run:

mymap <- readOGR("<path>/April_30_1938")

#note it puts together all files associated with this name not only .shp

mymap@proj4string

When I ran the program, I got the following CRS arguments:
+proj=aea +lat_1=43 +lat_2=62 +lat_0=30 +lon_0=10 +x_0=0 +y_0=0 +ellps=intl +units=m +no_defs

Which matches the proj.4 definition associated with EPSG:102013.

Now that we have the source projection of the European dataset, we can translate it to the system that the Polish map is using. I will use ogr2ogr for this task – a handy command-line tool to which I have dedicated a whole post the past: Changing dataset projection with OGR2OGR.

ogr2ogr europe.json -a_srs 'EPSG:102013' -t_srs 'EPSG:4326' April_30_1938.json

The convention is: ogr2ogr <output file> -a_srs ‘<original projection>’ -t_srs ‘<new projection>’ <input file>

(You can also run the command directly on the shape file).

At the end of this step you should have two files:

  • europe.json
  • palatinates.json

Yay, the maps are ready!

Drawing the maps in D3.js

#1 Drafting maps in Mapshaper

We’ve already drawn a map in the chapter Making a Map in D3.js v.5 so some of the parts will come as no surprise. However, plotting two maps on the same svg creates some new challenges.

If we just plot both maps we will end up with overlapping boundaries – as both carry the Polish borders in them. As the files present the world 4 years apart they are likely to have at least slightly different outlines. Merging our sets would produce the following:

To get rid of this horrid effect we need to remove the outer boundaries from the Interwar Poland set. This is incredibly easy achieved with… again, Mapshaper (also available as a command line tool).

Removing the outer boundaries is done with a single command innerlines:

The command will produce a set of polylines. The data can be downloaded in a GeoJSON format. I saved it as innerlines.json.

At the end of this step you should have three files:

  • europe.json
  • palatinates.json
  • innerlines.json

#2 Drawing maps in D3.js

I won’t go through the general mechanics of map making in D3.js v.5 as these are covered in my previous post. I will only highlight some bits and pieces that are particularly tricky when merging two maps.

Here’s the full code – scroll down for comments!

map.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>My map</title>
<script type="text/javascript" src="https://d3js.org/d3.v5.min.js"></script>
<link rel="stylesheet" type="text/css" href="map.css">
<style></style>
</head>
<body>
<div id="container" class="svg-container"></div>
<script type="text/javascript">
var w = 1400;
var h = 700;
var svg = d3.select("div#container").append("svg").attr("preserveAspectRatio", "xMinYMin meet").style("background-color","#4365bd")
.attr("viewBox", "0 0 " + w + " " + h)
.classed("svg-content", true);
var projection = d3.geoMercator().translate([w/2, h/2]).scale(2600).center([22,52]);
var path = d3.geoPath().projection(projection);

// load data
var europe_data = d3.json("Sources/europe.json");
var region_data = d3.json("Sources/innerlines.json");
var area_data = d3.json("Sources/palatinates.json");
var layer1 = svg.selectAll('path');
var layer2 = svg.selectAll('path');

Promise.all([europe_data, region_data, area_data]).then(function(values){

// draw the map of Europe
var europe = layer2.append("g")
.data(values[0].features)
.enter()
.append("path")
.attr("class","europe")
.attr("d", path);

// draw the map of Poland
var poland = layer1.append("g")
.data(values[0].features.filter(function(d){return d.properties.Name=="Poland"}))
.enter()
.append("path")
.attr("class","poland")
.attr("d", path);

// draw the Polish regions
var regions = layer1.append("g")
.data([values[1]])
.enter()
.append("path")
.attr("class","regions")
.attr("d", path);

// print the country names
var countrynames = svg.selectAll("text")
.data(values[0].features)
.enter()
.append("text")
.text(function(d) {return d.properties.Name;})
.attr("transform", function(d) { return "translate(" + path.centroid(d) + ")"; })
.attr("dx", "-3em")
.attr("dy", "0.5em")
.attr("class","countrynames");

// print the region names
var regionnames = regions.append("text")
.data(values[2].features)
.enter()
.append("text")
.text(function(d) {return d.properties.PALATINATE_NAME;})
.attr("transform", function(d) { return "translate(" + path.centroid(d) + ")"; })
.attr("dx", "-2em")
.attr("dy", "0em")
.attr("class","regionnames");
});

</script>
</body>
</html>

map.css

.europe {
fill: #e2d2ab;
stroke: #bc9a42;
stroke-width: 0.5;
}

.regions {
fill: none;
stroke: #967b35;
stroke-width: 0.5;
}

.poland {
fill: #cdb370;
stroke: #bc9a42;
stroke-width: 0.5;
}

.regionnames {
text-transform: uppercase;
font-family: "Franklin Gothic Book", Arial;
font-size: 12px;
fill: #967b35;
}

.countrynames {
font-family: "Franklin Gothic Book", Arial;
font-weight: 300;
text-transform: uppercase;
letter-spacing: 4px;
font-size: 16px;
fill: #967b35;
}

Insight #1 Getting the order right

I found it very tricky to get the order of the layers right. As promise mechanism is asynchronous, we cannot control which data set gets rendered first. To make matters worse, the old trick with setting layers and appending maps to them does not seem to work anymore. In the end, only a combination of things seem to have worked: the layer definition and the right order of drawing. You will notice that in the drawing order I am starting with the bottom visualisation – the map of Europe, eventually moving up to the inner boundaries of the regions. That, together with the layer mastery, had resulted in the desired effect.

Insight #2 Polish borders are drawn separately

The reason for singling out Poland from the main set was to differentiate the focus area from the rest of Europe. I used a darker colour to emphasize the location.

Filtering Poland from the GeoJSON collection is achieved with a single line:

.data(values[0].features.filter(function(d){return d.properties.Name=="Poland"}))

Insight #3 Polylines not polygons!

Note that when we removed the Polish borders from the original map, we essentially got rid of polygons, and got a set of lines instead. Polylines have to be treated differently in the visualisation as we no longer can access the features of a polygon. Instead, to draw the regional lines, data is called as .data([values[1]]).

Insight #4 Names are centered

You can center the names of the countries with a useful translate function:

.attr("transform", function(d) { return "translate(" + path.centroid(d) + ")"; })

I don’t think this worked greatly in my visualisation – some of the country names should be adjusted manually.

Insight #5 

Perhaps the biggest problem of the map is the visible map discrepancy: some of the region boundaries clearly don’t close on the country border. I think this could be manually adjusted in the json file to create a sleeker (but further from the truth) visual effect.

#3 Enjoy the visualisation

If you didn’t get distracted or stuck somewhere along the way, this is how your visualisation should look by the end of the exercise:

Hope you enjoyed the post. Let me know in the comments how you’d improve the process or the visualisation itself!

You can also check out the visualisation in action on bl.ocks.org. 

Eve

Leave a Reply