XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Hacking Election Maps with XML and MapServer

May 31, 2005

Editor's note: Simon St.Laurent dabbles in map generation here to show you a cool way to hack local maps. If this kind of project has you yearning for more, find your way to O'Reilly's Where 2.0 Conference, where application developers, vendors, and consumer web companies engaged in technologies like GPS, RFID, WLAN, and more, will be gathering to discuss how mapping and location technologies are being used right now, and what we can expect to see in the future. Join us in San Francisco June 29 - 30 to discover who's who and what's what in the location space.

All of those American red-blue political maps are pretty boring, right? After all, it's not likely that the two sides are going to form separate countries, and they produce all kinds of charming stereotypes. The more detailed county-by-county or even the shades of purple maps are nice too, but to be honest, I don't find the regional details that exciting except in my own region. Fortunately, I've managed to come up with a way to create that excitement.

By day, I edit books at O'Reilly Media, Inc. By night, I run a weblog about politics in my town, and I'm the Democratic Committee Chair for the town as well. Sometimes these two sets of challenges mix nicely, giving me the opportunity to apply skills I've learned from the books I edit to the projects I do in local politics. I've been editing Mapping Hacks and Web Mapping Illustrated, and learned a number of ways to create maps along the way, even the maps I want. I'm no GIS specialist, however, as will probably be clear.

First things first: map data

For this project, I didn't want to draw new maps. Instead, I wanted to color in existing maps with the colors of my choice. In my particular area, I'm lucky, because the county GIS office posts a wide variety of map data, including election districts, to the Cornell University Geospatial Information Repository. (The district map is a few years old, which created a few minor issues I was able to ignore because they took place in part of the county where politics is reliably uniform.)

Where 2.0 Conference.

If you don't have access to election district data - and you really need recent districts - you may be forced to resort to tracing a paper map and building your own foundations. In my case, I was mostly lucky. They were in the slightly inconvenient Arc Export (*.E00) format, but I opened them up in Manifold and saved out the shapefile. (When I figure out how to do these kinds of maps using Manifold, a $245 Windows GIS program, that'll be another article, I guess.)

Another option if you can't find district information is to do this at the municipality level. Municipal boundaries also change regularly, but the Census Bureau and other organizations tend to share recent-enough municipal boundary data freely.

When I exported the shapefile, it produced three files: tompkinsElectDists.shp, containing the polygon outlines, tompkinsElectDists.shx, containing offsets, and tompkinsElectDists.dbf, a dBase file containing metadata. Since I needed to find out identifiers to use for the polygons, preferably interpretable ones, I opened the .dbf file in Excel.

The Shapefile metadata
Figure 1. The Shapefile metadata, viewed in Microsoft Excel.

I found my preferred labels in the DISTNAME column - regular, but human interpretable. (As it turned out, I might have spared myself a union query later in an Access version of this if I'd picked ELDISTKEY instead.) All of the towns in the county, and wards in the City of Ithaca, had an abbreviation, and the election districts within them were indicated by a number after a dash. The abbreviations for Wards 1-5 in Ithaca were their Roman numeral equivalents, Dryden was DRY, Enfield was EN, and so on.

So now I had my shapefile data and a key I could use to connect additional information to it.

Generating a first map with MapServer

Following the recommendation in Web Mapping Illustrated, I wandered over to http://fwtools.maptools.org/, where I downloaded the FWTools package. FWTools, maintained by Frank Warmerdam, is a great way to grab a pre-compiled set of mapping tools for Windows or Linux. I'd be using its MapServer installation, as well as the OpenEV viewer and the ogrinfo utility.

For starters, I opened the shapefile in OpenEV, and was rewarded with a view of Tompkins County and its election districts:

Election districts for Tompkins County in OpenEV
Figure 2. Election districts for Tompkins County in OpenEV.

The data, of course, had a surprise for me here. While the metadata for the election district files told me the edges of the maps were from (42.6375N, 76.7W) to (42.25N, 76.2375W), that didn't line up with any of the locations in the file - note the bottom right-hand corner of the OpenEV window, with its 784274.13E. That's a lot of times around the world for longitude.

Fortunately, since I was only using one map layer, I didn't need to worry about figuring out what projection I was working with and changing it - all I needed was to figure out the boundaries of the data. I could do this in OpenEV by moving the cursor to the far corners of the map, but a more precise route uses another tool in the FWTools package, OGR. From the FWTools shell, I asked:

C:\Program Files\FWTools0.9.8\dryden>ogrinfo -all -summary 
tompkinsElectDists.shp

and I was rewarded with:

INFO: Open of `tompkinsElectDists.shp'
using driver `ESRI Shapefile' successful.

Layer name: tompkinsElectDists
Geometry: Polygon
Feature Count: 76
Extent: (789673.630000, 824525.750000) - (913602.630000, 957226.130000)
Layer SRS WKT:
(unknown)
ID: Real (18.0)
AREA: Real (18.6)
PERIMETER: Real (18.6)
[...much more information about fields...]

The key pieces I needed were the extent information. Extent in hand, I could now build a simple MapServer map file for generating whatever maps I wanted of this shapefile. My initial map file - a little ambitious for including labels and a different color for the district in which I happen to live - looked like:

MAP
  SIZE 384 384
  EXTENT 789673.630000 824525.750000 913602.630000 957226.130000
  LAYER
    NAME districts
    TYPE POLYGON
    STATUS DEFAULT
    DATA tompkinsElectDists.shp
    LABELITEM 'DISTNAME'
    CLASSITEM 'DISTNAME'
    
    CLASS
      STYLE
        OUTLINECOLOR 0 0 0
      END
      LABEL 
        MINFEATURESIZE 40
      END

    CLASS
     EXPRESSION 'DRY-4'
     STYLE
        OUTLINECOLOR 255 0 0
        COLOR 255 0 0
     END
     LABEL 
        MINFEATURESIZE 40
     END
    END

    END
  END
END

The map file uses nested text identifiers to describe features. At the top, I list basic information about SIZE (in pixels) and the EXTENT the map should cover, using a reformatted version of the information I got from ogrinfo.

Next, I defined the one LAYER this map will have, giving it the NAME districts, specifying that it's polygons, telling it where to find the DATA, and specifying that both the LABELNAME and the CLASSNAME should be the DISTNAME field I'd found earlier.

After providing the general information about the data sources, I have two CLASSes which tell MapServer how to draw the map. The first is the default, setting an outline color of black for all of the districts, as well as a label where the size of the district seems adequate to hold the label. The second CLASS is specific to my district, identified with the EXPRESSION 'DRY-4', and it sets the OUTLINECOLOR to red and the COLOR to red as well.

While MapServer is frequently thought of as a tool for creating interactive maps on the Web, its shp2img command-line interface makes it easy to create static maps as well. To try this out, I typed:

C:\Program Files\FWTools0.9.8\dryden>shp2img 
-m electionDist.map -o electionDists.png -i PNG

The shp2img program gets the map file from electionDist.map, and will generate a result file election.png in the PNG format. The result looks like:

Election districts for Tompkins County
Figure 3. Election districts for Tompkins County, generated by MapServer.

It's not cartographic genius by any means, but it's a good enough foundation to build on.

Gathering election information

For my first map, I figured I'd plot the results of the 2004 election in Tompkins County. Tompkins County is a blue (Democratic) county in a blue state (New York), but just from talking with people around the county you'd know that blueness is far from uniform. The further you get from the City of Ithaca, the more conservative it generally gets. I wanted a little more detail than that general sense, however.

I probably could have asked my local Board of Elections for the 2004 election data district by district in some convenient form, like Excel or CSV, but since I was doing this late on a Friday night, I printed out the PDF with the results and added up the Republican and Conservative lines for George Bush and the Democratic and Working Families lines for John Kerry. (New York State is fond of having many party lines on our proudly full-face ballot.) Then I entered them into an Excel spreadsheet, producing this:

Election results for Tompkins County
Figure 4. Election results for Tompkins County in Excel.

The raw votes are in the BUSH and KERRY columns. I turned these into percentages in the BUSHPER and KERRYPER columns. (Sorry, I left out the third-party candidates for this simple red-blue exercise.) Next, I multiplied those by 255 - one of the nice things about this whole red/blue thing is that RGB colors can easily be specified as Red value, Green zero, and Blue value. The last column used an IF calculation to list the winner in that particular election district. There was one small hitch, because two districts (2 and 8 in Lansing) were recorded with a single number. Since I was charting percentages, I just split the votes in half to cover both districts.

Originally I was thinking I would combine this with the DBF file from the shapefile, but it proved easier to go a different route. I saved my Excel spreadsheet out as SpreadsheetML, then hit that with the following XSLT stylesheet:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
>
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes" />
<xsl:template match="/">
<xsl:apply-templates select="ss:Workbook"/>
</xsl:template>
<xsl:template match="ss:Workbook">

<districts>
<xsl:apply-templates select="ss:Worksheet[@ss:Name = 'tompkinsElectDists']"/>
</districts>

</xsl:template>
<xsl:template match="ss:Worksheet">
<xsl:apply-templates select="ss:Table" />
</xsl:template>
<xsl:template match="ss:Table">
<xsl:apply-templates select="ss:Row[position( ) > 1]" />
</xsl:template>
<xsl:template match="ss:Row">
<dist>
<distID><xsl:apply-templates select="ss:Cell[1]" /></distID>
<distName><xsl:apply-templates select="ss:Cell[2]" /></distName>
<bush><xsl:apply-templates select="ss:Cell[3]" /></bush>
<kerry><xsl:apply-templates select="ss:Cell[4]" /></kerry>
<bushPercent><xsl:apply-templates select="ss:Cell[5]" /></bushPercent>
<kerryPercent><xsl:apply-templates select="ss:Cell[6]" /></kerryPercent>
<red><xsl:apply-templates select="ss:Cell[7]" /></red>
<blue><xsl:apply-templates select="ss:Cell[8]" /></blue>
<winner><xsl:apply-templates select="ss:Cell[9]" /></winner>
</dist>
</xsl:template>
</xsl:stylesheet>

All this really does is strip the SpreadsheetML down to a more natural XML, which looks in part like:

<districts xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
<dist>
<distID>414</distID>
<distName>CAR-1</distName>
<bush>139</bush>
<kerry>169</kerry>
<bushPercent>0.45129868664614659</bushPercent>
<kerryPercent>0.54870129691980096</kerryPercent>
<red>115</red>
<blue>139</blue>
<winner>KERRY</winner>
</dist>
<dist>
<distID>410</distID>
<distName>CAR-2</distName>
<bush>272</bush>
<kerry>342</kerry>
<bushPercent>0.44299673545607926</bushPercent>
<kerryPercent>0.5570032564218188</kerryPercent>
<red>112</red>
<blue>142</blue>
<winner>KERRY</winner>
</dist>

Creating maps from the data

For my next trick, I used a second XSLT stylesheet to turn those results into a MapServer map file. Map files are text files, but XSLT doesn't mind, thanks to xsl:output. I was able to take the map file I'd created to display the district boundaries, and tinker with it a bit to produce a map file that shows the district contents as well.

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
><xsl:output method="text" encoding="UTF-8" /><xsl:template match="districts">MAP
  SIZE 384 384
  EXTENT 789673.630000 824525.750000 913602.630000 957226.130000
  LAYER
    NAME districts
    TYPE POLYGON
    STATUS DEFAULT
    DATA tompkinsElectDists.shp
    LABELITEM 'DISTNAME'
    CLASSITEM 'DISTNAME'

<xsl:apply-templates select="dist"/>

  END
END
</xsl:template>

<xsl:template match="dist">
CLASS
EXPRESSION '<xsl:apply-templates select="distName" />'
STYLE
   OUTLINECOLOR 255 255 255
   COLOR <xsl:choose><xsl:when test= "winner = 'BUSH'">255 0 0</xsl:when>
         <xsl:when test= "winner = 'KERRY'" >0 0 255</xsl:when></xsl:choose>
 END
 LABEL 
   MINFEATURESIZE 40
 END
END

</xsl:template>
</xsl:stylesheet>

The results, abbreviated, look like:

MAP
  SIZE 384 384
  EXTENT 789673.630000 824525.750000 913602.630000 957226.130000
  LAYER
    NAME districts
    TYPE POLYGON
    STATUS DEFAULT
    DATA tompkinsElectDists.shp
    LABELITEM 'DISTNAME'
    CLASSITEM 'DISTNAME'


CLASS
EXPRESSION 'CAR-1'
STYLE
   OUTLINECOLOR 255 255 255
   COLOR 0 0 255
 END
 LABEL 
   MINFEATURESIZE 40
 END
END


CLASS
EXPRESSION 'CAR-2'
STYLE
   OUTLINECOLOR 255 255 255
   COLOR 0 0 255
 END
 LABEL 
   MINFEATURESIZE 40
 END
END

...

 END
END

Next, I ran this map file through MapServer with shp2img -m electionRedBlue.map -o electionRedBlue.png -i PNG. The results look like:

Election districts for Tompkins County with 2004 red/blue
Figure 5. Election districts for Tompkins County with 2004 red/blue, generated by MapServer.

Doing the same thing in shades of purple is easy. Just use a slightly different stylesheet, using the red and blue elements rather than setting the color based on the winner. The stylesheet looks like:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
><xsl:output method="text" encoding="UTF-8" /><xsl:template match="districts">MAP
  SIZE 384 384
  EXTENT 789673.630000 824525.750000 913602.630000 957226.130000
  LAYER
    NAME districts
    TYPE POLYGON
    STATUS DEFAULT
    DATA tompkinsElectDists.shp
    LABELITEM 'DISTNAME'
    CLASSITEM 'DISTNAME'

<xsl:apply-templates select="dist"/>

  END
END
</xsl:template>

<xsl:template match="dist">
CLASS
EXPRESSION '<xsl:apply-templates select="distName" />'
STYLE
   OUTLINECOLOR 255 255 255
   COLOR <xsl:apply-templates select="red" /> 0 <xsl:apply-templates select="blue" />
 END
 LABEL 
   MINFEATURESIZE 40
 END
END

</xsl:template>
</xsl:stylesheet>

Now the CLASSes in the resulting map file will look more like:

CLASS
EXPRESSION 'DRY-4'
STYLE
   OUTLINECOLOR 255 255 255
   COLOR 72 0 182
 END
 LABEL 
   MINFEATURESIZE 40
 END
END

The COLOR property now has values in both the red and blue values. When I generate a map using this combination of red and blue, I get a lot of shades of purple.

Election districts for Tompkins County with 2004 purple
Election districts for Tompkins County with 2004 purple, generated by MapServer.

More to do

This is just a start, but I've already reused these techniques once, to generate a related map of voter registration patterns using data I had in Microsoft Access. I used a similar approach, gathering my data there and exporting it as XML before feeding it through the stylesheets and into MapServer.

I've focused on MapServer even though it may not be the ideal tool for this project for a number of reasons. First, it's free, even though most of the rest of the tools I'm using in this project aren't. Second, I'm planning to use it for a lot more information, and this is an excellent way to get practice with map files and the process of using map data.

Eventually I'll look further into ways to create these maps with desktop applications like Manifold, Quantum GIS, or uDig. For now, though, I have some interesting maps and a flexible set of ways to create more.



1 to 2 of 2
  1. not metadata
    2006-01-17 16:28:38 Brian Wilson
  2. Incorrect Map File Layout
    2005-12-21 17:50:13 moea
1 to 2 of 2