Category Archives: Uncategorized

New home of the blog

I’ve forgotten to mention that I’ve been posting on a new version of this blog at

The new home, powered by the Jekyll, allows me to embed web maps in posts, pages and other customizations.

I’ll migrate the content on here over there one day.

Updating an alias to update a developmental install of mapbox-studio

(this assumes that you’re comfortable with some knowledge of github, git, and using linux).

I’ve been using mapbox’s mapbox-studio (previously known as tm2) and while it was going rapid development, I was tired of doing updating it (by using git ) nd reinstalling it, so
I made a bash alias (alias is a shortcut that you make in linux for your command line) to pull updates from mapbox’s mapbox-studio to my local fork, delete the node_modules as mapbox suggests, pull the newest changes from mapbox (upstream) on github, then reinstall mapbox-studio through npm.

here’s the alias, 3 commands that are instructed to run after the first one is completed:

alias udtm2='cd ~/prg/tm2/ rm -rf ~/prg/tm2/node_modules && git pull upstream mb-pages && npm install'

(by using the && after a command, the second following command won’t run until the first is completed! neato trick!)

Within the past couple weeks, mapbox renamed the repository from tm2 to mapbox-studio and they also switched the default branch from master to mb-pages.

So, when I ran my bash alias this morning: I received this error:

fatal: Couldn’t find remote ref master

So, I fixed my bash alias by doing 2 things:

1] First, I went into the .git folder within my local repo,

Edited the text that said and replaced master with mb-pages in both places.
remote = origin
merge = refs/heads/master

2] Updated my alias.

You can simply update an alias by
alias udtm2=’rm -rf ~/prg/tm2/node_modules && git pull upstream mb-pages && npm install’

Now, I can update my mapbox-studio by a simple command, udtm2 :)

Defining a neighborhood’s identity in cleveland – rough draft

(this is a rough draft)

Defining neighborhoods and discussion of Cleveland’s neighborhood names and boundaries has a discussion on led me to and gather up some of my thoughts and observations from the recent years.

[quote author=8ShadesofGray link=topic=2492.msg712498#msg712498 date=1403364432]
Time will tell whether the name Hingetown will stick Not the case in Ohio City, but I’d argue at this point, Gordon Square probably has a stronger brand than Detroit Shoreway and Asiatown a much stronger brand than the city’s official designations of the area as Goodrich-Kirtland or Payne-Sterling.

(context: Hingetown, a name for a neighborhood based on West 29th and Church Ave, where Rising Star Coffee is located).

As for Hingetown, I’m not a fan of the name itself (aesthetically, like ‘hingetown?’) but understand its functionality and Graham’s motivation for it. North of Lutheran Hospital unfortunately still has a stigma (I am not saying that it is justified) of being unsafe. Instead of strongly tying in with Ohio City, he decided to create a new name (to fight the stigma) and perhaps he thought it was OC was too geographically large and it needed a sub neighborhood (I could agree with the latter point).

A single business owner creating the neighborhood’s name isn’t the healthiest to do (did he consult any other local stakeholders? the residents, other businesses in the area?) I’ve seen the name used so far by Ohio City’s twitter and coolcleveland: that’s about it. The fact that it is very uniquely positioned: right across from the DS Bridge and leading to downtown is really the only unique characteristic that I see it distinct from the rest of Ohio City.

With regards to defining neighborhoods; geographical features are certainly an influence but it is not the only one. (They also act as borders; for example, the wide cliff between Brooklyn Centre and Old Brooklyn). Housing stock/age, businesses, other establishments that are unique to an area are also key influences. Most importantly, the residents of these neighborhoods – what should be the largest influence to determine a neighborhood’s name, aren’t using the names or self-identify as residents of that neighborhood.

Definitely agree with you on that point. City of cleveland planning officially has ‘statistical planning areas’ (36!) that are described by the city to be functionally equivalent ‘neighborhoods’.

In the case of the City of Cleveland, their influence of defining neighborhoods is minimal.

In several cases (corlett, jefferson, goodrich-kirtland, Euclid-Green), the names appear in only city planning documents.

Others (fairfax, Cuddell) have more use and identity as a neighborhood: they’re used by the CDCs and in the names of parks/Rec Centers/Public Libraries, maybe a local business or 2. (The level that they’re used by local businesses, stakeholders, and residents vary).

Then, I’d argue there’s a 3rd tier, others on that list (mount pleasant, hough, ohio city, old brooklyn, west park, tremont, collinwood) are extremely popular, used in the name of local businesses or stakeholders (churches, local non-profits), have numerous signage in the area that identify the neighborhood, and residents identify as being from there.

How I measured Cleveland’s road length (with postgis and OSM)

A couple weeks ago, I had been interested to know how how many miles of roads there are in Cleveland with all of the complaining that people have about the amount of potholes there are in Cleveland and because, well, I was curious.
By roads, I am referring to roads that are publicly accessible to vehicles.

I have been an active contributor to OpenStreetMap (OSM), a global geographic data that anyone can edit (think the wikipedia of google maps) in the Cleveland area, so Cleveland’s data was very current: I had the updated all of the changes the InnerBelt and other semi-permanent road closures in Cleveland, and a few new streets and re-openings (the roundabout in the flats for example, West 3rd Bridge). It is likely the most current database of Cleveland roads that exists, even compared to the county’s own database.

The following will guide how to do the same analysis for your city. Note, some cities may not be as current as Cleveland is in OSM.
(This assumes you know how to use postgis and osm2pgsql, and are able to create a osm2pgsql database that only contains your city. To learn how to create a postgis database of OSM data only containing your city via osm2gpsql, read this tutorial that I wrote:

This tutorial is also ideal for people who are learning how to do some basic queries in postgis/postgresql from OSM data as I explain what columns and functions I use.

So, once you created your database of your city:

First, I crafted this query after a lot of trial and error to ensure that I was selecting all of the roads in Cleveland:

select highway, name, way, st_length(st_transform(way,3637)) AS length FROM planet_osm_line WHERE highway NOT IN ('construction', 'footway', 'path', 'steps', 'track', 'cycleway', 'pedestrian', 'abandoned', 'disused') AND (service NOT IN ('parking_aisle', 'driveway') OR service is null) AND (access NOT IN ('no', 'private') or access is null)


So, what does this SQL all mean?

select highway, name, way, st_length(st_transform(way,3637)) AS length FROM planet_osm_line

I selected the highway and name columns. These columns are created in osm2pgsql, filled with values of each node/line in OSM (In OSM, objects are represented with tags, which is written out as “key=value”) for example, the street that the Simpsons live on could have the tags: highway=residential and name=Evergreen Terrace)

planet_osm_line is the name of the table (generated by osm2pgsql) that contains all of this data. The ‘way’ column contains the geography (the coordinates of where these lines are located). This ‘way’ column is generated by osm2pgsql.

You can use this query to browse some different streets in your city and make sure that all of the streets that you wish to include are listed..

Don’t worry if you see a street listed twice, it’s not that it was included one too many times, but a road that a single, straight way in real life may consist of 2 or 3 connected highways each with the same name in OSM, depending on a variety of factors (For example, part of the road may be on a bus route, part of the road may be one-way)

st_length measures the geometry that is inside the parentheses. What does it measure in centimeters? nautical miles? The units it measures is based on the coordinate system that is inside it (in our case, meters). At first, I just tried st_length(way) but I was using a test/sample road near where I grew up to ensure my results would be correct). With st_length(way), I was receiving 862 meters… not good.
I had measured my test/sample road in google maps and in Josm, using its measurement plugin.. My test/sample road was 646 meters with josm’s measurement plugin.

my test/sample road query:

select highway, service, way, st_length(st_transform(way,3637)) AS length FROM planet_osm_line WHERE name in ('Name of the street Ave');

Back to the st_length… I realized that’s because st_length was using the projection that osm2pgsql had my database in, 900913 (this is set by default).. So, I had to change the geometry to another projection system. ..I found that I could do this by using st_transform – , and inputting the column with the geometry and the coordinate system.
I knew that Northern Ohio was 3637 (EPSG), so I decided to transform to that …. and I was right on the money! 647.247 meters. A little more than 1 meter off on 600 meters.. I’ll take that margin of error :)

WHERE highway NOT IN ('construction', 'footway', 'path', 'steps', 'track', 'cycleway', 'pedestrian', 'abandoned', 'disused')

This selects all lines that have the “highway” key are not tagged as highway=construction, footway, path, track, cycleway, pedestrian, abandoned, or disused. Steps, dirt paths, sidewalks (known as footways in OSM lingo), cycleway (Dedicated bike paths), and other very specific tags are not included in my query and I was not trying to measure these.

As mentioned earlier, I just wanted to measure ways that were open to the public, ones that people would drive on, that are publically maintained.

AND (service NOT IN ('parking_aisle', 'driveway') OR service is null) AND (access NOT IN ('no', 'private') or access is null)
I didn’t want to include ways that were driveways (which are tagged as service=driveway) or the roads withing parking lots (known as service=parking_aisle in OSM). In most instances, a highway will not have a service tag, so I also included or service is null. I also didn’t want roads that were not accessible to the public (roads inside of inside industrial plants in Cleveland’s industrial valley).

So… My selection of roads is what I want, so I now had taken the task of how to add them all up together…

Now to find the length of these:

I didn’t know the proper syntax for sum, so I tried a couple things like:
select highway, service, way, st_length(st_transform(way,3637)) AS length, sum(length) FROM planet_osm_line


select highway, service, way, sum(st_length(st_transform(way,3637)))
– received message in pgadmin that “column “planet_osm_line.highway” must appear in the GROUP BY clause or be used in an aggregate function LINE 1: select highway, service, way, sum(st_length(st_transform(way…

After reading about aggregate functions, I learned that I couldn’t include highway, service, and way in the first part of the query,
and I realized, I didn’t need to at this point; I just wanted the sum.

select sum(st_length(st_transform(way,3637))) from planet_osm_line where highway NOT IN ('construction', 'footway', 'path', 'steps', 'track') AND (service NOT IN ('parking_aisle', 'driveway') OR service is null) AND (access NOT IN ('no', 'private') or access is null)

And this query worked! It gives the sum, 2,361,057 meters which is 1467.09 miles.

But if you want to break it down by what kinds of roads there are, you need to use a group by.

select highway, sum(st_length(st_transform(way,3637))) from planet_osm_line where highway NOT IN (‘construction’, ‘footway’, ‘path’, ‘steps’, ‘track’, ‘cycleway’, ‘pedestrian’, ‘abandoned’, ‘disused’) AND (service NOT IN (‘parking_aisle’, ‘driveway’) OR service is null) AND (access NOT IN (‘no’, ‘private’) or access is null) group by highway

I’m grouping by the highway column, which is the from the highway tag in osm.

highway | sum
unclassified | 91555.5506847939
primary | 79289.4305883909
secondary | 108610.45598608
motorway | 148936.906333119
tertiary | 190819.531876563
tertiary_link | 1749.67663661471
motorway_link | 104634.677558715
secondary_link | 2884.04084138583
primary_link | 372.479453999729
service | 140106.53820064
residential | 1480741.60563354
(11 rows)

(in metric):

Or in other terms:

Freeways: 92.54 miles
Freeway on/off ramps: 65.01
service roads (alleys): 87 miles
residential streets: 920 miles
main arteries (example: Chester, Lorain, East 55th, Pearl), all of these roads in Cleveland are 3+ lanes): 238.43
unclassified (two lane roads with no/few houses on them): 57 miles

Results and caveats:
I was bit surprised that freeway on/off ramps were that long.
Roads that have a physical separation between them (Chester, Euclid) are counted twice, because they are classified as two separate roads in the OSM database.
These results include roads within cemeteries! I’m working to update my results so they are not included.

Anthony Bennett’s historically poor start

(I normally don’t write about sports on here, but I watch it from a distance and have as little emotional investment.

I have given up on Anthony Bennett.

Giving up on Anthony Bennett’s potential within 3 months of the season seems premature but no respectable* player in NBA’s modern era (86-87 to present, that’s all I can find stats for) has started out as poorly or had as long of a sustained stretch of mediocrity as Anthony Bennett has had.

Quantifying someone’s contribution to an nba game is subjective, but one stat that is out there is a someone’s game score. I know it’s not perfect because it doesn’t highlight the contributions of defensive-minded players like bruce bowen, kurt thomas, Ben wallace, etc. But it’s used quite often and represents how a player contributes to a game.
(here’s the formula for it: search for GmSc)

Let’s take a look at Anthony Bennett so far this year.

He has only had 1 game with a game score above 5 (dec. 31, 2013, vs. pacers) and in all but 2 these 27 games, he’s played more than 5 minutes.
Anthony Bennett has had streaks of 15 and 10 consecutive games with this mediocre performance of game score less than 5.

So, here’s a list of players who have had the most consecutive games of a game score below 5 and have at least played 5 minutes in a game… The players that Bennett has had company with:

1988-89 through the 93-94 seasons

94-95 through 99-2000

00-01 through 04-05

04-05 through 08-09

08-09 through 13-14

Heck, if we remove the 5 minute requirement, Bennett’s streak was at 24 games!

Here are those lists for consecutive game scores less than 5, regardless of minutes played.
08-09 through 13-14:

(Look whose name are on there! Former Cavs Sasha Pavlovic and Diop; fellow recent lottery draft busts Jan Vesely and Austin Rivers…) Not far down is another bust, Austin Rivers…)

02-03 to 07-08:

96-97 to 01-02

90-91 TO 95-96

Out of these lists of hundreds of players who have played as lousy for such a sustained stretch, does anyone respectable appear? Do you even recognize any of those names?! (Yes I do, and that’s not a compliment to me..)
Would you want any of those players on your team or heck, even as a starter who would also receive minutes in crunch time and the fourth quarter?

What respectable players share these honors with AB? Mike Bibby, Ben Wallace (who, I’d argue, was the best defensive player in the 2000s, is an anomaly since his strength was so much on defense and took very few shots, leading to a low game score), Lou Williams, Rashard Lewis, Andrew Bynum, and Elden Campbell, are some notable names out of the hundreds who appeared on those lists. There’s a few more (Shawn Kemp) whose names appear AFTER they hit their prime and were then scrubby role players.

What lottery picks are on this list? Anyone who was deserved to be taken as a #1 pick?!

Fact is, only a very small percentage of players who were at least as good as a respectable starter have ever played that poorly over such a sustained stretch at any point in their career.

Despite such a small sample size, AB has already committed such a sustained stretch of mediocrity that very few NBA players have experienced regardless of the success of their career.

His probability to be a future star or even a viable starter or key rotation player in the NBA is dwindling by the day.

Follow along as I create a map of where I’ve been in 2013

(Follow along in my process of creating a simple heat map – from initial idea, to brainstorming, trial and error, coding, designing, and more trial and error in between)

Before I start:
goal: create a heat map that displays all of my traces of where I was in 2013.

what’s the map’s purpose: – create something purdy and find out where I have been the most in 2013.

Context: In 2013, I had taken 140 gps traces* using osmtracker, my favorite gps logging software, on the android. Nearly all traceswere taken while I was driving or riding my bicyle.
(This does not include traces that I had taken in Haiti in May-June while with the Humanitarian OpenStreetMap Team.

Here’s some brainstorming:
– what tool to use? Most of my projects and cartographic exploration have been with Tilemill.
In this case, using tilemill isn’t a smart move here, because I had hundreds of layers that i’d have to manually add.

I thought of manually combining all of the traces together – which I did using this simple bash script and a
gpx file manipulator called gpsbabel:

gpsbabel -i gpx $(echo $* | for GPX; do echo " -f $GPX "; done) \
-o gpx -F appended.gpx

I added the merged gpx file as a new layer to a tilemill project and received an error:
“This datasource has multiple layers:
(pass layer= to the Advanced input to pick one)”

Instead of looking for a workaround for the error, I thought about, hey, let’s give cartodb a shot!
I have been looking for an excuse to use cartoDB for ages. I had played around with it minimally over the past year and a half, but my comfortability with Tilemill and the fact that I simply had not had a project where I thought it would be a
really appropriate tool for the job, hadn’t come up.

Let’s give cartodb a shot now…
uh-oh :( )
appended.gpx is too large (it’s 5.6mb). You can import files up to 4.53 MB.

Perhaps I should simplify the GPX traces I thought? I also thought, let’s just convert it to geoJSON and see if geoJSON would resize it’s size… so using ogr, I thought to do:

ogr2ogr -f "GeoJSON" traces.json appended.gpx
ERROR 6: "GeoJSON driver doesn't support creating more than one layer"
ERROR 1: Terminating translation prematurely after failed
translation of layer routes (use -skipfailures to skip errors)

What?! I thought “what do you mean by multiple layers?”
– Does ‘layers’ in the error message mean geometry types (points, linestrings) ? or multiple geometry collections? Some extensive google searching just led to the gdal source code which didn’t give me any more clues..

I walked away for an hour or 2, then thought something might have been wrong in my gpx file.
I opened the gpx file in qgis, and saw, hey it asked me which layer to open of the file and it referred to:


The layers created from my GPX file consisted of 5 layers, including points where I made voice recordings, text notes, or had taken a picture. The layer that I want – the linestrings that I recorded, ended up being the Tracks layer. (The other layers: routes was blank; track_points were the points that make up my linestring; waypoints were points where I had taken a voice recording, text notes, or a picture)

My traces and points in qgis

So, I thought I could just specify the layer within my geojson with the following
ogr2ogr -clipsrclayer tracks -f “GeoJSON” tracksonly.json appended.gpx

I thought this would select the layer that I’m exporting to, but it does not.

Alas, that didn’t work, I received an error as mentioned earlier in the post.

So, fellow readers:

Any suggestions for tools to use or workflow for this?

(Updates to this post will be made as I progress along)

What I plan to do :

– Some more google and gis.stackexchange searching for some inspiration.
– save the gpx file (selecting only the tracks layer) as JSON in qgis and open in
cartodb and/or Tilemill.

I also wonder:
Simply convert each of them individually to geojson files, and then add all of them as a layer in a leaflet instance?

Update 1, 2014/01/02, 7pm:
– I converted the tracks layer in my gpx file to JSON (only 1.3mb!) and created a cartodb project for it . Success!
– I’ve also created a tilemill project and loaded my JSON in there. (I’ll post links soon)

I thought of a new idea: increasing or changing a section of the line based on its proximity to another line…
For example, if there’s another line within 60 meters of it at a specific point, increase the intensity of color of the line for a few meters…

This cannot be done out of the box in tilemill or cartodb… but I have a hunch that this sort of calculation could be done in postgis… I would then reimport this data into cartodb and style there.

HMM, this also looks interesting –

Some thoughts on NACIS, 2013.

I went to NACIS, the annual conference for north american cartographic information society.
This was my 2nd NACIS, my first was 2012 in portland.

This experience was much more positive and enriching for a few reasons, mostly personal:

Looking back, I was intimidated last year. At the time, I was unemployed, struggling freelancer at my first cartography conference and what was my third professional conference. There were many of my mapping heroes there, all of whom, I hadn’t met before. Since last year, I’ve learned a lot more and felt more comfortable of what was being discussed. I recall last year there were a couple talks where I felt completely lost in what was being presented (bivariate maps, for example). No instances of that this year :)
The crowd was a bit younger – meaning there were people in similar situations as myself or similar aged. Last year, I felt pretty young, outside of the college students.
Additionally, there were several people from the twitter and web sphere that I recognized (many more from last year), or had met online, or knew of their work – and had the chance to see them in person or catch up with them (Alan M, andy woodward, mele, mike foster, aj, ian, and dane from mapbox, matt mckenna, mamata akella).
All friendly people. I even drank and played pool with them. Interestingly, those who tweeted more often also tended to be more extroverted.
Plus, it helps when your local mentor also attends the conference this year:)

Presentations were more relevant to my background: I wanted to see practical things that I could implement in my work (of web maps), see things that push the boundaries of what a typical map is – particularly on the web, and what can be mapped in new ways (to wide audiences) and how. I’m not generally a theory person. My entire experience with using ESRI products is about 30 minutes. Last year, there was much more of an academic slant with a focus on theoretical talk and things on historical mapping.
This year, I’m really glad that I shelled out the extra $90 or 100 for Practical Cartography Day. It was quite practical – almost all on web maps of some sort and was in fact likely the best day of the conference. Even serendipitous conversations became really relevant: a discussion over lunch evolved to neighborhood boundaries (one theoretical topic that I find really intriguing) was great.

– Even outside of PCD, this year had much more focus on web maps and using open-source tools that I’ve been quite comfortable with or am anxious to use:
Tilemill, leaflet, D3, qgis. Someone (I forget who) mentioned that, this year, established, older cartographers are finally accepting that the internet (specific through web maps not just a static image) is the primary medium for contemporary cartography and it’s here to stay, and/or they’re even beginning to use some of these tools.

– Wednesday night featured a map gallery, 30-50 printed maps done by students, professionals, and anyone in between, on display. While it was great to see, I would love to see the map gallery to include online web maps, even if there’s dozens of computers set up to see them.

– My only other criticism: presenters, although it’s not the same as attending the talks, slides can be at least of some use and helpful to those who didn’t attend. Share them ! You just need to add a link to your talk on lanyard!

– I presented on the Humanitarian and lower income countries’ web map , designed by the Humanitarian OpenStreetMap Team (Presentation available at: It went well although it was very difficult to see the contrasting colors on the projected screen; you couldn’t tell whether a road was yellow or beige, or see a streets outline very well.

In retrospect, I should have tried a dry run with the projector and maybe I could have made simple modifications (like dimming the lights). More interestingly, I felt like that I belonged more since I presented or at least that I had something to offer this time. Last year was definitely a feeling of imposters’ syndrome. Discussing someone’s presentation is also a great conversation starter.

– Despite that at least 15% of talks were rescheduled or cancelled, the organizers did a great job of making things run smoothly as possible. Without the last second adjustments, this could have been a trainwreck.

Notable absences: – Code for America had a much smaller presence than last year although that could be because its annual conference was the following week. Additionally, CfA could have had less map-related projects this year. – The Feds. :( Several presenters were employees of the US government and as a result, couldn’t present because of the shutdown. laaame! Missed out on a few talks as a result (3 of the 4 in the same session as my presentation were cancelled!) Kudos to those who filled in at the last minute.

– An interesting point Eric Thiesse, Matt Mckenna, and I discussed at the Friday banquet dinner, the last night of the conference: We all do web mapping, marvelling how there’s so much change in the geospatial world even as we are trying to keep abreast of new technologies, tools, mapping libraries: How do you keep up and sharp on them ? Then I remembered what Tom McWright once tweeted months ago regarding this: you don’t. I’ve had this in the back of my mind since then. It’s impossible to keep up. You pick your battles. Now that my time to work on geospatial projects and mapping is greatly reduced (Thanks day job!). I’m picking my battles, being a little more discerning on what to learn or what projects to spend time on. Slowly coming to terms that I will fall behind on some things.

Counting the Use of tags in an osm2pgsql database.

Earlier this week, I was talking with a friend who just moved from Massachusetts to Cleveland, they were a bit surprised about the use of middle school and junior high were both used to describe schools consisting of Grades 6-8.

I was curious about this myself and wondered which was used more often in Ohio and this general outline that I use for counting the use of middle school and junior high in ohio can be used applied if you want to see which tag is used more often in a specific state, country, or other place.

Of course, there’s multiple ways to do this and I wish there an easier way but here’s what a I did….

Because I was comparing 2 tags within an entire state, I couldn’t use the USA implementation of taginfo which uses the entire USA, and I couldn’t open an OSM file containing the entire state of ohio in an osm editor like josm..

So, for my state of ohio, I figured this answer out by:

1. downloading an extract of my state from geofabrik.

  1. Create the postgis database (I have ubuntu 12.04, postgis 2.0, postgresql 9.1, I assume you already have this installed; commands are different if you’re using postgis 1.5… I should explain this step out more clearly in a different post since I never found good introductory documentation for postgis/postgresql/osm when I first starting learning this back in ’11.)

2a. createdb nameofyourdatabase
2.b psql -d nameofyourdatabase -c “CREATE EXTENSION postgis;”


3. Fill your database with the OSM data through the osm2pgsql software by the following command: osm2pgsql -s -d nameofyourdb ~/path/to/data.osm.pbf

  1. Using pgadmin, with the gui, I connected to my database and I clicked on magnifying glass with SQL within it and entered the following SQL. input the following SQL:

SELECT name from planet_osm_polygon WHERE lower(name) ~ ‘junior high’ UNION ALL select name from planet_osm_point WHERE lower(name) ~ ‘junior high’ ORDER BY name

(thanks to Paul Norman to assist me with the proper SQL query syntax).

SELECT name from planet_osm_polygon WHERE lower(name) ~ ‘junior high’
this means is that I selected all closed ways that has the name ‘junior high’ in it (case insensitive).

Now here’s what the finer points of what syntax means:
* name – This is the column ‘name’. Now, osm2pgsql creates columns based on the first half, also called the ‘key’, of an osm tag. Osm tags are written out as key=value

Other columns generated by your osm2pgsql include highway, amenity, leisure, and many more. Because it’s highly unlikely that there’s a store named junior high and the sake of simplicity, I didn’t need to specify that a tag must have amenity=school and have junior high in its name.

  • planet_osm_polygon – This is the name of a table in osm2pgsql that contains all closed ways. Here’s the names of the other tables in osm2pgsql.
  • WHERE – specifies the condition in which I want to select the name table. If I wanted to query a simple tag that has a standard key and value, like amenity=parking ; I could simplify do WHERE amenity IN (‘parking’). But since ‘junior high’ occurs in the middle of a text phrase, the tilde (known as ~) will search for the pattern ‘junior high’ within the tag value.

So, this will return results for: name=Mooney Junior High School; name=Junior High School; name=wilkens junior high ; regardless of case sensitivity. (I admit I don’t fully understand this aspect of the syntax, so someone clarify if I’m incorrect !)

  • UNION ALL – this allows you to do multiple queries within one and include the results of both queries at once .

select name from planet_osm_point WHERE lower(name) ~ ‘junior high’ ORDER BY name

Because OSM objects can be tagged as either nodes or as ways, I need to also search for any nodes that have junior high in its name ! The name of the nodes’ table is planet_osm_point and the structure of the syntax is nearly the same.

  • ORDER BY – this is simple, it merely sorts the results by a column. In this instance, I want to sort them in alphabetical order, so I did ORDER BY name.

Now, we can execute our query by clicking “Execute query” (its icon looks like a play button on a DVD/VCR),

Now in pgadmin’s lower-right hand corner, will be the number of results that are returned and the names of all of the schools with junior high in it…

So, we see: 219 of Junior High in Ohio ! There’s a few duplicate ones, which is interesting. Some may be the same name in 2 different places, some may be duplicate nodes of the same school.

And we repeat the process again for middle school, and… 389 !

Middle school is used more often in Ohio than in Junior High… :)

As you continue using osm2pgsql and working with OSM data, you’ll realize that if you are interested in generated statistics of tags or creating maps using postgresql in OSM data, most of your interaction with postgresql will be creating queries and using SELECT statements. You’ll want to guide your learning on that.

A day in the life as A HOT junior Coordinator (teaching OSM in Haiti)

(This was originally published on July 9, 2013 on OpenStreetMap Diaries).

While in Limonade, Haiti for 6 weeks, April 30 – June 22, I was a part of HOT’s project, in partnership with the Universite d’Etat Haiti and USAID, to help teach mapping and OSM skills to 60 mappers and establish an OSM community in Northern Haiti. Here’s a snapshot of what a typical (although most days were not typical, this was a typical as it was) day consisted of.

Follow along with this umap

For photos, check out these flickr sets: One and Two

6:45a – Shower with water from the house well, dressed, breakfast. The 15 housemates – 12 Advanced Mappers, two local coordinators, and 3 other HOT coordinators – begin to wake up within the hour. Read and reply to new emails.
Also used this time to learn python, thanks to Zed Shaw’s Learn Python the Hard Way

8:10 All 16 of us – 12 Advanced Mappers, two local coordinators, and 4 HOT coordinators pile into 2 vans to the Universite d’Etat d’Haiti’s Computer lab where we worked out of for the majority of our time there. The 12 advanced mappers and the 2 local coordinators were participants in last year’s St. Marc project and were relatively experienced with editing in OSM.

We established a system with 6 teams consisting of 2 Advanced mappers who supervised 10 novice mappers (participants whom had not edited OSM before they were hired at the beginning of the program in March) whose daily activities would rotate every 3 days. Each day, 2 teams would spend their day in the field surveying, 2 other teams remained at the computer lab, tracing buildings in anticipation of the following day’s field surveying or adding POIs that they recorded via GPS traces during their field surveys, and the remaining 2 teams’ novice mappers were off. The 4 advanced mappers of the team off would assist at the computer lab and continue editing. As necessary, we’d hold workshop days where all 60 mappers were in attendance for workshops and presentations on QGIS, and cartography and GIS concepts.

8:25 Our day at the university begins in the computer lab. !

The advanced mappers print out field papers and set up the equipment, gps, laptops for the day. Our 40 novice mappers would be arriving for the day at 9:00. I’d cross my fingers for the internet at the university to be functional for the day. As the 1,800 studens began to arrive for classes at the university, the internet slows to a crawl.

Throughout the day, I troubleshooted whether the current technical troubles, usually mappers unable to load bing imagery or map data from OSM, or upload changesets in JOSM, was caused by any problems within the local network on our end and if anything, what I could do to fix it.

Besides the on-site tech support, I’d help mappers by answering questions about what tags to use for features (for example, a window factory, a video arcade, driving school), basic josm usage, and other OSM related editing questions.

The project also focused on improving how HOT can more effective display and classify OSM data within the contexts of Haiti, other lesser developed areas, and in humanitarian contexts. During the day, I also worked on the HDM josm style, a custom map styling that customizes the display of OSM data in JOSM as users draw buildings, waterways, paths, and other points of interests to upload into OSM. Later, I also helped ybon (Yohan Boniface) with the HOT web map rendering described here. Still a work in progress, I’ve been stoked to work on this with Yohan and it’s coming along great. See this umap instance for an updated demo and an overview of its features. Check it out on github if you’re interested to contribute.

Fellow coordinators and I would also occasionally browse the changesets of our mappers to make sure they were mapping correctly. We had a couple RSS feeds set up for the area through whodidit to ensure mappers were actually uploading changesets and how much they were doing. The changeset history analyzer(currently offline) was also helpful as well to examine changesets to make sure our mappers’ changes were correct. OSMHV’s ability to display the specific tag changes that were made to an object as really useful as well.

Noon – Lunch time ! Outside the university grounds, were an assortment of makeshift roadside stands of rice and beans, fried chicken, egg sandwiches, and more.

3:45 – Our novice mappers leave for the day. Advanced mappers continue to map, inventory and begin to store equipment used during the day by the novice mappers and the teams in the field surveying.

5:00 – A wrap-up meeting of the day’s events with the advanced mappers, how field surveying went, updates on future meetings, mapping parties, and other events.

6:00 – Head back to the house for a combination of: surfing the internet, an informal meeting over Prestige, Haiti’s national beer with coordinators, reading, or Hanging out with the housemates. Many mappers had not used Ubuntu before and fell quickly in love with it. Several mappers who had their own personal laptops asked me and Yohan to walk them through installation and basic usage.

8:15 – Dusk, dinner at the roadside stalls is now available.

10-12:30 sleep awaits.

Some updates

Some assorted thoughts, happenings of the past couple weeks/months:

(Originally written on April 26, 2013 – forgot why this never posted..)

…..I’m stoked to announce that I’ll be joining HOT (Humanitarian OpenStreetMap Team) in Cap Haiten, from May-June. I’ll be with some immensely talented and great folks. We’ll be doing great work down there and working with local Haitian mappers. …..Alas, I’ll miss FOSS4G-NA, SOTM-US, mais, c’est la vie.

Coming to the realization that there’s simply so much learn and with increasing specialization, I should focus my learning efforts more on web mapping instead of print mapping.

note to self, I’ll be updating the site. Finally get a domain name name and strongly considering to move the site to jekyll.


Also, looking over the popular apps on the droid market and seeing most of the apps on the top lists on the market were games, made me reflect: what has the droid really enabled us to do ?

What made me think is that there aren’t many “killer apps” but rather, general hardware improvements like:

– being able to check your email, communicate online, 24/7 thanks to mobile data plans and increased presence of wifi in public spaces.

– Gps

– a more improved camera that allows higher resolution, image stabilization, etc.

The lack of killer apps is also an indication that software being less app based and more web-based with Software as services and social networks that are OS independent like Dropbox, twitter, facebook.

Now, granted there’s a few things that are close to ‘killer apps’ for me.

– finding local reviews for restaurants when you’re out of town… (yelp) and soundhound and shazam. Thinking of the capability of a software on your phone that could identify a song by just playing a 10 second clip of it was a pipe dream just 10 years ago…