• Home
  • About
Blue Orange Green Pink Purple

Geocoding part 1 - Getting the longitude and latitude of all australian postcodes from google maps

Posted in code. on Tuesday, August 26th, 2008 by Chris Hulbert
Aug 26

As part of a grand plan to be able to map a bunch of addresses onto a graphical map of sydney or australia, firstly you need all the australian postcodes and their longitude and latitude. Now in this article i describe how to screen scrape this list from google maps (please forgive me Mr Google!).

Firstly you'll need the list of all postcodes from Australia Post. You can get it here - it is a zip of a CSV file. It contains a whole bunch of non-physical addresses such as PO boxes - if you're inclined, open it in Excel and remove these. What you want to end up with is a file called 'pc-full.csv' which we'll use later.

For the next step you'll need Curl installed. You can get it here. If you're behind a firewall, there's some hints here on how to get Curl/Wget to play nicely with firewalls (even though the instructions are for wget, it still applies to curl). Once you think you've got it sorted, try this from the command prompt:

curl "http://maps.google.com.au/maps?f=q&hl=en&geocode=&q=2768+australia&output=js"

You should get a printout of a whole bunch of javascript with 'center:{lat:xxx,lng:yyy}' in it somewhere. Sweet - you've got curl working.

Next up we want to write a script to do all the work. I've used ruby, because it's awesome, but you can use whatever you want. The following code opens up the 'pc-full.csv' file, reads the list of distinct postcodes, runs curl against each one, parses the longitude and latitude from each, and outputs a nice CSV file with 3 columns: Postcode, Latitude, Longitude.
Now i'll have to apologise for the sloppy code, but it does work (updated to fix wordpress' curly quotes):

def IsGood(fname)
  # Does a file contain the longitude eg did it connect to gmaps correctly?
  # Ironically, in C# this code would be a one liner: File.ReadAllText(fname).Contains("center:{lat:");
  r = "Missing"
  if File.exist?(fname)
    f = File.new(fname)
    lines = f.read
    f.close
    if lines.include?("center:{lat:")
      r = 'Good'
    else
      r = 'Bad'
    end
  end
  r
end

# get the list of unique postcodes from the CSV file downloaded from australia post
@postcodes = []
File.new('pc-full.csv').readlines.each {|l|
  x = l.gsub(/[\",]/,"").to_i
 @postcodes << x if x>0 && !@postcodes.include?(x)
}
puts "Total #{@postcodes.length} unique postcodes"

# Scrape them all from google maps
@postcodes.each {|postcode|
  fname = "data_#{postcode}.txt"
  if !File.exist?(fname)
   system "curl -o #{fname} \"http://maps.google.com.au/" +
     "maps?f=q&hl=en&geocode=&q=#{"%04d" % postcode}+australia&output=js\""
   puts "Any good? #{IsGood(fname)}"
  end
}

# Go through the resultant files, parsing the longitude and latitude
@results = ["Postcode,Lat,Lng"]
@postcodes.each {|postcode|
  fname = "data_#{postcode}.txt"

  status = IsGood(fname)
  puts "#{fname} : #{status}"

  # Grab the long & lat
  if status=='Good'
    f = File.new(fname)
    lines = f.read
    f.close
    m = /center:\{lat:([\-.0-9]*),lng:([\-.0-9]*)\}/.match(lines)
    @results << "#{postcode},#{m[1]},#{m[2]}"
  end
}

# Write it out to a CSV file
File.open("PostcodeLatLng.csv","w") {|f|
  @results.each {|line|
    f.write line
    f.write "\n"
  }
}

Note: Put curl.exe in the same folder as your ruby file above, if it's not in the path.
And your output from running all that should be a 'PostcodeLatLng.csv' file with the contents something like this:

Postcode,Lat,Lng
2000,-33.869027000000003,151.21024499999999
2001,-37.808776999999999,144.94928899999999
2002,-25.335448,135.74507600000001
2004,-33.891787999999998,151.17625100000001

Bob's your uncle! You should now have the longitude and latitude of all australian postcodes. In the next article, i'll show how i use this to make a map of australian post offices overlaying a map of australia, for instance. Here: Geocoding part 2

11 Comments

  1. Alan Lamb on August 27th, 2008

    Thankyou for the code, but need help getting it to work.

    I think part of the problem is that in your code, the quotes have turned strange…
    ie your code has - ‘Good’
    but should be - 'Good'

    This results in lots of errors during syntax checking

    And I don't know ruby enough to resolve these issues.

    Could you confirm this issue for me please.

  2. Alan Lamb on August 27th, 2008

    With respect to my prior comment regarding quotes/apostrophes

    In the comment you cannot see the difference between the quote symbols.

    But when I copy and paste them into SciTE thay definately generate syntax errors.

    If I delete and retype them, the quote symbol is slightly different on screen and the syntax error goes away.

    I hope this better explains my issue

  3. Chris Hulbert on August 27th, 2008

    Hi,
    It looks like wordpress has magically converted the normal quotes into curly quotes. I'll see if i can work around this.
    Thanks for letting me know!

  4. Chris Hulbert on August 27th, 2008

    Okay it should be sorted out now. Again, thanks for letting me know. Props to these guys for a workaround: http://www.semiologic.com/software/wp-tweaks/unfancy-quote/

  5. Alan Lamb on August 27th, 2008

    Thanks Chris

    The data is out there, with a bit of searching

    http://www.corra.com.au/support/downloads/

    But I'm not 100% confident with the data

    For example, look for postcode 5273 via google maps or http://www.australianpostcodesearch.com/

    Then try the Lat/Lon from the data set -34.803467 138.755016

    Google maps says these locations are 336km away from each other!

  6. Alan Lamb on August 27th, 2008

    Thank you

    Syntax error resolved

    Also I needed to place curl.exe into the Ruby folder

    It's running now - will comment on data accuracy later

  7. Alan Lamb on August 28th, 2008

    Thanks Chris

    Data integrity is excellent

    Regards, Alan

  8. Chris Hulbert on August 28th, 2008

    Well, you really should thank google, not me, but cheers anyhow.

  9. bhavin on November 24th, 2008

    Thanks Alan
    this is really helpful information which i was looking for since last 1 week.
    Regards, Bhavin

  10. Pete on November 28th, 2008

    Great article Chris. The exorbitant fees a vendor out there wants for this data (ok - plus a little more) has really irked me for ages.

  11. Pete on November 28th, 2008

    If you install the database on the corra.com.au website into MySQL (and the distance function), this perl code can get you a list of suburbs within x km. Note there's gotcha with some of the data (eg. Bendigo), and you'll need DBD::mysql installed as well. There might be more efficient ways to do this - just a 10 minute job!

    #!/usr/bin/perl -w
    use strict;
    use DBD::mysql;

    # Connect to your database (au_postcodes), on your machine, on the standard MySQL port
    my $dsn = 'dbi:mysql:au_postcodes:localhost:3306';
    # set the user and password
    my $user = 'youruseridhere';
    my $pass = 'yourpasswordhere';

    my $dbh = DBI->connect($dsn, $user, $pass)
    or die "Sorry - cannot connect to the DB: $DBI::errstr\n";

    # What suburb state?
    my $search_suburb='ABBOTSFORD';
    my $search_state='VIC';
    # Suburbs how far away?
    my $search_distance = 5.00;

    # Lookup the postcode for the search suburb/state
    my $get_postcode_qry = $dbh->prepare("select postcode from postcode_db where suburb='" . $search_suburb . "' and state='" . $search_state . "'");
    $get_postcode_qry->execute;
    my $search_postcode = $get_postcode_qry->fetchrow_array();

    print "Now looking for suburbs within $search_distance kilometres of $search_suburb in $search_state\n";

    # Now get all the postcodes - narrow down to state for a bit of efficiency if you're using small search distances.
    my $postcodes_qry = $dbh->prepare("select distinct postcode,suburb from postcode_db where state='" . $search_state . "'");
    $postcodes_qry->execute;

    # Get each postcode and suburb in this state….
    while (my($pc,$suburb) = $postcodes_qry->fetchrow_array()) {
    # and find out it's distance from your search suburb
    my $dist_qry = $dbh->prepare("select postcode_dist($pc,$search_postcode)");
    $dist_qry->execute;
    my $dist = $dist_qry->fetchrow_array();
    # If we are closer than the search distance, print it out
    if ($dist<=$search_distance) {
    print "Suburb : $suburb\t $pc\tDistance: $dist\n";
    }
    }



Leave a Reply

Chris' Babble

  • About
    Hi, i'm Chris Hulbert, a software guy from Sydney Australia.
  • Categories
    • code
    • Portfolio
    • Uncategorized
  • Recent Articles
    • How to implement DES and Triple DES from scratch
    • How to use Cookies in Struts 2 with ServletRequest and ServletResponse
    • How to use sessions with Struts 2
    • Using Quartz Scheduler in a Java web app (servlet)
    • Javascript date picker that Doesn't Suck!(tm)
    • Using Oracle XE with Hibernate


  • Home
  • About

© Copyright Chris' Babble. All rights reserved.
Designed by FTL Wordpress Themes brought to you by Smashing Magazine

Back to Top