Converting country names to country codes in SQL

We often have to export customer data from all sorts of obscure databases for migration to our e-commerce platform, which is based on Magento.
Our tools are quite sophisticated, but sometimes a quick brute force solution is what’s needed. Today we had to extract customer data for the umpteenth time from some bespoke PoS system.

country-codes-iso-3166-1

Magento requires country codes to be in 2-letter code format, e.g. US, UK, AU, DE. I wrote a quick SQL script to do it at the source and send a clean file to Magento customer import.

  • _address_country_id is the name of Magento field
  • @cust is the name of the table the data was in
  • Replace both as needed

I replaced longer official names with simpler patterns and checked for uniqueness. You are probably unlikely to find many in your e-commerce database. The best way to check is to run select distinct [your address field name] from [your db name].

Interesting observations

  • There are South Georgia and the South Sandwich Islands, population 30, but there are no North Sandwich Islands.
  • The two Koreas are hard to tell apart using their official names. One is called Democratic People's Republic of Korea and the other is simply Republic of Korea.
  • You can probably safely map ‘%Korea%’ to KR for South Korea
  • The UK can be United Kingdom or Great Britain. Both can be used in other country names for dependent territories.

The following countries are included in the ISO standard, but not in the UN countries list, so they can be safely removed.

  • UNITED STATES MINOR OUTLYING ISLANDS, UM
  • TAIWAN, PROVINCE OF CHINA, TW
  • HEARD ISLAND AND MCDONALD ISLANDS, HM
  • SOUTH GEORGIA AND THE SOUTH SANDWICH ISLANDS, GS
  • VIRGIN ISLANDS, BRITISH, VG
  • FRENCH SOUTHERN TERRITORIES, TF
  • CHRISTMAS ISLAND, CX
  • COCOS (KEELING) ISLANDS, CC
  • BOUVET ISLAND, BV
  • ANTARCTICA, AQ

Alternative ways of converting country names to codes

edoceo.com offers a more permanent way of replacing country names with codes would be to create a table with name/code combinations and do the matching using those records. The matching script would be much shorter.

Pierre-Luc Soucy has a nicely formatted ISO 3166 country list in CSV and SQL formats. It creates an InnoDB table named countries with MySQL syntax.

Commerce Guys have an open source address cleansing project on GitHub. It is quite comprehensive.

Read more about: