MyFIRST MySQL
Posted on Thu 01 October 2015 in blog
Description
In this post, my goals are: 1. Get a SQL database using bash 2. Try some basic SQL commands I'll be using a public database of population statistics (from this site).
First, get the data using bash
sudo mysql -uroot
wget http://downloads.mysql.com/docs/world.sql.zip
unzip world.sql.zip
Now, use mysql scripts to create/load the database
mysql> /*This is a comment*/
mysql> CREATE database world;
mysql> USE world;
mysql> SOURCE /home/ubuntu/world.sql;
Toes dipping slowly into the water
So what's in the database? Let's take a look.
mysql> show tables;
+-----------------+
| Tables_in_world |
+-----------------+
| City |
| Country |
| CountryLanguage |
+-----------------+
3 rows in set (0.00 sec)
Looks like there are three tables
within world
. Let's explore one of the tables, City
.
mysql> select * from City limit 10;
+----+----------------+-------------+---------------+------------+
| ID | Name | CountryCode | District | Population |
+----+----------------+-------------+---------------+------------+
| 1 | Kabul | AFG | Kabol | 1780000 |
| 2 | Qandahar | AFG | Qandahar | 237500 |
| 3 | Herat | AFG | Herat | 186800 |
| 4 | Mazar-e-Sharif | AFG | Balkh | 127800 |
| 5 | Amsterdam | NLD | Noord-Holland | 731200 |
| 6 | Rotterdam | NLD | Zuid-Holland | 593321 |
| 7 | Haag | NLD | Zuid-Holland | 440900 |
| 8 | Utrecht | NLD | Utrecht | 234323 |
| 9 | Eindhoven | NLD | Noord-Brabant | 201843 |
| 10 | Tilburg | NLD | Noord-Brabant | 193238 |
+----+----------------+-------------+---------------+------------+
10 rows in set (0.00 sec)
What are the top ten biggest cities in the dataset?
mysql> select CountryCode, Name, Population from City
-> order by population desc
-> limit 10;
+-------------+-------------------+------------+
| CountryCode | Name | Population |
+-------------+-------------------+------------+
| IND | Mumbai (Bombay) | 10500000 |
| KOR | Seoul | 9981619 |
| BRA | São Paulo | 9968485 |
| CHN | Shanghai | 9696300 |
| IDN | Jakarta | 9604900 |
| PAK | Karachi | 9269265 |
| TUR | Istanbul | 8787958 |
| MEX | Ciudad de México | 8591309 |
| RUS | Moscow | 8389200 |
| USA | New York | 8008278 |
+-------------+-------------------+------------+
10 rows in set (0.01 sec)
How many cities are there in the dataset? How many countries?
mysql> select count(Name), count(distinct(CountryCode)) from City;
+-------------+------------------------------+
| count(Name) | count(distinct(CountryCode)) |
+-------------+------------------------------+
| 4079 | 232 |
+-------------+------------------------------+
1 row in set (0.36 sec)
Looks like 4079 cities total and 232 countries.
What countries have the highest total populations?
mysql> select CountryCode, sum(Population) from City
-> group by CountryCode
-> order by sum(Population) desc
-> limit 10;
+-------------+-----------------+
| CountryCode | sum(Population) |
+-------------+-----------------+
| CHN | 175953614 |
| IND | 123298526 |
| BRA | 85876862 |
| USA | 78625774 |
| JPN | 77965107 |
| RUS | 69150700 |
| MEX | 59752521 |
| KOR | 38999893 |
| IDN | 37485695 |
| PAK | 31546745 |
+-------------+-----------------+
10 rows in set (0.01 sec)