Well, I just loaded all of the data. I made a little mistake. I had added a zip plus 4 field, and that’s an integer. I was also going to convert the standard zip field to a numeric column, but forgot, so I’ll have to reload the data again (a few late nights recently
However. Here’s a quick test for zip code 10128
select count(*) from anchor where ZIPPLUS4 between 101280000 and 101289999;
+----------+
| count(*) |
+----------+
| 74604 |
+----------+
1 row in set (0.28 sec)
select count(*) from anchor where ZIP = ‘10128’;
+----------+
| count(*) |
+----------+
| 74604 |
+----------+
1 row in set (14.16 sec)
The first query is the numeric field. I crammed the zip and the plus 4 part togeter to get a number (101280000-101289999). That .28 seconds is the first query run on that column. The second query is against the varchar field. 14.16 seconds. So, roughly 50 times as long? I’ll post the numbers after I fix my little problem with the zip field not being numeric as well.
A semi-similar query:
select count(*) from anchor where fipscode = 36061;
+----------+
| count(*) |
+----------+
| 1706440 |
+----------+
1 row in set (0.11 sec)
select count(*) from anchor where state = ‘NY’ and countycd = ‘061’;
+----------+
| count(*) |
+----------+
| 1706440 |
+----------+
1 row in set (4.75 sec)
Obviously not the same, as the second query has an ‘and’ condition. However, the first is a numeric field that I put together from the parts of the second. That’s .11 seconds, hitting hundreds of millions of records. Pretty snazzy.