select * from table_1 a inner join country_code c on a.ip1 between c.start and c.end where date_time between '2009-03-01 00:00:00' and '2009-04-01 00:00:00' and pkt_count < 4
Table_1 is 3 billion lines, country_code is 90k rows
Reasonable memory (about 12G) and plenty of disk (cache disk is 60G)
Any chance to change it into an equality join? Looks like the intermediate join result is just too large. (Plenty of disk space is not enough for 20 trillion of rows, as I suspect).
The join result can have up to 3 000 000 000 * 90 000 rows. Each row consumes several bytes in memory during join. So the join result can be well over 60GB (pessimistically more than 1 PetaByte).
Surely, the conditions in where and join should eliminate some rows, but apparently they do not limit the join result well enough…
If you supply the query logs from bh.err, we may be able to see how many rows are going to be produced.
mysql> describe table_1; +------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+------------+------+-----+---------+-------+ | date_time | datetime | NO | | | | | subsecond | int(11) | NO | | | | | ip1 | bigint(10) | NO | | | | | port1 | int(11) | NO | | | | | ip2 | bigint(10) | NO | | | | | port2 | int(11) | NO | | | | | pkt_count | int(11) | NO | | | | | byte_count | int(11) | NO | | | | +------------+------------+------+-----+---------+-------+ 8 rows in set (0.00 sec)
mysql> describe geoip; +-------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------+------+-----+---------+-------+ | start | bigint(10) | NO | | | | | end | bigint(10) | NO | | | | | cc | char(2) | NO | | | | | cn | char(50) | NO | | | | +-------+------------+------+-----+---------+-------+ 4 rows in set (0.05 sec)
mysql>
I want the ‘cc’ (country code) field for ‘ip1’ in table1, hence the inner join. If there is a more efficient way of constructing the query then please let me know. The ‘start’ and ‘end’’ in the geoip table are decimal representations of IP address, as is ‘ip1’ in table_1
Nothing of interest in the bh.err logs. Do I need to enable more logging somehow?
Janusz Borkowski - 29 May 2009 03:57 AM
Hi!
The join result can have up to 3 000 000 000 * 90 000 rows. Each row consumes several bytes in memory during join. So the join result can be well over 60GB (pessimistically more than 1 PetaByte).
Surely, the conditions in where and join should eliminate some rows, but apparently they do not limit the join result well enough…
If you supply the query logs from bh.err, we may be able to see how many rows are going to be produced.
I suppose this is guaranteed that start-end intervals are disjoint (to make sure the join will give an unequivocal answer). However, right now the join will produce n:m intermediate answer and then will make the final 1:n result. It will be optimized in some of the next releases.
Right now, the only workaround I can see would be to split the query into smaller pieces, joined by UNION ALL (preferably by date_time, e.g. by hours). And it is better to select only columns which are actually needed - for column-based engines it does make the difference.
Try setting “ControlMessages = 2” in “brighthouse.ini”. Then restart server and execute the queries again. A log file bh.err will be created next to the brighthouse.ini file. This log file will contain information how the queries have been processed. Please post this file, so we can see how each of the queries has been handled by the engine. Then some conclusions on the resulting performance could be derived.
And indeed, instead of “select * ..” use “select col1, col2 ..” specifying only columns really needed. The performance difference can be large.
OK. I tried using a sub-query to limit the JOIN and I have the bh.err log.
See below:
select a.date_time, a.ip1, g.cc from ( select date_time, ip1 from table_1 where date_time between '2009-03-01 00:00:00' and '2009-04-01 00:00:00' and pkt_count < 4 ) a inner join geoip g on a.ip1 between g.start and g.end
The subquery probably will not help. Please try this:
select a.date_time, a.ip1, c.cc from table_1 a inner join country_code c on a.ip1 between c.start and c.end where date_time between '2009-03-01 00:00:00' and '2009-03-03 00:00:00' and pkt_count < 4;
So, that is my original query with only selected columns rather than ‘*’ - I was being lazy!
Well, that worked. Or at least it didn’t but now I can see why!!
2009-05-29 07:41:35 [7] Warning: a big intermediate object created (~268 GB)
I don’t have a spare 268GB on this machine….but that is alright as at least I understand why it is failing
I guess I can chop my time from one month to one week etc etc
Thanks again for all the help
Jakub Wroblewski - 29 May 2009 09:40 AM
Hi,
The subquery probably will not help. Please try this:
select a.date_time, a.ip1, c.cc from table_1 a inner join country_code c on a.ip1 between c.start and c.end where date_time between '2009-03-01 00:00:00' and '2009-03-03 00:00:00' and pkt_count < 4;
That worked, but not sure how to progress one hour at a time
A more fundamental question: when I parse the raw data to produce the input file, maybe I should lookup the country code at that point in the preparation phase, and store it in the table rather than try and do a JOIN like this? Although it kinda defeats the object of the ‘warehouse’ doesn’t it? Or is that the correct thing to do?
Well, it would be definitely the faster way to have this column precalculated. And it is not a bad thing in data warehousing - in general, precalculated things (aggregations etc.) are common in this area.
A reason not to precalculate it: it would be a denormalization, redundancy etc. But in ICE this column should be well compressed.
I guess a second reason NOT to pre-calculate is if the country code mapping changes - as it could easily do.
Given it works when I do one hour, is there any further optimisation I could do to make that one hour run faster? A query covering one hour takes 12 minutes. So to cover one day I could do a UNION ALL on 24 queries it would take about 288 minutes (24 x 7).
Can I do any optimisation?
(Until I get considerably bigger disks!!)
Jakub Wroblewski - 29 May 2009 10:16 AM
Well, it would be definitely the faster way to have this column precalculated. And it is not a bad thing in data warehousing - in general, precalculated things (aggregations etc.) are common in this area.
A reason not to precalculate it: it would be a denormalization, redundancy etc. But in ICE this column should be well compressed.