Joinutility seperatorLogin utility separator Infobright.com
   
1 of 2
1
Brighthouse out of resources error: Out of resources on JOIN.
Posted: 28 May 2009 03:49 PM   Ignore ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

Me again!

Got the above error with this query

select from table_1 a
inner join country_code c on a
.ip1 between c.start and c.end
where date_time between 
'2009-03-01 00:00:00' and '2009-04-01 00:00:00'
and pkt_count 

Table_1 is 3 billion lines, country_code is 90k rows

Reasonable memory (about 12G) and plenty of disk (cache disk is 60G)

Any ideas?

Profile
 
Posted: 28 May 2009 10:27 PM   Ignore ]   [ # 1 ]  
Jr. Member
Avatar
RankRank
Total Posts:  96
Joined  2008-08-18

Please post your full/complete table definitions and some sample data.

Signature 

Brian Beharry, QA
Infobright

Profile
 
Posted: 29 May 2009 01:43 AM   Ignore ]   [ # 2 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  764
Joined  2008-08-18

Hi,

Any chance to change it into an equality join? Looks like the intermediate join result is just too large. (Plenty of disk space is not enough for 20 trillion of rows, as I suspect).

Regards,

Signature 
Profile
 
Posted: 29 May 2009 03:57 AM   Ignore ]   [ # 3 ]  
Super Duper Member
Avatar
RankRankRankRankRank
Total Posts:  966
Joined  2008-08-18

Hi!

The join result can have up to 3 000 000 000 * 90 000 rows. Each row consumes several bytes in memory during join. So the join result can be well over 60GB (pessimistically more than 1 PetaByte).
Surely, the conditions in where and join should eliminate some rows, but apparently they do not limit the join result well enough…
If you supply the query logs from bh.err, we may be able to see how many rows are going to be produced.

Profile
 
Posted: 29 May 2009 08:19 AM   Ignore ]   [ # 4 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

Thanks everyone.

Here are the table details:

mysqldescribe table_1;
+------------+------------+------+-----+---------+-------+
Field      Type       Null Key | Default | Extra |
+------------+------------+------+-----+---------+-------+
date_time  datetime   NO   |     |         |       | 
subsecond  int(11)    | NO   |     |         |       | 
ip1        bigint(10) | NO   |     |         |       | 
port1      int(11)    | NO   |     |         |       | 
ip2        bigint(10) | NO   |     |         |       | 
port2      int(11)    | NO   |     |         |       | 
pkt_count  int(11)    | NO   |     |         |       | 
byte_count int(11)    | NO   |     |         |       | 
+------------+------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

mysqldescribe geoip;
+-------+------------+------+-----+---------+-------+
Field Type       Null Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
start bigint(10) | NO   |     |         |       | 
end   bigint(10) | NO   |     |         |       | 
cc    char(2)    | NO   |     |         |       | 
cn    char(50)   | NO   |     |         |       | 
+-------+------------+------+-----+---------+-------+
4 rows in set (0.05 sec)

mysql

I want the ‘cc’ (country code) field for ‘ip1’ in table1, hence the inner join. If there is a more efficient way of constructing the query then please let me know. The ‘start’ and ‘end’’ in the geoip table are decimal representations of IP address, as is ‘ip1’ in table_1

Cheers

Profile
 
Posted: 29 May 2009 08:21 AM   Ignore ]   [ # 5 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

Nothing of interest in the bh.err logs. Do I need to enable more logging somehow?

Janusz Borkowski - 29 May 2009 03:57 AM

Hi!

The join result can have up to 3 000 000 000 * 90 000 rows. Each row consumes several bytes in memory during join. So the join result can be well over 60GB (pessimistically more than 1 PetaByte).
Surely, the conditions in where and join should eliminate some rows, but apparently they do not limit the join result well enough…
If you supply the query logs from bh.err, we may be able to see how many rows are going to be produced.

Profile
 
Posted: 29 May 2009 08:45 AM   Ignore ]   [ # 6 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  764
Joined  2008-08-18

Hi,

I suppose this is guaranteed that start-end intervals are disjoint (to make sure the join will give an unequivocal answer). However, right now the join will produce n:m intermediate answer and then will make the final 1:n result. It will be optimized in some of the next releases.
Right now, the only workaround I can see would be to split the query into smaller pieces, joined by UNION ALL (preferably by date_time, e.g. by hours). And it is better to select only columns which are actually needed - for column-based engines it does make the difference.

Regards,

Signature 
Profile
 
Posted: 29 May 2009 08:52 AM   Ignore ]   [ # 7 ]  
Super Duper Member
Avatar
RankRankRankRankRank
Total Posts:  966
Joined  2008-08-18

Do I need to enable more logging somehow?

Try setting “ControlMessages = 2” in “brighthouse.ini”. Then restart server and execute the queries again. A log file bh.err will be created next to the brighthouse.ini file. This log file will contain information how the queries have been processed. Please post this file, so we can see how each of the queries has been handled by the engine. Then some conclusions on the resulting performance could be derived.

And indeed, instead of “select * ..” use “select col1, col2 ..” specifying only columns really needed. The performance difference can be large.

Profile
 
Posted: 29 May 2009 09:29 AM   Ignore ]   [ # 8 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

OK. I tried using a sub-query to limit the JOIN and I have the bh.err log.

See below:

select a.date_timea.ip1g.cc from 
    
(    select date_timeip1 from table_1 
        where date_time between 
'2009-03-01 00:00:00' and '2009-04-01 00:00:00'
        
and pkt_count 4
    
)  a
inner join geoip g on a
.ip1 between g.start and g.end 
2009-05-29 07:17:45 [3] T:-TABLE_ALIAS(T:1,"table_1")
T:-TMP_TABLE(T:-1)
F:CREATE_FILTER(T:-2,TERM(t:-1 a:0),BETWEEN,TERM(vt:2009-03-01 00:00:00),TERM(vt:2009-04-01 00:00:00),WHERE)
F:= AND(F:0,TERM(t:-1 a:6),<,TERM(vn:4),TERM(<null>))
T:-2.FILTER(F:0)
A:-T:-2.ADD_COLUMN(TERM(t:-1 a:0),LIST,"date_time","ALL")
A:-T:-2.ADD_COLUMN(TERM(t:-1 a:2),LIST,"ip1","ALL")
T:-TMP_TABLE(T:-2)
T:-TABLE_ALIAS(T:0,"geoip")
T:-3.JOIN(T:-4)
F:CREATE_FILTER(T:-3,TERM(t:-2 a:-2),BETWEEN,TERM(t:-4 a:0),TERM(t:-4 a:1),WHERE)
T:-3.FILTER(F:1)
A:-T:-3.ADD_COLUMN(TERM(t:-2 a:-1),LIST,"date_time","ALL")
A:-T:-3.ADD_COLUMN(TERM(t:-2 a:-2),LIST,"ip1","ALL")
A:-T:-3.ADD_COLUMN(TERM(t:-4 a:2),LIST,"cc","ALL")
RESULT(T:-3)

2009-05-29 07:17:45 [3] Initial execution plan (non-join):
2009-05-29 07:17:45 [3] Cnd(0):   t:-1 a:6 BETvn:-inf AND vn:3        (9.81)
2009-05-29 07:17:45 [3] Cnd(1):   t:-1 a:0 BETvn:2.761408958e+16 AND vn:2.761494857e+16       (18.98)
2009-05-29 07:17:46 [3] Packs/packrows after KN evaluation:
2009-05-29 07:17:46 [3] (t0Pckrows49200susp2705 (46495 empty 0 full). Packs opened in 2 cond.: 2760
2009
-05-29 07:18:07 [3] Packrows after exact evaluation (WHERE):
2009-05-29 07:18:07 [3] (t0): 49200 all packrows2695 to open (including 0 full)
2009-05-29 07:18:52 [3] Initial execution plan (non-join):
2009-05-29 07:18:52 [3] Packs/packrows after KN evaluation:
2009-05-29 07:18:52 [3] (t0Pckrows512susp512 (empty 0 full). Packs opened in 0 cond.: 0
2009
-05-29 07:18:52 [3] (t1Pckrows2susp(empty 0 full). Packs opened in 0 cond.: 0
2009
-05-29 07:18:52 [3] (t0-t1Pack2Pack pairs100possible.
2009-05-29 07:18:52 [3] Packrows after exact evaluation (WHERE):
2009-05-29 07:18:52 [3] (t0): 512 all packrows512 to open (including 512 full)
2009-05-29 07:18:52 [3] (t1): 2 all packrows2 to open (including 2 full)
2009-05-29 07:18:52 [3] Join execution plan:
2009-05-29 07:18:52 [3] Cnd(0):   t:-2 a:->= t:-4 a:0         (48.83)
2009-05-29 07:18:52 [3] Cnd(1):   t:-2 a:-<= t:-4 a:1         (48.83)
2009-05-29 07:19:12 [3] ErrorOut of resources on JOIN
Profile
 
Posted: 29 May 2009 09:36 AM   Ignore ]   [ # 9 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

If I run only the sub-query without the JOIN it works. Here is the bh.err log

2009-05-29 07:29:10 [3] T:-TABLE_ALIAS(T:0,"table_1")
T:-TMP_TABLE(T:-1)
F:CREATE_FILTER(T:-2,TERM(t:-1 a:0),BETWEEN,TERM(vt:2009-03-01 00:00:00),TERM(vt:2009-04-01 00:00:00),WHERE)
F:= AND(F:0,TERM(t:-1 a:6),<,TERM(vn:4),TERM(<null>))
T:-2.FILTER(F:0)
A:-T:-2.ADD_COLUMN(TERM(t:-1 a:0),LIST,"date_time","ALL")
A:-T:-2.ADD_COLUMN(TERM(t:-1 a:2),LIST,"ip1","ALL")
T:-TMP_TABLE(T:-2)
A:-T:-3.ADD_COLUMN(TERM(t:-2 a:-1),LIST,"date_time","ALL")
A:-T:-3.ADD_COLUMN(TERM(t:-2 a:-2),LIST,"ip1","ALL")
RESULT(T:-3)

2009-05-29 07:29:10 [3] Initial execution plan (non-join):
2009-05-29 07:29:10 [3] Cnd(0):   t:-1 a:6 BETvn:-inf AND vn:3        (9.81)
2009-05-29 07:29:10 [3] Cnd(1):   t:-1 a:0 BETvn:2.761408958e+16 AND vn:2.761494857e+16       (18.98)
2009-05-29 07:29:11 [3] Packs/packrows after KN evaluation:
2009-05-29 07:29:11 [3] (t0Pckrows49200susp2705 (46495 empty 0 full). Packs opened in 2 cond.: 2760
2009
-05-29 07:29:18 [3] Packrows after exact evaluation (WHERE):
2009-05-29 07:29:18 [3] (t0): 49200 all packrows2695 to open (including 0 full)
2009-05-29 07:29:47 [3] Displaying result33537982 rows
Profile
 
Posted: 29 May 2009 09:40 AM   Ignore ]   [ # 10 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  764
Joined  2008-08-18

Hi,

The subquery probably will not help. Please try this:

select a.date_timea.ip1c.cc  from table_1 a
inner join country_code c on a
.ip1 between c.start and c.end
where date_time between 
'2009-03-01 00:00:00' and '2009-03-03 00:00:00'
and pkt_count 4

Regards,

Signature 
Profile
 
Posted: 29 May 2009 09:49 AM   Ignore ]   [ # 11 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

So, that is my original query with only selected columns rather than ‘*’ - I was being lazy!

Well, that worked. Or at least it didn’t but now I can see why!!

2009-05-29 07:41:35 [7] Warninga big intermediate object created (~268 GB

I don’t have a spare 268GB on this machine….but that is alright as at least I understand why it is failing

I guess I can chop my time from one month to one week etc etc
Thanks again for all the help

Jakub Wroblewski - 29 May 2009 09:40 AM

Hi,

The subquery probably will not help. Please try this:

select a.date_timea.ip1c.cc  from table_1 a
inner join country_code c on a
.ip1 between c.start and c.end
where date_time between 
'2009-03-01 00:00:00' and '2009-03-03 00:00:00'
and pkt_count 4

Regards,

Profile
 
Posted: 29 May 2009 09:52 AM   Ignore ]   [ # 12 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  764
Joined  2008-08-18

Hi,

It was a three day period. So please try with one day or so…

Regards,

Signature 
Profile
 
Posted: 29 May 2009 10:10 AM   Ignore ]   [ # 13 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

...or even a one hour period!

2009-05-29 07:48:15 [10] Initial execution plan (non-join):
2009-05-29 07:48:15 [10] Cnd(0):   t:-1 a:6 BETvn:-inf AND vn:3     (9.81)
2009-05-29 07:48:15 [10] Cnd(1):   t:-1 a:0 BETvn:2.761408958e+16 AND vn:2.761408958e+16     (10.09)
2009-05-29 07:48:15 [10] Packs/packrows after KN evaluation:
2009-05-29 07:48:15 [10] (t0Pckrows49200susp36 (49164 empty 0 full). Packs opened in 2 cond.: 72
2009
-05-29 07:48:15 [10] (t1Pckrows2susp(empty 0 full). Packs opened in 0 cond.: 0
2009
-05-29 07:48:15 [10] (t0-t1Pack2Pack pairs0.07possible.
2009-05-29 07:48:16 [10] Packrows after exact evaluation (WHERE):
2009-05-29 07:48:16 [10] (t0): 49200 all packrows16 to open (including 0 full)
2009-05-29 07:48:16 [10] (t1): 2 all packrows2 to open (including 2 full)
2009-05-29 07:48:16 [10] Join execution plan:
2009-05-29 07:48:16 [10] Cnd(2):   t:-1 a:>= t:-3 a:0     (52.01)
2009-05-29 07:48:16 [10] Cnd(3):   t:-1 a:<= t:-3 a:1     (52.01)
2009-05-29 07:49:00 [10] Sorting roughly multiindex...
2009-05-29 08:00:29 [10] Sorting roughly multiindex...
2009-05-29 08:00:29 [10] Updating P2P...
2009-05-29 08:00:29 [10] Tuples after 2 condinner join 0-1 [mix.]16060
2009
-05-29 08:00:30 [10] Displaying result16060 rows.
2009-05-29 08:00:30 [10] Total data packs actually loaded (approx.): 4
2009
-05-29 08:00:30 [10] ---------------------------------------------------------------------------- 

That worked, but not sure how to progress one hour at a time grin

A more fundamental question: when I parse the raw data to produce the input file, maybe I should lookup the country code at that point in the preparation phase, and store it in the table rather than try and do a JOIN like this? Although it kinda defeats the object of the ‘warehouse’ doesn’t it? Or is that the correct thing to do?

Thanks

Profile
 
Posted: 29 May 2009 10:16 AM   Ignore ]   [ # 14 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  764
Joined  2008-08-18

Well, it would be definitely the faster way to have this column precalculated. And it is not a bad thing in data warehousing - in general, precalculated things (aggregations etc.) are common in this area.

A reason not to precalculate it: it would be a denormalization, redundancy etc. But in ICE this column should be well compressed.

Regards,

Signature 
Profile
 
Posted: 29 May 2009 10:27 AM   Ignore ]   [ # 15 ]  
Newbie
Rank
Total Posts:  49
Joined  2009-05-19

I guess a second reason NOT to pre-calculate is if the country code mapping changes - as it could easily do.

Given it works when I do one hour, is there any further optimisation I could do to make that one hour run faster? A query covering one hour takes 12 minutes. So to cover one day I could do a UNION ALL on 24 queries it would take about 288 minutes (24 x 7).

The bit that takes the time is

2009-05-29 07:49:00 [10] Sorting roughly multiindex... 

Can I do any optimisation?
(Until I get considerably bigger disks!!)

Jakub Wroblewski - 29 May 2009 10:16 AM

Well, it would be definitely the faster way to have this column precalculated. And it is not a bad thing in data warehousing - in general, precalculated things (aggregations etc.) are common in this area.

A reason not to precalculate it: it would be a denormalization, redundancy etc. But in ICE this column should be well compressed.

Regards,

Profile
 
   
1 of 2
1