Joinutility seperatorLogin utility separator Infobright.com
   
 
Performance issue
Posted: 10 August 2009 08:30 AM   Ignore ]  
Newbie
Rank
Total Posts:  20
Joined  2009-06-19

Hi,

    I have the table with 742000000 rows…both in MyISAM and Infobright. I have run the following query in both tables.

Infobright:
select id,count(id),sum(scheduled_payment),sum(current_balance) from loan_info where id = 372636 group by id;
1 row in set (6.49 sec)

MyISAM:
select id,count(id),sum(scheduled_payment),sum(current_balance) from loan_info where id = 372636 group by id;
1 row in set (2.20 sec)

Why infobright takes long time to execute the query?

See the attached brighthouse.ini file setting with 8GB ram, 64 bit server & ICE 3.2(RC1)

Is possible to fine tune the config variables?

What is KNFolder variable in the config? Will it must to set?

Thanks.

Regards,
S.Ashokkumar.

File Attachments 
brighthouse.txt  (File Size: 3KB - Downloads: 408)
Profile
 
Posted: 11 August 2009 12:05 PM   Ignore ]   [ # 1 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  736
Joined  2008-08-18

Hi,

Could you, please, copy the bh.err log file for this query? Namely, the part between “Initial execution plan” and the final horizontal line.

Probably no tuning will be needed in any config files. Usually only the memory settings are important to be set, and your settings look OK.

Regards,

Signature 
Profile
 
Posted: 12 August 2009 01:12 AM   Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  20
Joined  2009-06-19
2009-08-12 10:06:21 [5] T:-TABLE_ALIAS(T:0,"loan_info")
T:-TMP_TABLE(T:-1)
VC:-2.0 CREATE_VC(T:-2,PHYS_COL(T:-1,A:0))
A:-T:-2.ADD_COLUMN(VC:-2.0,GROUP_BY,"id","ALL")
A:-T:-2.ADD_COLUMN(VC:-2.0,COUNT,"count(id)","ALL")
VC:-2.1 CREATE_VC(T:-2,PHYS_COL(T:-1,A:9))
A:-T:-2.ADD_COLUMN(VC:-2.1,SUM,"sum(scheduled_payment)","ALL")
VC:-2.2 CREATE_VC(T:-2,PHYS_COL(T:-1,A:4))
A:-T:-2.ADD_COLUMN(VC:-2.2,SUM,"sum(current_balance)","ALL")
VC:-2.3 CREATE_VC(T:-2,EXPR("372636"))
F:CREATE_FILTER(T:-2,VC:-2.0,=,VC:-2.3,<null>,WHERE)
T:-2.FILTER(F:0)
RESULT(T:-2)

2009-08-12 10:06:21 [5] Initial execution plan (non-join):
2009-08-12 10:06:21 [5] Cnd(0):   VC:0(t0a0BETVC:4(372636) AND VC:5(372636)         (0)
2009-08-12 10:06:21 [5] Packs/packrows after KN evaluation:
2009-08-12 10:06:21 [5] (t0Pckrows11324susp486 (10838 empty 0 full). Packs opened in 1 cond.: 486
^[[2009-08-12 10:06:25 [5] Packrows after exact evaluation (WHERE):
2009-08-12 10:06:25 [5] (t0): 11324 all packrows69 to open (including 0 full)
2009-08-12 10:06:25 [5] Aggregating355383 tuples left0 grfound so far
2009
-08-12 10:06:27 [5] Aggregated (1 gr). Omitted packrows0 partiallyout of 69 total.
2009-08-12 10:06:27 [5] Displaying result1 rows.
2009-08-12 10:06:27 [5] Total data packs actually loaded (approx.): 624
2009
-08-12 10:06:27 [5] ---------------------------------------------------------------------------- 


This is the log for that query.

Profile
 
Posted: 12 August 2009 06:01 AM   Ignore ]   [ # 3 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  736
Joined  2008-08-18

Thanks!

What we may deduce from the log: the most of time (4 s) consumes calcultion of condition “id = 372636”. We were able to filter out 95% of rows thanks to Knowledge Grid (note that logs report 486 suspicious data packs out of 11 thousands). Then the resulting 355 thousand rows are aggregated, which takes another 2 s.

The log shows that the data are quite well organized. On the other hand, they may be organized a bit better - if they are loaded in the way “id” column is ordered, then we would need only 6-7 data packs to be opened.

BTW, the query may be optimized a bit, by getting rid with GROUP BY (as we have only one group):

select 372636count(*), sum(scheduled_payment), sum(current_balancefrom loan_info where id 372636

Regards,

Signature 
Profile