Joinutility seperatorLogin utility separator Infobright.com
   
 
Load never returns
Posted: 20 October 2008 06:09 PM   Ignore ]  
Newbie
Rank
Total Posts:  3
Joined  2008-09-16

Help!!!!!!

I’m trying to load a 2 million row input file into a table and it just never returns.

I did a sample load on a few hundred rows and it returns very quickly.  However, when I try to load the entire file I can leave it for hours and it just doesn’t come back.

Here’s the ddl for the table:

mysql> show full columns from tst_load;

+-------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+--------------------------------------------+
Field       Type        Collation         Null Key | Default | Extra Privileges                      Comment                                    |
+-------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+--------------------------------------------+
user_agent  varchar(30) | latin1_swedish_ci YES  |     | NULL    |       | select,insert,update,references LOOKUPSize[MB]0.1Ratio0.00unique 
ip          varchar(15) | latin1_swedish_ci YES  |     | NULL    |       | select,insert,update,references Size[MB]0.1Ratio0.00unique         
new_visitor tinyint(4)  | NULL              YES  |     | NULL    |       | select,insert,update,references Size[MB]0.1Ratio0.00unique         
created     datetime    NULL              YES  |     | NULL    |       | select,insert,update,references Size[MB]0.1Ratio0.00unique         
+-------------+-------------+-------------------+------+-----+---------+-------+---------------------------------+--------------------------------------------+ 

4 rows in set (0.00 sec)

Profile
 
Posted: 21 October 2008 04:38 AM   Ignore ]   [ # 1 ]  
Super Duper Member
Avatar
RankRankRankRankRank
Total Posts:  964
Joined  2008-08-18

Hi !

Thanks for contacting us. I would like to begin with some questions…

- Could you tell if the loader hangs (zero activity), or it loops (100% CPU usage), or there is huge disk activity ? You may run “vmstat 1” in a separate console to get activity reports.

- What memory setting in brighthouse.ini you have, and what is the memory size in your system?

- What is the syntax of the load command you use?

- Can you post sample data?

Profile
 
Posted: 21 October 2008 04:45 AM   Ignore ]   [ # 2 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  738
Joined  2008-08-18

argyllsock,

Could you please run:

SELECT COUNT(DISTINCT user_agentFROM tst_load

I suspect that your load is being impacted by the maintenance of the LOOKUP table on the user_agent field.  LOOKUP is designed to be used where the number of unique values is less than 10,000.  For numbers greater than 10,000, you will see the degradation in load performance that was experienced.  If the result of your query exceeds 10,000, please consider removing the LOOKUP attribute from this field and reload.

Best regards,

Signature 
Profile
 
Posted: 21 October 2008 09:33 AM   Ignore ]   [ # 3 ]  
Newbie
Rank
Total Posts:  3
Joined  2008-09-16

Wow, that’s a lot of things to try out for this one…

So, I took the easy one first:

I ran the select count distinct and it returned 1,732,221 distinct values for user_agent.

So, I removed the LOOKUP from the table and reloaded.  Worked great - the load finished in a few minutes.

Thanks.

Profile
 
Posted: 22 October 2008 11:24 AM   Ignore ]   [ # 4 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  738
Joined  2008-08-18

argyllsock,

That’s great news.  Best of luck with your continued use of INFOBRIGHT.

Signature 
Profile