Joinutility seperatorLogin utility separator Infobright.com
   
1 of 2
1
Infobright Uknown Error on Load
Posted: 25 June 2010 02:27 PM   Ignore ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15

Hello:

We are running the community version of Infobright - IB_3.3.1_r6997_7017(ice). We have this issue during bulk load of a certain table. The error we get is:

ERROR 1402 (XA100) at line 1: The Infobright storage engine has encountered an unexpected error. The current transaction has been rolled back. For details on the error please see the brighthouse.log file in
the /mnt/infobright-3.3.1-x86_64/data/ directory.

Now looking at the logs did not provide any clues. All it says in the logs is:

2010-06-24 21:43:39 The Infobright storage engine has encountered an unexpected error. The current transaction has been rolled back.
2010-06-24 21:43:39 Error: Unknown error.

Now, to narrow down the issue, what I did was break the large file into smaller junks of 10K lines and then try to load them. Doing this—loading the smaller chunks of 10K each succeeded.


Does any one know what maybe causing this issue?

TIA,
M.

Profile
 
Posted: 25 June 2010 07:45 PM   Ignore ]   [ # 1 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  738
Joined  2008-08-18

The error message is just a translation of the MySQL error message, which is not very specific, but would lend itself to an ‘out of resources’ scenario.

Questions:
- How large is a 10k line file?  How large are all of them combined?
- What platform are you running on?  Windows/Linux?  32- or 64-bit?  How much memory?  How much memory allocated to the database server?

Sorry to ask for so many details.  Just trying to be thorough and efficient in a single response.

Signature 
Profile
 
Posted: 28 June 2010 01:42 PM   Ignore ]   [ # 2 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15

Hello David,

Thanks for your reply. Sorry I should have provided more information in the OP.


(i) Each individual file is ~2.7 MBytes and the total big file is: ~30MBytes. The big file has 113,614 lines in it.
(ii) We are running on 64-Bit CentOS with 32GB RAM. These are the parameters in brighthouse.ini
ServerMainHeapSize = 24000
ServerCompressedHeapSize = 4000
LoaderMainHeapSize = 800


TIA,
M

Profile
 
Posted: 06 July 2010 11:00 AM   Ignore ]   [ # 3 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15

Friendly bump smile

Profile
 
Posted: 07 July 2010 04:03 AM   Ignore ]   [ # 4 ]  
Super Duper Member
Avatar
RankRankRankRankRank
Total Posts:  912
Joined  2008-08-18

Hi !

Do you have a scenario (data and statements) to reproduce the problem ? Finding the reason may require deeper investigation…

Profile
 
Posted: 14 October 2010 01:57 PM   Ignore ]   [ # 5 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15
Janusz Borkowski - 07 July 2010 04:03 AM

Hi !

Do you have a scenario (data and statements) to reproduce the problem ? Finding the reason may require deeper investigation…

We have upgraded to 3.4.2 on CentOS (64-bit) and we are consistently getting this issue. If you like I can send you the file and the table schema. Please let me know where I could send this to.

TIA,
M

Profile
 
Posted: 14 October 2010 02:37 PM   Ignore ]   [ # 6 ]  
Administrator
Avatar
RankRankRank
Total Posts:  145
Joined  2008-08-18

Hi aceeca;

one suggestion, and another question:


1) could you try setting
ServerMainHeapSize = 16000
and then running your load?
(you’ll need to restart infobright)

2) is autocommit set =0 or 1 ?


I’ll also inquire about getting a site for uploading.

tw

Profile
 
Posted: 14 October 2010 02:57 PM   Ignore ]   [ # 7 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15
Todd Wongkee - 14 October 2010 02:37 PM

Hi aceeca;

one suggestion, and another question:


1) could you try setting
ServerMainHeapSize = 16000
and then running your load?
(you’ll need to restart infobright)

2) is autocommit set =0 or 1 ?


I’ll also inquire about getting a site for uploading.

tw

1) I was using 24000, since I had a 32GB box. I’ll try with 16000.

2) autocommit is 0,

TIA,
M

Profile
 
Posted: 14 October 2010 03:01 PM   Ignore ]   [ # 8 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15

16000 is also failing.

Profile
 
Posted: 15 October 2010 01:54 PM   Ignore ]   [ # 9 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15

Further digging, I suspect that what is mentioned by the OP at:

http://www.infobright.org/Forums/viewthread/1797/

could also be the cause of my issue. I will try changing to UTF8

Profile
 
Posted: 20 October 2010 01:09 PM   Ignore ]   [ # 10 ]  
Newbie
Rank
Total Posts:  9
Joined  2008-10-15
Todd Wongkee - 14 October 2010 02:37 PM

Hi aceeca;

one suggestion, and another question:


1) could you try setting
ServerMainHeapSize = 16000
and then running your load?
(you’ll need to restart infobright)

2) is autocommit set =0 or 1 ?


I’ll also inquire about getting a site for uploading.

tw

After passing on the relevant files to “TW”, he figured that the issue was with having a “lookup” for some of the columns. Removing this definition from the columns resulted in successful loading of the tables. Thanks TW, you are the best.

—Aceeca

Profile
 
Posted: 19 November 2010 06:29 AM   Ignore ]   [ # 11 ]  
Newbie
Rank
Total Posts:  9
Joined  2009-12-07

Question for David Lutz, Janusz Borkowski, and Todd Wongkee:
Can you please comment on aceeca’s last post?
(...with having a “lookup” for some of the columns. Removing this definition from the columns resulted in successful loading of the tables)
As far as I understood, defining columns as “lookup” is a major advantage in what regards the queries speed.
—-
I have the same problem. In our case:
1. we have ICE v.3.3.1 running on a server
2. we have ICE v 3.4.2 installed on a CEntOS 5 - x86_64
3. we need to migrate a DB from the first server to the second one
We’ve started by SELECT INTO OUTFILE with a limit of 10M (10,000,000) rows.
Then tried to LOAD DATA INFILE on the second server
First chunk loaded fine. The second returns the error mentioned in this thread. We’re stuck.

aceeca mentions that he keeps having this error after upgrading ICE to 3.4.2. Can you please comment on this?

Profile
 
Posted: 19 November 2010 09:54 AM   Ignore ]   [ # 12 ]  
Sr. Member
Avatar
RankRankRankRank
Total Posts:  738
Joined  2008-08-18
mariusg - 19 November 2010 06:29 AM

1. we have ICE v.3.3.1 running on a server
2. we have ICE v 3.4.2 installed on a CEntOS 5 - x86_64
3. we need to migrate a DB from the first server to the second one
We’ve started by SELECT INTO OUTFILE with a limit of 10M (10,000,000) rows.
Then tried to LOAD DATA INFILE on the second server
First chunk loaded fine. The second returns the error mentioned in this thread. We’re stuck.

mariusg,

Can I assume that the DDL for the tables on the different servers is the same, meaning that lookups are defined on the same columns in both databases?  If so, it’s possible that the table in 3.3.1 was loaded in multiple increments instead of a single large load (or loads) as you are attempting in 3.4.2, and not experienced the condition I describe below.

The lookup issue could appear here if the cardinality (uniqueness) of any of the columns defined with lookups exceeds 10,000.  If cardinality exceeds 10,000, you will experience slow loads and unpredictable load behavior.  If cardinality exceeds 50,000, the loader has been known to crash during load execution.

To determine cardinality for a given column, on 3.3.1 run

SELECT COUNT(DISTINCT colname)
FROM table

If it’s greater than 10,000, remove the lookup from that column on the table in 3.4.2.  (You cannot later alter the column to add the lookup.)

Also, the uniqueness of the column’s values should be 10% or less of the total number of values in the column (= records in the table).  You can determine this by comparing the cardinality from above against the total number of values returned from:

SELECT COUNT(colname)
FROM table

Bear in mind, that this measurement and decision is made at a point in time.  Lookups are best used when there is a finite range of possible values that meets the cardinality recommendation - states or provinces in a country, gender, product codes, status codes, etc.  They are not appropriate for columns where the cardinality is initially low but grows over time - names, cities, free-form text fields, etc. - if the total table size will exceed 10,000 records.

And another thing, in general, the lower the cardinality the better.

Signature 
Profile
 
Posted: 19 November 2010 12:16 PM   Ignore ]   [ # 13 ]  
Newbie
Rank
Total Posts:  9
Joined  2009-12-07

Hi David,

Much thanks for the quick reply. Answers to your questions:
1. Yes, same DDL. I have used 2 consoles. One one I have issued: “show create table…”, on the second I have pasted the result.
2. Yes, the column satisfies the conditions you mention:
InfoBright>select count(distinct COL) from TABLE;
+—————————————+
| count(distinct COL) |
+—————————————+
|              1369 |
+—————————————+
1 row in set (0.30 sec)
InfoBright>select count(COL) from TABLE;
+————————-+
| count(COL) |
+————————-+
|    157673933 |
+————————-+
1 row in set (0.00 sec)

It’s a bit strange, since last week we have to move in a similar fashion several DBs, of which one had 1.3G rows in the largest table. And this behavior struck out of the blue at the very last DB…

Profile
 
Posted: 02 December 2010 12:47 PM   Ignore ]   [ # 14 ]  
Administrator
RankRankRankRank
Total Posts:  413
Joined  2010-09-22

Hi mariusg,

Can you provide some sample data, your ddl, and load statement?

Cheers,
Jeff

Signature 

jeff kibler

Profile
 
Posted: 03 December 2010 06:01 AM   Ignore ]   [ # 15 ]  
Newbie
Rank
Total Posts:  9
Joined  2009-12-07

Hi Jeff,

I would rather not post sensitive information on a public forum. If there’s an alternative, just let me know.
However, I will post here excerpts from the scripts used for dumping / loading data:

# Dumps an InfoBright DB table; $1 is the DB; $2 is the table; use 10M rows in a chunk
TOT=$(mysql-ib -uUSER -pPASSWD -s -e “select count(*) from “$1”.”$2”;”);
OFF=o
INC=10000000
FILE=100
while [ $OFF -lt $TOT ]; do
  let STOP=$OFF+$INC
  mysql-ib -uUSER -pPASSWD -e \
    “set @bh_dataformat = ‘txt_variable’; \
    select * from “$1”.”$2”  LIMIT “$OFF”,”$INC” \
    into outfile ‘/tmp/”$1”.”$2$FILE”.csv’ \
    fields terminated by ‘|’ enclosed by ‘NULL’;”
  let OFF=$STOP;
  let FILE=$FILE+1;
done

Loading is done as per official documentation. It simply loops for all FILE
mysql-ib -uUSER -pPASS $1 -e \
  “LOAD DATA INFILE ‘/tmp/”$1”.”$2$FILE”.csv’ \
  INTO TABLE “$2” \
  FIELDS TERMINATED BY ‘|’ \
  ENCLOSED BY ‘NULL’;”

best regards,
Marius

Profile
 
   
1 of 2
1