error HY000 while load a txt file of 70GB
Posted: 27 November 2012 11:03 AM
[ Ignore ]
Newbie
Total Posts: 16
Joined 2012-11-27
hi,
i am using a txt file of 70GB to open to load the data,
i have 32GB of ram, is it possible it is a memory problem?
is there some other way i need to load it instead of:
LOAD DATA INFILE ‘D:\\data\\tpch\\lineitem\lineItem.tbl’ INTO TABLE LINEITEM FIELDS TERMINATED BY ‘| ‘;
Posted: 27 November 2012 11:33 AM
[ Ignore ]
[ # 1 ]
Administrator
Total Posts: 352
Joined 2012-01-18
In your brighthouse.ini, what is your setting for
LoaderMainHeapSize=
ServerMainHeapSize=
ControlMessages =
Please set your ControlMessages = 5, and let me know what the other settings are please.
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 27 November 2012 11:39 AM
[ Ignore ]
[ # 2 ]
Newbie
Total Posts: 16
Joined 2012-11-27
brighthouse.ini:
LoaderMainHeapSize= 800
ServerMainHeapSize= 24000
ControlMessages =0
following your request i set it to 5, do i need to restart the service?
Posted: 27 November 2012 12:05 PM
[ Ignore ]
[ # 3 ]
Administrator
Total Posts: 352
Joined 2012-01-18
Yes, stop & restart the service, ALSO set up the Reject file per
http://www.infobright.org/wiki/Reject_File/
It is very easy and will handle any data issues at the row level. Then run your LOAD statement and send me the bh.err file please.
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 28 November 2012 07:33 AM
[ Ignore ]
[ # 4 ]
Newbie
Total Posts: 16
Joined 2012-11-27
hi,
i used:
set @BH_REJECT_FILE_PATH = ‘d:\\data\\reject_file.txt’;
set @BH_ABORT_ON_COUNT = 10;
LOAD DATA INFILE ‘D:\\data\\tpch\\lineitem\\lineItem.txt’ INTO TABLE LINEITEM FIELDS TERMINATED BY ‘| ‘;
i got error:
ERROR 2 (HY000): Wrong data or column definition. Row: 10, field: 1.
and bh.err:
Loading configuration for Infobright instance ...
Option: AllowMySQLQueryPath, value: 0.
Option: AutoConfigure, value: 0.
Option: CacheFolder, value: D:\Infobright\cache.
Option: ControlMessages, value: 0.
Option: InternalMessages, value: 0.
Option: KNFolder, value: BH_RSI_Repository.
Option: KNLevel, value: 99.
Option: LicenseFile, value: <unknown>.
Option: LoaderMainHeapSize, value: 800.
Option: PushDown, value: 1.
Option: ServerMainHeapSize, value: 24000.
Option: UseMySQLImportExportDefaults, value: 0.
Infobright instance configuration loaded.
121128 11:21:52 [Note] Event Scheduler: Loaded 0 events
121128 11:21:52 [Note] Infobright Community Edition is ready for connections.
socket: ‘’ port: 5029 build number (revision)=IB_4.0.7_r16961_17249(ice) (shared)
Posted: 28 November 2012 11:52 AM
[ Ignore ]
[ # 5 ]
Administrator
Total Posts: 352
Joined 2012-01-18
Row 10, means that part of the load is already happenning, but since you set it at 10 for abort, all rows are not loading.
It sounds to me like you may not have the correct column terminators, line terminitaors and escaped by settings properly for the file.
Can you please send me a small set of the file, maybe the first 5 lines?
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 28 November 2012 12:19 PM
[ Ignore ]
[ # 6 ]
Newbie
Total Posts: 16
Joined 2012-11-27
hi,
thanks for your fast reply, attached my sample text file.
File Attachments
Posted: 28 November 2012 12:44 PM
[ Ignore ]
[ # 7 ]
Administrator
Total Posts: 352
Joined 2012-01-18
Can you please email me the DDL? I think i may see the issue. This looks like it was generated by some older unix environment (maybe SCO server, or UNIX or something like that) , the line terminators are different. Send me the DDL and I will write the statment for you.
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 28 November 2012 01:26 PM
[ Ignore ]
[ # 8 ]
Newbie
Total Posts: 16
Joined 2012-11-27
hi ,
1. what dll? please provide full name.
2. i am using a windows environment.
thank
Posted: 28 November 2012 01:27 PM
[ Ignore ]
[ # 9 ]
Administrator
Total Posts: 352
Joined 2012-01-18
Can you please send me the Create Table statement for the table you are loading this data into?
DDL = Data Definition Language
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 28 November 2012 02:35 PM
[ Ignore ]
[ # 10 ]
Newbie
Total Posts: 16
Joined 2012-11-27
CREATE TABLE LINEITEM ( L_ORDERKEY INT NOT NULL,
L_PARTKEY INT NOT NULL,
L_SUPPKEY INT NOT NULL,
L_LINENUMBER INT NOT NULL,
L_QUANTITY DECIMAL(15,2) NOT NULL,
L_EXTENDEDPRICE DECIMAL(15,2) NOT NULL,
L_DISCOUNT DECIMAL(15,2) NOT NULL,
L_TAX DECIMAL(15,2) NOT NULL,
L_RETURNFLAG CHAR(1) NOT NULL,
L_LINESTATUS CHAR(1) NOT NULL,
L_SHIPDATE DATE NOT NULL,
L_COMMITDATE DATE NOT NULL,
L_RECEIPTDATE DATE NOT NULL,
L_SHIPINSTRUCT CHAR(25) NOT NULL,
L_SHIPMODE CHAR(10) NOT NULL,
L_COMMENT VARCHAR(44) NOT NULL);
Posted: 28 November 2012 02:36 PM
[ Ignore ]
[ # 11 ]
Newbie
Total Posts: 16
Joined 2012-11-27
again, thank for the fast reply.
Posted: 28 November 2012 02:44 PM
[ Ignore ]
[ # 12 ]
Administrator
Total Posts: 352
Joined 2012-01-18
LOAD DATA INFILE ‘C:\\Projects\\InfoBright\\tempLineItem-first15lines.txt’
INTO TABLE LINEITEM FIELDS TERMINATED BY ‘|’
This statement worked for me. Two things to note:
1. There is an additional space in your fields terminated by ‘| ’ <- note the space after the pipe
2. Looking at your text file in a binary editor or hex editor, I found that there are 2 characters that are NOT visible that is your line break, ‘\n\r’. You may need to specify this just incase.
Signature
Craig Trombly
Community Manager
60 Hazelwood Dr Champaign, IL 61820
(JavaScript must be enabled to view this email address)
Posted: 29 November 2012 07:22 AM
[ Ignore ]
[ # 13 ]
Newbie
Total Posts: 16
Joined 2012-11-27
thanks , it works perfect!