Joinutility seperatorLogin utility separator Infobright.com
   
 
A pack for every column including nulls?
Posted: 31 December 2008 05:23 PM   Ignore ]  
Newbie
Rank
Total Posts:  8
Joined  2008-12-24

My application is atypical, in that on every query, I need every column of data, and 40 or more columns (out of 52) are likely to be null.  In the bh.err excerpt below, I noticed that loading 1 row of data seemed to have opened 52 packs, even though most of the returned columns were null.  Is there any way around this?

2008-12-28 14:14:15 [1] T:-1 = TABLE_ALIAS(T:0)
T:-2 = TMP_TABLE(T:-1)
F:0 = CREATE_FILTER(T:-2,TERM(t:-1 a:4),<=,TERM(vn:1992),TERM(<null>),WHERE)
F:0 = AND(F:0,TERM(t:-1 a:5),>=,TERM(vn:1992),TERM(<null>))
F:0 = AND(F:0,TERM(t:-1 a:0),=,TERM(vn:1),TERM(<null>))
F:0 = AND(F:0,TERM(t:-1 a:1),=,TERM(vn:234),TERM(<null>))
F:0 = AND(F:0,TERM(t:-1 a:2),=,TERM(vn:84),TERM(<null>))
F:0 = AND(F:0,TERM(t:-1 a:3),=,TERM(vn:1899),TERM(<null>))
F:0 = AND(F:0,TERM(t:-1 a:6),=,TERM(BSHG),TERM(<null>))
T:-2.FILTER(F:0)
A:-1 = T:-2.ADD_COLUMN(TERM(t:-1 a:0),LIST,“CategoryId”,“ALL”)
A:-2 = T:-2.ADD_COLUMN(TERM(t:-1 a:1),LIST,“SubcategoryId”,“ALL”)
A:-3 = T:-2.ADD_COLUMN(TERM(t:-1 a:2),LIST,“MakeId”,“ALL”)
A:-4 = T:-2.ADD_COLUMN(TERM(t:-1 a:3),LIST,“ModelId”,“ALL”)
A:-5 = T:-2.ADD_COLUMN(TERM(t:-1 a:4),LIST,“FromYear”,“ALL”)
A:-6 = T:-2.ADD_COLUMN(TERM(t:-1 a:5),LIST,“ToYear”,“ALL”)
A:-7 = T:-2.ADD_COLUMN(TERM(t:-1 a:6),LIST,“BrandId”,“ALL”)
A:-8 = T:-2.ADD_COLUMN(TERM(t:-1 a:7),LIST,“BrandName”,“ALL”)
A:-9 = T:-2.ADD_COLUMN(TERM(t:-1 a:8),LIST,“PartTerminologyId”,“ALL”)
A:-10 = T:-2.ADD_COLUMN(TERM(t:-1 a:9),LIST,“PartTerminologyName”,“ALL”)
A:-11 = T:-2.ADD_COLUMN(TERM(t:-1 a:10),LIST,“PartNumber”,“ALL”)
A:-12 = T:-2.ADD_COLUMN(TERM(t:-1 a:11),LIST,“Quantity”,“ALL”)
A:-13 = T:-2.ADD_COLUMN(TERM(t:-1 a:12),LIST,“AspirationName”,“ALL”)
A:-14 = T:-2.ADD_COLUMN(TERM(t:-1 a:13),LIST,“BedLength”,“ALL”)
A:-15 = T:-2.ADD_COLUMN(TERM(t:-1 a:14),LIST,“BedType”,“ALL”)
A:-16 = T:-2.ADD_COLUMN(TERM(t:-1 a:15),LIST,“BodyNumDoors”,“ALL”)
A:-17 = T:-2.ADD_COLUMN(TERM(t:-1 a:16),LIST,“BodyTypeName”,“ALL”)
A:-18 = T:-2.ADD_COLUMN(TERM(t:-1 a:17),LIST,“BrakeABSName”,“ALL”)
A:-19 = T:-2.ADD_COLUMN(TERM(t:-1 a:18),LIST,“BrakeSystemName”,“ALL”)
A:-20 = T:-2.ADD_COLUMN(TERM(t:-1 a:19),LIST,“BrakeTypeFront”,“ALL”)
A:-21 = T:-2.ADD_COLUMN(TERM(t:-1 a:20),LIST,“BrakeTypeRear”,“ALL”)
A:-22 = T:-2.ADD_COLUMN(TERM(t:-1 a:21),LIST,“CylinderHeadTypeName”,“ALL”)
A:-23 = T:-2.ADD_COLUMN(TERM(t:-1 a:22),LIST,“DisplayOrder”,“ALL”)
A:-24 = T:-2.ADD_COLUMN(TERM(t:-1 a:23),LIST,“DriveTypeName”,“ALL”)
A:-25 = T:-2.ADD_COLUMN(TERM(t:-1 a:24),LIST,“EngineBaseId”,“ALL”)
A:-26 = T:-2.ADD_COLUMN(TERM(t:-1 a:25),LIST,“EngineVINName”,“ALL”)
A:-27 = T:-2.ADD_COLUMN(TERM(t:-1 a:26),LIST,“EngineVersion”,“ALL”)
A:-28 = T:-2.ADD_COLUMN(TERM(t:-1 a:27),LIST,“EngineMfr”,“ALL”)
A:-29 = T:-2.ADD_COLUMN(TERM(t:-1 a:28),LIST,“FuelDeliveryTypeName”,“ALL”)
A:-30 = T:-2.ADD_COLUMN(TERM(t:-1 a:29),LIST,“FuelDeliverySubTypeName”,“ALL”)
A:-31 = T:-2.ADD_COLUMN(TERM(t:-1 a:30),LIST,“FuelSystemControlTypeName”,“ALL”)
A:-32 = T:-2.ADD_COLUMN(TERM(t:-1 a:31),LIST,“FuelSystemDesignName”,“ALL”)
A:-33 = T:-2.ADD_COLUMN(TERM(t:-1 a:32),LIST,“FuelTypeName”,“ALL”)
A:-34 = T:-2.ADD_COLUMN(TERM(t:-1 a:33),LIST,“IgnitionSystemTypeName”,“ALL”)
A:-35 = T:-2.ADD_COLUMN(TERM(t:-1 a:34),LIST,“MfrLabel”,“ALL”)
A:-36 = T:-2.ADD_COLUMN(TERM(t:-1 a:35),LIST,“Position”,“ALL”)
A:-37 = T:-2.ADD_COLUMN(TERM(t:-1 a:36),LIST,“RegionId”,“ALL”)
A:-38 = T:-2.ADD_COLUMN(TERM(t:-1 a:37),LIST,“RestraintTypeName”,“ALL”)
A:-39 = T:-2.ADD_COLUMN(TERM(t:-1 a:38),LIST,“SpringTypeFront”,“ALL”)
A:-40 = T:-2.ADD_COLUMN(TERM(t:-1 a:39),LIST,“SpringTypeRear”,“ALL”)
A:-41 = T:-2.ADD_COLUMN(TERM(t:-1 a:40),LIST,“SteeringSystemName”,“ALL”)
A:-42 = T:-2.ADD_COLUMN(TERM(t:-1 a:41),LIST,“SteeringTypeName”,“ALL”)
A:-43 = T:-2.ADD_COLUMN(TERM(t:-1 a:42),LIST,“SubModelName”,“ALL”)
A:-44 = T:-2.ADD_COLUMN(TERM(t:-1 a:43),LIST,“TransferCaseId”,“ALL”)
A:-45 = T:-2.ADD_COLUMN(TERM(t:-1 a:44),LIST,“TransferCaseBaseId”,“ALL”)
A:-46 = T:-2.ADD_COLUMN(TERM(t:-1 a:45),LIST,“TransferCaseMfrCode”,“ALL”)
A:-47 = T:-2.ADD_COLUMN(TERM(t:-1 a:46),LIST,“TransmissionControlTypeName”,“ALL”)
A:-48 = T:-2.ADD_COLUMN(TERM(t:-1 a:47),LIST,“TransmissionElecControlled”,“ALL”)
A:-49 = T:-2.ADD_COLUMN(TERM(t:-1 a:48),LIST,“TransmissionMfrCode”,“ALL”)
A:-50 = T:-2.ADD_COLUMN(TERM(t:-1 a:49),LIST,“TransmissionNumSpeeds”,“ALL”)
A:-51 = T:-2.ADD_COLUMN(TERM(t:-1 a:50),LIST,“TransmissionTypeName”,“ALL”)
A:-52 = T:-2.ADD_COLUMN(TERM(t:-1 a:51),LIST,“ValvesPerEngine”,“ALL”)
A:-53 = T:-2.ADD_COLUMN(TERM(t:-1 a:52),LIST,“WheelBase”,“ALL”)
RESULT(T:-2)

2008-12-28 14:14:15 [1] Initial execution plan:
2008-12-28 14:14:15 [1] Cnd(0):  t:-1 a:3 BET. vn:1899 AND vn:1899   (7.28)
2008-12-28 14:14:15 [1] Cnd(1):  t:-1 a:6 BET. vn:3459 AND vn:3459   (7.59)
2008-12-28 14:14:15 [1] Cnd(2):  t:-1 a:2 BET. vn:84 AND vn:84   (9.7)
2008-12-28 14:14:15 [1] Cnd(3):  t:-1 a:1 BET. vn:234 AND vn:234   (11)
2008-12-28 14:14:15 [1] Cnd(4):  t:-1 a:0 BET. vn:1 AND vn:1   (12.96)
2008-12-28 14:14:15 [1] Cnd(5):  t:-1 a:4 BET. vn:-inf AND vn:1992   (15.93)
2008-12-28 14:14:15 [1] Cnd(6):  t:-1 a:5 BET. vn:1992 AND vn:+inf   (16.11)
2008-12-28 14:14:15 [1] Packs/packrows after KN evaluation:
2008-12-28 14:14:15 [1] (t0) Pckrows: 153, susp. 1 (152 empty 0 full). Packs opened in 7 cond.: 5
2008-12-28 14:14:16 [1] Packrows after exact evaluation (WHERE):
2008-12-28 14:14:16 [1] (t0): 153 all packrows, 1 to open (including 0 full)
2008-12-28 14:14:16 [1] Displaying result: 1 rows.
2008-12-28 14:14:16 [1] Total data packs actually loaded (approx.): 52
2008-12-28 14:14:16 [1]——————————————————————————————————————

Profile
 
Posted: 31 December 2008 05:32 PM   Ignore ]   [ # 1 ]  
Jr. Member
Avatar
RankRank
Total Posts:  87
Joined  2008-08-18

Hi,

When you load data, IB updates the metadata for every column, whether there is a value or not.  So, what you see in your log is the updating of every column’s metadata to indicate the addition of another element.  So, in a column where all values are null, while my min and max values do not change, the count of values stored in the last data pack for the column is incremented by one.

Now, when you run a query, IB will evaluate the data packs to determine which values meet the criteria.  In the case where all values of a column are NULL, IB will know the result to return immediately.  On very small volumes of data, this may seem inefficient, but as your table grows in size, you should find very fast performance as IB will already ‘know’ the value to return for many columns, since they are primarily null.

Additionally, you should see very high compression rates on this data!

Cheers,

Profile
 
Posted: 02 January 2009 07:43 AM   Ignore ]   [ # 2 ]  
Member
Avatar
RankRankRank
Total Posts:  191
Joined  2008-08-18
jcaristi - 31 December 2008 05:23 PM

2008-12-28 14:14:15 [1] Packs/packrows after KN evaluation:
2008-12-28 14:14:15 [1] (t0) Pckrows: 153, susp. 1 (152 empty 0 full).

That shows that without accessing data we eliminated 152 out of 153 packs for every column (so called packrows=groupings of 65K records)

jcaristi - 31 December 2008 05:23 PM

Packs opened in 7 cond.: 5
2008-12-28 14:14:16 [1] Packrows after exact evaluation (WHERE):
2008-12-28 14:14:16 [1] (t0): 153 all packrows, 1 to open (including 0 full)

After “touching” data (5 data packs) from this packrow to evaluate exactly WHERE condition, we see that some records meet the critieria while the others don’t. So, one packrow still needs to be accessed while computing output columns (SELECT list)

jcaristi - 31 December 2008 05:23 PM

2008-12-28 14:14:16 [1] Displaying result: 1 rows.
2008-12-28 14:14:16 [1] Total data packs actually loaded (approx.): 52
2008-12-28 14:14:16 [1]——————————————————————————————————————

Thus, for this single packrow we accessed every column out of 52 to generate the result set. From this we can see that none of the 52 columns contained nulls only for this particular grouping of records. It means that for some other packrow (different WHERE condition) we might potentially load less data packs, provided that some column has some uniform value (like NULL) for this packrow.

Thanks,
Piotr

Profile
 
   
 
 
‹‹ Can I speed up my join queries?      scaling ››