Users are asking whether to use surrogate keys in ICE. And if so, when should they be used.
Some useful links describe surrogate keys and slowly changing dimensions. These links provide good context before discussing the benefits of surrogate keys.
http://en.wikipedia.org/wiki/Slowly_changing_dimension
http://en.wikipedia.org/wiki/Surrogate_key
To level set on terminology, the use of surrogate keys is a database design technique; it does not necessarily imply the use of indices. ICE does not need nor does it support indices. But there are times when sing surrogate keys would be very useful.
Surrogate keys are good for:
1. Replacing compound PKs & FKs with a single column surrogate key. It is more efficient to join on 1 column than multiple columns.
2. Replacing an inefficient data type (e.g. VARCHAR (128) with a BIGINT). It is more efficient to join on INTEGER than VARCHAR.
Both techniques #1 and #2, above, may speed up Infobright queries, so we recommend them for these cases. are both , To level set on terminology,
And surrogate keys serve more purposes than facilitating efficient joins. They are also a necessity for a production data warehouse that maintains Type II SCDs. That is reason enough to use them. Keep in mind: Type II SCD = surrogate keys. If your end users wish to keep historical data (i.e. Type II SCD), surrogate keys are very useful.
On a related note, someone asked, “Would something like MySQL’s AUTO_INCREMENT be enough or should we consider a wider range of possibilities?”
Assuming a client wants to use a surrogate key with sequentially ascending numbers, there are two main ways of incrementing numbers prior to loading the surrogate key into the target database:
1. Use procedural logic in the ETL layer
2. Use AUTO_INCREMENT in a MyISAM staging database

