Hi,
You generate an image of the entire dimension, the way you want it to look after the update, into a CSV file. Then drop and reload the table.
This is usually not so bad because dimensions tend to not be as large as your main fact table. Still, they can get quite large.
Example:
1) Select all the rows that have not changed (based on some criteria, JOIN or ETL stage) and all closed products
SELECT p.product_id, p.product_name, p.start_dt, p.end_dt FROM product_dim p
JOIN current_product_list cp on p.product_id = cp.product_id where end_dt IS NULL
UNION ALL SELECT p.product_id, p.product_name, p.start_dt, p.end_dt FROM product_dim p
WHERE end_dt IS NOT NULL
INTO OUTFILE '/tmp/unchanged.csv' fields terminated by '\t' enclosed by '"';
2) Select all the rows that need to be closed out (based on not being found in current_products)
SELECT p.product_id, p.product_name, p.start_dt, current_timestamp FROM product_dim p
LEFT OUTER JOIN current_product_list cp on p.product_id = cp.product_id WHERE cp.product_id IS NULL
INTO OUTFILE '/tmp/dropped.csv' fields terminated by '\t' enclosed by '"';
3) Recreate product_dim (empty)
CREATE TABLE product_dim_tmp LIKE product_dim;
DROP TABLE product_dim;
CREATE TABLE product_dim LIKE product_dim_tmp;
DROP TABLE product_dim_tmp;
4)
LOAD DATA INFILE '/tmp/dropped.csv' INTO product_dim
fields terminated by '\t' enclosed by '"';
LOAD DATA INFILE '/tmp/unchanged.csv' INTO product_dim
fields terminated by '\t' enclosed by '"';
5) Load any new products that weren’t previously in
Hope this helps.
Geoffrey