This week I‘ve been giving Oracle Warehouse Builder training at a company in Dublin, Ireland. The people are extremely nice and likeable and Dublin has a lot going for it and those Irish pubs are to die for and and and the Guinness is excellent!
It has been an interesting experience since this is the first time that I‘ve ever trained on a Standard Edition One (SE1) database. Using it a lot has surfaced that I hadn‘t really thought about before.
OWB is a great ETL tool and has many incredibly powerful features. As Oracle so often does, it‘s standard features aren‘t that great, the ones that just come out of the box – the core functionality. So in order to use stuff like automatic data cleansing and rules. Slowly changing dimensions, type 2 and 3, you will have to buy the Data Quality Option and the ETL option. For a database that is a SE1 this is no small investment. The options themselves cost somewhere in the range 10 – 20. 000 USD (I don‘t know exactly) per CPU but that is not all. You can not use these options unless you buy the Entreprise edition of Oracle and THAT, my friend, is where it hurts.
Still, even using SE1, you can get a lot done without these extras.
We created dimensions, mappings, cubes and implemented these as tables and packages. What you have to be aware of is that i.e. bitmap indexes are created automatically for you on fact tables which will then fail when you try to deploy the table since bitmap indexes are an enterprise option. Same goes for the dimension objects, you simply can not deploy them. You can deploy the underlying tables and create the mappings using the dimension objects, which makes life so much easier. Just don‘t deploy the dimensions. All they are used for in a relational implementation is that they are used for query rewrite and query rewrite is – you guessed it – an enterprise option.
Talking about dimensions and query rewrite. If you have a table that is 365 million rows, 1 millon per day and you often run queries like:
Select count(*), t.yyyymm
from fact_table f
join time_dim t on (t.time_key = f.time_key)
group by t.yyyymm
to get a listing like
31.000.000 200801
29.000.233 200802
etc. You would benefit from a materialized view that would summarize this for you into 12 rows or so to save you from scanning the full 365 million rows. Oracle would then automatically redirect this for you to that table. Great stuff. If you have defined a dimension on the time_dim table with a hierarchy i.e. YYYY-YYYY.Q-YYYY.MM-DATE
then Oracle would also know that you could calculate the quarter level numbers from the 12 rows in the mview instead of scanning the 365 million rows for the answer. Without a dimension, youd be in deep excrement...I will put an example up next week.
On that note, Microsoft does the exact same thing with their giveaway of i.e. Analysis Services. It will run OK on the same machine as your relational DW for a while but eventually you will move either Reporting Services, Analysis Services or both onto a new server which in most cases means that you will have to buy a new license. Same excrement, different smell.