Another day in Doha Qatar. We had a breakthrough today almost completing our method of populating 1 fact table containing about 400 indicators! That has to do with the fact that one of the dimensions is an indicator dimension allowing us to create basically one basic query to retrieve any data we need. As the table is fully solved (a total exists for every possible combination of dimensions) no aggregates are necessary but the query predicates can become quite tricky so that you will have to know at which level your answer is. For example if you issue a query that asks for a total for age=10-14, gender=All and municipality=’London’, you will retrieve just one row and no aggregates. (assuming these are the only dimensions, there really are more). Instant response, regardless of the size of the underlying table (which will be less than 100.000.000 rows, probably much less). The gender is at a “Total” level and instead of summing up for Male and Female, you retrieve this directly.
Complex? Yes. Fast? Yes. Slow to populate? No. Initial tests indicate 30 seconds for a million rows Indicating 60 million rows in 30 minutes. This may be optimistic, but we will not run into a load window issue since we can load one indicator at a time into its own partition using PEL (Partition Exhange Loading). Using PEL we can make this appear instant.
I will probably post the SQL to load tomorrow or Thursday. It ended up pretty straight forward but we still have some testing to do to make sure this works.