Dimensional Modeling
This week's textbook reading on Dimensional Modeling was quite eye opening. At first, my reaction to non-normalized database tables (with respect to the dimension tables) was "noooooooo!", not that I'm a huge fan of crazy ER diagrams...
However, the star schema approach described makes a lot of sense. When one focusses on the ease of data access and the ability to accommodate changing requirements, the simplicity of the Fact and Dimension tables and their relationships is compelling.
The Four Step Process
The four-step process outlined in the lecture materials really helped solidify the concepts of dimensional modeling. Steps 1 and 2 seem especially important, as they guide the rest of the process.
One surprising thing in the lecture was the idea of a "Date/Time Dimension" being present in almost every data warehouse and how it is one of the most important dimensions in a data mart. I understand why a date/time is important, but I would have thought that having a table to contain the values described would be unnecessary. To me, the data in this dimension can be easily and quickly calculated and to actually store this in a table would have been a waste of space. It is easy to convert from UNIX epoch time to things like the week number, like this in Perl:
my $weekNum = POSIX::strftime("%V", someTimeValue);
Of course, I will admit that 'holiday indicator' would be a little more difficult to determine programatically. After the discussion on slicing and dicing, it does make a little more sense to have this information ready and available for use in queries (although I'm not totally sold yet on the idea!)
OLAP Operations
I enjoyed the section on data cube operations (slice, dice, roll-up, drill-down, and pivot) and found that the examples really helped me understand the concepts.
Overall Impressions
The approach to dimensional modeling reminds me a lot of the principles outlined in agile software development (e.g. see Armson, 2012), specifically the ability for the system to accommodate change. Approaching dimensional modeling as an iterative process also harkens to the agile software methodology.I think that the publishing metaphor at the beginning of the textbook was especially apt, where the responsibilities were framed for the role of data warehouse manager ((chapter 1, page 6). The responsibilities outlined really help drive home the point that data warehouses are there to support the needs of the business and thus need to be business-user focussed.
No comments:
Post a Comment