Part 12: Particularity – Hierarchies

A hierarchy is an arrangement of items (objects, names, values, categories, etc.) in which the items are represented as being “above,” “below,” or “at the same level as” one another. In business environments, we do handle hierarchies every day. For example, Apple sells products in the category “mac”, “ipad”, “iphone”, “watch”, “tv”, “music”, “ipod” and other elements. But there are levels above and below this product level. “Hardware” and “software” could be the level above, “big” and “small” below.

A hierarchy is important because of the analytical approaches later. When you want to analyze the sales for hardware, especially for ipads and for the “big” ones, then you need a hierarchy in your data model. Very often (I made this experience) an ERP-system or another system does not provide such a hierarchy and/ or you have to implement it in the data warehouse data model. But what is to do then?

There are a lot of hierarchies – it depends of the people`s view

There are a lot of hierarchies and my experience showed me that it depends on the people`s mind. When a sales rep talks about a hierarchy in the product segment, it will be different from someone who is part of production. So, a hierarchy is an artificial organization of elements which does not exist in the real world except human being creating it.

But when you create a hierarchy, you should evaluate which hierarchy models are necessary and have to be implemented. My suggestion for Apple`s product hierarchy based on the marketing perspective would be:

Example of a Product Hierarchy for Apple Products

Example of a Product Hierarchy for Apple Products

How to implement a hierarchy

There are several ways how can implement technically a hierarchy. The two most common ways are to build a “flattened dimension” or a “snowflake”. Both should be chosen wisely.

Snowflake vs. Flattened

Snowflake vs. Flattened

Flattened Dimension: If the number of levels and attributes is few, then consider flattening the hierarchies into a denormalized dimension. This is what is done in a pure star. In this all the levels are in one row. The denormalized dimension must have all the attributes sufficient to handle the maximum number of levels. Reporting hierarchies do not usually have the depth of a manufacturing bill of materials so it is usually not a problem to flatten them this way in terms of the width of the dimension. The issue is what to do about missing levels. If a level is missing, there are two approaches: 1) use a skip flag to tell the query to skip this level; SQL Server does it this way. 2) store the next higher level in the missing Level.

Snowflake.:If the number of levels is few and fairly predictable, and the number of attributes is many, then consider creating fixed but separate tables for each level of the dimension. In essence, I’m saying snowflake the dimension. This works well for regular or symmetric hierarchies, but will be complex for ragged hierarchies. Clive Finklestein in his first book, Information Engineering, shows a manufacturing example of this but is absurdly complex to do this. So this is not a good solution if the hierarchy is really ragged.

There are more ways how you can implement a hierarchy. More is here.

 

About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.