For some people, between Business Intelligence (BI) and Data Warehousing (DWH) is huge difference and for others they are quite the same. In my humble opinion it depends on the educational background and the experience of each person. For me, it is important to bring a defintion before talking in depth about the characteristics of each terms.
My definition of Business Intelligence:
„Business Intelligence is an initiative in user companies in order to collect, transform and provide data for users to plan, control and steer the company to achieve the company`s goals.“
My definition of Data Warehousing:
„Data Warehousing includes the collection, transformation and the provision of consolidated data in conceptual, logical, and physical models to support business goals and end-user information needs.“
As you can see (or acutally read), Data Warehousing and Business Intelligence is a pretty good match. Data Warehousing is part of the Business Intelligence initiative and is a success factor for it.
In the header, there is the part “architecture of an analytical landscape”. It stands for the business intelligence architecture which I will describe on a very high level. But this high level contemplation is the early stage before I will explain the different types of business intelligence architectures. In the figure below, you can see five layers Source System(s), Collecting & Transforming, Storage, Provisioning and Application(s).
Let´s start with the source system(s). A source system is an OLTP (Online Transaction Processes). OLTP is a class of systems that supports or facilitates high transaction-oriented applications. OLTP’s primary purpose is to control and run fundamental business tasks as well as handles inserts into rows of tables and updates short and fast initiated by end users. The database design is highly normalized with many tables. To shorten it and not explained in all details, OLTP systems are not made for business intelligence.
But the data in the OLTP systems are the base of the business intelligence environment which starts with the Collecting & Transforming Layer. In this layer, the data will be extracted out of the source systems, often transformed and loaded into the Storage Layer. The data is often pushed in packages (PK). It can be a full load or only a delta load (which means only changed data in the source system is pushed). The StorageLayer is often called the data warehouse, but that is totally correct. The Storage Layer stores the extracted and transformed data from the source systems in a neutral and independent way.
Only when data gets pushed to the Provisioning Layer, it will be transformed further and stored application related. The data in this layer can be stored in one or more data base(s) or only be referenced.
The Provisioning Layer is the starting point for the Application Layer. There are six general application classes which focus different approaches of presenting data and gaining information out of it. This six classes are
– DB: Dashboards
– SR: Standard Reports
– AD: Ad hoc Analysis
– AN: Analysis
– PS: Planning and Simulation
– DM: Data Mining
There are differnt types of BI architectures which are described as Scenarios from I to VI.
Business Intelligence Architecture I
We start with architecture I. It contains all compontents of the reference architecture which means Source System(s), Collecting & Transforming, Storage, Provisioning and Application(s).
This architecture is a typical approach in big environments. In general huge companies used this type of architecture to spread the data from various source systems to the end users all over the world.
A typical combination in the past was to use a relational data bases as Data Warehouse and a multidimensional data base as Data Mart. The Data Warehouse stored the data independant and neutral, which means the data model was not application related. Data Marts contain data which was modeled by application requirements. The management needs different facts, dimensions, data granularity as marketing, production, human resources and so on. But, more or less, every person or department can use the data of the data warehouse in future steps. Part of this combination is software from different vendors in the application layer.
The requirements for this architecture are openness, scalability and integration. The cost drivers are licences, the initial and continous project costs and high coordination and communication expenses. The advantage of it is a high scalabilty, high performance and a broad support of application classes. In general, end users can choose which vendor software they want to use.
Business Intelligence Architecture II
We proceed with architecture II. It contains the compontents of the reference architecture which means Source System(s), Collecting & Transforming, Provisioning and Application(s). It waives the Storage layer, the central platform for data collecting and provisioning. The question is why?
In the real world, this architecture is often placed in environments in big companies with strong departments, which (want to) work indipendently from the IT-department. It is also known in smaller and medium sized companies (SMEs) which do not need a central platform for data collecting and provisioning. There are a lot of reasons to waive it but most of the time the case was that the client had only one source system.
A typical combination of the components are small relational data bases or multidimensional data bases as data marts together with and non-integrated or integrated application software. My experience teached me, that in smaller companies the departments implemented the business intelligence environment without the IT-department and licenced only one application software and one data base (relational or multidimensional). In bigger companies the departments did not work together very closely and decided on their own. That led to more than one application software and more than one data base.
The advantage of this approach are high performance, low initialer costs for installation and implementation as well as lower communication and coordination expenses. The disadvantages are double or more licence costs, most of the time more than one redundant ETL-processes and different data models, analysis results and key performance indicators (kpi). That provides within the department any problems in general, but when departments exchange information, kpi discussions are ubiquitous.
Business Intelligence Architecture III
Next is architecture III. It contains the same components as in the reference architecture Source System(s), Collecting & Transforming, Storage, Provisioning and Application(s). Compared to architecture II, there are two differences in architecture III:
– Central data platform called Data Warehouse
– Semantic layer and NOT Data Marts
Your question should be:
– What is a semantic layer?
– What is the difference between Data Marts and semantic layers?
– Where are the advantages?
– When should I use this concept?
So, first of all, in this architecture the core element is the Data Warehouse, which is the central data platform. The DWH stores the data for every stakeholder and has at least one ore more data model(s). The semantic layer is the access layer for the business users in the departments when this user group wants to do ad hoc reporting. Ad hoc reporting means simple analysis on data if a specific business questions raises in the departments and shall be answered quickly. I will describe it in the next blog entries in depth.
In general, business user do not have the overview about the data in the data warehouse like an data warehouse manager or a IT-driven employee. They do not know how the data is collected, transformed and loaded into the DWH. For them, it is enough to have a simple view on some data model measures and dimension with business user friendly names. At the end, the semantic layer supports this requirement.
The difference between a Data Mart and a semantic layer cannot be easily described. The Data Mart has the same taks as the semantic layer. In the real world you may say that a semantic layer is only a virtual provider, so the data is not stored persistently in a data base while Data Marts store the data persistently in relational or multidimensional data bases. The semantic layer often is part of the Application software and the DWH is a relational data warehouse.
The use case of this architecture is in the provisioning of mass data, usage of existing data base technologie and providing easy access for business users. The IT department is responsible for the ETL-processes and the DWH, the business departments for the Application(s) and then for the semantic layer. This architecture raised in the 1990`s when a lot of business intelligence Application vendors build their software for Data Warehouses which were located in the IT departments. The standard connector was the SQL interface and the DWH fields were hard to read for non IT emplyoees.
Business Intelligence Architecture IV
Next architecture is architecture IV. It waives completely the components Collecting & Transforming, Storage and Provisioning. The last layer is the Application(s) layer. Compared to all architectures in the last weeks, it is the slimmest business intelligence architecture.
This business intelligence architecture supports the usage of the Application(s) directly on the Source Systems. What you can expect is that this approach does guarantee the shortest implementation time of all. But without additional layers, the Application will always connect directly to the data base(s) of the Source Systems. That leads to additional utilization of the Source System capacities. Intensiv OLAP analysis can lead to too high demands and the OLTP system can break down. So, in general, the Application layer will never be supported by the IT department for business intelligence projects.
But there is an exception: sometimes it is necessary to do a operational reporting. In general, OLTP system supports analysis, but often only in limited ways. If the requirements are operational reporting and the use case is small AND the analysis does not overstrain the Source System, then this approach is fine. But testing and only restricted usage is mandatory!
Business Intelligence Architecture V
Last is architecture V, which is a combination of all BI architecture I to IV. It contains all components Source Systems, Collecting & Transforming, Storage, Provisioning and Application(s). This is the common classical BI and DWH architecture in every bigger company in the whole world.
Business Intelligence environments in bigger companies use this combination of BI architectures because of historical reasons and actual requirements out of the IT department or business departments.
Historical reasons could be that the DWH was implemented and has been maintained since a lot of years. Sometimes companies were bought from or sold to other companies and DWHs and the Collecting and Transformation layer had to be redesigned for a lot of years. Sometimes there are new requiremtents out of the business departments and the IT department does not solve them. The result is that business departments developt their “own BI environemnt”, which is only a small picture of all processes and and iniatives.
But what I discovered in the last years was that in this huge environments a lot of money was burned by the companies themselves. I have three examples:
– In the Application layer, there are more than one tool licensed. The biggest number I counted was 15 different tools for business intelligence front end software. You do not want to know how many times they were on licensed.
– Then there are different DWHs and a lot of more Data Marts (technologies). Often the Data Marts and the front end tools were bought and licenced in the business departments, while the IT bought software for DWH and front end requirements.
– Additionally, there is a lot of effort for ETL jobs as well as a lot of logic in each layer which leads to different key performance indicators for report recipients. In general, there is not only ETL process, where data is extracted, transformed and loaded. There are sometimes more than 100 or 1000 and anyone in a company has an overview about all the steps which are done automatically.
The reason why it happens in regularly in big companies? One good reason is that often no one is responsible for the business intelligence architecture. The next reason is that most employees do not have the educational background and the current knowledge about tools, technology and vendors to simplify an architecture. The last reason I often saw but was never spoken out was that the departments want develop and manage their own Business Intelligence Environment.
About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.