BI@WITTENSTEIN – Don´t show dashboards without real data

In the last weeks the team developed a lot of dashboards. After finalization, the team showed the dashboards the product owner for release readiness. The feedback was impressive: they cannot release the dashboard because they do not see the data.

We struggle with the parallel implementation of SAP ERP

At the end of September 2016, WITTENSTEIN will introduce the new SAP ERP system. With this ERP system, the old one will be shut down as well as new processes released. While the ERP team started to implement and customize the SAP ERP system, they used to demo data for customer transactions like orders, deliveries and so on. The transaction data was expanded by master data for customers, products and companies.

We don´t have credible data

Demo data in SAP ERP was our base for the ETL-processes and data modeling steps as well as for visualization. As you can see, our dashboard was finally designed and completed. It showed every data what was in our enterprise data warehouse. The whole dev team had concerns that we do not have credible data.

Daily Dashboard without real data

Daily Dashboard without real data

No data – no dashboard

The feedback of our product owners or key users was impressive. They received the dashboards via the BI launch pad and could investigate the look and feel, the created functions, the navigation and everything else. But the only feedback was that they cannot release it – the data is missing. And without the permission of the product owner or key user, we do not release any application. That was a hard hit. We are waiting for the GoLive on Monday in order to check the data and maybe release the dashboards.

About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.

“If you do BI not right from the beginning, you do it the whole time wrong”

BI@WITTENSTEIN – Self Service Business Intelligence with SAP BusinessObjects Software

Self Service BI tools are getting more and more attractive for end users. On one side the rising stars on the market of business intelligence software are products like Tableau, QlikView and other which are loved by end users. On other side market analysts like Forrester, Gartner, BARC (a CXP company) are rating these products very high in customer satisfaction and maturity. And on conferences and discussions online, there is always the noun of SSBI. Often, people think with SSBI, every problem in BI initiatives will be solved. But that is lie. SSBI should be evaluated wisely.

SSBI contains of three components. The components are “Dashboards and Reports”, “Data Models” and “Source System”. Every component has two parts:

  • Dashboards and Reports
    • End users want to filter data in dashboards and reports on their own
    • End users want to create dashboards and reports on their own
  • Data Models
    • End users want to create their own queries on data models on their own
    • End users want to create their own measures and dimensions on their own
  • Source System
    • End users want to use the data of their own data source on their own
    • End users want to have the responsibility about data in their source system

When evaluating tools, you should think about these components and their parts. I made a short evaluation of the business intelligence tools we use at WITTENSTEIN. We have SAP BO Analysis for Office, SAP BO Design Studio, SAP BO Analysis Templates (based on Design Studio) and SAP BO Web Intelligence. You can see my assessment in the picture below.

Self Service BI - SAP BI BusinessObjects

Self Service BI – SAP BI BusinessObjects

What you should see is that only the component “Dashboards and Reports” are fulfilled by the tools, except Analysis for Office. This tool is an Excel client and not really suitable for dashboards and reports. The other components are ignored completely by the tools. This is not a problem by the tools but by the SAP BW architecture which should be considered when talking about SSBI in an SAP BI stack.

Download the slide deck about Self Service Business Intelligence with SAP BusinessObjects Software:

Self Service BI with SAP BI BO Overview

About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.

“If you do BI not right from the beginning, you do it the whole time wrong”

BI@WITTENSTEIN – SAP BW on HANA architecture – part2

I was attending a SAP class about SAP Business Warehouse powered by SAP HANA and I want to share my personal insights and ideas with you. You`ll find the first part here.

My summary

  • SAP`s BI product portfolio was and will never be state of the art!
  • SAP BW on HANA is at the moment  not either necessary or too expensive
  • SAP BO in combination with SAP BW on HANA is only a huge mess
  • SAP does not know how to sell the its product SAP BW and SAP HANA at the same time

The class SAP Business Warehouse powered by SAP HANA

The teacher were Marc Hartz and Ulrich Christ. Marc is member of the product management of SAP HANA, and he is focusing there on SAP BW as a product and the whole topic data warehousing itself. Ulrich is a colleague of Marc. The course had five chapters with a lot of Content about SAP BW on HANA.  The topic I want to describe here was part of chapter five and was called “Modern Data Warehouse Architecture – Mixed Scenarios”.

The key findings about Mixed Scenarios for Modern Data Warehouse Architectures

SAP recommends the usage of SAP BW on HANA in six possible ways, but they do not tell you the advantages and disadvantages of each Scenario. I guess that they do not want to do it because it is their marketing strategy which remembers me on a sentence which I heard yesterday on the radio: “everything is possible, nothing has to be”.  I explain the last three  scenario:

Scenario 4: Modeling in SAP HANA, usage of SAP BW as master data manager

This scenario brings dramatic architectural shifts in SAP BW landscapes as well. When you use this scenario you relinquish totally the “Analytical Manager” and “BEx Queries” like in scenario 3. The data modeling is done on SAP HANA and only master data relies in the SAP BW side. The advantage is that ETL and master data modeling is still done with SAP BW techniques and the security concept can be done via SAP BW. The data access is provided only via SQL. One more time: you can use every SQL tool of the world to access data of SAP BW. What you add here is the possibility to access SAP ERP data in real time and combining it with data of the SAP BW.

Modeling in SAP HANA, usage of SAP BW as master data manager

Modeling in SAP HANA, usage of SAP BW as master data manager

The disadvantage is that the data modeling quality know relies more and more on the HANA view modeler who has to be a quite experienced data modeler. I think this combination is quite expensive at the moment. Further, all investments on SAP BW Queries are lost because they will no longer be used.

Scenario 5: Use of SAP HANA and provisioning via Analytic Manager

This scenario is the craziest approach. You do the whole data modeling in SAP HANA with SAP HANA views. You build a data warehouse or data mart with persistent or logical data. Afterwards, you access the data with BI tools of SAP via the Analytical Manager on BEx Queries. This scenario makes only sense if you are keen on BEx Queries, but who`s that?

Use of SAP HANA and provisioning via Analytic Manager

Use of SAP HANA and provisioning via Analytic Manager

Scenario 6: SAP HANA Data Warehouse without SAP BW content and tools

This scenario describes the most advanced approach of SAP. You give up the whole SAP BW environment and use only SAP HANA as the data warehouse platform. At the end, it is no longer a SAP BW on HANA. It is the data warehouse on SAP HANA.

At the moment, the advantage is that you can access SAP ERP and other data sources in real time. The data modeling in SAP HANA is comparable to access which I like really. And I think other will like it, too. Rapid Prototyping is done very easy and “time-to-market” can be really high.

SAP HANA Data Warehouse without SAP BW content and tools

SAP HANA Data Warehouse without SAP BW content and tools

The disadvantage is that this scenario is not really practicable is large environments and then for Enterprise BI. SAP HANA does not support Enterprise BI security. In addition, this approach is only adapted by early adopters and the concept has to be established on the market. Early investments can be lost through future decisions by SAP.

 

Explanation of the summary

  • SAP`s BI product portfolio was and will never be state of the art!

SAP is too big, earns too much Money with the old products and tails off in “everything is possible” with SAP BW, SAP HANA and SAP BO. The SAP products are developed by teams all over the world (USA, France, Germany are the biggest stations) and they work on tools which shall support a lot of business cases. SAP HANA is first only a data base but is sold as a development platform for apps as well as a platform for large Enterprise data warehouses. This whole concept sounds great and looks great on slides but in real life it is hard to manage.

  • SAP BW on HANA is at the moment  not either necessary or too expensive

The scenarios Show the difficulties of the status quo. While SAP tells the customer they can use SAP BW on HANA like they want, I do not see a use case for customers in general. In my eyes, customers with an Enterprise BI Concept stay with Scenario 1. In this Scenario SAP HANA is useless. Customers with smaller BI approaches build a Data Mart then. But to build Data Marts, there are a lot of other possibilities (easier to realize and cheaper)

  • SAP BO in combination with SAP BW on HANA is only a huge mess

Like described above: to bring everything together SAP BW, HANA, BO products as well as a portal to share applications and reports, my experienced showed me that SAP is complex, expensive and bugs or failures are solved only after longer development cycles.

  • SAP does not know how to sell the its product SAP BW and SAP HANA at the same time

SAP tells the customer everything is possible and customers believe what SAP says. This credibility was established through the last 30 years of SAP ERP and it is justified. But customers should really ask SAP if the whole SAP BI Roadmap is the right way for a high user experience.

 

About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.

“If you do BI not right from the beginning, you do it the whole time wrong”

BI@WITTENSTEIN – SAP BW on HANA architecture

I was attending a SAP class about SAP Business Warehouse powered by SAP HANA and I want to share my personal insights and ideas with you.

My summary

  • SAP`s BI product portfolio was and will never be state of the art!
  • SAP BW on HANA is at the moment  not either necessary or too expensive
  • SAP BO in combination with SAP BW on HANA is only a huge mess
  • SAP does not know how to sell the its product SAP BW and SAP HANA at the same time

The class SAP Business Warehouse powered by SAP HANA

The teacher were Marc Hartz and Ulrich Christ. Marc is member of the product management of SAP HANA, and he is focusing there on SAP BW as a product and the whole topic data warehousing itself. Ulrich is a colleague of Marc. The course had five chapters with a lot of Content about SAP BW on HANA.  The topic I want to describe here was part of chapter five and was called “Modern Data Warehouse Architecture – Mixed Scenarios”.

The key findings about Mixed Scenarios for Modern Data Warehouse Architectures

SAP recommends the usage of SAP BW on HANA in six possible ways, but they do not tell you the advantages and disadvantages of each Scenario. I guess that they do not want to do it because it is their marketing strategy which remembers me on a sentence which I heard yesterday on the radio: “everything is possible, nothing has to be”.  I explain the first three scenario:

Scenario 1 is the classic SAP BW architecure in SAP´s technology stack.

You use SAP BW with all the classical layers and methods as well as classic SAP BO products to Access the data in the data warehouse. SAP HANA is only the database which accelerates the data calculations. The perfomance is comparable to the Business Warehouse Accelerator  (BWA). The Advantage is the usage of predefined models and application setttings and the complexity of the whole architecture is simple. It is possible to establish a data warehouse without a lot of expertise quite fast.

Classical SAP BW in SAP technology stack

Classical SAP BW in SAP technology stack

The disadvantage is that you can only use SAP BO tools for accessing data via the proprietary interface “Analytic Engine” on BW Queries. Futher, as a customer of SAP, you are totally depending on the SAP BW and SAP BO product portfolio. If you want to secure your invests, you have to follow (if you want or not) the sap product roadmap.

Scenario 2: Use of SAP HANA Views for SQL requests

This scenario brings a first real change in SAP BW history. You can access data of the BW via SQL requests and this is the biggest advantage of SAP HANA. You still use all mechanism and modeling techniques of SAP BW but allow the access via SQL. Sounds great because now, you can approximately can use every business intelligence tool in the world (Microsoft, IBM, Information Builders, MicroStrategy, Oracle and all the others) to access data without creating a new data warehouse. In theory, the whole data and user security relays in the SAP BW.

Use of SAP HANA Views for SQL requests

Use of SAP HANA Views for SQL requests

Sounds really great but I cannot imagine how SAP has solved the problem of how a SQL request is translated into the BEx-Query language. I can imagine that a lot of features and functions are not available. Other disadvantages are that the BW architecture is broken and a new data mart layer could be introduced by other administrators. And I guess as well that the performance is not the same like via the proprietary interface because you have one additional layer.

Scenario 3: Usage of SAP HANA Views for Virtual Data Marts

This scenario brings the first dramatic architectural shifts in SAP BW landscapes. When you use this scenario you relinquish totally the “Analytical Manager” and “BEx Queries”. The data modeling is done on SAP HANA and only master data relies in the SAP BW side. The advantage is that ETL and master data modeling is still done with SAP BW techniques and the security concept can be done via SAP BW. The data access is provided only via SQL. One more time: you can use every SQL tool of the world to access data of SAP BW.

Version 3: Usage of SAP HANA Views for virtual Data Marts

Version 3: Usage of SAP HANA Views for virtual Data Marts

The disadvantage is that the data modeling quality know relies more and more on the HANA view modeler who has to be a quite experienced data modeler. I think this combination is quite expensive at the moment. Further, all investments on SAP BW Queries are lost because they will no longer be used.

The blog next week will describe the last three mixed Scenarios. Come back next week and be curios.

 

About Tobias Riedner
Tobias Riedner foundet WYCDWD.com in 2015. He works and worked as innovator, consultant, analyst and educator in the fields of business intelligence and data warehousing. He learned a lot from the best consultants, managers und educators in the past and shares his knowledge worldwide. He works for a steady growing traditional company which is a leader in industry 4.0.

“If you do BI not right from the beginning, you do it the whole time wrong”

BI@WITTENSTEIN – KPI definitions

A Key Performance Indicator is a measurable value that demonstrates how effectively a company is achieving key business objectives. Companies use KPIs at multiple levels to evaluate their success at reaching targets. High-level KPIs may focus on the overall performance of the enterprise, while low-level KPIs may focus on processes in departments such as sales, marketing or a call center. For more detailed definitions, check out Wikipedia’s definition of KPI

Are KPIs still relevant?

KPIs often have a negative connotation associated with them. Unfortunately, many business users are beginning to see KPI monitoring as an obsolete practice. This is because KPIs fall victim to that most human of all problems: lack of communication.

The truth is that KPIs are only as valuable as you make them. KPIs require time, effort and employee buy-in to live up to their high expectations. Bernard Marr, best-selling author and enterprise performance expert, sparked an interesting conversation on this subject in his article, “What the heck is a KPI?” The comments make it clear that while KPIs may have fallen out favour (depending who you ask), their potential value remains in the hands of those that use them.

Why does WITTENSTEIN want to use KPIs?

KPIs are important at WITTENSTEIN to get an impression of the daily business. They require time, effort but the management supports the definition and creation process and wants to drive the business based on data.

What is the problem of a KPI definition?

The problem of the kpi definition is to bring every stakeholder on the same level. Examples are always a good way to give a short impression. Let`s talk about customers. A short list of possible kpis:

  • Customers [#]: number of customers
  • New Customers [#]: number of new customers
  • Regular Customers [#]: number of regular customers
  • Lost Customers [#]: number of lost customers

So far pretty easy… But how do you calculate the measures?

  • Customers [#]: COUNT DISTINCT (CustomerID)

My first question is: is a customer a customer when he registered in the ERP- or CRM-system or does customers become a customer only with his first order? And what is his name before? Interested person?

  • New Customer [#]: COUNT DISTINCT (CustomerID) AND ??

And then the journey begins. It is quite hard to define a “new” customer. Typically a customer is a new customer after his first order. But how long does he stay in this segment? Only one day? The day after he will become a regular customer? Or will he become a regular customer after 30 days? Or 90 days? My question is specific because the analysis of revenue is tremendously different. While new customers for only one day have only one order with an order amount, new customers in a 90 day period create a much bigger order value. That will be a totally different report.

What does WITTENSTEIN do?

We calculate both measures and we will discuss the results with the management. Hope I can tell you more in some weeks.

 

BI@WITTENSTEIN – What Is „The Revenue“

What is revenue?

“The income generated from sale of goods or services, or any other use of capital or assets, associated with the main operations of an organization before any costs or expenses are deducted. Revenue is shown usually as the top item in an income (profit and loss) statement from which all charges, costs, and expenses are subtracted to arrive at net income. Also called sales, or (in the UK) turnover.” Read more here.

Sounds easy, doesn´t it? But the problem relies always in details. Like in my last business intelligence projects it is harder to calculate the right revenue.

What is the problem of calculate the revenue?

The problem is hard to describe but we will begin with the basics. The revenue of an order is simple and can be calculated via:

Revenue = Amount of sold goods * price of good

Tata, and that´s it. At WITTENSTEIN, we have more than two parameters which influence the value of the revenue. We have for example an extra charge for sets, price reductions for specific customers, amounts, and contributions as well as extra charges for carrier services, insurances, roundings and clearings. In addition, various types of commissions follow as well as cash discount and transfer pricing. And the final question. Do you show sales as gross sales or net sales?

What is a possible calculation schema?

Last week, I began to create a calculation schema like in the picture below. As you can see, we have a basic price, selling price, position w/o tax 1 and 2, a final amount, gross and net sales. It took me one working day to talk with people of the business (financial department) to coordinate the calculation.

Selling Price      100,00 €
= Basic Price      100,00 €
Set: extra charge        10,00 €
= Selling price 2        90,00 €
Reduction: Customer type        10,00 €
Reduction: Amount        10,00 €
Reduction: Contribution              –   €
= Position W/O Tax        70,00 €
Set Reduction              –   €
+ Carrier Service        30,00 €
= Position W/O Tax 2      100,00 €
+ VAT        19,00 €
+ Roundings              –   €
+ Clearing              –   €
= Final Amount      119,00 €
Commission 1        10,19 €
Commission 2              –   €
Cash discount          2,38 €
Transfer pricing              –   €
= Gross Sales      106,43 €
VAT        16,99 €
= Net Sales        89,44 €

 

What is to be done next?

The next step is do a first calculation with data of our SAP ERP in our BI environment so that the calculation schema comes to live and business people can talk about it. My experience says me that we will not deliver a final and perfect solution but a first shot we can discuss further.

BI@WITTENSTEIN – To Start New Is Always Hard

Tobi`s new Job – Business Intelligence Analyst at Wittenstein AG

My last post is from October 2015. Now, I want to bring an update of my current situation and the changes I am through the last months. Pretty short: I have a new job since 01th of May 2016. So, this blog is written in my second week at my new employer. My impression is that I work for a highly professional company in a completely new environment. But, as I promised in my time at FLYERALARM, I will present thoughts, ideas and decisions in an absolutely clear and focused way.

Who is Wittenstein?

WITTENSTEIN AG develops customized products, systems and solutions for highly dynamic motion, maximum-precise positioning and smart networking for mechatronic drive technology. It is a so called “hidden champion” of the German “Mittelstand” and has a growing revenue of 300 Million Euros/ year. The headquarter is based in Harthausen, which is pretty near to Bad Mergentheim, a small city in Germany. The city is located in the triangle of Frankfort, Nuremberg and Stuttgart.

What is the situation of Business Intelligence at Wittenstein?

The situation of BI is quite easy to describe. There is nothing except an idea how it should be done. At the moment, we do not have a clear BI strategy and not an organizational concept as well as an architecture. We do have business requirements like an “Deckungsbeitrag” for products as well as the creation of specific key measures.

What is my role at the moment?

My role is to analyze the business requirements and to transfer them into tasks for external developers. We started this week with the definition of measures and dimensions. Well, I thought at FLYERALARM it was hard to do this, but at WITTENSTEIN it is even harder. The reason is simple. The organization is older and works since years with kpis in the headquarter as well as in the business units. At FLYERALARM, the financial department defined the kpis completely new.

What I have learned from the SAP HANA, SAP BW and Business Objects?

Right now, we started to create data models based on the business requirements. I was able to create a data model via SAP HANA Studio. And so far, it was a success. The modeling technique reminds me on Microsoft Access, which I worked with as a student. The possibilities to join the HANA models with BW data was quite impressive. We will proceed and I will give you deeper insights in the next weeks.

 

 

 

Tuesday BI Basics #25: Kimball vs Imnon

These chapter is a blog entry by Dr. Mark Whitehorn specializes in the areas of data analysis, data modeling and business intelligence (BI). Based in the UK, he works as a consultant for a number of national and international companies and is a mentor with Solid Quality Mentors. In addition, he is a well-recognized commentator on the computer world, publishing articles, white papers and books. Whitehorn is also a senior lecturer in the School of Computing at the University of Dundee, where he teaches the master’s course in BI. His academic interests include the application of BI to scientific research. I totally agree with his opinion about Kimball and Imnon.

When you start to design a data warehousing and business intelligence (BI) system, you need to make several fundamental decisions right at the outset. Get any of them wrong and you risk disaster: as we all know from any design process, the more fundamental the error, the more expensive it is to fix. (So, no pressure then!)

Of these fundamental design decisions, probably the most crucial is: ‘Inmon or Kimball?

The Inmon vs Kimball debate is about how you structure the data in your data warehouse. The Kimball approach favours a dimensional structure, while the Inmon data warehouse methodology favours a relational structure. Both assume that the data warehouse will be underpinned by a relational database (Oracle, SQL Server, DB2, etc). But a Kimball data warehouse will use a relational OLAP implementation of a dimensional model – that is, a mixed set of dimension and fact tables.

Clearly, there is more to it than that. Both Bill Inmon and Ralph Kimball have written extensively on this subject, and neither makes bald statements without backing them up. Here I am compressing a huge collection of well-reasoned, carefully structured arguments into a very brief description, so inevitably, the detail is lost. If you want to know more about the arguments, the obvious place to go is back to the source material – which a quick search of Amazon will happily yield.

If you listen to a zealot on either side, you could be forgiven for assuming that the alternative is doomed to certain failure. This is demonstrably not the case. I have worked on both Kimball- and Inmon-style warehouses, and I can tell you that both approaches work. Nor am I alone in making that observation.

A couple of years ago, I gave a webcast on data structuring in BI systemsand was able to poll the 100 or so participants, who were asked this question: “Data warehouses (as distinct from data marts) can be structured as relational (Bill Inmon) or dimensional (Ralph Kimball). Which do you favour?”

The answers were:

Relational: 9%
Dimensional: 39%
Use either/both as appropriate: 52%

The results were encouraging because more than half the respondents were open-minded enough to make the decision based on the particular circumstances they found, rather than on dogma. (And some of those who chose ‘relational’ or ‘dimensional’ might still be open-minded – they just might have happened to work on several warehouses where the best solution was one or the other.)

So how do you decide: Inmon or Kimball?
The Inmon vs Kimball choice should have nothing to do with dogma and everything to do with the characteristics of the two approaches. And simply calling those characteristics ‘pros and cons’ would be pejorative – if you choose a dimensional structure for a data warehouse, your choice will have a number of consequences that can only be classified as pros and cons in the context of the particular warehouse you’re building.

In practice, there is a whole collection of factors that you need to weigh up. To get a feel for the kind of issues you need to think about, let’s just take a look at one – data restructuring.

The data in your transactional systems is almost bound to be relational. And no matter which data warehouse model you choose, your data marts are almost certainly going to be dimensional. So it follows that at some point in your BI system, you’re going to have to change the structure of the data. If you go Kimball, you have to change the structure during the extract, transform and load (ETL) process. If you go Inmon, the change is put off until the data flows from the warehouse to the marts.

Now let’s think about data volumes for a minute. Almost all BI systems use ETL tools to move data from transactional systems once a day. They don’t move it all – they only move data that is new or that has been changed since the last extraction. So, a warehouse may contain, say, 10 years worth of data amounting to 3.5 TB, and each nightly extract may only be moving 1 GB of information.

Before data can be loaded into a Kimball data warehouse, you have to transform it into a dimensional structure. That will slow down the ETL process. With an Inmon data warehouse, on the other hand, the ETL process is faster and easier because you’re simply moving data from one relational structure to another.

But wait a minute: we haven’t finished yet. Either way, you’re eventually also going to have to move the data down into your data marts for business users to query and analyse. Now the tables (if you’ll pardon the pun) are turned. With Kimball, the amount of data restructuring required to populate the marts is minimal. That isn’t the case with Inmon, making data mart population slower.

Not only that, think about the volumes of data. During the ETL process, we were only talking about 1 GB per night. When populating data marts, you may, in many cases, be moving hundreds of gigabytes of data to several different marts during a single night.

Assessing the likely impact of choosing Inmon or Kimball
Once you understand the fundamentally different effects that the Kimball and Inmon approaches have on data flow and restructuring, you can look at a particular data warehouse and determine the likely impact of choosing one over another. For example, in a large warehouse (perhaps that of a telecoms company like Orange), data volume and transformation are likely to be crucial, while for an SME it may be entirely inconsequential.

This question of where you restructure data is not, of course, the only factor you need to consider, or even the most important factor – it is simply one of many. But the choice between a Kimball data warehouse structure and an Inmon one should be made on the basis of reason rather than dogma – and sadly, some people tend to give inflexible support to one methodology or the other.

It is perhaps worth remembering that while Ralph Kimball and Bill Inmon differ in their views, there is no evidence that they themselves take dogmatic positions. Indeed, the fact that Inmon wrote a foreword for the first edition of Kimball’s book The Data Warehouse Toolkit tells us that they probably take a balanced view. We’d all do well to take the same approach.

The original blog entry was published here.

Tuesday BI Basics #24: Physical data model schema

Physical schema is a term used in data management to describe how data is to be represented and stored (files, indices, et al.) in secondary storage using a particular database management system (DBMS) (e.g., Oracle RDBMS, Sybase SQL Server, etc.).

In the ANSI/SPARC Architecture three schema approach, the internal schema is the view of data that involved data management technology. This is as opposed to an external schema that reflects an individual’s view of the data, or the conceptual schema that is the integration of a set of external schemas.

Subsequently the internal schema was recognized to have two parts:

The logical schema was the way data were represented to conform to the constraints of a particular approach to database management. At that time the choices were hierarchical and network. Describing the logical schema, however, still did not describe how physically data would be stored on disk drives. That is the domain of the physical schema. Now logical schemas describe data in terms of relational tables and columns, object-oriented classes, and XML tags.

A single set of tables, for example, can be implemented in numerous ways, up to and including an architecture where table rows are maintained on computers in different countries.

Tuesday BI Basics #23: Logical data model schema

A logical schema is a data model of a specific problem domain expressed in terms of a particular data management technology.

Without being specific to a particular database management product, it is in terms of either relational tables and columns, object-oriented classes, or XML tags. This is as opposed to a conceptual data model, which describes the semantics of an organization without reference to technology, or a physical data model, which describes the particular physical mechanisms used to capture data in a storage medium.

The next step in creating a database, after the logical schema is produced, is to create the physical schema.

In the next weeks I will provide a simple training model.