Business Intelligence and Data Warehousing – Data Warehouse Concepts
Lastly, we discussed Business Intelligence Tools. Today, we will see the correlation Business Intelligence and Data Warehousing. In our attempt to learning Business Intelligence and its aspect, we must learn the important technology i.e. data warehousing. In this lesson, we will learn both the concepts of business Intelligence and data warehousing. Also, we will see how they work in tandem as well. Moreover, we will look at components of data warehouse and data warehouse architecture.
So, let’s start Business Intelligence and Data Warehousing Tutorial.
Stay updated with latest technology trends
Join DataFlair on Telegram!!
1. What is Business Intelligence?
The term Business Intelligence refers collectively to the tools and technologies used for the collection, integration, analysis, and visualization of data. The raw data which we collect from different data sources transform into comprehensible data or meaningful information using BI technologies. This information interprets strategically by looking for trends and patterns in order to make business decision supported by facts revealed by the analyzed data.
To simplify the concept, we collect raw data from various sources and with the help of Business Intelligence tools transform it into meaningful information. We can store such data in data files, databases, data warehouses or data lakes in specific data structures. From the data warehouses, we can retrieve stored data in the form of a report, query, make a dashboard to conduct data analysis. We do this with the process known as ETL (Extract, Transform, Load). Thus, enterprise executive can use the extracted, transformed and loaded data on different levels. He uses this to draw insights and fuel their decision making with the useful insights revealed by analyzing the data.
2. What is Data Warehousing?
Data warehousing is the process of storing data in data warehouses, which are databases following the relational database model. Data is selected from different data sources, aggregated, organized and managed to provide meaningful insights into data for analysis & queries. A holistic approach to deal with and manage immense amounts of data that we use at enterprise levels. In such a wholesome approach, data does not simply fetches from data sources for operational or transactional tasks but transform in a certain way that we use for analytical and comparison purposes.
A data warehouse is known by several other terms like Decision Support System (DSS), Executive Information System, Management Information System, Business Intelligence Solution, Analytic Application. We call it Decision Support System as it provides useful insights and patterns shown by data as a result of the analysis which makes taking important decisions in business easy and safe. Therefore, in almost all the enterprises, a data warehouse maintains separately from the operational database. We use it only for transactional purposes which is more objective in nature.
3. How does Data Warehousing Work?
In a normal operational database are fully normalized data or is in the third normal form (3NF). In a 3NF state, every field of the table in a database is functionally dependent on only the primary key and does not contain any indirect associations. This means highly ramify data and so fetching data in such a condition is a slow process. In data warehousing, data is de-normalized i.e. it is converted to 2NF from 3NF and hence, is called Big data. We call it big data because of data redundancy increases and so, data size increases. The sole purpose of creating data warehouses is to retrieve processed data quickly. Also, to provide aggregate data like totals, averages, general trends etc for enterprises to analyze and make decisions good for their business and functioning in the industry.
4. Components of Data Warehouse
A data warehouse has several components that work in tandem to make data warehousing possible.
- Operational Systems: These are the different operational domains in an enterprise which serve a unique purpose and contribute in their ways for the proper functioning of the enterprise. Different operating systems can be marketing, sales, Enterprise Resource Planning (ERP), etc. All of these systems have their own normalized database.
- Integration Layer: The normalized data is present in the operational systems must not be manipulated. Instead, a copy of that we take data into an integration layer staging area where manipulate and transform it in specific ways. One basic operation done is bringing the copied data into a single standardized format because, in the operational systems, data is not present in the same format. For instance, in a data field, the data can be in pounds in one table, and dollars in another.
- Data Warehouse: The transformed and standardized data flows into the next element, known as the data warehouse which is a very large database. So, the data stores from all over the enterprise in this data vault in the second normal form having a certain uniform format and structure.
- Data Marts: These are the purpose-specific sub-databases of the data warehouse containing only some parts of the entire big data. In each data mart, only that data which is useful for a particular use is available like there will be different data marts for analysis related to marketing, finance, administration etc. Each of these databases does not coincide or share their data with each other and operations performed in each of them does not influence the other. This makes fetching data from the data marts much faster than doing it from the much larger data warehouse.
A data warehouse is a comprehensive database as it contains processed data information which could be directly taken up by BI tools for analysis. As opposed to this, if you fetch raw data, directly from the data source, you might face issues with the uneven formatting of data, data being unstructured and not sorted. To prevent all of this from happening, data warehouses work as an intermediary data source between the original database and the BI tool. Data warehouses merge the data fetched from different sources and give it structure and meaning for the analysis. It also helps in conducting data mining which is finding patterns in the given data.
5. Business Intelligence and Data Warehousing
Data warehousing and Business Intelligence often go hand in hand, because the data made available in the data warehouses are central to the Business Intelligence tools’ use. BI tools like Tableau, Sisense, Chartio, Looker etc, use data from the data warehouses for purposes like query, reporting, analytics, and data mining.
In any enterprise, Business Intelligence plays a central role in the smooth and cost-effective functioning of it. Thus, BI is helpful in operational efficiency which includes ERP reporting, KPI tracking, risk management, product profitability, costing, logistics etc. And also, helps in customer interaction which includes, sales analysis, sales forecasting, segmentation, campaign planning, customer profitability etc.
When a user needs data related as a result to the queries like when did an order ship? How many of the product X items have been sold this month? Etc. Data from the traditional database using the Online Transaction Processing (OLTP) is used. That is, such data retrieval is done when you need data as an answer to direct questions or queries. Whereas, if you need data for more subjective and holistic queries like factors affecting order processing time, the contribution of each product line in the gross profits etc., data warehouses are used. From our prior discussions, we know that data warehouses store processed and aggregated data which is best used as an answer to the subjective queries mentioned above. Business Intelligence tools require such data from the data warehouses. The data is transported through the Online Analytical Processing (OLAP). Data warehousing and OLAP has proved to be a much-needed jump from the old decision-making apps which used OLTP. As at that time, data was unstructured, not in a standardized format, of poor quality. Also, decentralized data and data retrieval from the source was a slow process. And so, almost all of the enterprises switched to using OLAP and data warehouse model.
6. Business Intelligence and Data Warehousing – Architecture and Process
In this section, we will see how to extract, transform and load raw data into data warehouses. Also, we discuss how BI tools use it for analytical purposes. Refer to the image given below, to understand the process better.
Step 1: Extracting raw data from data sources like traditional data, workbooks, excel files etc.
Step 2: The raw data that is collected from different data sources are consolidated and integrated to be stored in a special database called a data warehouse. A data warehouse is conceptually a database but, in reality, it is a technology-driven system which contains processed data, a metadata repository etc. The process by which we fetch the data into data warehouses from the source is ETL (Extract, Transform, Load). This extracts raw data from the original sources, transforms or manipulates it different ways and loads it into the data warehouse.
Step 3: If you wish to use data from the data warehouse for specific purposes like marketing analysis, financial analysis etc., subsets of the data warehouse are created known as data marts and data cubes. Data from the data warehouse to the data marts also goes through the ETL.
Step 4: From both data warehouse and data marts, data is redirected to data or OLAP cubes which are multi-dimensional data sets whose data is ready to be used by front-end BI tools or clients. At the front-end, exists BI tools such as query tools, reporting, analysis, and data mining. These BI tools query data from OLAP cubes and use it for analysis.
7. Current Scenario of Business Intelligence and Data Warehousing
As technologies change and get better with time, alternatives to data warehousing have also been introduced into the market. They are data lakes, ELT process, and automated data warehouses for faster data processing and analysis. Data lakes and technologies like Hadoop follow Extract-Load-Transform which comparatively more flexible process than ETL. Whenever a BI tool needs the data, we take it from the data lakes and transform accordingly to conduct the analysis. However, enterprises still need data warehouses for analysis which needs structured and processed data.
So, this was all about Business Intelligence and Data Warehousing.
Thus, Business Intelligence and Data Warehousing are two important pillars in the survival of an enterprise. It helps to keep a check on critical elements like CRM, ERP, supply chain, products, and customers. The Business Intelligence and Data Warehousing technologies give accurate, comprehensive, integrated and up-to-date information on the current situation of an enterprise which supports taking required steps and making important decisions for the company’s growth. Hope you liked the explanation. If you have any query related to BI and Data Warehousing, ask in the comment tab.
You must read –