BI is a vast topic, but for starters, let us understand, in brief, how does Business Intelligence operates.
- A business often has data collected over a vast range of sources. The data is often present in various databases and can come from various departments within the organization as well as external vendors. With data coming from different sources, the quality of data varies since all of them have different representations, codes and formats which need to be reconciled. Hence, the problem of integrating, cleansing and standardizing data for BI becomes quite complicated.
CRUX- Business has lot of branches and all of them have lots of departments. So, in short, there is lot of data. Now, the way this data is stored is different among different departments. But when we need to find out some knowledge using this data, we need to bring all of it together. Since, the format and representation of this data is different, it becomes quite a task to prepare the data for further BI operations.
- Although it is complex, we do need to prepare the data. The tools and the back-end technologies that we use to prepare the data for BI is collectively referred to as “Extract-Transform Load” tools. But, with time, the need to support BI tasks in real time i.e. the moment data is collected has come up. As a result, specialized engines such as Complex Event Processing have emerged.

- Once the data has been prepared, it is loaded into a repository called data warehouse that is managed by one or more data warehouse servers. Now, in these servers, typically relational database management systems are used for storing and querying. But with time, several other data structures, optimizations and query processing techniques have emerged for executing complex SQL queries over large volumes of data. Large data warehouses typically deploy parallel RDBMS engines so that SQL queries can be executed on large volumes of data with low latency (the amount of time a message takes to traverse). However, with time lot of data being generated is digital. This huge amount of digital data, is often referred to as the “big data challenge”. As a result, engines based on Map Reduce paradigm are being used to execute complex SQL queries.
- Data warehouse systems are then complemented by a set of mid-tier servers that provide specialized functionality for different BI scenarios.
4.1 OLAP – Online Analytic Processing servers enable the multi-dimensional view of the data to the application enabling common BI operations such as filtering, aggregation, and drill-down and pivoting.
4.2 Reporting servers enable definition, efficient execution and rendering of reports, that enable the business with a better insight as they are able to make more clear comparisons.
4.3 Enterprise search engines support the keyword search paradigm over text and structured data in the warehouse.
4.4 Data mining engines enable in depth analysis of the data allowing the user to answer predictive questions such as- which customers are likely to respond to my upcoming catalog?
4.5 Text analytic engines analyze large amount of data and extract the required information that would otherwise require lot of manual effort.
Other front end applications that allow the user to perform BI tasks are: spreadsheets, enterprise portal etc.
Other BI technologies are- Web analytics, which allows understanding how customers of a particular web page. Customer Relationship Management (CRM) which gives insight of the most likely and the least likely customer who will repurchase a particular product.
Happy Learning 🙂

Leave a comment