Main Page | See live article | Alphabetical index

OLAP

Olap is an acronym for Online analytical processing. It is an approach to quickly providing the answer to complex database queries. It is used in business reporting for sales, marketing, management reporting, data mining and similar areas.

The reason for using Olap to answer queries is speed. Relational databases store entities in discrete tables if they have been properly normalized. This structure is good for operational databases but for complex multi-table queries it is relatively slow. A better model for querying, but worse for operational use, is a dimensional database.

Olap takes a snapshot of a relational database and restructures it into dimensional data. The queries can then be run against this. It has been claimed that for complex queries Olap can produce an answer in around 0.1% of the time for the same query on relational data.

An Olap structure created from the operational data is called an Olap cube. The cube is created from a star schema of tables. At the centre is the fact table which lists the core facts which make up the query. Numerous dimension tables are linked to the fact tables. These tables indicate how the aggregations of relational data can be analysed. The number of possible aggregations is determined by every possible manner in which the original data can be hierarchically linked.

For example a set of customers can be grouped by city, by district or by country; so with 50 cities, 8 districts and two countries there are three hierarchical levels with 60 members. These customers can be considered in relation to products; if there are 250 products with 20 categories, three families and three departments then there are 276 product members. With just these two dimensions there are 16,560 possible aggregations. As the data considered increases the number of aggregations can quickly total tens of millions or more.

The calculation of the aggregations and the base data combined make up an Olap cube, which can potentially contain all the answers to every query which can be answered from the data. Due to the potential number of aggregations to be calculated, often only a predetermined number are fully calculated while the remainder are solved when demanded.

Beyond the basic concept there are three types of Olap - Multidimensional Olap (MOLAP), Relational Olap (ROLAP), and Hybrid Olap (HOLAP). Molap is the 'classic' form of Olap and is sometimes referred to as just Olap. It uses a summary database, has a specific dimensional database engine and creates the required schema as a dimensional set of both base data and aggregations. Rolap works directly with relational databases, the base data and the dimension tables are stored as relational tables and new tables are created to hold the aggregation information. Hybrid Olap uses relational tables to hold base data and multi-dimensional tables to hold the speculative aggregations.

Each type has certain benefits, although there is disagreement about the specifics of the benefits between providers. Molap is better on smaller sets of data, it is faster to calculate the aggregations and return answers but does create enormous amounts of data. Rolap is considered more scaleable and uses the least space but is the slowest at pre-processing and query performance. Holap is between the two in all areas, but it can pre-process quickly and scale well. The difficulty in implementing Olap comes in forming the queries, choosing the base data and developing the schema, as a result of which most modern Olap products come with huge libaries of pre-configured queries. Another problem is in the base data - it must be complete and consistent.

The first product which performed Olap queries was Oracle's Express which was released in 1970. However, the term was not invented until 1993 when it was coined by Ted Codd, who has been described as "the father of the relational database". The most well known Olap product was actually released a year earlier, Arbor's Essbase (now owned by Hyperion). Arbor paid Codd to write his Olap paper, and his "twelve laws of online analytical processing" were explicit in their reference to Essbase.

Other well known Olap products include Microsoft Analysis Services (previously called Olap Services, part of SQL Server), IBM's DB2 Olap Server, SAP BW and products from Brio, Businessobjects, Cognos and MicroStrategy. Around 50% of Olap licences are never deployed. Business Performance Management softwares are a major player in the OLAP space.