Agile data warehousing

Agile data warehousing is a method of architecting data warehouses and enabling efficient reporting systems to meet the needs of today's demanding market. Such reporting systems allow introduction of rapid changes in the data warehouse to be up to demands for information on customers, products, and financial information, which are necessary to keep up with the pace of business nowadays.
Agile data warehousing stands in contradiction with waterfall model of business and data management: it places emphasis on supporting rapidly developing business and always puts business value first.

Principles of Agile data warehousing

There are several main principles, which should be fulfilled by Agile data warehouse; however, some of them may already exist in non-Agile data warehouses.

  • Agile data warehousing can only be performed by Agile-oriented team. Introducing Agile methods may be difficult, so the staff must be ready for a change and capable to cope with it. Therefore Agile data warehousing team must be selected very carefully and they have to understand changes and anticipate coping with a continuous stream of new tasks, requests and requirements.
  • Metadata-driven development tools support business development, speed up the introduction of changes to an existing data warehouse, and accelerate implementation of new Agile data warehouses. When changes have to be brought in, meta data facilitates delivery of fresh information. Meta data tools store the information about the relation of meta data to source data tables, the methods of extraction, transformation and loading used, and the relation of data tables for query purposes. In Agile data warehousing the meta data files should be open and available for other programs, so you can use a broader spectrum of tools. The most popular format is the Common Warehouse Metamodel (CWM).
  • Isolating Data Tables speeds up changes introduction. In Agile data warehousing the data tables should be divided into groups by their subjects to reduce the amount of knowledge needed to understand the data, and subsequently, to introduce changes. Interactions between groups should be limited to embrace a certain area without having to understand the whole warehouse. To do so, foreign keys that reference entities or dimensions are usually introduced. Together with isolation of data tables, you can also divide your data in smaller portions stored in denormalized tables. This will allow non-business developers to understand the relations between data and to make changes in the warehouse.
  • Using surrogate keys enable integrating data from different programs. They allow combining information from multiple sources into existing data warehouse tables. This helps to avoid duplicate key generation and enables maintaining the data properly identified and separated.
  • In Agile data warehousing frequent source systems feeds help to gain maximum return on investment (ROI). Keeping information fresh is really necessary to adjust your actions to market’s demands. To determine required information you should set maximum time and provide your customers with information no older then the set amount of time. Moreover, by expanding the usage of data warehouse itself, you can reduce the load on application server to increase return on investment of the warehouse.