Skip to main content

Home Data warehouse

Data warehouse

Data warehouse definition

A data warehouse is a centralized repository for data collected from various sources. The primary goal of data warehouses is to make sure that the data gathered across an organization has a common point of reference and can be used for comparison (and is thus useful for business analysis).

See also: sensitive information, extraction, data custodian, data mining, predictive data mining, machine learning

Real data warehouse uses

  • Consolidating data from multiple systems (such as apps or databases) into a single location.
  • Keeping large volumes of historical data to let organizations analyze trends and patterns over time.
  • Letting users share customized reports based on the available data.
  • Providing a platform for business analysis and comparison. The resulting insights help executives and departments make informed decisions.
  • Providing an environment for AI training, including deep learning and machine learning.

How data warehouses operate (the ETL process)

  • Extraction (E) is the collection of data from various sources for transformation. Extraction involves identifying the relevant data, filtering it, and preparing it for processing.
  • Transformation (T) is the act of standardizing the extracted data into a common format. Transformation ensures that data from different sources has a common frame of reference and can be used for comparison.
  • Loading (L) involves storing and organizing the transformed data (for example, by arranging it into tables).