After we studied the concepts of infoobjects and Persistent Staging Area(PSA), lets move on to DSO's and Infocubes. But before we move on to this topic lets revisit what are infoproviders and data providers in SAP BW.
Infoproviders:
These are objects which provide data for queries. They are basically the objects on which you do reporting.
Data Providers
Data providers are objects which are used for intermediate data staging but not for reporting.
Let us now study DSO and Infocubes which can be both data providers and infoproviders
DATASTORE OBJECTS(DSO)
A DSO is just a transparent table in SAP backend which works similar to a RDBMS. The primary key equivalent in a DSO is called as a key field. A composite key(primary key which is a combination of fields in primary key) are also represented as key fields. We can have a maximum number of 16 key fields in a DSO. A DSO can have overwrite/summation property and this is where it differs from a normal RDBMS where duplicate primary key entries are not supported(violation of rule). In case of a DSO, if you enter a record with the same primary key combination twice, it overwrites the previous value with a new one. A DSO can also have summation property but it is hardly used. As of SAP BW 7.3 there are three types of DSO's which are available for designing:
Infoproviders:
These are objects which provide data for queries. They are basically the objects on which you do reporting.
Data Providers
Data providers are objects which are used for intermediate data staging but not for reporting.
Let us now study DSO and Infocubes which can be both data providers and infoproviders
DATASTORE OBJECTS(DSO)
A DSO is just a transparent table in SAP backend which works similar to a RDBMS. The primary key equivalent in a DSO is called as a key field. A composite key(primary key which is a combination of fields in primary key) are also represented as key fields. We can have a maximum number of 16 key fields in a DSO. A DSO can have overwrite/summation property and this is where it differs from a normal RDBMS where duplicate primary key entries are not supported(violation of rule). In case of a DSO, if you enter a record with the same primary key combination twice, it overwrites the previous value with a new one. A DSO can also have summation property but it is hardly used. As of SAP BW 7.3 there are three types of DSO's which are available for designing:
- Standard DSO : It is the most widely used DSO for intermediate data staging. A Standard DSO is further broken down into 3 tables in the backend.
- New Table(Activation Queue): New table holds data as it arrives during a data load. The primary key of the New table comprises of Technical Characteristics such as Data Packet ID, Request ID and SID.
- Change log: The change log table is a sort of calculator. This table keeps a track of changes occuring during data load. Delta loads through further targets happens through the infocube.
- Active table: An active table contains data which can be used for reporting/furthering data to other infocubes. Data enters the active table by the process termed activation. This process links Transaction to Master data. After activation, each request contains unique surrogate id (SID).
- Write optimised DSO: A write optimized DSO contains only one table i.e active table. This skips the time consuming activation process. This is used for intermediate staging of data.
- Direct update DSO: This also works similar to W-DSO the difference being it can be updated not by DTP's but by Analysis Process Designs(APD) or any third party tools.
INFOCUBES
Info-cubes are multidimensional structures which are used for reporting on multiple dimensions.. For example, you need to analyse sales across countries, regions, time. With each one being a dimension, managers/decision makers can use one/multiple dimensions for reporting and making decisions. This is the prime reason why infocubes are used over DSO's for reporting.
With the advent of SAP HANA, all DSO's and Infocubes are replaced by Advanced DSO's (ADSO).
An Infocube follows the Extended Star schema. Meaning the fact table is linked to the dimension table using Dimension id's. These dimension tables are further linked to their base tables using surrogate id's(SID). This sometimes increases data reading times and hence Line item dimensions(LID) are used for this purpose. A LID clearly surpasses the layer of DIM ID's and the underlying table is directly linked using SID's which greatly reduces data reading time.
1.1 SAP Infocube structure(Extended Star Schema)
An infocube is further classified as standard cube and Real Time cube. RT cubes are used for planning purposes(data directly from external APO's) RT infocubes do not have a flow as such but get data directly. A standard infocube on the other hand follows a data flow and gets data from multiple layers beneath it.