Pages

Showing posts with label Basic concepts. Show all posts
Showing posts with label Basic concepts. Show all posts

Sunday, June 26, 2016

Conquering Basics: Part 03 DataStore Objects(DSO) and Infocubes

       After we studied the concepts of infoobjects and Persistent Staging Area(PSA), lets move on to DSO's and Infocubes. But before we move on to this topic lets revisit what are infoproviders and data providers in SAP BW.

Infoproviders:

These are objects which provide data for queries. They are basically the objects on which you do reporting.

Data Providers

Data providers are objects which are used for intermediate data staging but not for reporting.

Let us now study DSO and Infocubes which can be both data providers and infoproviders

DATASTORE OBJECTS(DSO)

A DSO is just a transparent table in SAP backend which works similar to a RDBMS. The primary key equivalent in a DSO is called as a key field. A composite key(primary key which is a combination of fields in primary key) are also represented as key fields. We can have a maximum number of 16 key fields in a DSO. A DSO can have overwrite/summation property and this is where it differs from a normal RDBMS where duplicate primary key entries are not supported(violation of rule). In case of a DSO, if you enter a record with the same primary key combination twice, it overwrites the previous value with a new one. A DSO can also have summation property but it is hardly used. As of SAP BW 7.3 there are three types of DSO's which are available for designing:


  • Standard DSO : It is the most widely used DSO for intermediate data staging. A Standard DSO is further broken down into 3 tables in the backend.    
    1. New Table(Activation Queue): New table holds data as it arrives during a data load. The primary key of the New table comprises of Technical Characteristics such as Data Packet ID, Request ID and SID.
    2. Change log: The change log table is a sort of calculator. This table keeps a track of changes occuring during data load. Delta loads through further targets happens through the infocube.
    3. Active table: An active table contains data which can be used for reporting/furthering data to other infocubes. Data enters the active table by the process termed activation. This process links Transaction to Master data. After activation, each request contains unique surrogate id (SID). 

  • Write optimised DSO: A write optimized DSO contains only one table i.e active table. This skips the time consuming activation process. This is used for intermediate staging of data.

  • Direct update DSO: This also works similar to W-DSO the difference being it can be updated not by DTP's but by Analysis Process Designs(APD) or any third party tools.

INFOCUBES

Info-cubes are multidimensional structures which are used for reporting on multiple dimensions.. For example, you need to analyse sales across countries, regions, time. With each one being a dimension, managers/decision makers can use one/multiple dimensions for reporting and making decisions. This is the prime reason why infocubes are used over DSO's for reporting. 
With the advent of SAP HANA, all DSO's and Infocubes are replaced by Advanced DSO's (ADSO).
An Infocube follows the Extended Star schema. Meaning the fact table is linked to the dimension table using Dimension id's. These dimension tables are further linked to their base tables using surrogate id's(SID). This sometimes increases data reading times and hence Line item dimensions(LID) are used for this purpose. A LID clearly surpasses the layer of DIM ID's and the underlying table is directly linked using SID's which greatly reduces data reading time.



1.1 SAP Infocube structure(Extended Star Schema)


An infocube is further classified as standard cube and Real Time cube. RT cubes are used for planning purposes(data directly from external APO's) RT infocubes do not have a flow as such but get data directly. A standard infocube on the other hand follows a data flow and gets data from multiple layers beneath it.


Thursday, December 17, 2015

Conquering Basics:Part 02 Persistent Staging Area and Infoobjects

  Before moving on to loading issues, we will first quickly introduce you to SAP objects used in numerous staging layers. A Comprehensive description of these objects can be found in the official SAP website and I wont repeat them here. But I will explain it to you in my way, with real world examples.

Persistent Staging Area
The PSA(Persistent Staging Area) is the entry point into the BW system.Data from the source system(ECC) first arrives here where we cleanse it according to certain rules(no special characters allowed, no capital letters allowed etc). We edit these records for it to be ready to be sent to the reporting layers. Of the ETL, Extraction happens in this stage and to some extent the transformation.A Transfer structure represents a 1:1 mapping between source and BW system using which data is transferred from source(where users post) to BW. For each data source in the ECC, the same DS is created in BW which contains a transparent table(PSA). A transparent table is the one whose definition is as a single and independent table in database. SAP also has pooled table and cluster tables.

Example: A company keeping an employee in probation period is as good as data being stored in the PSA. Being in the probation period is not the final fate of an employee, either he is made permanent or he is removed from the company.Similarly, each data record in the PSA is either transferred to the subsequent reporting layers or is discarded. An employee in probation doesn't enjoy those rights given to a permanent employee in the same way as any record in the PSA cannot be used for reporting unless transferred ahead.

Infoobject
An Infobject is a business evaluation object which is itself responsible for distributing and dividing data logically.An infoobject contains characteristics(company code,fiscal year period,region),key figures(amount,number,quantity) etc. These are reflected in multidimensional cubes whilst reporting. If the characteristics have Text, Attributes and Hierarchies as its part then the info object is sure to contain Master data.WHY?
Consider the below example determine whether a transnational data can have attributes,texts and hierarchies.
Example: Suppose I load employee master data. Here MD attributes will be employee name,employee number etc. The  details to these attributes are stored in MD Texts. Hierarchies are nothing but an organisational structure according to which data is classified. A hierarchy can be considered similar to the following flow( Organisation-->Continental Level branches-->Country wise branches-->Region wise brahcnes in a country).
Note: Hierarchies play a very important role in roll up and drill down operations in BW reporting. Rollup and Drill down are the unique features owing to which many organisations opt for a BI system.

Lets continue the basics in the upcoming article.

Monday, August 24, 2015

Conquering basics:Part 01:Master and Transactional data

  Well..well..well if you thought I am going to utter what is written on all those forums or on the official site of SAP then you are probably wrong. I will try to keep it as close to the real world for better understanding.So, lets begin.
 Data is basically divided into two categories:

1)Master Data
2)Transaction Data

A master data is the type of data which is referenced by numerous transactions. Lets take an example to make it clear:
Suppose you go shopping in Store XYZ, the store's Backend system basically stores two types of information:

1)Your information such as Name,Address,email,telephone number etc.(which hardly change)
2)The details of the goods that you buy on each visit.(which is dynamic and changes with each of your visits)

You might have very well guessed that the first point is nothing but master data. It is called master data because it is the reference for transaction data without which transaction data actually has no meaning. For instance if I just say a somebody bought a towel, does it make any sense to the store? but on the other hand I say a towel bought by Mr. Joe then it makes sense to the business in evaluation of sales. This reference may help the business make useful decisions such as

  1. the business annually from Mr Joe
  2. Is Joe the maximum Purchaser(star purchaser)
  3. The perks to be given to Joe based on his purchases.


Transaction  data as stated above deals with variable data i.e the data which is highly dynamic. Suppose I go to shop XYZ and the first time I buy a towel, a shirt. Next time I go to the same shop, I buy a Basketball and a T-Shirt. Isnt it changing?Yes.
To sum up, transactional data without master data is unreferenced and meaningless data(which has no meaning by itself.)

For a better understanding, you can consider master data with CUST_ID as the primary key in the CUSTOMER TABLE with fields such as CUST_NAME,CUST_ADDRESS and transaction data as the TRANSACTION_ID as the primary key in the TRANSACTION_TABLE with other fields as ITEM_BOUGHT,COST_PER_PIECE,TOTAL_COST and CUST_ID as a foreign key referring to CUST_ID of the CUSTOMER_TABLE. But surprisingly in SAP you do not have the concept of FOREIGN KEY violations by which I mean you can load transaction data with no reference to master data without any error. But surely, there will be inconsistencies in the report outputs as there is no reference to Master Data. We will elaborate on this issue later on.

Please note that SAP BI/BW itself is not a relational database. SAP BW can be said to be a collection of databases/Database Management System (DBMS) Softwares which integrate data from numerous platforms in an organised form. The backend Databases/DBMS's used can be Oracle, MaxDB,MSSQL server etc.

I end this blog on the notion that I was clear enough in explaining these concepts.  We will delve deeper into these concepts in the coming articles.Should you have any queries, feel free to comment.