SAP Datasphere Data Integration - Part 1 - Introduction and Integration using Remote Tables
SAP Datasphere provides a large set of default connections to access data from a wide range of sources, source from SAP as well as non-SAP sources or partner tools, residing in an cloud as well as on-premise environment.
Connections are defined in a separate view which can be found in the main navigation window of the left. Connections are individual objects in SAP Datasphere. They are created and maintained per SAP Datasphere Space which means only members of a specific Space are able to make use of the related connections.
These connections can be then be used within the different tools (like SQL View Builder, Graphical View Builder or Data Flow) for creating your data models.
SAP Datasphere can integrate with both SAP or Non-SAP Source systems and it can be either cloud or on-premise.
- Remote Tables - helps building views. During import, the tables are deployed as remote tables. Depending on the connection type, you can use remote tables to:
- directly access data in the source (remote access)-
- copy the full set of data (snapshot or scheduled replication)
- copy data changes in real-time (real-time replication)
- Data Flows - After you have created a connection, in the data flow editor of the Data Builder, a modeler can add a source object from the connection to a data flow to integrate and transform your data
- External Tools - SAP Datasphere is open to SAP and non-SAP tools to integrate data to SAP Datasphere
- Model Import - The model import is a special feature for SAP BW/4HANA or SAP S/4HANA Cloud as a source. It supports importing meta data having to rebuild them manually. After you have created a connection, from the entry page of the Business Builder, a modeler can import source meta data from the connection
Integration using Remote Tables:
Most of the SAP sources are able to integrate as Remote tables in Datasphere as sources. In for this SAP Datasphere data integration approach, the SAP HANA Federation Framework is currently mainly based on SAP HANA Smart Data Integration (SDI) and its data provisioning framework (dpServer + dpAgent). There are some sources which are able to setup a direct connection (e.g. SAP SuccessFactors, SAP HANA Cloud) or SAP HANA Smart Data Access (SDA) with the Cloud Connector (e.g. SAP HANA on premise). But in fact, most SAP Datasphere connection types that support creating views and accessing or replicating data via Remote Tables for this purpose leverage this so-called SDI Data Provisioning Agent (DP Agent). Appropriate Setups are needed before creating any connections.
What are DP Agents?
DP Agents is a light weight component running outside the SAP Datasphere environment. It hosts data provisioning adapters for connectivity to remote sources thus enabling data federation and replication. This basically acts as a middleman (gateway) to SAP Datasphere providing secure connectivity between the database of your SAP Datasphere tenant and the adapter-based remote sources. These are managed by DP Server. Typically this is needed by all the SDI connections.
Through the Data Provisioning Agent, the pre-installed data provisioning adapters communicate with the Data Provisioning Server for connectivity, metadata browsing, and data access. The Data Provisioning Agent connects to SAP Datasphere using JDBC. It needs to be installed on a local host in your network and needs to be configured for use with SAP Datasphere.
Now that we understood what are dpAgents and how it helps in SDI connection to SAP Datasphere for Remote tables, we must understand how SDA Connections are done in SAP Datasphere for Remote tables.
Only for SAP HANA on-premise based Remote tables, SDA connections are used. These are done using a Cloud Connector as a link between SAP Datasphere and SAP HANA on-premise.
What are Cloud Connectors?
The Cloud Connector serves as a link between SAP Datasphere and your on-premise sources and is required for connections that you want to use for following use cases:
- Data flows
- Model import from SAP BW/4HANA Model Transfer connections (Cloud Connector is required for the live data connection of type tunnel that you need to create the model import connection)
- In rare cases also for Remote tables: Only for SAP HANA on-premise via SDA
The above picture would provide an idea of SAP Datasphere Connection types that support Remote table Federation.
Data Access (Remote/Replicated):
By default, when you import a remote table, its data is not replicated and must be accessed using federation each time from the remote system.
You can improve performance by replicating the data to SAP Datasphere and you can schedule regular updates (or, for many connection types, enable real-time replication) to keep the data fresh and up-to-date.
During Replication, data is replicated physically from the source into SAP Datasphere.
- For snapshots, replicated means that data is read from the replica table which is not updated in real-time.
- For real-time replication, replicated means that data is read from the replica table and expected to be updated in real-time if changes occur in the original system.
Virtual access is also possible using remote tables, that points to a table in an external system without copying the data. So, the data is transferred via network each time a query is executed through which the data is accessed directly from the source and read from the virtual table.
Data Integration from SAP BW Bridge as Remote table:
A dedicated connection called BWBRIDGE of type SAP Datasphere, SAP BW Bridge is available to import SAP BW bridge objects into SAP Datasphere as Remote Tables. This SAP BW bridge connection differs from other connection types, as it can't be created or modified. It is being generated by default by SAP when the SAP BW bridge component is provisioned.
Comments
Post a Comment