Upsert [sic] with SAP Data Services? Don’t cry though….

upsert2So, recently I find myself heavily involved with SAP Data Services and data migrations, I know what you’re thinking… thats not real code! And you’re damn straight …. it isn’t, but I have always leaned heavily towards the data end of the spectrum rather than the UX end which is filled with sorcery, witch craft and more importantly users who are like little pixies with no respect for the sanctity of code; data is data though and it doesn’t press buttons marked ‘Nuclear Button, Do Not Press’ just for kicks. I thus don’t complain too  much (ahem) when ETL jobs come my way. One of the biggest challenges with ETL in the SAP world is to keep everything down at the database level, it’s very easy to slip up and watch your execution times rattle up to unusable very quickly all because you used a feature for which there is no direct database  equivalent and at times it is frustrating at how the simplest functions are not covered; bygones.

The idea for this blog came about when I was trying to establish how to delete (archive) a load of data which is of course easy if you use the Map Operation transform, the issue with that of course is that none of this gets pushed down to the database server. To my knowledge there is currently no way of achieving this (without recourse to the SQL transforms which are of course frowned upon). I needed to think outside the box a little, luckily as  I always use intermediary tables to compile my data  prior to completion that answer was fairly simple:-

  1. Compile data as normal in intermediary table
  2. Mark any candidate records in intermediary table as ‘Archived’ using a database field
  3. Omit ‘Archived’ records when inserting records from intermediary table into main target table

And that led me into using a feature that (until you know the toolset a little better) appears to be missing, namely how to execute an Update statement (using FULL pushdown) against existing data. Oh sure, again it’s very easy to just add a Map Operation and convert a flow of data into an ‘Update’ but this of course results in a RBAR operation as the BODS engine will iterate over each row of data and then change its op code so that it is an ‘update’ not a ‘insert’ . The end result is that the BODS engine then creates individual update statements and we quickly have a non -performant data transformation; furthermore each row of data will compound the problem and before very long we will have a drawing board to which we will have returned.

I must admit that how to achieve Updates with a SQL pushdown did escape my sights for a little while before I realised that the guys from SAP have allowed for it. There’s a nice feature called ‘Merge’ that was introduced in SQL Server 2008 (and I’m certain that other database providers also allow for it) which allows you to easily specify within one SQL statement how to match a series of records from a data source with another series of rows from a datatable, you then specify how to join these disparate and distinct sets of data and in the case of a match or a non match from the above join also specify how to update/insert the data.  For me the SQL is a little verbose for everyday use and I instead prefer to break it into distinct operations for everyday use but then I’m a little bit old school…. The real bonus about this merge methodology is that with very little effort SAP have encapsulated and used this functionality to offer a true database level Upsert mechanism in Data Services Designer. Of course the normal restrictions apply in that you must be operating against and joining on data that has a unique key but then that is relevant to the database provider and not to data services itself

So lets take a look at how to drive this functionality, quite simply we start with our simple transformation as shown below:-

2017-01-10_0912

As you can see we are using the same FactSalesIntermediary table for our source data and target data, in this instance we are inner joining to the other tables to establish which data can be logically archived off.  In our scenario as the FactSalesIntermediary is the driver table we can be sure that we will only ever be updating records and so in this instance the Insert part of the upsert methodology is not actually used at all (this is actually a reasonably normal use case as using upsert is the only effective way in SAP Data Services Designer to push an update down entirely to the database server). If we take a look at the configuration of the table that we are updating you should see the following:-

2017-01-10_0913

Note that ColumnComparison is set to ‘Compare by Name’ and that the only two fields that we are interested in are the SID (which is our unique key) and the archive flag which details the fact that we are to archive off the record with the current SID. If you think about this from a purely SQL point of view this is enough information to compile the following:-

UPDATE FactSalesIntermediary SET Archived = 1 WHERE SID = 10002

The rest of the fields as they are not inputs will of course be ignored. You should also note the other two advanced feature settings that have been highlighted, namely  ‘Auto Correct Load’ and ‘Allow Merge or Upsert’, selecting both of these allows the Data Services engine to create an Upsert SQL statement (instead of the normal SQL ‘Insert’ statement) that can be pushed directly down to the database thus avoiding any RBAR processing. Lets take a look at the SQL that is generated:-

MERGE INTO "DBO"."FACTSALESINTERMEDIARY" s 

USING
(SELECT DISTINCT "FACTSALESINTERMEDIARY"."SID" , 1
FROM (("DBO"."STAGINGSAPDELETIONS" "STAGINGSAPDELETIONS" INNER JOIN "DBO"."DIMSAPDOCUMENTNUMBER" "DIMSAPDOCUMENTNUMBER" ON ( "STAGINGSAPDELETIONS"."SAPDOCUMENTNUMBER" = "DIMSAPDOCUMENTNUMBER"."SAPDOCUMENTNUMBER" )) INNER JOIN "DBO"."FACTSALESINTERMEDIARY" "FACTSALESINTERMEDIARY" ON ( "DIMSAPDOCUMENTNUMBER"."SID" = "FACTSALESINTERMEDIARY"."DIMSAPDOCUMENTFK" )) INNER JOIN "DBO"."DIMSTATUS" "DIMSTATUS" ON ( "DIMSTATUS"."SID" = "FACTSALESINTERMEDIARY"."DIMTRANSACTIONTYPEFK" ) AND
( "STAGINGSAPDELETIONS"."STATUS" = "DIMSTATUS"."STATUS" )
) n (SID , ARCHIVED)

ON ((s.SID = n.SID))

WHEN MATCHED THEN
UPDATE SET s."ARCHIVED" = n.ARCHIVED

WHEN NOT MATCHED THEN
INSERT ("ARCHIVED" )
VALUES (n.ARCHIVED);

So lets break it down (each colour representing a ‘part’):-

  1. As you can see the first part of the SQL creates a Merge Statement which sets up the whole syntactical structure and also aliases the table we wish to Update to ‘S’.
  2. We than have a SELECT statement that we built in our query detailing which source data to use and again aliasing all of this to a virtual table called ‘N’.
  3. We then specify how these two aliased sources join together, in this instance joining the ‘A’ and ‘N’ together by the SID fields.
  4. We then have the nuts and bolts of the upsert itself, we start with the WHEN MATCHED  which effectively gives us our ‘UPDATE’ statement
  5. And finally we have the WHEN NOT MATCHED which for this scenario we actually ignore but in other circumstances would give us our INSERT statement.

And that really is all there is to it, when the example is run I see that 150 records are updated and none are inserted which is what I expect to see. I can then when I populate the Main FactSales table ignore all of the ‘archived’ records

Upsert? Please don’t cry…..

Array Properties in SAP Design Studio Custom Components

ALMA CorrelatorWhilst coding a custom component for a complex SAP Design Studio Dashboard I was vaguely aware that coding one property per required column property was not exactly what you’d call, scalable! What I needed was to be able to capture arrays of data instead of just single properties, that way I could just have iterative code in the component which would take care of my component generation instead of these hardcoded single properties. In this instance I wanted  to capture the colours for a given column header ( for a data table). In my situation I had 10 headers but as this is a customisable component I guess it would be better to express this as ‘n’ rather than 10. Despite posts to the contrary this is indeed made possible by use of the Array data type.

First we start with writing the property into the contribute.xml file. This is really the key to our issue and once this is correct the rest is really plain sailing. We start by writing the following property XML (I find the xml to be much easier than using the GUI) into our file:-

        <property id="columnColours" type="Array" title="Column Colours Array">
            <property id="arrayEntry" type="String" title="Column Colour">
            </property>
        </property>

The outer xml snippet entitled ‘columnColours’ creates an array to house the array data whilst the inner section entitled ‘arrayEntry’ details what I consider to be an array template/type definition for the type of data you wish to store; in this instance the type of data that we wish to store is a string value representing a colour value.  Once this has been defined and the changes saved your array is nearly there. All that remains is to back this up with the accessor function to allow the Design Studio to inject your settings into the main body of the custom component .js file. This function is defined like any other accessor function, identical!

    
    this.columnColours= function(value) {
        if (value === undefined) {
            return columnColours_data;
        } else {
            columnColours_data = value;
            return this;
        }
    };

From here you can now access your property_data which instead of containing a primitive data type now contains an array of the type you defined. You can thus use iterative methods to access the values. Of course, you are all hardened professionals in the javascript world… You won’t do what I do and use pop/shift on the passed array and then wonder why subsequent redraws failed to work;-) That little problem I solved by cloning my source array using the very simple syntax to return an new shallow copy.

var newArray = sourceArray.slice(0);

//Magic happens here, EVERY TIME, not just first time!

Once your code has all been defined you can now  run up the custom component in SAP Design Studio and if you look at the properties grid for your custom component you should see the new property there showing 0 entries. If you then click on the ellipsis to the right of the edit field you should be provided with a dialog similar to that below which allows you to specify the members of your array.  Once you have defined your members your code should be good to go.

2016-04-13_2217

Far more scalable than 9 separate properties, I’m sure you’ll agree.

Go do it.

Change Data Sources at Run Time in SAP Design Studio

LOLZ.Recently whilst working on quite a complex SAP Design Studio Dashboard for a valued client we came across the situation whereby the  custom graphs and charts that we had developed needed to be changed at run time. This seemed, and in reality was, quite simple although information was sparse and so it took a little bit of digging to find the answer to our solution.

What was very clear was that the out of the box components provided by SAP all had a mechanism by which this could be done , by use of the .setDataSource() method which takes the name of  a data source within your Design Studio project. What was also clear was that the custom components created by us (based upon the samples given, in this case the simple table example)  did not expose this property. We went thorough the normal responses,

1. Ignoring the fact that a squiggly line underneath our setDataSource call meant that it would not resolve at run time… and tried to call it anyway.

2. Cursing. A good option always… I am well versed.

3. Considering replicating all of our components, one for each data source.

4. More cursing….Cursing the fact that we had even considered 3 an option given the complexity of our situation.

5. Abused Google with various incorrectly worded searches.

6. Considered becoming a monk, there are many many reasons why I could not do so, mainly cursing and the fact that I don’t believe in God

7. Trawled the forums looking for examples.

In the end solution 7 yielded a nugget that prevented the opening of the ‘First Order of Cursing Monks’ (Good band name right?!)

The answer was in fact ridiculously simple as all solutions are once you know them! Firstly we need to return to the custom component editor which is of course Eclipse,  within your custom component is a file entitled contribution.ztl.  If you open this file and look at the class that this extends from it is by default Component. Component of course knows nothing about how to deal with DataSources and so instead you need to change this file to inherit from DataBoundComponent like so:-

class com.dscallards.sap.uitable.UITable extends DataBoundComponent {
}

You can then save and run up the custom component in Design Studio and now you may observe that upon writing the following code you are left with no squiggly line to indicate that you have a compilation error.

myCustomTable.setDataSource(myDataSource);

And better yet, when you run the code you can see that the data source does indeed change when your event (in this sample case a button onClick)  is triggered.

Go do it.