Why Do You Have To Be So Sensitive?

Time Crystals

For this latest blog I have my soon to be ex-colleague (Splitter!) Luke ‘Maximo Park’ Johnson to thank. He asked me a question with regard to SAP Data Services which has always bugged me but that I have never managed to resolve, A quick google search on my part confirmed my previous line with regard to this issue and I left it at that, he sent me a link sometime later which I investigated which contained a nugget that had thus far eluded me. So I investigated and his nugget became fully a fledged gold bar. I therefore decided to blog this in order that others may benefit from this knowledge…. and I won’t forget!

So, what exactly was this problem? The question he asked me was ‘Why does Data Services create all of its tables within the data store with an upper case name. This has been my biggest gripe by far (apart from the god awful SQL that gets generated!) because being an everyday human being we use shortcuts for everything, consciously and subconsciously. Let’s step back a little to step forward…. When I create a database table dealing with Product Derivations I have a few ways I can do this, I could use the following names:-

Product Derivations



Its very much a matter of taste of course but as I have an abundance of taste I would always use the latter of these options! Why? Well option 1 with its embedded space means that when it comes to using SQL (which is an environment I live extensibily in)  you would have to use [] or “” around every table name which is just hideous and more more more work

SELECT * FROM “Product Derivations”


SELECT * FROM [Product Derivations]

Eugh… So option 2 is definitely less work as regards to these extra characters but I left visual basic behind a long time ago. To be perfectly honest it’s just amateurish and doesn’t really aid in the readability, all those silly little underscores which to me break just serve to the message up

SELECT * FROM Product_Derivations

No, I would use the final Camel Case option every time, it says exactly what it needs to and no more nor no less. Data Services, however, gets in the way of this as out of the box (and seemingly until today) it creates all Template Tables fully upper case which is just a bit of a car crash…Wait, I hear you say, surely it doesn’t matter? But actually, it really really does…  It all goes back to those shortcuts that we subconsciously do. I have read a LOT of words in my lifetime, many many millions I suspect and I know for a fact that not one of us reads every single letter in a word before arriving at ‘the word’ unless we are just learning that word or training our brains like children do. In general terms we just don’t work like that, we shortcut using context and we also shortcut using word shapes the word ‘balloon’ looks a lot different to the word ‘battlestation’ and we see that shape pretty much instantly. When I am therefore looking through a list of potentially hundreds of database tables, all beginning with P, my eyes are scanning for all of those that start with a shape like ‘Product’ and then once that shape has been recognised the rest of ‘the word’ will also be matched meaning that matching the words ‘ProductDerivation’ is a far easier job than it otherwise would have been.


What happens though if we take the case sensitivity out of the equation?


I’m sure you’ll all agree much harder to read…

And if we go to the opposite extreme and do what Data Services does and upper case everything things go from bad to shocking! Practically unreadable at speed as we don’t work in upper case really.


So, the question is how do I make Data Services behave and give me the tables EXACTLY as I define them? Well, the answer is very simple. WE can make one configuration change and normal service is resumed. We start by locating the DSConfig.txt file which is located within the Data Services/conf folder. On my machine this was within this directory:-

C:\Program Files (x86)\SAPBusinessObjects\DataServices\conf

We then open this file up and look for the [AL_Engine] Section heading which will look like this:-


AlEnginePathName =…..

You then need to add a new setting called ODBCAlwaysCaseSensitive and give this a value of ‘TRUE’

Your file should now look a little like this:-


AlEnginePathName =…..

ODBCAlwaysCaseSensitive = TRUE

And that’s it as far as changes are concerned. We just save the changes before stopping and restarting the SAP Data Services windows service available from the Windows Services Manager dialog. Lo and behold, you’re all done. What you should now find is that upon running your data transformations any template tables that you have defined will now upon closer examination within SQL Server have been created using the casing you specified at design time. I don’t know about you but for me, this will now be the first thing I change in any new data service installations….


There Is Nothing Like A Dame…


So there I was on a Friday (I’m lying it was a Thursday actually) minding my own business and trying to implement a like clause in SAP Data Services thinking… It really shouldn’t be this hard! But it was, it turns out that the double pipe characters   (||) which are the usual concatenation expressions within Data Services aren’t valid in the join expression. This was slightly problematical as I needed to query with a like as well as with exact matches. Faced with my like clause problems I would see the day quickly slipping away from me. The choices were as follows:-

  1. 1. Move all of the logic for my transform out to a Stored Procedure . I have no doubt that this would work but would have resulted in much reworking as my data transformations were reasonably complex and there were many of them. In addition it always feel like a cheat too as you end up losing the Data Lineage which to be honest had already gone to hell in a hand cart in this instance!
  2. 2. Not use a like, nor give a fig and just give up and run away to the South of France for the rest of my life. In all honesty, I'm struggling to find a downside here...
  3. 3. Add a new derived which performed the concatenation and created a new field call FuzzyData. I could then use the phrase OR FieldToCompare LIKE FuzzyData. Theretically this could work but I could see this extra query step potentially preventing my pushdowns from working and with the size of this data set that would be completely unworkable.

Now if you’re anything like me you stopped reading at point 2, made for the airport and was enjoying a cheeky red before embarkation; actually scratch that. I think that a red is probably way too classy and haughty to be considered cheeky. But I digress…. If you are also like me I would guess that Mr Lloyds also probably had you forcibly ejected  from the aeroplane United Airlines style; Something about better repaying those airline tickets size dents in ‘his’ money

So lets state the problem, we have a set of spreadsheet data that contains data returned from a client pertaining to products that they have ordered which are defective in some way; this data has been moved into a staging table within our data warehouse to allow for performant querying. There is of course much data within this table but the main identification field is the barcode which is always in the format ‘nnnnnnnn’ n being a numeric value of the barcode as exposed to the client.

Spreadsheet Barcodes


Also within our data warehouse is a DimProducts dimension table which has loaded the full inventory from our own LOB database. The barcodes within this data are generally  in exactly the same format although there are a few that have 11 characters instead of the more usual 8 (format ‘nnnnnn-nnn’). The client however will only ever see the (first) 8 characters of any barcode including these anomalies.

Dimension Barcodes


The difficulty is that we wish to map the data contained within the ‘spreadsheet’ to the data within the ‘LOB database’ so that for every defective complaint sent out we can associate it to the product within our inventory. We can then contact manufacturers etc. Remember that in the main there will be a direct match between these two sources, it will only be ‘edge cases’ where the ‘spreadsheet’ data will match on the first 8 characters of our 11 character barcodes. As you can see from the sample barcodes above if we were to perform the below join we would only match on the first two records

 'Spreadsheet.Barcode = Database.Barcode'

The final barcode would not be resolved as 09812344 is not the same as 09812344-908. In order to match against this we would need a join like this:-

'Spreadsheet.Barcode LIKE Database.Barcode || '%''

The percentage is of course required as SQL server requires wildcard characters to utilise fuzzy match on a LIKE statement otherwise it is essentially just an equal to statement. Nice and simple so far but just try and use that phrase inside an SAP data flow join. Not very helpful errors ensue and the afternoon disappears in much the same way that my salary does… alarmingly fast.

As I alluded to earlier in this blog post the easiest way to address this is to split your processing into two joined queries, in the first we select only the fields we are interested in from our spreadsheet staging table and derive a new field entitled FuzzyBarcode. This field uses the following formula for its text

Spreadsheet.Barcode || '%'

This removes the need for monkeying around with pipe characters in the joins at a later stage. Next off in the second query we bring in the data from the first query and join that to the data within our products table using the barcode join as shown below:-

'Dimension.Barcode LIKE Spreadsheet.FuzzyBarcode'

This data can then be output into your fact table and as you should see something magical happens. Not only does your data resolve in the manner that you wished the execution of this workflow (so long as you have not done anything else too complex) gets entirely pushed down to the database leaving you with a statement Similar to this pseudo SQL

INSERT INTO FactComplaints (DimProductFK,RejectionReason)
SELECT D.Sid,S.RejectionReason 
FROM Spreadsheet AS S INNER JOIN DimProducts AS D
ON D.Barcode LIKE S.FuzzyBarcode

Perfect, a like clause within SAP BODS and everything gets pushed down to the database for maximum efficiency.

Selectively Executing Jobs with SAP Data Services


Time…. Its constantly against us in every working day and it matters not how productive you manage to be somebody will always rain on your parade with ‘just one more job’ to eat away at your life blood. So here’s the deal… I’ll tell you how to save a little bit of time during your day but…. Shhhhhhhhh, keep quiet about! Maybe you could use the time you will doubtless save to start planning your takeover of the world, lets face it you could hardly make more of a hash of it than the current crop of lunatics are, could you?. Agreed? Good, now without wasting more time lets improve the world….

When it comes to Data Services one can spend an awful lot of time rerunning the same data imports over and over again and the cost added up over time can be considerable. Sure we can use the ‘Bypass’ functionality but as anyone with more than a passing familiarity to this can testify it can be more hassle than its worth; and of course once you start moving jobs between repositories everything goes further south than The Duke Boys on a day trip to the Alamo. In a chance conversation with my colleague Angus Menter (if he hasn’t lived there, it probably doesn’t exist) the frustrations of a particular user were relayed to me and I was asked how I coped with this. The answer is in fact relatively simple once you understand but there is a little bit of setting up.

So lets state the problem, I have a small but complex job that moves data from a line of business database to a data warehouse for reporting purposes. This is rather simplistically accomplished in 3 workflows.

  1. Destage Data (Move data from source database to destination database for reasons of performance)
  2. Create Dimension Table
  3. Create Fact Tables (using data created in step)


Now imagine that i had one massive view to move over in Stage 1, which I then needed to split into over 500 dimension tables in step 2 before creating one fact table referencing all of these dimension tables in step 3. Imagine now that Stage 1 takes 15 minutes to upload… It doesn’t take a genius to see that when it comes to designing and testing and running stage 2 that initial 15 minutes download is going to get very old very fast, as indeed will I. Ideally we would probably only want to run this initial stage once a day during the development process, if that. This we would accomplish  by changing our main workflow to use ‘Conditional Workflows’ overarching our 3 main workflows.

We start by removing the standard workflows from our main canvas (don’t worry, you won’t lose them) and replacing them with three conditional workflows named in a similar manner to the deleted workflows as shown in the following image.


We then open up each conditional workflow in turn and using the local object library’s ‘Workflow’ tab we drag our previous normal workflow (named the same) into the ‘IF’ window of the conditional workflow. What this will mean is that if the logical statement (which we will write later) equates to true then we SHOULD run this workflow otherwise we will run whatever is in the ‘ELSE’ window, in this instance…. Nothing at all. Your 3 conditional workflows should look a little like the following 3 images.

This slideshow requires JavaScript.

Great, so thats really coming along but of course we need to manage the logic regarding whether or not these jobs run. How do we do this? Well, there are a number of ways that it could possibly be achieved but I think for me the easiest and most flexible way is to use ‘Substitution Parameter Configurations’. For this of you not familiar with these it is essentially a set of parameters that are assigned at run time and fed through to the executing job. The job may then query these values to find out ‘how it should run’. Lets start off by defining our parameters, we want three one for each workflow so that we can control whether or not each runs. We thus open the Substitution Parameter Configurations dialog from the tools menu as shown below.


We will then be furnished with a dialog like the following where we can define our three parameters, I’m calling them:-

  1. $$DestageData
  2. $$LoadDimensions
  3. $$LoadFactsData


And I think that their usage is fairly obvious by their names. We set them all to have the value YES indicating that as things stand they SHOULD run. We then press OK and dismiss the dialog. We then return to each of the Conditional workflows in turn and set the ‘IF expression using the ellipsis button surrounded with a green square toward the top right of the following dialog. We then flip to the variables tab, expand the Substitution Parameters item and drag our ‘$$DestageData’ parameter into the formula editor before finishing the formula as shown below:-


Note how both the parameter name and the value MUST be enclosed in single quotes as they are text values and not numeric.  We then do this for each of the three conditional workflows as shown below:-

This slideshow requires JavaScript.

Great strides! We’re nearly there. In fact, lets just test that what we have created works as intended. If you save and run your job you should see that it does indeed function as it did previously but of course we are STILL running all of the jobs.

We could leave it at this point if you wished, and omit the running of a stage simply by opening and changing the Parameter Configurations which is perfectly acceptable. However, we will go a little further and with a little bit more set up eliminate the manual editing entirely.  We start by once again loading up the Substitution Parameter Configuration dialog and renaming our Configuration to something a bit more obvious, in this instance ‘Run All’ and as you can see all of the stages are set to run.


We then create an exact duplicate of this configuration using the indicated button which we rename to RunAllExceptDestage. Yep as you’ve probably guessed we then change the ‘$$DestageData’ value to ‘NO’ indicating that it should NOT run.


Brilliant, so lets run it right? Hold on Tiger… we’re not there yet.

We then need to utilise the System Configurations as shown below:-


These are amazingly useful things that allow you to create configurations with associated substitution parameters and data stores, you could thus work against a live database, or a test one or run in a different manner entirely. In our instance we don’t have different data stores and so we are only really interested in changing the substitution parameter configurations that are used. We thus create two System Configurations and entitle them rather imaginatively ‘RunAll’ and ‘RunAllExceptDestage’. Each configuration then uses different substitution parameters meaning that one will execute the full process, whilst the second will omit the ‘Destaging’ operation.

This slideshow requires JavaScript.

Nice and simple, we save those changes and now upon executing our job we get a new dropdown to pick from:-


From this we select ‘RunAllExceptDestage’. Once executed you should see that we no longer run the Destage operation and instead jump straight to the Dimensions job instead.


Amazing eh? Enjoy your executions……




SAP Data Services – Importing Image Data

Quite recently I was working on an SAP Data Services project which involved importing a great deal of data about fruits including photographs of the different varieties. There was a large amount of work to do and I must admit that whenever I thought about this photographic element of the import I developed a kind of coders twitch; I would hit google, notice the total lack of suggestion as to how this can be achieved and I would then push it very deliberately to the bottom of my pile. This is not normally how I work, I generally tend to do the hardest things first make stunning progress and then realise how much I had underestimated the complexities of the ‘simpler tasks’! But this was image data which is probably my least favourite thing, I like data… Eventually there came a point when I had no pile left under which to push this job any longer and so I had to assert myself to the job in hand.

Lets start by stating the problem, which lets face it, could have been much worse. We were given a spreadsheet containing many different varieties of fruits along with an awful lot of data, most of which is to be frank, completely irrelevant to this blog. This document also contained a relative path for each variety of fruit which described the location of a jpg file which was the picture itself. My task was to get all of this data into a database to allow the reporting services  to report against. So lets take a simplistic look at the data:-

As you can see we have a Variety Column which describes the name of the fruit Variety  along with a Product column which describes the Type of fruit. Row 1 is thus a variety of ‘Mango’ called ‘Amelie’ whilst Row 2 is a variety of ‘Passion Fruit’ called ‘Edulis’. For each of these varieties we also have a ‘Photo’ column which in this instance stores a relative path (from the context of the location of the document itself). The base path of this document is in this instance:-


Helpfully the first  row we look at in our earlier screen capture has a null value but the next row contains a valid relative path of

Exotic photos\Edulis.JPG

Using this relative path and the known location of the document itself we can thus build a path that details the location of the image relating to the Edulis Passion Fruit:-

K:\ImportFiles\Shared\Photos\Exotic photos\Edulis.JPG

Nice and simple, so we have the path now how do we go about translating that path into BLOB data  representing the actual image itself? This is where I struggled to find information, I scanned the function library looking for a ‘LoadImage’ method or similar before hitting the internet itself, after turning over some pretty scabby stones I found a forum post that was pretty illegible though it did contain a morsel of gold dust. Turns out there is no specific load function within the SAP library which I had already ascertained. There was however a fairly simple, but equally obtuse method for loading image data which involves writing the path name to a Text File  using a certain format and then opening and importing that data using a standard workflow. Lets investigate just how that works. We start by processing our spreadsheet data into a format that can be written to our text file, the transformation  I wrote looks like this:-

As you can see when you examine this first query I am merely removing any leading/trailing blanks and ensuring that null values are replaced with an empty string. Nothing tricky there nor even necessarily relevant to you.

The next operation is where the real work is done.

For the purposes of this demonstration we are only really interested in the ‘Photo’ field where as you can see we are building a string that looks a lot like this:-

<<K:\ImportFiles\Shared\Photos\Exotic photos\Edulis.JPG>>

The appending of the angle brackets will later tell SAP data services to treat the data contained within them as the path for an image that should be loaded. You needn’t worry yourself with the replace function that I have had to use, this is peculiar to my data alone. In reality your code will look more like this:-

'<<K:\ImportFiles\Shared\Photos\\' ||  Query.PHOTO || '>>'

This data can then be written out to a staging file, my file is formatted like this, note how the Photo field we are really interested in has been defined as a blob.

We then need to import the data, we thus create a new data flow looking much like this one using our new staging file type as the source.

The query component with regard to the Photo field itself simply maps the data together, in my instance I wanted to remove some blanks in another field which is why the query is even necessary and the data is then simply written into a blob field in the target database.

And that my friends, is mission accomplished…. This blog will self destruct in 10 seconds.


Lost in Time? Reach For The Time Tables


So….fairly recently whilst working on a regular clients data they expressed an interest in displaying times for use in their dimensional database. We are all too familiar with using Date tables in a dimensional manner but times are much less used, at least in my experience. It is with this in mind, and with an hour to kill due to your average ‘Microsoft 71% progress bar’ that I decided to blog exactly how I went about doing this as I was pleased with the results and the simplicity of the solution. Apologies, this solution is using an older version of SSIS but this methodology can be tailored for the ETL tool of your choice. First lets lake a look at the issue, we are all familiar with a date table where for each day of the year between x & y we represent each day with a single individual record. The table below shows how that would work starting from the 1st January 1900 (I’m nothing if not thorough!) Thus instead of using a date time column in our fact table to (for instance) detail a ‘Sail Date’ we can instead use the SID value to refer to a record within the DimCalendars table. The advantage of this is both speed and the fact that we can dice the date within this calendar table in many different ways (Week, Period,Month, Year, FinancialPeriods etc)


Thus we could also do the same with a DimCalendarTimesTable that would expose every second of the day from 00:00:00 to 23:59:59 thus allowing us to report on every single time that could potentially be encountered within a dataset. In our instance this data will look something like this starting from the top of the day:-



As you can see TimeToTheSecond reports every single second within the day (in order which can come in useful when dealing with push down in SAP Data Services… bygones). Time to the Minute only changes with every whole minute so that at 00:01:00.000000 is the value used for every minute from 00:01:00.000000 to  00:01:59.000000. This allows you to report on all events at say 13:53 without worrying about the seconds value. Taking this one stage further we also have a TimeToTheHourfield which only changes once every hour allowing us to report on every event in the 10:00pm window without caring what the second or minute values were. In addition we have two extra fields , SID being the unique identifier and TimeString being a special field that we join to in order to properly resolve the Foreign Key values in the first instance; the primary use of this field  is that it allows us to properly resolve NULL date values.


Below are displayed the tail of our DimCalendarTimes table along with that special null record. Note how the Time to the minute for all except the nulls are set to 23:59:00 and the TimeToTheHour is 23:00:00.


2017-04-10_1443You can of course dice this in many other ways (shift patterns for instance) as you can see,  but this is what we required. So how was this achieved and how complex… Well the answer is really not that complex at all. In fact it took just two Sql Statements:- The first to put in all of those times:-

INSERT INTO DimCalendarTimes (TimeToTheSecond,TimeToTheMinute,TImeToTheHour)
 CAST(DATEADD(second,Data.SecondsAfterMidNight,0) AS TIME) As UniqueSeconds, 
 CAST(DATEADD(minute,(Data.SecondsAfterMidNight/60),0) AS TIME) As UniqueMinutes, 
 CAST(DATEADD(hour,(Data.SecondsAfterMidNight/3600),0) AS TIME) As UniqueHours

(SELECT top (86400) 
 (ROW_NUMBER() OVER (ORDER BY t1.id)-1) AS SecondsAfterMidNight

FROM sys.sysobjects AS t1 cross join sys.sysobjects AS t2 ) AS Data

And the second to add that NULL value record

INSERT INTO DimCalendarTimes (TimeToTheSecond,TimeToTheMinute,TImeToTheHour)
 NULL As UniqueSeconds, 
 NULL As UniqueMinutes, 
 NULL As UniqueHours

Lets dissect that first SQL statement a little, we start off with the simple INSERT statement that determines which fields we are inserting:-

INSERT INTO DimCalendarTimes 

Nice and simple, we then need to generate these values which we do by generating one second for each in the day, there being 86,400 seconds in a day.  The 86400 seconds are generated as is usual in tally tables by cross going two large tables together (in SQL Server we generally use the sys objects tables) and generating a unique row number for each record.

  SELECT top (86400) (ROW_NUMBER() OVER (ORDER BY t1.id)-1) 
      AS SecondsAfterMidNight
  FROM sys.sysobjects AS t1 cross join sys.sysobjects AS t2 
) AS Data

This sql statement will simply yield 86400 rows numbered from 0 to 86,399. We then simply need to generate the time values using the following simple pieces of date addition:-

 CAST(DATEADD(second,Data.SecondsAfterMidNight,0) AS TIME) 
     As UniqueSeconds, 
 CAST(DATEADD(minute,(Data.SecondsAfterMidNight/60),0) AS TIME) 
     As UniqueMinutes, 
 CAST(DATEADD(hour,(Data.SecondsAfterMidNight/3600),0) AS TIME) 
     As UniqueHours

They first column simply adds the row count value (or second) to the seed value of 0 and casts the resultant value to a TIME object. This will generate a true time starting at 00:00:00 right through to 23:59:59. The second and third columns  do exactly the same except that they will divide the result by 60 and 3600 respectively to give the time to the minute or to the hour. And that really is the crux of it. We of course also generate our null value record as below to allow us to resolve data rows with no time data (dimensional databases should ALWAYS resolve to a foreign key for reasons of performance.)

Resolving the times in our staging data into the foreign key is thus as easy as the following pseudo SQL. This will deal with both valid times and also with NULL times:-

FROM (SELECT CAST(S.TimeValue AS nvarchar(30)) AS CastTime,* 
      FROM StagingData AS S 
) AS S
INNER JOIN (select SID,CAST(TimeString AS TIME) AS TimeString 
            FROM DimCalendarTimes
) AS D
       ON COALESCE(LTRIM(RTRIM(UPPER(S.CastTime))),'--') 
        = COALESCE(LTRIM(RTRIM(UPPER(D.TimeString))),'--')

Perfectly simple. I hope that this escape into the dimensions of time has been both useful and informative. For me, it has certainly proved a distraction… I hear that 76% is the new 71% and besides, I feel sure that ‘Windows moon on a stick’ will be worth the tediom.

Scheduling the Awesome.


Welcome to the final part of my primer on creating a production package using SAP Data Services. In part 1 we looked at just how you would configure your data service repositories to suit team development, in part 2 we dived in a little deeper and created our central repositories and in part 3 we looked at how you would interact with these repositories to create ‘Releases’ of software ready for testing and for the production environment.

All being well you should currently be at the stage where you have created a QA central repository where your candidate release code will be helped. You should also have a local production repository from which the production code can be executed (remember central repositories cannot be used to execute code) The previous blog should then have walked you through the process of synchronising your local repository with the candidate release ETL code ready for execution.

So what now…. Well, you can, of course, wake up really really early every morning, go to work and then run this package before your accounts department log onto the system to ensure that their reporting data is up to date before business opens, you may laugh….. Alternatively, you may do what the majority of normal people would do and set up a scheduled service to run the ETL on a given schedule. Let’s do this instead of the former for the purposes of this blog, maybe I’ll cover the other one in a later blog (although given my approach to early mornings don’t hold your breath!);-)

We start by logging onto the machine that contains the SAP CMC, generally but not always, this machine also contains a program called ‘Data Service Manager’ like so:-


Once you click on this the following webpage should be invoked, you should log on using the appropriate login for your installation:-


At this point, things might start looking a little familiar as we see for the second time at least the following screen…


Again we want to select the administrator option to interact with the service settings for the Data Services service application. Upon invoking this you should be presented with a tree view down the left-hand side with a Batch item, this contains all of the batch instructions for each configured LOCAL repository. In this instance we want to set up a schedule against our production repository where our candidate release code is located. We thus open the tree like so and select the ‘Production’ repository.


To the right-hand side of this tree view the following detail page should be displayed showing all of the latest package executions that have been invoked:- Yours may well be blank:-


We need to set up a batch job and so we need to select the ‘Batch Job Configuration’ tab as selected which will detail us a list of all of the ETL Packages that are available to run. In this instance we have only the one ‘bjSalesIntegration’. In order to set up a schedule we then select the ‘Add Schedule’ item as indicated.


Nice and simple so far, and indeed it remains so. We merely choose a name for the package and set it to ‘Active’ so that upon saving it is run. When it comes to selecting a scheduler I tend to choose the Data Services scheduler, no reasoning behind that, just why not! Now we come to the scheduling, this is relatively comprehensive in that you can choose:-

  1. Days of Week to perform the  ETL on.
  2. Days of Month to perform the ETL on.
  3. Execute once a day at a given time
  4. Execute from a given time of the day for the duration specified

In this instance we are running the package daily at 8:30.

In addition we can also specify to use a given ‘System Configuration’ which we have detailed in a previous blog.


Once you press apply the scheduled job should appear like so on the ‘Repository Schedules’ meaning that the scheduled batch job has been configured and may/may not be currently active.


And voila, you’re all set and ready to go. If you now look at the top batch job status for my configuration you will see that the last execution was successful (with warnings but I expect that) and started at approx 8:30 pm as requested executing for about 6 and a half minutes.


You have now scheduled the awesome, the world is yours…. Get up late, that’s an order!

Dogs, Guard Dogs & Toblerones



Tris in a pensive moment.

As is usually the case this blog post started life through a totally off topic conversation about whether stereotypes are always negative. My fiancee is currently taking a language degree and part of the course relates to  languages & culture. When at first challenged with this question initially you DO think that stereotypes are negative; then you start to delve just a little deeper and you start to realise they are not always. Moreover they are actually an essential part of life as well as being an innate component of the learning process. We are born pretty much a blank slate and slowly over time our brain fills with experiences, observations and nuggets of information, it is these that inform our world view. One of the most amazing things about the human being is our adaptability, it is innate within us and allows us to hit the ground running in any of the different cultural situations we find ourselves in. There are some medical conditions that from a very early age prevent us from assimilating within a social group but in the main we are all able to cope with whatever the crucible of life throws at us. It is my belief that the blank slate approach is critical, if we had some level of pre-programming that allowed us to ‘fit in’ culturally I believe that our adaptability would be severely compromised, culture is of course also constantly evolving. It is my belief that pre programming would hamper our learning model to a huge degree.

When I was about 2 or 3 years ago I was of course impossibly cute, but even then I had a liking for chocolate; the only difference was I had a metabolism that could offset my interest. The following short vignette is one of my earliest memories and in the past it has always perfectly summed up my second memorable experience of disappointment. I was in the bath (cue sniggers from the back row…) and left to my own devices I seem to recall (don’t judge… it was the 70’s, kids bounced really well back then). I remember that by the side of the bath towards the tap end was a triangular package, cream coloured and looking almost familiar to my dopamine receptors. I remember taking the package and opening it expecting to find a Toblerone instead finding some sort of hair care product (I think it was Clairol but I am not sure some 4o odd years later!). As I have already said until fairly recently I have in the past always associated this story with disappointment. I have now come to realise that there was also a critical part of the learning process  in play here, ‘Generalisation’ a form of stereotyping.

At that tender point in my life I had not got all of the tools that I needed to make a reliable informed guess as to what this object was but that didn’t stop 3 year old me from trying did it?  I had not, at this age, acquired the necessary linguistic skills to read the label; sure I may have been able to read the letters but I’m guessing that THE THING with a Toblerone to a 3 year old is that it is triangular…That alone at this early stage informed my choice. I also had no parent with me at that point in time to ask whether the sacred object was chocolate and more to the point… could I have some. Sure, there was probably a picture of a woman with nice hair on the wrapper but then again in our society, especially back then, packaging can often display attractive men/women to help  sell any product. The context of the bathroom also never came into my mind, I may have thought

"Chocolate?    In the bath?"

But then… as now, I can see no down side to that! The most important and informing thing was THAT triangular package. With hindsight my brain was performing a fairly fundamental skill inherent in our ability to learn, it was generalising and using ‘the available’ data to make an assumption about the world it found itself in… Preferably a chocolate rich one. That it was not, was a learning experience to me; In retrospect I would have learnt a few things:-

  1. Not all triangular packets are Toblerone.
  2. Toblerones are probably not found in the bathroom, unless you take them there.
  3. Toblerone wrappers probably don’t have women on them.

As with much that happened in earlier life after this point I have no idea what happened but it was certainly not an educational dead end. My generalisations with regard to confectionery would have been honed still further and over time as I encountered distinctive wrappers in a strange place I would have paid more heed to the context. The experience would have have further informed my world.


The Dogtor

This behaviour is not of course limited to humans, the animal world has exactly the same principles as anyone with a dog would testify too. Yes they can be happy as anything but their experience informs their world view. We have a beautiful chocolate labrador called Tris, slightly obsessed with water, walks and dragging towels around…. when there is no food at hand of course. If I am to don my trapper hat, and am not competing with <insert any food here>  she will quite correctly, in general, draw the conclusion that walking is in her immediate future. That is her world view informed by all the experience that living has taught her. Similarly a couple of the older girls are keen, as girls of a certain age are, to experiment with how they wear their hair. On a few occasions upon doing so and emerging from the black holes of their teenage universes’ Tris will bark at them confused by these strange people. For a split second she does not ‘know’ who they are. Her informed world does not contain teenage girls with mad hair, it just doesn’t, and for a split second she can’t read the faces either. Maybe they are pulling a slightly different face and maybe of course being a dog she just does not have quite the same complex ‘face software’ that we have.

In the same vein my Dad once had a big German Shepherd called General who was not a family dog at all. At around about the same time he also had a Jensen Interceptor which if you have never heard one has a very distinctive engine tone. This was sleepy Shropshire in the 1970’s and  7.2 Chrysler V8’s were in short supply. There were in fact at that time a grand total of two Jensens in Shrewsbury, my Dads being one and my ‘Uncle Jacks’ being the other. General thus became used to the engine tone of the car and as you could hear it from quite a distance above the general noise of the area he would know a long time before you arrived that you were coming. At some point General’s services were no longer required and so he was sold on to a local(ish) scrapyard. That’s where this story could have ended and the punchline for my story would never have existed…. However cars being cars, we use them. Some unspecified time later, my Dad had occasion to be in the area of this scrapyard late one night driving the Jensen, loving his shepherds as he does he drove slowly past the scrapyard looking to see if he could see General. The sound of the car spooked all of the dogs on guard that night with the result that they were all giving it some with the woof woof noise….. all except one that is, who just sat there with his head cocked to one side listening to the engine tone. It was of course General, he had generalised and correctly so on this occasion, that the engine tone heralded the return of his old master. With the available data  that HE had, he saw no threat, the other dogs obviously did not have this generalisation built in to their experience and so with out further data THEY could only surmise that a threat was imminent and unknown. Remember this was a scrapyard and so they must have heard cars of all sorts coming and going all the time, one  engine tone probably sounded like every other….. and more importantly  the engine tones were probably not attached to any important memories.

Generalisation is of course a thing in software too,  albeit slightly different and much more focused. We use interfaces to describe and encapsulate a common problem and to generalise behaviours, programmers then implement shared behaviours using abstractions and finally using a concrete class will define the specific behaviours of an entity. The core of the code will neither know nor care exactly how these concrete classes work, but it can generalise and rely on them adhering to a certain behavioural pattern defined by the base interfaces. Lets say for instance we have an IEmployee interface which our core system exposes, this interface could define one method called CalculateWageSlip which the system would call once a month in order to populate a WagesDue database table. External processes could then query this data table in order to generate the necessary banking records to effect salary payments. In essence this sounds simple, the wages preparation system iterates over all of the Employee Objects and instructs them to prepare their due salary. The system though here has generalised away the real complexities…

  1. ProductionEmployees may be paid according to how many hours they worked during the week and at what rates those hours were chargeable at.
  2. ExecutiveEmployees may be paid on Salary regardless of performance or actual hours worked
  3. SalesEmployees may be paid (in part) on results within a timeframe and according to complex commission structures which may vary according to length of service, sales levels and responsibilities.

In the above instance we  have three different ‘concrete’ employee types who are ‘generally’ all paid in the same way (by calling CalculateWageSlip). The logic within these three different classes would be very different though and each object type encapsulates its own rules. The point is the system has been able to generalise (through design) that employees are paid, I guess that Milton from Office Space would attract his own rules…. If you don’t know who Milton is, you’re probably not a techie.