Training Module 16: Effective Dated Lookups

The slides used in the following video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 16, Effective Dated Lookups

Slide 2

Upon completion of this module, you should be able to:

  • Describe uses for effective-dated lookups
  • Read a Logic Table and Trace with effective-dated lookups
  • Explain the Function Codes used in the example
  • Debug lookups with effective dates

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

Many reporting applications perform date effective processing to:

  • Spot trends
  • Perform “What-if” analysis
  • Analyze changes in positions
  • Reconstruct historical perspectives for audit and other purposes

As you can see in this example, the number of customers by division by year is shown.  SAFR’s date effective joins provide a powerful way to perform these types of functions.  Changes in reference data can be maintained to allow for recreation of reports as of a point in time.  This can be done in individual views or a view might include multiple “as of” dates.

Slide 5

We’ll expand our example by including a new table, the Customer Name table.  This table is keyed by the Customer ID.  This Customer system tracks changes to data via effective dates, to keep track of changes to people’s name, perhaps when they get married.  To do so, it has an additional key to the table, the Effective Start Date.  Thus for the same Customer ID it could have two names, each effective for specific times.

Slide 6

When using tables structured for SAFR effective dated joins, the same primary key value can have two different sets of attributes.  The effective start date is used to distinguish between the two values, meaning when each record became effective.  A start date of all zeroes means the record has always been effective; in other words it has been effective since the earliest possible “date”. 

In this example, two different start dates are used to search the reference table for customer 5.  The first is 2012, and so it returns the record which started in 1985.  Customer 5’s name in 2012 was Sally Rogers Smith. 

The second query is for 1984, and so the earlier record, for the earliest possible date, is returned.  Customer 5’s name from the earliest possible date until June 22, 1985 was Sally Rogers.  SAFR assumes that the effective “end” date of the first record is the start date of the next record.

In many applications, exact binary matches of keys required the data value be exactly the same, including the effective date.  This can require many duplicate records to have all possible dates covered.  SAFR doesn’t require this.  It tests for the most recent effective dated record “falling back” from the key date provided to the last effective record.  This can significantly reduce the size of effective-dated reference files, which thus use less memory while still performing very high speed joins.

Slide 7

SAFR also provides the ability to use an effective End date as well when an explicit end date is stored on each record.  This is useful rather than assuming the effective “end” date of the first record is the start date of the next record.  This functionality allows for gaps when no effective date record may be in force.

When using end date processing, SAFR starts with the same “fall back” logic to find the earliest start dated record.  It then performs one more test, checking to see if the date provided is also less than the end date.  If it is not, SAFR returns a not found condition.

In this example we repeat the same search for customer 5’s name.  The 2012 search returns the same results because the end date of 9999-99-99 greater than 2012; in fact it is the highest possible date.  This record is effective forever. 

The 1984 search returns a Not Found condition, because the effective date of 1984 is between the 1983 end date of the first record, and the 1985 start date of the next record.

This same functionality in other tools can consume a lot of CPU resource testing for valid date ranges.  SAFR effective date search is very efficient, on the order of 2 additional CPU instructions for each effective date test.

Slide 8

The Effective Dates, both Start and End, are defined on the Logical Record.  There can be only one Start and one End effective date on each LR.  Both parameters are optional, but an End Date requires a Start Date be defined.  They can be located anywhere on the LR. 

In this example, the Customer Name LR has both a Start and an End date defined. 

Slide 9

The Start Date and End Date parameters should not be placed on the Lookup Path, and do not affect how the lookup path is constructed.  Fields listed in the lookup path are used for an exact binary match.  Because the Start and End dates are not exact binary matches, they are not placed on the Lookup Path.  However, remember the start and end dates are marked as effective dates on the LR, instructing SAFR to use them during processing.

In this example, only the field to be used to search the 10 byte Customer ID is defined in the Lookup Path.

Slide 10

Use of a Start Date or Start and End Dates affect the Join Phase processing.  It is not possible to use just an end date.  The values in these fields are added to the RED Core Image file at the end of the LR defined key data. All field date effective dates are converted to binary, and SAFR will automatically convert all dates to the same content code of CCYY-MM-DD regardless of the source data format.  This is done to reduce the amount of memory used by them and ensure the test for effective dates test years, then months, then days. (Note that even though dates are converted to binary format, no exact match of dates is required).

Note that use of a Start Date parameter is the only case where duplicate keys are allowed in a SAFR Join Reference file.  In this example, although there are two records for Customer 5, the start date ensures uniqueness in the search results.  

Like all other RED Core Image files, it must be in sorted order by the (1) key, such as the Customer ID in this example, (2) the start date, and (3) end date, if used. 

Slide 11

The Effective date used in the search key can come from three different places. 

  • A constant placed in the view
  • A constant fed to Performance Engine programs in the job stream, or
  • The value from a field on the Source Event File

Each of these will be discussed in the following slides.

Slide 12

A constant “as-of” effective date value can be hard coded in the view.  To do so, place it immediately below the path in the column parameters.  If used, the same value will be used for all lookups for that column.  The value would have to be updated manually if the view is reused, for example, to report processing results at the end of every month. 

In this example all customer names as of 2010-01-01 would be extracted in column 2. 

Use of this approach is effective for (1) unit testing views (2) one time views to produced a particular report for analysis, or (3) for future dated views to do what-if analysis for reorganizations.

Slide 13

A more frequently used option is to use the RUNDATE parameter.  The RUNDATE can be set in the GVBMR86 step of the Selection Job of the Performance Engine.  If the parameter is not set, GVBMR86 will default to the system date.  Thus each day a new VDP is created, the RUNDATE will be advanced to the current date. 

In our example then, although the view might have been created in 2010, if GVBMR86 is run in 2012-10-03, all customer names as of that date will be extracted by our example view. 

Often rather than using the system date which might change arbitrarily at midnight, small programs are created to pass GVBMR86 a “fiscal date” of some predetermined kind.  A fiscal date allows all views running at the same time to use the same date.  Also, even if a set of views needs to be re-run on another date, they can be “as of” the required date.  For example, running reports for the year end December 31 can be done even though the system date is January 1st

Slide 14

A field on the source Event File can also be used as an effective date.  This is similar to using it in a lookup path, although it is not placed in the path and exact match of values is not required.  If used for the date effective join, the date used can vary record by record, unlike the other two date constants described.

In this example, the Date Became Customer field is used to find the customer’s name when they became a customer.  The report is not as of 2010, or 2012, but varies customer by customer depending upon the date stored in the customer Event file.

Each use of a look-up path containing an effective date in a view requires specifying which date should be used for that join, either a constant, the run date, or from a field value.  Thus a single view can use all three sources for different joins.

Next, we’ll examine the Logic Table generated by these different options.

Slide 15

The View Constant and Run Date effective joins use an LKDC Logic Table Function Code.  Remember LK functions build look-up keys prior to performing searches of core image files.  LKD functions build lookup keys containing dates for effective dates.  LKDC builds a lookup Effective Date from a Constant. 

The only difference between the View Constant and Run Date is the source of the constant.  The View Constant is the value placed in the view itself.  The Run Date constant is placed there by GVBMR86 at run time.  One cannot detect which source was used for the effective date by examining the logic table.  In this example, the View constant is 2010-01-01.  The Run Date constant is 2014-04-04.

Because all effective dates are converted to a four byte binary (a FM type 6 on the logic table report) which can potentially hold 10 digits, an additional “00” is appended to the end of the date constants for the join processing, but does not affect the source or target date.

Slide 16

The view using a source Event File Field for the date effective join uses an LKDE Logic Table Function Code, which builds a Lookup Key effective Date using the value from a source Event File Field. Rather than seeing a specific date in the logic table, the value to be used will be moved from a field. This is like an LKE function, but used for an effective date.

In our Event File Field example the field ID 68, DATE_BECAME_CUSTOMER, will be moved to the Lookup Key Effective date position.

Slide 17

As noted earlier, all these dates are given a four byte binary (“FM” Format code 6) field, with a CCYYMMDD content, (“CN” Content code 3).  The Constants are put into these formats in the logic table to save CPU cycles during run time, thus they are listed with these formats as SOURCE.  The date from the Event File field will be converted to this format as the data is read during execution.

Slide 18

Next we’ll trace records through these logic tables.  We’ll use this reference file as the target of our join, searching against customer 1, Michael Clark.  We’ll show what the results of the date effective search would be for various date combinations.

Slide 19

The Logic Table Trace for one record against all views is shown here.  Each view executes an LKE to move the Customer ID “1” to the lookup key.  Then each executes the effective date build, either an LKDC or LKDE.  The LUSM lookup function is performed, and then the customer last name is moved to the extract file through a DTL Data lookup function, and the record is then written using the WRDT Write the extract record Data Area function code. 

We’ll examine the details of each on the following slides.

Slide 20

Let’s examine the LKDC and LKDE functions more closely, and what effective date is used in each. 

At the bottom of the screen is shown the Effective Start date for the target reference file record, 1961-03-15, and it’s end date is 9999-99-99.  Since all three LKD effective dates, 2010 for the View Constant, 2014 for the Run Date, and 1999 for the Event File Field value are within this date range the results for all three views are found conditions.

Note that the value for the LKDE is not shown in binary in the trace because the source data is not binary.  Conversion to binary takes place as part of this logic table function processing. 

Slide 21

In this example, we’ve changed the reference file record effective date to start in 2011 rather than 1961. 

  • The View constant compares the LKDC 2010 which is less than the start date of 2011, resulting in a Not Found condition
  • The Run date compares LKDC 2014 to a start date of 2011 and find the reference file record is effective, and so results in a Found condition
  • The LKDE Field value contains 1999 which is less than the start date of 2011, and thus results in a Not Found condition. 

Thus views 1 and 3 use DTC function code, move a Data Constant of spaces to the extract file. 

Slide 22

In this example, we’ve changed the reference file effective END date (not the START Date) from a very long time in the future, 9999-99-99, to a while ago in the past at 2000-01-01. 

  • The View constant compares LKDC 2010 which is greater than the end date of 2000, resulting in a Not Found condition
  • The Run Date compares LKDC 2014 to an end date of 2000 and find the reference file record is no longer effective, and so results in a Not Found condition
  • The LKDE Field value contains 1999 is less than the end date of 2000, the reference file record is still effective, and thus results in a Found condition. 

Thus views 1 and 2 use DTC function code to move a Data Constant of spaces to the extract file. 

Slide 23

In this example, we’ve added another reference file record for customer 1 reflecting a name change to CLARK-SMITH.  However, there is gap between the two records.  One still ends in 2000, but the new one does not start until 2011.  The view still requires an “as of” date of 20100101.

There is no change to the View Constant process. The lookup key is primed with customer 1, and an LKDC effective date of 2010.  The search finds the CLARK record because the new record doesn’t start until 2011.   After finding this record, the program compares the view effective LKDC date constant 2010 which is greater than the end date of 2000, resulting in a Not Found condition

Slide 24

The Run Date view primes the key with customer 1 and an LKDC effective date of 2014.  Therefore, the binary search returns the second record with the effective start date of 2011.  The performance engine then tests the LKDC effective date of 2014 with the end date of 9999, and results in a Found condition.  The customer name of CLARK-SMITH is used in the output column.

Slide 25

The Field value view primes the key with customer 1 and a LKDE field value of 1999.  The search returns the first reference file record with its start date of 1961.  A comparison of the end date of 2000 to 1999 shows the record is still effective and thus results in a Found condition.  The view uses the last name Clark to build the output record.

Slide 26

In this example we’ve modified the Customer Name Logical Record so there is no Effective End Date field.  If no end date field is defined on the LR, the start date of the next record is the implicit end date of the prior record.

  • The View Constant searches with an LKDC constant of 2010, which is less than the 2011 of the 2nd record, but greater than the 1961 of the first record.  Thus the 1961 record is effective.  It uses the last name of “Clark” to build the output record
  • The Run Date view searches with an LKDC value of 2014 date which returns the 2nd record with a start date of 2011.  Because there is no explicit End Date, this record is assumed to be effective forever.  The view uses the “Clark-Smith” value to build the output record
  • The Field Value view with a LKDE date of 1999 has the same results as the View Constant.

Thus all three views have found conditions.

Slide 27

This Trace shows Event File records beyond the first record for the Event File Field view.  Note that the trace shows that the Date used in the LKDE Function Code changes with each event file record, whereas the constants used in LKDC would remain the same throughout a run.  The relationship between these dates might be summarized as:

  • The View Constant date is always the same, even in between different executions of the view in the Performance Engine if the view is never manually updated.
  • The Run Date Constant is the same within an execution of the Performance Engine, but can change depending on the system date or parameters passed to GVBMR86 between executions
  • The Event File Field Date can vary within an execution of the Performance Engine, as the dates with the source Event File change.

Slide 28

The following are a few of the more common reasons why you may encounter an error in effective date processing, and possible approaches to each:

  • Effective dates should not be included in the lookup path.  If they are, the dates will be moved by LKE functions instead of LKDE or LKDC functions, requiring a exact match, not an effective date “match”.
  • Because the source for an LKDE is a field specified in a view rather than on the lookup path, it is possible to use different LR fields, resulting in different output results per view when similar results were expected. 
  • Improper content code definition for LR dates, either the Join or the Event LR, such as MM-DD-YY, will result in not found conditions.
  • Views can be hardcoded with effective dates, rather than using the Run Date parameter or a Field value.
  • Because each use of a look-up path containing an effective date effective in a view, inconsistent use of constants, run-date, or a field value can provide unexpected results.
  • Improper parameters passed to GVBMR86 can set an incorrect Run Date

Slide 29

This module has introduced the following Logic Table Function Codes:

  • LKDC, builds a date effective lookup key from a constant
  • LKDE, builds a date effective lookup key from the source Event record

The following function code was not shown in an example

  • LKDL, builds a date effective lookup key from a looked up record

Slide 30:

This module described effective dated lookups. Now that you have completed this module, you should be able to:

  • Describe uses for effective-dated lookups
  • Read a Logic Table and Trace with effective-dated lookups
  • Explain the Function Codes used in the example
  • Debug lookups with effective dates

Slide 31

Additional information about SAFR is available at the web addresses shown here. This concludes Module 16, Effective Dated Lookups

Slide 32

Training Module 15: Lookups using Constants and Symbolics

The slides used in this video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 15, Lookups using Constants and Symbolics

Slide 2

Upon completion of this module, you should be able to:

  • Read a Logic Table and Trace with constants and symbolics in lookup key fields
  • Explain the Function Codes used in the example
  • Debug lookups with symbolics
  • Use selected Logic Table Trace Parameters

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

Most lookups use key values provided by other LRs, starting with the Source Event LR, and values in the source and target fields are the same. In some cases data values in the input file and those in the reference file cannot be simply matched for a join.  Yet through use of logic, the relationship might be programmed.   The person building the join and view must know the data to properly use these types of joins.

If in our example, we did not have a Customer Table to give us the customer code, the value to find the Gender would need to be provided through the view in some way. 

This can be accomplished in 2 ways: 

  • A Constant can be added to the Lookup Path when defined.  The view cannot change the constant
  • A Symbolic can be defined in the Lookup Path.  The value of the symbolic is assigned in the view

In either case, the value remains the same throughout the processing of the view.

Slide 5

We’ll use a similar example to the prior module, where we wish to output the Order ID, Customer ID and the Customer Gender Code.  In our example, we won’t change either the input or output data, so a typical join could be used.  We’ll show that a constant could be if either the source field or the target field were different values

Slide 6

Both types of joins still require the source and target logical records to be specified, even when no fields from the source are used in the search.  Constants and symbolics can be used in combination with fields to perform a join, or they can be the only value to search the target reference file.  They can also be used in any step of a join.

A Constant can be used if the same value should be used in every search of the reference file.  The value to be used in the search for a matching record is placed directly in the Look-up Path definition.  The data type format must also be specified correctly to match the key data of the target reference file.

For our example we have defined this lookup path to use a constant of “M” for the Customer Code key.  We will also use another lookup path with a constant of “F”, which is not shown.

Slide 7

A symbolic join is also defined in the lookup path.  But unlike the constant join, the value to be used in the search is not yet defined.  Rather, the symbolic is a key word that will be used in the view to assign what value should be used to search the target reference table.  In other words, the symbolic is a variable that is resolved using logic text in the view. 

Constants, symbolics and LR fields can be used together in the same join in any order or any combination.

In our example, we’ve defined a symbolic of GENCD for the Gender Code which we will assign in Logic Text.

Slide 8

Lookup paths that contain constants or symbolics are used like any other paths in views.  In both our example views we will use them in logic text.

We’ll use view 129 to demonstrate how paths with embedded constants operate.  In Column 3 of this view, our logic text will first look up to the Customer table using the Customer ID to find the Gender Code of either “M” or “F”.  Instead of using this as the key to the Gender Description table, we will use our new constant paths.

Slide 9

Symbolic views require logic text in order to assign the symbolic value.  The symbol (or variable) assigned in the path is assigned a value within the logic.  This is done by using the path name, followed by a semicolon, a dollar sign and the symbol name, and equal sign and then an assigned value as a constant.  Multiple symbolics can be assigned values one right after the others.

Our example view 130 shows how symbolics work.  As shown, the path name is used, followed by a semicolon, a dollar sign and the symbolic name “$GENCD”, an equal sign and a constant.  Notice that instead of requiring two paths on the prior view, one for “M” and the other for “F”, we now are able to use a single path for both conditions.

Slide 10

The Constant and Symbolic Functions both populate a lookup key using a constant value rather than a value from an input file field, either the source Event File or Look up.

A Constant value is hardcoded in the Lookup Definition, which in turn simplifies view definition with no logic text, but requires more lookup definitions.

A Symbolic allows many possible constant values assigned in Logic Text through a symbolic variable, which means fewer total components with one lookup definition, but each use requires logic text for assignment

Slide 11

Constant and Symbolic joins do not affect the JLT process in any way.  Core image reference files are produced as they would be for any other join.  Constant and symbolic joins only affect the XLT, and as we will see, in very small ways.

Slide 12

This is the complete logic table for the constant view, view 129.   The view has no extract record filtering.  Thus it is composed almost entirely of column dependent logic table rows.  Column 1 requires one row, a DTE to populate the Order ID.  Column 2 requires logic table rows 5 – 10, a join to the Customer Table to obtain the Customer ID.  Column 3 requires rows 11 through 33 to populate the Customer Gender Code using our lookup paths with embedded constants.

Slide 13

This is the complete logic table for the symbolic view, view 130.   It has by and large the same structure as the constant view. One difference is the LT row numbers differ because view 130 is executed in the same Performance Engine run as view 129.  Its LT rows are therefore after those of view 129.

We’ll focus our attention in this module on Column 3, and compare and contrast the differences between these two views and their different paths.  For this logic table row comparison, we’ll change view 130 row numbers to be consistent with view 129.

Slide 14

Inside column 3 logic, both views perform the same lookup to the Customer file to determine the Gender of the Customer.  The first test is for “M”, and a second test for “F”.  Both views also have the same patterned DTL and DTC functions for populating the column based upon lookup results.  When we eliminate all these rows of the logic table for our comparison, we are down to very few rows to analyze.

Slide 15

Lookup paths containing Constants have an LKC Lookup Constant Logic Table Function.  This function code, similar to an LKE and LKL builds a lookup key value to be used when searching a core image file.  The LKC simply places a constant in the key. 

In our view 129 example two LKC’s are used, one each to satisfy the join required by the COLUMN = keyword.  One assigns a value of “M” to the key, the other a value of “F”.

Symbolic paths contain LKS Lookup Symbolic Logic Table Functions.  This function performs the same operation as the LKC, placing a constant in the lookup key before searching a core image file. 

Example view 130 similarly has two LKS functions, one assigning a “M” and the other assigning an “F” before the respective LUSM functions.

Slide 16

Before examining the trace results, let’s look at the Extract Engine GVBMR95 control report to see the results of running these views.

Both views have written 12 output records.  But the found and not found counts are not the same.  Remember that Found and Not Found counts are the results of LUSM functions, and join optimization might simple cause view 130 to perform fewer of these functions. 

Both views wrote their output to the same output file, DD Name OUTPUT06.  Thus records from view 129 are interspersed with records from view 130.  If we examine the file carefully, it appears that each record is basically duplicated, which is what we would expect because the views create effectively the same output in slightly different ways.  And we can see that there appears to be two records, one for each view, which does not have a customer ID or Gender Code.

Lastly, because we were looking up to the Gender Description table, we would have expected to see “MALE” and “FEMALE” on the records, not the “M” and “F” displayed.  Let’s use the Logic Table Trace to examine these potential issues.

Slide 17

In the Introduction to Trace Facilities we briefly discussed the additional parameters available to control what is traced.  We’ll use those parameters highlighted in blue to investigate our view results. 

Although multiple trace parameters can be used at one time only one global parameter is permitted.  The global parameter is the one with no VIEW keyword on it.  Besides the one global parameter, all trace parameter rows require a VIEW parameter or they will be ignored in trace processing.

In the top example, the global parameter TRACEINPUT is used, and then two views each have individual trace parameters. 

In the bottom example, only a global parameter is used which will affect all rows in the logic table for all views. 

Note that an asterisk in the first position of the MR95PARMs marks a comment, which can be useful for saving prior trace parameters.

Slide 18

The TRACEINPUT parameter causes GVBMR95 to print the source Event File record.  The parameter is followed by the DD Name of the Event File to be traced.  In the Trace output the DD name is followed by the record number being printed.

The headings for the Trace Report have been removed in this example to make the printout more clear, and because of the report width is it shown in two parts. 

The input record is printed in two forms, as HEX characters which can allow viewing of things like packed and binary numbers, and as display characters which are easier for reading text. 

The SEQ NUM column before the Hex characters contains an index to the character positions of the printed hex record.  In this example, the first hex value of “F0F0F0F0” begins at offset “000000”.  On the next row the hex value of “00000582” begins at offset 000020 from the start of the record.  Note that a hex 20 is a decimal 32 bytes from the beginning of the record. 

Slide 19

The Trace View Parameter indicates which view should be traced.  In this example, only view 129 will be traced.  Notice that the trace output does not include any trace rows from view 130 even though it was contained in the same logic table and was processed by the Extract Engine at the same time.  This is a simple way of reducing the trace output to a single view of interest.

Slide 20

Another useful keyword is the LTFUNC parameter. 

Multiple VIEW parameters can be used at the same time. In this example, all LUSM functions for both views are shown.  Because we are looking for the same function codes in all the views in the logic table, the same thing could have been accomplished by coding the LTFUNC parameter on the Global Trace Parameter line.

Notice on this trace that for the last two records in the input file, records 11 and 12, only view 129 performed any LUSM functions.  This might explain the differences between Found and Not Found lookup on the GVBMR95 Control report.

Slide 21

Multiple trace parameters can be used together with different sub parameters as well.

  • In this example the orange boxes highlight the Global parameter request for a print of the input record from the ORDER001 DD Name
  • The green boxes show the VIEW=129 parameters print only the LUSM function code rows on input records 11 and 12 and the output.
  • The blue boxes highlight the VIEW=130 parameters requesting the trace to print all LT function codes from Logic Table Rows 54 to 65, on input records 11 and 12.

The output from these last set of parameters shows that only the JOIN function code of LT Row 54 and the DTC of LT Row 65 were executed.  The LUSM and the DTL function codes allow in this section of the Logic Table were not executed.  Thus we can confirm that because the LUSM was not executed the Found and Not Found counts between view 129 and view 130 should be different.

Slide 22

When we inspected the output, we noted it only contained the letters “M” and “F” on the end of the file, not the full reference file title values of “MALE” and “FEMALE”.   If we set our trace parameters to trace only the DTL functions associated with the last column shown in this example, the trace output shows the Value 1 for the DTL function code is a 20 byte source values of “MALE” and “FEMALE” from the full reference file field.  However, the Target length is only 1 byte long.  The field has been truncated because the view definition was coded incorrectly.  Adjusting the view column width and re-running it should correct the error.

Slide 23

The following are descriptions of other trace parameters we have not examined in detail in this module.

  • The DDNAME parameter will trace only the specified input file. While the TRACEINPUT shows input record for a DD Name, this parameter shows only executed logic table rows against that file.
  • The VPOS, VLEN, and VALUE parameters trace only when the data at position for length on the source record is equal to a specific value. These parameters can consume a great deal of CPU resources against very large files.  Use them with care.
  • LTABEND will cause GVBMR95 to produce a dump for debugging if it executes a specific logic table row.  This may be useful when debugging user exits which we’ll discuss in a later module.
  • And MSGABEND will cause MR95 to abend if it produces a specific error number, such as an 0C7 data exception.

Slide 24

This module has introduced the following Logic Table Function Codes:

  • LKC, builds a look up key using a constant
  • LKS, builds a look up key using a symbolic

Slide 25

This module described lookups with constants and symbolics . Now that you have completed this module, you should be able to:

  • Read a Logic Table and Trace with constants and symbolics in lookup key fields
  • Explain the Function Codes used in the example
  • Debug lookups with symbolics
  • Use selected Logic Table Trace Parameters

Slide 26

Additional information about SAFR is available at the web addresses shown here. This concludes Module 15, Lookups using Constants and Symbolics

Slide 27

Training Module 14: Multi-Step Lookups

The slides for this video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 14, Multi-Step Lookups

Slide 2

Upon completion of this module, you should be able to:

  • Describe a multi-step lookup
  • Read a Logic Table and Trace with multi-step lookups
  • Explain the Function Codes used in the example
  • Identify lookup and other patterns in a larger logic table
  • Debug a multi-step lookup not-found condition

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

In this module, we will extend the example used in the prior module, where we joined from the Order file to the Customer file to find Customer related data, including the Customer Gender Code, either M or F.  In this module, we’ll use the Gender Code to look up to the Gender Description table to find a description of the M or F, either Male or Female.

SAFR Lookup Definitions specify the full “path” or route from the source data to the final target data.  In the last module we used the “Customer from Order” Lookup Path.  In this module we’ll use the “Gender Description from Order” Look up Path to go through the Customer table to find the Gender Description.

Slide 5

Creating a multi-step look up requires using the “Steps” portion of the workbench.  Each step allows definition of a new target table.  All fields from prior steps can be used as sources for key values with which to search these tables. 

Care should be taken when building paths with respect to required data format conversions.  All SAFR joins use binary searches, meaning it matches byte for byte the key to a potential target record.  Joins on numeric keys can fail if the source field formats do not match the target formats.  SAFR will perform the necessary data conversations in the logic table when building keys to convert field formats if they are specified properly in the Look-up Path definition. 

Slide 6

Our example lookup View reads the Order logical record as the source event file.  It has three columns, containing the:

  • Order ID from the source event file
  • Customer ID from the Customer Logical Record
  • The Customer Gender Description from the Gender Description Logical Record

The Customer Logical file requires the Customer ID from the source event file, a single step lookup.  The Gender Description requires the Gender Code from the Customer logical record, thus creating a multi-step look up.

Slide 7

The view also contains logic, both Extract Record Filtering and column logic.

The Extract Record Filtering logic selects records for customers with a Gender type of “M” and orders where the Order Date is later than the new customer date, or orders which have no associated customer record.  Like column 2, this logic only requires a single step lookup.

The column logic, for column 3, inserts the Customer Gender description either “MALE” or “FEMALE” if found, and the value “UNKNOWN” for those orders with no associated customer record.  This logic requires a multi-step lookup.

Slide 8

GVBMR90 recognizes the view requires two reference tables because it uses the “Gender Description from Order” Look up Path.  It then generates additional Extract Only views in the JLT, creating another Core Image file by reading and extracting data from the Gender Description file, including the Gender Code 1 byte key, and the 20 byte Gender Description. 

The REH file also will contain two header records with counts and sizes of reference data for the Customer Reference table, and the Gender Description table.  Both these Core Image files will be loaded into memory during the Extract phase processing.

Note that the standard Performance Engine Process does not sort the Core Image files, but these files must be in key sequence before being loaded into memory or an error will occur.  Therefore these files should be sorted prior to beginning the Join Phase process.

Slide 9

This is the complete logic table.    Note that this single view requires 38 logic table rows (40 rows less the HD and RENX rows which are shared by all views in this logic table). Logic tables for complex views can be hundreds of rows in length.  Determining which part of a view generated the logic table rows can be important. 

Rows 4 – 14 are generated from the extract record filter logic.   These rows have no sequence numbers.  Rows 15 – 38 have column sequence numbers assigned.  They contain column logic and build column values.

We’ll examine each of these portions individually.

Slide 10

When the same look up path is used multiple times within extract record filter or a column, the lookup path steps are duplicated multiple times in the logic table.  Join optimization may skip rows, but the logic table still contains them. 

In our example, the Extract Record Filter logic requires three lookups to the Customer LR.  Each is accomplished using the same Lookup Path.  Thus each has the same Join, LKE and LUSM functions and the target Customer LR ID is 7.

  • Rows 4 – 6 tests for “M” type customer Gender Codes generates a lookup to the Customer LR.
  • Rows 8 – 10 tests dates generates.
  • Rows 12 – 14 test for not found customers in the OR condition.

Slide 11

Extract Filtering Lookups are often followed by CF functions to test for inclusion or exclusion of records.  In our example, lookup for both clauses 1 and 2 are followed by Compare Field functions. 

  • The test for an “M” requires a CFLC function, Compare Field Looked-up value to Constant.  The CFLC compares the looked up Customer Gender Code to the constant of “M”. If the test is true (the field equals “M”), the CFLC GOTO Row 1 continues to test on row 8 for the AND condition.  If the test is false, then it branches to 12 to begin the OR test for a Not Found condition.
  • Testing the dates requires a CFEL function, Compare Field Event file field to Looked-up field value.  The comparison type is a “003” meaning a greater than test.  If this test proves true the record should be selected; there is no need to test the OR Not Found condition.  Thus GOTO Row 1 jumps to the row 15 DTE function to begin building columns.  If the date tests proves false, then the OR condition is tested, beginning on row 12.
  • Although not shown in this example, a CFEE and CFLL functions are similar in format to the CFEL.  A CFEE compares two source Event File Fields, for equal, not equal, greater or less than, etc.  A CFLL compares two looked-up field values.

Slide 12

The NOT FOUND keyword of the filter logic clause 3 also determines if records are selected for processing or not.  However, it does not generate a Compare Field function.  Rather the Found and Not Found rows of the LUSM cause the selection.  In other words, the true and false rows typically placed on a CF function are placed right on the LUSM (and the Join in case Join Optimization occurs).

In this example, if the LUSM test on row 14 finds a corresponding customer record in the lookup, then the record will be skipped by jumping to logic table row 40, the ES function.  If a corresponding customer record is not found (GOTO Row 2), then the column build processing will begin on logic table row 15.

If instead of the NOT FOUND keyword, FOUND had been used, the GOTO Rows 1 and 2 on the LUSM function would have been switched.

Slide 13

Being able to identify GOTO Row patterns typical of AND and OR logic can help in debugging. 

In this example, the repeated GOTO Row 2 value of 12 pointing to a CF function is indicative of a OR condition. 

The nearly sequential nature of GOTO Row 1 for clauses 1 and 2 are indicative of an AND condition because both must be satisfied before logic can jump from logic table row 11 to 15 to build the columns. 

Slide 14

Column logic can also be broken into segments by looking at the pattern of sequence numbers.  In this example, the three columns of our view require three different types of logic to be completed. 

  • Column 1, the Order ID from the source event file, only requires a DTE function.
  • Column 2, the Customer ID from the Customer Logical Record, requires a single step lookup, followed by a DTL and DTC functions.
  • Column 3, the Customer Gender Description from the Gender Description Logical Record, requires a multiple step look up.  We’ll focus our attention here.

Slide 15

Even the logic for a single column can be broken down into its individual parts. 

In this example, Column 3 logic can be divided by the clauses of the logic text: 

  • The Found test to find a matching Gender Description record
  • Locating the actual Gender Description for use in the extract file
  • Or placing the value “UNKNOWN” in the column if the Gender Description is not found

Next let’s focus on part clause A logic

Slide 16

Patterns of single step joins can often be found as pieces of multi-step joins.  In this example the single step join from column 2 is nearly identical to the first part of our multi-step join, which also joins the Order LR to the Customer LR.  They have nearly identical JOIN, LKE and LUSM functions.  When debugging joins, the correct Lookup path must be identified.

The one significant difference is that the GOTO ROW 1 and 2 are substantially larger on the multi-step join, 6 to 16 steps away as opposed to 3 to 5 steps.  This is because column 3’s multi-step Join covers the entire join, not just the first step.  During Join Optimization, if the join has been performed before, these GOTO ROW 1 and 2 values are used immediately.

Slide 17

The second and subsequent steps of multi-step joins all begin with LKLR functions, Prepare a Lookup to an LR.  This, like the JOIN function code, identifies what core image file in memory to search.  In our example, the next step in the join goes form the LR ID of 7 to the Gender Description table, LR ID 63.

Often a multi-step join includes an LKL function, build a Lookup Key from a Looked Up value.  This moves a value from a looked up field to the search key for the core image file.  In this example, the value in Gender Code field ID 478 on the Customer table will be used to search the Gender Description table for a value, an “M” or an “F” depending upon the customer.

Our example target reference file only includes a single field for the key, the “M” or “F” values.  Multi key tables require multiple LKE and LKL functions, one right after the other in the logic table.  These build the keys for binary searches from left to right.

Slide 18

Again repeated patterns are visible in the logic table, even within logic for one column.  Column 3 of our example performs the same look up twice. 

  • Test to see if a join is successful using the Found Keyword
  • If it is, then use the join to place the Gender Description in the column

Thus our multiple step join is repeated in the column logic.

Slide 19

Some join steps will never be executed during a run because of join optimization.  There can be other functions which will never be executed as well. 

In this example, there is a default Not Found DTC, the companion to the DTL moving the Gender Description code.  Note that it does not have constant “UNKNOWN” assigned to it. This DTC will never be executed because the GOTO ROW2 on both LUSM rows 30 and 33 would never be triggered; rather jumps to row 38 would have already happened on the JOIN on row 22, or LUSM rows 24 or 27.

Having completed our analysis of the logic table, next we’ll execute this view with the Trace Parameter enabled.

Slide 20

This trace example output does not show the record number of the Event DD Name, in order to show the Value 1 and Value 2 output.

Most of the records from the input file meet the selection criteria of being Male Customers and were recorded Customers before placing orders.  The trace for the CFLC, the Compare Field Lookup to Constant, shows the looked-up value as Value 1, and the Constant value as Value 2.  The CFLC function was an equal test, thus because the two are equal the record continues evaluation.

For the CFEL function, Compare Field Event Field to Lookup Field, Value 1 is the source Event field value, and Value 2 is the Looked-up field value.  In this case the Value 1 and Value 2 are different, but the comparison is a Greater Than comparison, so thus the records are selected for processing.

Having passed record filtering logic, the trace shows the DTE function for the first column

Slide 21

The join required for Column 2 to obtain the Customer ID is the same join already performed to find the customer Gender Code for the record filter test.  Thus the LKE and LUSM functions are skipped by join optimization.  The Join function on row 16 immediately jumps to the DTL function on row 19 to populate column 2 with the Customer ID from the Customer file.

Slide 22

The multi-step join is then executed, to test for found on Customer Gender Description by joining first the Customer table.  Having done so successfully, the join uses the Gender Code key for the LKL function to search the Gender Description table. 

Join optimization shortens the next lookup.  The same lookup path just used to test for a Found is used again to actually get the Gender Description starting on Logic Table row 28.  Thus through optimization the B clause logic of this column only requires row 28 Join function.

The DTL function then moves the Gender Description of “MALE” to extract record.

Slide 23

Let’s move on to a new Event File record.  A number of customers are Female according to the Customer file.  On these records the trace shows the Extract Record Filtering CFLC Compare Field Lookup to Constant test failing; the “F” value does not match the constant of “M”.

LT row 12 is also executed to tests to see if the join is Not Found and thus should be included in the output, but of course the look up was successful, and so the record is skipped. 

Slide 24

For multi-step lookups, not found conditions can occur anywhere along the lookup.  First we’ll examine a failure in the first step, the same failure noted in the prior module.  The join from the Order file to the Customer File fails for column 3 because no customer “0000000000” is found.  Instead of proceeding to the LKLR function to get ready for the second join, execution jumps to row 38 DTC to place “UKNOWN” Gender Description in the output file.

Slide 25

To simulate a second step failure, let’s modify the Customer Gender code for Customer 1, from “M” to “O”.  If we do this the first lookup, Logic Table Row 24, would complete successfully.  We would then execute the LKLR, LKL and the second LUSM to search the Gender table.  The LKL function would have used a Gender code of “O”, which will not be found on the Gender table, causing a branch by the second LUSM to the same DTC used on the first step failure we just examined, resulting in the same output value.

Thus the final value in the output file may not indicate which step of a join failed.  Only by tracing through the logic table can this be determined.  The trace might highlight incorrect input data, like a Gender code of “O” or a different data format for numeric keys.  It may also highlight an incorrect LR or Join definition, or improper logic in the view. 

Slide 26

As noted earlier, incorrect number format conversion can cause look-ups to fail.  For example, source and targets that use Binary, Packed and Zoned Decimal field can all have the proper field lengths for use in a lookup path.  However, if the sign characteristics are not properly converted, the correct matching record will not be found. Therefore it is important for LRs to be defined correctly to match the actual data type.  SAFR will do the necessary conversions between data types.

The GVBMR95 Trace output prints the binary data for LKE and other functions.  At times this data is not printable, but using the appropriate command to display hex data on output data (such as “Set Hex On” depending upon the JES configuration) will show the data for debugging.

Slide 27

The Extract Engine GVBMR95 control report again shows the results of Lookup processing.   Remember that the Found and Not Found counts are the results of LUSM functions, whether in 1st step or subsequent steps of joins, and do not include Join optimizations. 

In our example we used two different lookup paths, the single step lookup for the Customer ID, and the multi-step lookup for the Gender description.  Thus although we have only two output records with not found conditions on them, we have 4 not founds in the count, 2 for the Customer ID join failure and 2 for the Gender Description join.

Slide 28

This logic table and module has introduced the following Logic Table Function Code:

  • CFEL, Compares fields, Event File Field to Looked-up Field
  • CFLC, Compares fields, Looked-up field to Constant
  • LKLR, like the Join function, identifies join targets for multi-step joins
  • LKL, builds a look up key using data from a looked-up record

Other related function codes without examples include:

  • CFEE, Compares fields, Event File Field to Event File Field
  • CFLL, Compares fields, Looked-up field to Looked-up field
  • CNL, a Class Test Numeric on a looked up value, similar to a CNE

Slide 29

This module described multi-step lookup processes. Now that you have completed this module, you should be able to:

  • Describe a multi-step lookup
  • Read a Logic Table and Trace with multi-step lookups
  • Explain the Function Codes used in the example
  • Identify lookup and other patterns in a larger logic table
  • Debug a multi-step lookup not-found condition

Slide 30

Additional information about SAFR is available at the web addresses shown here. This concludes Module 14, Multi-Step Lookups

Slide 31

Training Module 13: Single Step Lookups

The slides used in the following video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 13, Single Step Lookups

Slide 2

Upon completion of this module, you should be able to:

  • Describe the Join Logic Table
  • Describe the advantages of SAFR’s lookup techniques
  • Read a Logic Table and Trace with lookups
  • Explain the Function Codes used in single step lookups
  • Debug a lookup not-found condition

Slide 3

Remember from Module 4 that SAFR allows users to combine or “join” data together from different files for selection tests or inclusion in an output file. Joins or Lookups require a Lookup Path to instruct SAFR which data from the two files to match, the source and the target. The fields in the lookup path are used in the Logic Table to build a key which is then used to search the target table for a match.  All lookups begin by using fields from the Source or Event LR or constants.

Slide 4

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 5

The Performance Engine flow changes when joins are performed.  GVBMR90 generates two logic tables, one for use in the Extract Phase, called the XLT for Extract Phase Logic Table. This logic table contains the views selected for processing from the SAFR meta data repository. We’ve analyzed XLT Logic Tables in prior modules.

When lookups are required, GVBMR90 also generates a Join Logic Table, called a JLT.  This logic table contains artificially generated views used to preprocess the reference data.  These views are never stored in the SAFR Metadata repository, and are regenerated every time GVBMR90 is run. 

In Join Processing, the Extract Engine is executed twice, once with the JLT and the second time with the XLT.  The Join Phase GVBMR95 reads the raw Reference Data files as its source data.  The outputs from the Join Phase are used in memory datasets by the Extract Phase to perform high speed joins. 

Slide 6

In this and following modules we’ll use a simple order example.  We’ll create a file that contains the Order ID, Customer E-mail Address, and Customer Gender Code. 

Available data includes

  • Customer’s orders are recorded on the Order file with an Order ID as the key. 
  • Customer data is recorded in a separate file, keyed by Customer ID. 

Each customer can make more than one Order, so the records in the Order file are more numerous than those in the customer file.  SAFR is designed to scan the more detailed file as the source event file and lookup keyed data in another file.  Thus we’ll define the Order file as the Source Event File, and the Customer file as a Reference File.  We’ll use these files to produce our a very simple output.

Slide 7

SAFR loads target join data into memory in order to increase efficiency.  Memory access is thousands of time faster than disk access.  However, memory is more expensive than disk.  Loading the entire source reference data, including fields that are not needed, into memory can be difficult or expensive.  The purpose of the JLT is to create a Core Image File, with only those fields needed for extract phase processing.  In our example, only the Customer ID, which is the key to the table, and the Customer E-mail Address and Customer Gender Code fields are needed in the core image file during Extract Phase processing.

Slide 8

When no joins are required by any view being executed, the MR90 Control Report will show an empty Join Logic Table.  When joins are required, the MR90 Control Report will contain a printed copy of the JLT. 

For our example, the JLT will contain two Extract Only type views, one to extract the reference data needed in a core image file for the Extract Phase and one to create a header record. These Extract Only views are generated by GVBMR90 at run-time-only.  They are never stored in the SAFR Metadata Repository.   In this example, JLT views, 25 and 26 temporary views.

The core image view has three DTE Logic Table Functions, (1) for the Customer ID field, (2) for the Customer E-mail Address and (3) for the Gender Code. It will also have some DTC and other functions to properly format the core image file.

Slide 9

In addition the core image files, which are formally named Reference Extract Data (RED) files, the second JLT view produces the Reference Extract Header file or REH.  The REH is used in the Extract Phase to allocate memory to load the Core Image Files into memory.

Each REH record contains counts of RED records and other information like the key length and offset, and total bytes required.  These records are optionally printed after the Join Phase by the GVBMR71.

These control records are produced by an additional view for each reference file in addition to the Extract Only view in the JLT.   This view contains some special Logic Table function codes that allow it to accumulate record counts.  It also only writes one record no matter how many input records are in the reference file.

Slide 10

Our example look up view reads the Order logical record as the source event file.  It has three columns, containing the:

  • Order ID from the source event file
  • Customer E-mail Address and
  • Customer Gender Code, both from the Customer file

Obtaining the Customer E-mail Address and Gender requires a join to the Customer logical record.  The join requires using the Customer ID on the Order logical record to search for the corresponding customer record in the Customer reference file.

As noted, the JLT will read the Customer reference file.  One view will create the core image RED file which will primarily have the Customer ID and Customer Email Address field data in it.  Another view will produce the REH record with a count of the number of records in the Customer core image file.

In the remainder of this module, we’ll focus exclusively on the Extract Logic Table, the XLT, for this view

Slide 11

The XLT contains the standard HD Header, RENX Read Next, NV New View functions as well as the concluding ES End of Source and EN End of Logic Table functions.

Additionally, it also contains a DTE Data from Event File field function.  This data is for column 1 of the view, indicated by the Sequence Number.  The source data starts at position 1 for a length of 10 bytes.  Under the Target section of the report, we can see that this data consumes 10 bytes in the target extract record.

Slide 12

Lookups always begin with a Join function code. The Join indicates where the Lookup Path information has been inserted into the view.  The steps required to perform the join are not coded in the view directly. 

The Sequence number following the Join indicates which column requires this join if the logic is column specific.  If the logic is general selection the sequence number will be blank.  The Join also indicates the ultimate Target LR for the join. 

The Join also includes Goto rows.  We’ll discuss these later in this module. 

In our example, The output from Column 2 is the Customer E-mail Address which requires a join to the customer file. Note that the NV indicates the source Event file LR is 10, the Order LR.  The join target is LR ID number 7, the Customer reference file.

Slide 13

A Join is almost always followed by one or more LKE functions.  An LKE function instructs the Extract Engine to build a Lookup Key from an Event File field. The LKE lists which LR the data should be taken from.  The Lookup Key, as opposed to the Extract record, is temporary memory containing a key used to search the core image file for a matching value. 

Note in our example that although the Join function Customer table Target LR ID is 7, the data for the key build should be taken from LR ID 10, the Order file Customer ID field.  This data is found on the source record at position 11 for a length of 10 bytes. 

Slide 14

The end of a Join typically has an LUSM.  The LUSM function does the binary search with the key against the Join’s target table in the RED Core Image file to find a matching record.  The function code stands for Lookup to Structure in Memory.

The GOTO rows for the LUSM function are Found and Not Found conditions.  Execution branches to GOTO Row1 if a reference table matching record is found for the LKx function built key.  Execution branches to GOTO Row2 if no matching record is found in the reference structure.

In this example, if a matching Customer record is found for the supplied Customer ID, execution will continue at LT Row 8.  If no customer is found, row 10 will be executed.

Slide 15

LUSM functions are typically followed by a set of functions similar in structure to the CFxx functions.  You’ll remember that if the comparison proves true, often a move Data from the Event field DTE or similar function follows, then a GOTO  function to skip over the ELSE condition.  The else condition is often a DTC, move Data from a Constant default value.

In this lookup example, if the LUSM finds a corresponding Customer ID in the Customer table the LUSM is followed by a DTL, move Data from a Looked-up field value.  If no matching Customer ID is found, a constant of spaces will be placed in the extract file by the DTC function.

Slide 16

The field Source start positions on DTE functions reflect where the data resides in the event file.  This is not true for the DTL functions, because the data is pulled from the RED. The data positions in the RED are not the same as the Reference Data Logical Record. The JLT process moves fields from the LR positions to the RED position, and these positions are determined by the order in which the views use the data.

In our example, the DTL function moves the 40 byte Customer Email Address on the Customer LR to the extract record from a starting position of 1 whereas on the Customer Record, it begins in Position 39. 

If the LR definition of the reference file is incorrect, the RED Key and all the data may be wrong.  At times it is necessary to inspect the RED file to see if the data exists.

Slide 17

Looked-up fields can be placed in columns in two ways.  They may be assigned directly, as shown in this view example and on this logic table.  Or they may be coded in Logic Text. 

If they are placed directed on the view editor and not in logic text, then SAFR automatically generates a default DTC in case of a not found condition. The default value depends upon the format of the target column:  Alphanumeric columns receive a default value of spaces, and numeric columns a default value of zeroes.

If the field is coded in Logic Text, the user can define a default ELSE condition, which would be assigned to the DTC value.

In our example, the default value is spaces supplied by the DTC function if the lookup fails and no customer Email Address is found.

Slide 18

Lookup paths are often reused within views, which means lookup logic is often repeated. 

Our view requires two look ups, one for the Customer Email Address for column 2, and a second for the Customer Gender Code in column 3. After the first lookup, the logic table then contains very similar logic for the second look up.  Except for changes in logic table and GOTO rows and Sequence (column) numbers the second set of Join, LKE and LUSM functions are identical to the first set.

The DTL, GOTO and DTC functions are also very similar between the two lookups, with same row and sequence number differences, yet their source field and target positions and lengths also differ.

Slide 19

The first goal of SAFR is efficiency.  Developers recognized that repeating LKx key build and the binary search LUSM function over and over again for the same key was inefficient, particularly when this lookup had just been performed and would have the same results.  SAFR has just found the Customer record with the Customer E-mail Address, and can use this same record to find the Customer Gender.  Thus SAFR has automatic Join Optimization built-in to every lookup. 

This is done through the Join function GOTO Rows.  Note that the GOTO Row 1 and Row 2 numbers on the Join Function are the same as the LUSM, the Found and Not Found rows.  Before the Join executes, it very efficiently test if this join has been performed for this record.  If it has, the JOIN immediately branches to either the Found or Not Found row based upon the last lookup.  This avoids using CPU cycles to do unnecessary work.

This optimization is not limited to repeat joins within a view, but operates across all joins for all views being run against the same source event file.  The greater the number of views and joins, the greater the efficiencies gained.

Slide 20

The last view specific logic table function is the WRDT, which Writes the Data section of the Extract record.  The record will include the DTE data for Order ID, the  DTL Customer Email Address or DTC spaces if there is not one, and the DTL Customer Gender Code or DTC spaces if that is missing.

Having examined the simple lookup logic table, let’s turn the Logic Table TRACE and examine the flow. 

Slide 21

For the first record from the Order File, rows 3 through 9 are executed in order. No rows are skipped.  (The HD and RENX functions of rows 1 and 2 are not shown in the trace). 

The Trace shows the DTE function moving the Order ID of 0000000001  from the Order Source Event File to the Extract record.

Slide 22

Because this is the first source event file record read, and the Join function on row 5 is the first join in the logic table, no join has been performed.  The Join performs a test to see if the join has been performed, and finding that it has not, the Join falls through to the LKE function to build the key for the join. No branching from the Join is performed.

Slide 23

Because this is the first Join, the LKE function is used to build a key to search for a matching customer record.  The trace shows the LKE value in the Customer ID field on Order file, which is moved to the lookup key buffer prior to doing the search.  The search LUSM function will search for customer number 1.

Slide 24

The results of the LUSM binary search can be seen by noting the next row executed.  In our example row 8 is the next row in the trace; from this it is clear that the LUSM function found a corresponding customer record with customer ID 1 in the Customer Reference File.  Now any Logic Table functions that reference Customer Logical Record ID 7 (called Record ID in the Trace Report) will use this record until another LUSM is performed for the Customer Logical Record.

Slide 25

The DTL Function moves looked up data from  the Customer Email Address field on the Customer Reference File to the Extract record.  The trace report shows a portion of the value that is moved to the extract record.

The GOTO function on Logic Table Row 9 then skips the DTC function which would have placed spaces in the Extract record if no Customer Record had been found. Remember that the trace only shows executed logic table rows.  Rows skipped are not printed in the trace.

Slide 26

The second Join is required for the Gender code.  It begins with the Join function testing to see if a join to Logical Record ID 7 has been performed previously.  Since the lookup for the Customer E-mail Address was just performed and resulted in a Found Condition, the LKE, LUSM functions do not need to be performed.  Rather, the GOTO Row 1 on the Join is used, just like the second LUSM had been performed and found Customer ID 1 again in the Customer record.  In this particular logic table, logic table rows 12 and 13 will likely never be executed during processing. 

Slide 27

The 2nd DTL Function moves the Customer Gender code from the Customer Reference File to the Extract record.  The trace report shows the “M” value that is moved to the extract record.  The GOTO function on Logic Table Row 16 then skips the DTC function which would have placed spaces in the Extract record if no Customer Record had been found.

The WRDT function writes this completed Extract Record to the Extract File.  Execution then resumes at the RENX function for the next Event Record.

Slide 28

Processing for the next four records continues following the same execution pattern as record number 1.  The only changes in the trace are the Event Record number, and the Values on the DTE, LKE, and both DTL functions.  The values from the Order and Customer files are moved to the Extract File.

Slide 29

The last two Orders in our example Source Event File do not have corresponding Customer records.  Thus on these two records, the first LUSM functions result in Not Found conditions, and branch to the DTC functions to populate the extract record Customer Email Address with the default value of spaces.  Likewise, the following Join functions detect that a Not Found occurred on the lookup, and also branch to the DTC functions to populate the extract record with a default space for the Customer Gender code.

We can manually perform the search function by taking the value shown in the LKE function, the “0000000000 “ and searching the Customer Reference File and find there is no Customer record for Customer ID “0000000000”.  There can be numerous causes for this, such as non-SAFR issues of mis-matches in the Order and Customer Reference files. 

If all lookups fail for a particular reference file it is likely a structural problem.  For example, the Join LR may be wrong.  Inspecting the RED can highlight this problem.  If data displayed in the LKE functions is incorrect, either the Event LR or the path may be incorrect.  Partially found lookups may indicate a data issue external to the SAFR configuration. 

Slide 30

The Extract Engine GVBMR95 control report shows the results of processing.  Twelve records were read from the input file, and twelve records written to the output file.  For each view against each event file, it reports the total lookups found and not found.  The process required 12 total lookups, 10 of which found corresponding records, and 2 which did not.

Note that these counts are the results of LUSM functions, and do not include Join optimizations.  If Join optimization had not been performed, the total number of lookups would have been 24, two look up for each extract record written times 12 extract records. 

Slide 31

There are many advantages to SAFR’s Lookups processes, allowing it to perform millions of joins or look-ups each CPU second.  The following are some reasons for this:

  • Memory based lookups are more efficient and rapid than lookups to data on a disk, but requires efficient memory usage
  • Join optimization and single-pass architecture allows reuse of lookups, including across different views
  • All reference data is loaded once, even when performing parallel processing; multiple threads share the same table in memory

Slide 32

This logic table and module has introduced the following Logic Table Function Code:

  • JOIN, Identifies join targets and test for executed Joins
  • LKE, builds a look up key using data from the source Event record
  • LUSM, binary searches of a in memory reference data table using the Lookup Key
  • DTL, moves a looked input field to the output buffer

Slide 33

This module described single step lookup processes. Now that you have completed this module, you should be able to:

  • Describe the Join Logic Table
  • Describe the advantages of SAFR’s lookup techniques
  • Read a Logic Table and Trace with lookups
  • Explain the Function Codes used in single step lookups
  • Debug a lookup not-found condition

Slide 34

Additional information about SAFR is available at the web addresses shown here. This concludes Module 13, Single Step Lookups

Slide 35

Training Module 4: Introduction to Lookups

The slides of this video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR. This is module 4: Introduction to Lookups.

Slide 2

This module provides you with an overview of SAFR lookup processing. Upon completion of this module, you should be able to:

  • Describe a typical SAFR lookup path
  • Create a lookup path in the SAFR Workbench
  • Include join data, using the lookup path in a view
  • Update and run the SAFR Join phase JCL
  • Interpret the results of join processing

Slide 3

SAFR allows users to combine or “join” data together from different files for selection tests or inclusion in an output. This requires telling SAFR which data from the two files to match, the source, and the target. These files typically have different layouts, described by differing SAFR logical records, or LRs. SAFR requires these source and target relationships to be predefined in a SAFR lookup path. The lookup path is similar to an SQL “where” clause in a relational database.

Slide 4

Lookup paths can be used by many different views. To view existing lookup paths, click Lookup Paths in the Navigator pane and open a lookup path in the metadata list at the bottom of the screen.

To create a new lookup path, select Administration, then New, and then Lookup Path from the menu.

Slide 5

The General tab of the selected lookup path shows the lookup path ID, name, comments, and status, either active or inactive.

When you create a new lookup path, the lookup path ID is assigned by the Workbench and is not editable. It is used to make each lookup path unique and appears along with the lookup path name in views.

Slide 6

The name of the lookup path is visible when you are using it in views. When you name a lookup path, it is good practice to include the source-LR-to-target-LR mapping performed by the path. Also, if more than one lookup path in your environment is used to join the same source LR and target, you should add additional qualifiers to the name to provide uniqueness.

Slide 7

Lookup paths are created with an Inactive status and remain inactive until they are completely defined. They can then be activated, allowing for use in views.

If an active lookup path changes, all views referencing that lookup path are deactivated to prevent views from performing inaccurate joins. Select the Make Lookup Path Inactive check box to deactivate the lookup; this also deactivates all the views using the lookup path.

Slide 8

From the General tab, we’ve clicked the Lookup Path Definition tab, where we can select the source logical record and the target LR and its specific logical file. We can then select the source fields that should be used to create a key to search the target to find a matching record. The accumulated length of the source fields must equal the total key length before the lookup path can be activated and used for processing in a view.

Slide 9

The Source Field Properties area is displayed when the Selected Source Fields list is clicked. Here you can select a source field from the source logical record or specify a constant that should be used in the key. Constants can be either static, as shown here, or symbolic, which allows setting the value in the view when the lookup path is used. When specifying a constant, carefully consider the format of the target key when setting the format (data type), length, and so on.

We’ve now completed the lookup path. Next, we’ll see how to use it in the view.

Slide 10

On this slide, we’ve switched to the View Editor, creating a view that reads a file that is described by the Order LR. In column 3, we require a field that is on the Customer LR. After inserting the column, we select Lookup Field as the source for the column.

Slide 11

Next, the Lookup LR box lists all the LRs that can be accessed from the order LR, based upon the lookup paths that have been created. Because we created the Order to Customer lookup path, the Customer LR is available in the list.

Slide 12

Next, we select the lookup path that we want to use to perform the join. This is necessary because multiple methods of joining from one LR to another might exist. For example, the Order LR might have a Purchasing Customer ID field and a Ship to Customer ID field, and either of these might be used as the basis for a different lookup path to find the customer address. In this instance, we’ve selected the lookup path that we have just created, with a lookup ID of 3.

Slide 13

Finally, we select the actual field we want to place on the extract file. All the fields on the Customer LR are available for use. In this instance we’ve selected the customer email address to be put out on the extract file.

Note that the Column Source Value field is populated with Customer[7].Order_to_Customer[3].Customer_Email_Address[70]. This is the LR, path, field names, and IDs. This pattern is used in numerous places in the Workbench.

Slide 14

We’ve used the lookup path in the view, resulting in a join. Looked-up fields can be placed in output columns, used as sort and aggregation fields, or used in filtering processes or calculations using similar steps.

We’ve now completed the view. Next, we’ll see how to run it in the Performance Engine.

Slide 15

Recall from module 2 that the first step in running the Performance Engine is running MR86, which takes in the View List and the SAFR Metadata Repository and creates an XML file of the view.

Slide 16

The next program is GVBMR84, the COMPILE, which produces the MR84 VDP. The addition of join processing makes no significant changes to this process.  

Slide 17

Next, MR90 is run. Because our view requires a lookup, two logic tables are produced. The Extract Logic Table (XLT) is used in the actual Extract phase as the view scans the Order file and performs joins to the Customer LR. The Join Logic Table, or JLT, on the other hand, simply prepares the Customer lookup file for use in the Extract phase.

The VDP tells MR90 that only the customer email address, the customer ID, and the join key are required for extract processing. The Join phase will use the JLT to extract only these two fields to be loaded into memory during the Extract phase processing to perform the joins.

Slide 18

The MR90 Control Report in the “Performance Engine Overview” module showed an empty JLT when no joins were performed in the view. Now the MR90 Control Report shows the JLT statistics and a printed version of the logic table.

Note that the view numbers listed in the JLT are run-time-only views for this execution of the Performance Engine. These example JLT views, 25 and 26, are never stored in the SAFR Metadata Repository

Slide 19

GVBMR95 produces two files in the Join phase. The RED file contains one row for every row in the input join file (in this example, the customer file), but only the fields required for extract processing. The REH file contains a single control record for each join file.

The following slides detail how to locate the DD name for the input join files for the Join phase GVBMR95 step and modify the JCL.

Slide 20

First, within the view, locate the lookup path used for any joins. For this example view, we are using the Order to Customer lookup path ID 3. The path name is highlighted on the slide.

Columns containing a joined field are highlighted with a Join icon to speed identification of required joins.

Slide 21

  • Next, click the Lookup Path icon in the Navigator pane. A list of all paths is shown at the bottom of the screen.
  • Select the path from the list. The Edit Lookup Path screen opens.
  • Click the Lookup Path Definition tab.
  • Note the target logical file defined for the lookup path. This value is used in the next step.

Slide 22

Next, locate the logical file by performing the following steps.

  • First, select Logical Files in the Navigator pane and then select the appropriate logical file from the list of logical records.
  • Next, find the associated physical file. The DD name is listed in the physical file definition, which we’ll find next.

Note that join files can be associated with only one physical file.

Slide 23

  • Next, click the Physical File icon and select the specific file from the Associated Physical Files list.
  • Last, locate the input DD name.

This DD name should be placed in the join file job for use by GVBMR95, as shown at the bottom of this slide. The DD name must point to the file containing customer data. Note that join files must be in sorted order by the key specified in the logical record. An error is issued in the Extract phase if the files are not in sorted order

Slide 24

After execution of the MR95 in the Join phase job, the MR95 Control Report shows the number of records read from the input Customer join file. It also shows how many records were written to the RED file (DD name GREF003) and the REH file (DD Name GREFREH). These files are then used in the Extract phase.

Slide 25

Following the Join phase execution of MR95, the Extract phase is run. Its inputs include the XLT, the REH and RED files, and the source or event file data (in this example, the Order file).

Slide 26

The Extract phase MR95 Control Report shows the processing of view 24 reading the Order file. Note that, on the same row showing how many records were extracted from the Order file (12), the control report also shows 10 lookups found (designated by the letter “F”) and 2 lookups not found (NF). This means two orders were associated with at least one customer ID on the Order file, which did not have corresponding customer records in the Customer file.

The report also shows that a total of 12 lookups were performed.

Slide 27

This module provided an introduction to lookups. Now that you have completed this module, you should be able to:

  • Describe a typical SAFR lookup path
  • Create a lookup path in the SAFR Workbench
  • Include join data, using the lookup path in a view
  • Update and run the SAFR Join phase JCL
  • Interpret the results of join processing

Slide 28

Additional information about SAFR is available at the web addresses shown here.

Slide 29

Slide 30

This concludes the Module 4. Thank you for your participation.