Training Module 22: using Pipes and Tokens

The slides used in the following video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 22, using Pipes and Tokens

Slide 2

Upon completion of this module, you should be able to:

  • Describe uses for the SAFR piping feature
  • Read a Logic Table and Trace with piping
  • Describe uses for the SAFR token feature
  • Read a Logic Table and Trace with tokens
  • Debug piping and token views

Slide 3

This module covers three special logic table functions, the WRTK which writes a token, the RETK which reads the token and the ET function which signifies the end of a set of token views.  These functions are like the WR functions, the RENX function and the ES function for non-token views. 

Using SAFR Pipes does not involve special logic table functions.

Slide 4

This module gives an overview of the Pipe and Token features of SAFR.

SAFR views often either read data or pass data to other views, creating chains of processes, each view performing one portion of the overall process transformation

Pipes and Tokens are ways to improve the efficiency with which data is passed from one view to another

Pipes and Tokens are only useful to improve efficiency of SAFR processes; they do not perform any new data functions not available through other SAFR features.

Slide 5

A pipe is a virtual file passed in memory between two or more views. A pipe is defined as a Physical File of type Pipe in the SAFR Workbench.

Pipes save unnecessary input and output. If View 1 outputs a disk file and View 2 reads that file, then time is wasted for output from View 1 and input to View 2. This configuration would typically require two passes of SAFR, one processing view 1 and a second pass processing view 2.

If there is a pipe placed between View 1 and View 2, then the records stay in memory, and no time is wasted and both views are executed in this single pass.

The pipe consists of blocks of records in memory.

Slide 6

Similarly a token is a named memory area. The name is used for communication between two or more views.

Like a pipe, it allows passing data in memory between two or more views.

But unlike pipes which are virtual files and allow asynchronous processing, tokens operate one record at a time synchronously between the views.

Slide 7

Because Pipes simply substitute a virtual file for a physical file, views writing to a pipe are an independent thread from views reading the pipe.

Thus for pipes both threads are asynchronous

Tokens though pass data one record at a time between views.  The views writing the token and the views reading the token are in the same thread.

Thus for tokens there is only one thread, and both views run synchronously

Slide 8

An advantage of pipes is that they include parallelism, which can decrease the elapsed time needed to produce an output.  In this example,  View 2 runs in parallel to View 1. After View 1 has filled block 1 with records, View 2 can begin reading from block 1.  While View 2 is reading from block 1, View 1 is writing new records to block 2. This advantage is one form of parallelism in SAFR which improves performance. Without this, View 2 would have to wait until all of View 1 is complete.

Slide 9

Pipes can be used to multiply input records, creating multiple output records in the piped file, all of which will be read by the views reading the pipe.  This can be used to perform an allocation type process, where a value on a record is divided into multiple other records.

In this example, a single record in the input file is written by multiple views into the pipe.  Each of these records is then read by the second thread.

Slide 10

Multiple views can write tokens, but because tokens can only contain one record at a time, token reader views must be executed after each token write.  Otherwise the token would be overwritten by a subsequent token write view.

In this example, on the left a single view, View 1, writes a token, and views 2 and 3 read the token. View 4 has nothing to do with the token, reading the original input record like View 1.

On the right, both views 1 and 4 write to the token.  After each writes to the token, all views which read the token are called. 

In this way, tokens can be used to “explode” input event files, like pipes.

Slide 11

Both pipes and tokens can be used to conserve or concentrate computing capacity.  For example CPU intensive operations like look-ups, or lookup exits, can be performed one time in views that write to pipes or tokens, and the results added to the record that is passed onto to dependent views.  Thus the dependent, reading views will not have to perform these same functions.

In this example, the diagram on the left shows the same lookup is performed in three different views.  Using piping or tokens, the diagram on the right shows how this lookup may be performed once, the result of the lookup stored on the event file record and used by subsequent views, thus reducing the number of lookups performed in total.

Slide 12

A limitation of pipes is the loss of visibility to the original input record read from disk; the asynchronous execution of thread 1 and thread 2 means the records being processed in thread one is unpredictable.

There are instances where visibility by the view using the output from another view to the input record is important.  As will be discussed in a later module, the Common Key Buffering feature can at times allow for use of records related to the input record.  Also, the use of exits can be employed to preserve this type of visibility.  Token processing, because it is synchronous within the same thread, can provide this capability. 

In this example, on the top the token View 2 can still access the original input record from the source file if needed, whereas the Pipe View 2 on the bottom cannot.

Slide 13

Piping can be used in conjunction with Extract Time Summarization, because the pipe writing views process multiple records before passing the buffer to the pipe reading views.  Because tokens process one record at a time, they cannot use extract time summarization.  A token can never contain more than one record.

In the example on the left, View 1, using Extract Time Summarization, collapses four records with like sort keys from the input file before writing this single record to the Pipe Buffer.  Whereas Token processing on the right will pass each individual input record to all Token Reading views.

Note that to use Extract Time Summarization  with Piping, the pipe reading views must process Standard Extract Format records, rather than data written by a WRDT function; summarization only occurs in CT columns, which are only contained in Standard Extract Format Records. 

Slide 14

Workbench set-up for writing to, and reading from a pipe is relatively simple.  It begins with creating a pipe physical file and a view which will write to the pipe.

Begin by creating a physical file with a File Type of “Pipe”.

Create a view to write to the Pipe, and within the View Properties:

  • Select the Output Format of Flat File, Fixed-Length Fields,
  • Then select the created physical file as the output file

Slide 15

The next step is to create the LR for the Pipe Reader Views to use.

The LR used for views which read from the pipe must match exactly the column outputs of the View or Views which will write to the Pipe.  This includes data formats, positions, lengths and contents.

In this example, Column 1 of the View,  ”Order ID” is a 10 byte field, and begins at position 1.  In the Logical Record this value will be found in the Order_ID field, beginning at position 1 for 10 bytes.

Preliminary testing of the new Logical Record is suggested, making the view write to a physical file first, inspect the view output, and ensure the data matches the Pipe Logical Record, and then change the view to actually write to a pipe.  Looking at an actual output file to examine positions is easier than using the trace to detect if positions are correct as the Pipe only shows data referenced by a view, not the entire written record produced by the pipe writing view.

Slide 16

The view which reads the Pipe uses the Pipe Logical Record.  It also must read from a Logical File which contains the Pipe Physical File written to by the pipe writer view.

In this example, the view read the pipe-LR ID 38, and the Pipe LF ID 42 Logical File which contains the Physical File of a type Pipe written to by the pipe writing view.  The view itself uses all three of the fields on the Pipe Logical Record.

Slide 17

This is the logic table generated for the piping views.  The pipe logic table contains no special logic table functions.  The only connection between the ES sets is the shared physical file entity.

The first RENX – ES set is reading Logical File 12.  This input event file is on disk.  The pipe writing view, view number 73, writes extracted records to Physical File Partition 51. 

The next RENX – ES set reads Logical File 42.  This Logical File contains the same Physical File partition (51) written to by the prior view.  The Pipe Reader view is view 74.

Slide 18

This the the trace for piped views.  In the example on the left, the thread reading the event file, with a DD Name of ORDER001, begins processing input event records.  Because the pipe writer view, number 73, has no selection criteria, each record is written to the Pipe by the WRDT function.  All 12 records in this small input file are written to the Pipe.

When all 12 records have been processed, Thread 1 has reached the end of the input event file, it stops processing, and turns the pipe buffer over to Thread 2 for processing.

Thread 2, the Pipe Reader thread, then begins processing these 12 records. All 12 records are processed by the pipe reader view, view number 74.

As shown on the right, if the input event file had contained more records, enough to fill multiple pipe buffers, the trace still begins with Thread 1 processing, but after the first buffer is filled, Thread 2, the pipe reading view, begins processing.  From this point, the printed trace rows for Thread 1 and Thread 2 may be interspersed as both threads process records in parallel against differing pipe buffers.  This randomness is highlighted by each NV function against a new event file record

Slide 19

Workbench set-up for writing to, and reading from a token is very similar to the set-up for pipes. It begins with creating a physical file with a type of token, and a view which will write to the token.

Begin by creating a physical file with a File Type of “Token”.

Create a view to write to the token, and within the View Properties, select the Output Format of Flat File, Fixed-Length Fields.

Then in the proprieties for the Logical Record and Logical File to be read, select the output Logical File and created Physical File as the destination for the view

Slide 20

Like pipe usage, the next step is to create the Logical Record for the Token Reader Views to use. In our example, we have reused the same Logical Record used in the piping example.  Again, the logical record which describes the token must match the output from the view exactly, including data formats, positions, lengths and contents.

Slide 21

Again, the view which reads the Token uses this new Logical Record.  It also must read from a Logical File which contains the Token Physical File written to by the token writer view.

Note that the same Logical File must be used for both the token writer and token reader; using the same Physical File contained in two different Logical Files causes an error in processing. 

Slide 22

This is the logic table generated for the token views.

Any views which read a token are contained within an RETK – Read Next Token / ET – End of Token set at the top of the logic table.

The token writer view or views are listed below this set in the typical threads according to the files they read; the only change in these views is the WRTK logic table function, which Writes a Token shown at the bottom of the logic table in red.

Although the token reading views are listed at the top of the Logic Table, they are not executed first in the flow.  Rather, the regular threads process, and when they reach an WRTK Write Token logic table function, the views in the RETK thread are called and processed.  Thus the RETK – ET set is like a subroutine, executed at the end of the regular thread processing.

In this example, view 10462 is reading the token written by view 10461.  View 10462 is contained within the RETK – ET set at the top of the logic table.  View 10461 has a WRTK function to write the token.  After execution of this WRTK function, all views in the RETK – ET set are executed.

Slide 23

This is the Logic Table Trace for the Token Views.

Note that, unlike the Pipe Example which has multiple Event File DD Names within the Trace, because Tokens execute in one thread, the Event File DD Name is the same throughout, ORDER001 in this example.

The WRTK Write Token Function, on view 10461 in this example, which ends the standard thread processes are immediately followed by the Token Reading view or views, view 10462 in this example.  The only thing that makes these views look different from the views reading the input event file is that they are processing against a different Logical File and Logical Record, Logical File ID 2025  and Logical Record ID 38 in this example. 

Slide 24

The trace on the right has been modified to show the effect of having multiple token writing views executing at the same time. 

Note multiple executions of View 10462, the token reading view processing the same Event Record1,  from the Same Event DD Name ORDER001, after each token writing views, 10460 and 10461.

Slide 25

The GVBMR95 control report shows the execution results from running both the Pipe and Token views at the same time.

  • At the top of the report, the Input files listed include the
  • Disk File, the event file for both the pipe and token writing views
  • the Token (for the Token Reading views), and
  • the Pipe (for the Pipe Reading views)
  • The bottom of the report shows the final output disposition, showing records were written to
  • an Extract File (shared by the pipe and token reading views),
  • the Token (for the Token Writing View) and
  • the Pipe (for the Pipe Writing View)
  • The center section shows more details about each one of the Views.

Slide 26

The top of this slide shows the final run statistics about the run from the GVBMR95 control report.

Note that a common error message may be encountered if view selection for the pass is not carefully constructed, for either Pipes or Tokens.  Execution of a pass which contains only  Pipe Writers and no Pipe Readers will result in this message.  The opposite is also true, with no Pipe Writers and only Pipe Readers selected.  The same is also true for token processes.  Pairs of views must be executed for either process, matched by Logical Files containing matching Physical Files for the process to execute correctly.

Slide 27

This module described using piping and tokens. Now that you have completed this module, you should be able to:

  • Describe uses for the SAFR piping feature
  • Read a Logic Table and Trace with piping
  • Describe uses for the SAFR token feature
  • Read a Logic Table and Trace with tokens
  • Debug piping and token views

Slide 28

Additional information about SAFR is available at the web addresses shown here. This concludes Module 22, The SAFR Piping and Token Functions

Slide 29

Training Module 19: Parallel Processing

The slides for the following video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 19, Parallel Processing

Slide 2

Upon completion of this module, you should be able to:

  • Describe SAFR’s Parallel Processing feature
  • Read a Logic Table and Trace with multiple sources
  • Set Trace parameters to filter out unneeded rows
  • Configure shared extract files across parallel threads
  • Set thread governors to conserve system resources
  • Debug a SAFR execution with multiple sources

Slide 3

In a prior module we noted that the Extract Engine, GVBMR95, is a parallel processing engine, the Format Phase, GVBMR88, is not.  It is possible to run multiple Format Phase jobs in parallel, one processing each extract file.  Yet each job is independent of each other.  They share no resources, such as look-up tables for join processing.  In this module we will highlight the benefits of GVBMR95 multi-thread parallelism, how to enable, control, and debug it.

Slide 4

Parallelism allows for use of more resources to complete a task in a shorter period of time.  For example if the view requires reading 1 million records to produce the appropriate output and the computer can process 250,000 records a second, it will require 4 seconds at a minimum to produce this view.  If the file is divided into 4 parts then the output could be produced in 1 second.

Doing so requires adequate system resources, in the form of I/O Channels, CPUs, and memory.  If for example, our computer only has 1 CPU, then each file will effectively be processed serially, no matter how many files we have.  All parallel processing is resource constrained in some way.

Slide 5

GVBMR95 is a multi-task parallel processing engine.  It does not require multiple jobs to create parallelism.  Using the Logic Table and VDP, GVBMR95 actually creates individual programs from scratch and then asks the operating system to execute each program, called sub tasks, one for each input Event File to be scanned for the required views.  This is done all within one z/OS job step.  Each of these sub tasks independently reads data from the Event File, and writes extract records to extract files for all views reading that event file.

In this example, the combination of views contained in the VDP and Logic Table require reading data from four different event files.  These views write output data to six different extract files.  The main GVBMR95 program will generate four different sub tasks, corresponding to the four different input event files to be read.

Slide 6

Multi-threaded parallelism can provide certain benefits over Multi-job parallelism.  These benefits include:

  • Shared Memory.  Only one copy of the reference data to be joined to must be loaded into memory.  Sharing memory across different jobs is much less efficient than within a single job.  Thus all sub tasks can efficiently access the memory resident tables
  • Shared I/O.  Data from multiple input files can be written to a single extract file for that view, allowing for shared output files
  • Shared Control.  Only one job needs to be controlled and monitored, since all work occurs under this single job step
  • Shared Processing.  In certain cases, data may be shared between threads when required, through the use of SAFR Piping or exit processing

Slide 7

Recall from an earlier lesson that:

  • A physical file definition describes a data source like the Order Files
  • A logical file definition describes a collection of one or more physical files
  • Views specify which Logical Files to read

The degree of potential parallelism is determined by the number of Physical Files.  GVBMR95 generates subtasks to read each Physical File included in every Logical File read by any View included in the VDP.

Slide 8

The SAFR metadata component Physical File contains the DD Name to be used to read the input event file.  This DD Name is used throughout the GVBMR95 trace process to identify the thread being processed.

Slide 9

Remember that GVBMR95 can also perform dynamic allocation of input files.  This means that even if a file is NOT listed in the JCL, if the file name is listed in the SAFR Physical File entity, GVBMR95 will instruct the operating system to allocate this named file.  GVBMR95 will first test if the file is listed in the JCL before performing dynamic allocation.  Thus the SAFR Physical Entity file name can be over-ridden in the JCL if required.  Dynamic allocation can allow an unexpected view included in the VDP to successfully run even though no updates were made to the JCL. 

Slide 10

The same Physical File can be included in multiple logical files.  This allows the SAFR developer to embed in file’s meaning that may be useful for view processing, like a partition for stores within each state.  Another Logical File can be created which includes all states in a particular region, and another to include all states in the US.

In this module’s example, we have the ORDER001 Logical File reading the Order_001 physical file, the ORDER005 Logical File reading the ORDER-005 physical File, and the ORDERALL Logical File, reading the ORDER001, 002, 003, 004 and 005 physical files.

Slide 11

In this module we have three sample views, each Extract Only Views with the same column definitions (3 DT columns).  The only difference is the Logical File each is reading.  View 148 reads only the ORDER 001 file, while view 147 reads only the ORDER 005 file, and view 19 reads all Order files, including 1, 2, 3, 4 and 5.

Slide 12

In the first run, each view in our example writes its output to its own extract file.  Output files are selected in the view properties tab.  The SAFR Physical File meta data component lists the output DD Name for each, similar to the Input Event Files. 

Slide 13

As GVBMR90 builds the logic table, it copies each view into the “ES Set” which reads a unique combination of source files.  Doing so creates the template needed by GVBMR95 to generate the individualized program needed to read each source. Note that Output files are typically a single file for an entire view, and thus typically are shared across ES sets.

In our example, three ES Sets are created.  The first includes views 148 and 19, each with their NV, DTEs, and WRDT functions.  The second ES Set is only view 19.  And the third is views 19 and 147.  The Output04 file will be written to by multiple ES Sets, which contain view 19.

Slide 14

This is the generated logic table for our sample views.  It contains only one HD Header Function, and one EN End function at the end.  But unlike any of the pervious Logic Tables, it has multiple RENX Read Next functions, and ES End of Set Functions.  Each RENX is preceded by a File ID row, with a generated File ID for that unique combination of files which are to be read.  Between the RENX and the ES is the logic for each view; only the Goto True and False rows differ for the same logic table functions between each ES set.

In our example, begins with the HD function, then ES Set 1, reading File “22” (a generated file ID for the ORDER001 file) contains view logic 48 and 19.  The second ES Set for File “23” (Order files 2, 3, and 4) contains only View ID 19, and the third set for file ID “24” (the ORDER005 file) contains the view logic for view 47 and 19 and ends with the EN function.

Next we’ll look at the initial messages for the Logic Table Trace.

Slide 15

When Trace is activated, GVBMR95 prints certain messages during initialization, before parallel processing begins.  Let’s examine these messages.

  • MR95 validates any input parameters listed in the JCL for proper keywords and values
  • MR95 loads the VDP file from disk to memory
  • MR95 next loads the Logic Table from disk to memory
  • MR95 clones ES Sets to create threads, discussed more fully on the next slide
  • MR95 loads the REH Reference Data Header file into memory, and from this table then loads each RED Reference Data file into memory.  During this process it check for proper sort order of each key in each reference table
  • Using each function code in the logic table, MR95 creates a customized machine code program in memory.  The data in this section can assist SAFR support in locating in memory the code generated for specific LT Functions
  • Next MR95 opens each of the output files to be used.  Opening of input Event files is done within each thread, but because threads can share output files, they are opened before beginning parallel processing
  • MR95 updates various addresses in the generated machine code in memory
  • Having loaded all necessary input and reference data, generated the machine code for each thread, and opened output files, MR95 then begins parallel processing.  It does this by instructing the operating system to execute each generated program.  The main MR95 program (sometimes called the “Mother Task”) then goes to sleep until all the subtasks are completed.  At this point, if no errors have occurred in any thread, the mother task wakes up, closes all extract files, and prints the control report.

Slide 16

The GVBMR90 logic table only includes ES Sets, not individual threads.  When GVBMR95 begins, it detects if multiple files must be read by a single ES Set.  If so, it clones the ES Set logic to create multiple threads from the same section of the logic table.

In our example views, the single ES Set for the Order 2, 3 and 4 files is cloned during MR95 initialization to create individual threads for each input file.

Slide 17

During parallel processing GVBMR95 prints certain messages to the JES Message Log

  • The first message shows the number of threads started, and the time parallel processing began
  • As each thread finishes processing, it also prints a message showing the time it completed
  • The thread number completed
  • The DD Name of the input file being read, and
  • The record count of the input file records read for that thread

Each thread is independent (asynchronous) with any other thread (and the sleeping Mother Task during thread processing).  The order and length of work each performs is under the control of the operating system.  A thread may process one or many input event records in bursts, and then be swapped off the CPU to await some system resource or higher priority work.  Thus the timing of starting and stopping for each thread cannot be predicted.

Slide 18

The Trace output is a shared output for all processing threads.  Because threads process independently, under the control of the operating system, the order of the records written to the trace is unpredictable.  There may be long bursts of trace records from one thread, followed by a long burst of processing within another thread.  This would be the case if only two threads were being processed (for two input event files) on a one CPU machine (there would be no parallel processing in this case either)  How long each thread remains on the CPU is determined by the operating system. 

Alternatively, records from two or more threads may be interleaved in the trace output, as they are processed by different CPUs.  Thus the EVENT DDNAME column become critical to determining which thread a specific trace record is for.  The DD Name value is typically the input Event file DD name.  The Event Record count (the next column) always increases for a particular thread.  Thus record 3 is always processed after record 2 for one specific DD Name.

In this example,

  • The ORDER002 thread begins processing, and processes input event records 1 – 13 for view 19 before any other thread processes. 
  • It stops processing for a moment, and ORDER001 begins processing, but only completes 1 input event record for views 148 and view 19 before stopping to process. 
  • ORDER002 picks back up and processes record 14 for its one view. 
  • ORDER005 finally begins processing, completing input record 1 for both views 147 and 19. 
  • ORDER001 begins processing again, record 2 for both views 148 and 19.

Note that this portion of the trace does not show any processing for thread ORDER003 and ORDER004.

Slide 19

In a prior module we showed examples of Trace parameters which control what trace outputs are written to the trace file.  One of the most useful is the TRACEINPUT parameter, which allows tracing a specific thread or threads.  It uses the thread DD Name to restrict output to that thread.

Slide 20

The heading section of the GVBMR95 control report provides information about the GVBMR95 environment and execution.  It includes:

  • SAFR executable version information
  • The system date and time of the execution of the SAFR process.  (This is not the “Run Date” parameter, but the actual system date and time)
  • The GVBMR95 parameters provided for this execution, including overall parameters, environment variables, and trace parameters
  • zIIP processing status information

Slide 21

The remainder of the control report shows additional information when running in parallel mode:

  • The number of threads executed, 5 in this example, is shown
  • The number of views run is not simply a count of the views in the VDP, but the replicated ES Set views.  In this example, because view 19 read 5 sources, it counts as 5, plus view 147 and view 148 equals 7
  • The greater the number of views and threads that are run, the larger the size of the generated machine code programs,  in bytes. 
  • The input record count for each thread is shown, along with it’s DD name.  Because GVBMR95 ran in full parallel mode, the record counts for the thread are the same as the record counts for each input file.  Later we’ll explain how these numbers can be different.
  • The results of each view against each input DD Name (thread in this case) is shown, including lookups found and not found, records extracted, DD name where those records were written to, and the number of records read from that DD Name.  Certain additional features, explain in a later module, can cause the record counts read for a view to be different than the record counts for the input file.

Slide 22

Because SAFR allows sharing output files across threads, it is possible to have many views writing to a single output file.  The outputs may either be all the same record formats, or differing formats as variable length records.  This can allow for complex view logic to be broken up multiple views, each containing a portion of the logic for a particular type of record output.

Slide 23

This control report shows the results of changing our views to write their outputs to one file.  The total record count of all records written remains the same.  They are now consolidated into one single file, rather than 3 files.

Similar to the Trace output, the order of the records written to the extract file by two or more threads is unpredictable. 

As we’ll discuss in a later module, there is a performance impact for multiple threads writing to a single output file.  Threads ready to write a record may have to wait, and the coordination of which thread should wait consumes resources.  Therefore it can be more efficient to limit the amount of sharing of output files when possible, and combine files (through concatenation, etc.) after SAFR processing.

Slide 24

In this execution, we have included additional views, reading the same input event files, but which produce Format Phase views.  The rows showed in red are added to the prior control report as additional outputs from the single scan of the input event files.  This also demonstrates how the standard extract files can be shared across multiple threads.

Slide 25

SAFR provides the ability to control the overall number of threads executed in parallel without changing the views.  The GVBMR95 parameters Disk and Tape Threads specify how many parallel threads GVBRM95 should execute in parallel.  Disk threads control the number of threads reading input Event Files on disk, as specified in the SAFR Physical File meta data; tape threads control those input Event Files which are on tape. 

The default value for these parameters is 999 meaning effectively all threads should be executed in parallel.  If one of the parameters is set to a number less than the number of input Event Files of a specific type, SAFR will generate only that many threads.  GVBMR95 will then process multiple event files serially within one of those threads.  As one event file completes processing, that thread will then examine if more event files remain to be processed and if so, it will process another Event file under the same thread.

Slide 26

The GVBMR95 Control reports show the results of thread governor.  In the top, the disk governor parameters are shown, and the number of records read by each thread are shown.  Because each thread only processed one event file, and all were done in parallel, the records read from the event file and for the thread are the same.

In the bottom example, the thread governor has been set to 1, meaning only one thread will be run.  In this instance the control report shows that the total records read for the thread equal the total reads read for all event files because this thread processed each event file serially until all threads were complete. 

Slide 27

The above highlight the key z/OS statistics often tracked from GVBMR95 runs.  These include CPU time, elapsed time, memory usage, and IO counts.

The top set of statistics are from the run with parallelism, the bottom from a run using the Thread Governor with no parallelism.  The impact of parallelism can be seen in the difference between the elapsed time for a job with parallelism and one without. Parallelism cut the elapsed time in half. 

The Extract Program, GVBMR95, is subject to all standard z/OS control features, including job classes, workload class, priority, etc.  In other words, even with parallel processing, the z/OS job class or priority may be set to such a low level, and the other workloads in higher classes on the machine may be so significant that effectively no parallel processing occurs even though GVBMR95 instructs z/OS to execute multiple threads.  Each of these threads receive the same job class, priority, etc. as the entire Extract Phase job.  In this way the operator and system program remain in control of all SAFR jobs.

Slide 28

This module described SAFR Parallel Processing. Now that you have completed this module, you should be able to:

  • Describe SAFR’s Parallel Processing feature
  • Read a Logic Table and Trace with multiple sources
  • Set Trace parameters to filter out unneeded rows
  • Configure shared extract files across parallel threads
  • Set thread governors to conserve system resources
  • Debug a SAFR execution with multiple sources

Slide 29

Additional information about SAFR is available at the web addresses shown here. This concludes Module 19, Parallel Processing

Slide 30

This page does not have audio.

Training Module 18: Extract Time Summarization and Other Functions

The slides for the following video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 18, Extract Time Summarization and Other Functions

Slide 2

Upon completion of this module, you should be able to:

  • Describe uses for the Extract-Phase Record Aggregation (formerly knows as Extract-Time Summarization (ETS)) 
  • Read a Logic Table and Trace with ETS views
  • Describe how other SAFR functions affect the Logic Table
  • Explain the Function Codes used in the example
  • Debug ETS and other function views

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

In the prior module, we learned how the Format Engine creates summarized outputs.  Except for producing Standard Extract File formatted records, the Extract Engine typically plays no role in summarization.  The one exception is Extract Time Summarization or ETS.  With ETS, some level of summarization occurs at extract time. 

This can have significant benefits if the final summarized output file is relatively small, but the number of event records required to produce it is very large.  This reduces the IO required to write all the detailed extract records, and then for the sort utility to read all those records again.  This is a common problem where high level summaries are required for initial analysis of results, before investigating greater detail.

Slide 5

Use of ETS is specified in the View Properties pane, Extract Record Phase Aggregation parameter.  The view developer specifies the use of Extract Time Summarization.  They also specify how many summarized sort keys the Extract Engine should hold in memory during extract time.  Only the sort keys for records held in this buffer at any one time are eligible summarization.  

Specifying a large number of records may result in greater summarization during the extract phase.  However the Extract Engine allocates memory equal to the number of records multiplied by the number of bytes in each extract record (multiplied by the number of input partitions the view reads when running in parallel mode).  If large buffers are specified, or many views use ETS, the Extract Engine may require substantial amounts of memory.

Slide 6

Although ETS collapses some records, complete record aggregation is only assured in the Format Phase.  ETS may still result in duplicate records, depending on the order of the input records, because the keys required may overflow the ETS buffer.  If the total number of output keys exceed the buffer size, the Extract Engine will dump the record with the least used key to the extract file.

In this example, the ETS record buffer is set to 3 records.  However, the total number of output records in the final file will be 4.  Because of the order of the input records, the extract file will contain two records with the same key. 

Slide 7

After the extract process, the Sort process as part of the Format Phase places all duplicate keyed records next to each other.  Thus, no matter the number of duplicates in the Extract File, all be eliminated by during Format Phase processing. 

In the example on the left the ETS Buffer is set to 3, and two records with the key AAA 111 are written to the extract file.  In the Format Phase these records are summarized together.  If the ETS Buffer had been set to 5, memory would have been allocated but never used. If the ETS buffer had been set to 4 no memory will be wasted, and no unnecessary IO will be required. 

Slide 8

Consider the following when setting the ETS buffer size:

  • The total number of summarized keys in the output file.  ETS is often used for thousands and even tens of thousands of rows.  However it is not typically used for hundreds of thousands of rows or more.
  • The sort order of the input file.  If the sort fields for the view are the same as the sort order of the input Event File, then a buffer size of 1 will result in no duplicates in the extract file.  This is because once a row is dumped to the extract file the same key will not appear again in the input file.
  • The Extract Phase memory available which will be impacted by
  • The number of views executing in this Extract Phase
  • The memory required for reference tables being joined to by these views
  • ETS Buffer sizes specified in other views
  • Other items, such as the size of the logic table, buffer sizes for input and output files and Common Key Buffering, etc.

Slide 9

Similar to the Format Phase, ETS only acts upon CT column data for records with the same Sort Key values. However, unlike Format Phase processing with multiple column calculations possible, ETS only performs summarization. Multiplication and division of values is not possible in ETS.  Also similar to Format Phase processing, resulting DT values are unpredictable. 

In this example the ETS extract file output is only two records, one for “F” and one for “M.” The CT packed values are accumulated to produce the accumulated results.

Slide 10

Although Standard Extract File Format is typically used to send data to the Format Phase, Logical Records can be constructed to read a specific views extract records for additional processing.  The ETS output records are often used with SAFR piping to reduce records processed in downstream threads.  Piping will be discussed in a later module.

Slide 11

The only change to the Logic Table when ETS is used is the WRXT Write Extract Record function is changed to a WRSU Write Summarized record function.  Although a WRXT row can only write one record to the extract file, a WRSU function may write an extract record to the ETS buffer and also a separate overflow record to the extract file if the buffer has overflowed.  At the conclusion of Extract Phase processing, all remaining records in the buffer will be dumped to the Extract File.  These additional write functions are not shown in the Logic Table trace.   The Logic Table trace only shows this function once even though no record or multiple records may have been written to the Extract File. 

Next we’ll examine additional Logic Text Keyword functions, and how they impact the Logic Table

Slide 12

SAFR Logic Text Keywords often manipulate the Constant portion of a CFEC or CFLC Compare Field to Constant.  For example the BatchDate keyword creates a constant of the execution date.  Thus the logic text “Selectif(ORDER_DATE<BATCHDATE())” that is run on January 3, 2010  creates a CFEC comparing ORDER_DATE to the constant “2010-01-03”

Many of the keywords allow math.  For example if the Batch Data contains a +3 inside the parenthesis, and were run on the same date the constant would be “20100106” rather than “03”

These function create very efficient processes as constant manipulation is not required during run time. Only the comparison is required.

Slide 13

The Batch Date (also referred to as the “RunDate”) defaults to the system date if not specified in the JCL.  It can also set to a specific value as a JCL parameter in GVBMR86.  The date is therefore consistent across all views in that execution of GVBMR86.

This date can also be updated in the Logic Table Build program, GVBMR90.  This is useful when VDP’s are managed like source code and only created when changes are made to views.  In this case the VDP will contain a date for when the VDP was created.  These dates can be updated in the Logic Table for the current date by running GVBMR90 in the batch flow and use of the GVBMR90 parameter.

Because processing schedules may provide inconsistent results in rerun situations or when processing near midnight, some installations have written a small program which accesses a enterprise processing table containing a processing date to create this JCL parameter for use by GVBMR86 or GVBMR90.  At the end of successful processing this same program then updates the processing date table with the next days date.

Slide 14

In addition to Batchdate, which returns a full date in CCYYMMDD format, additional key words provide access to a portion of the run date, from day, month, quarter and year.  Additionally, each of these keywords allows a numeric parameter which will add or subtract from the current batch date value for use in logic text.

Other keywords allow for calculations and comparisons of dates.  Days, months and years between functions performs appropriate date math to return the number of days, months or years.  The returned values can be used to evaluate logic text conditions or placed in columns. 

Slide 15

The Fiscal Keywords enable “moving” or “sliding” selection criteria using date criteria.  For example, if the view should select data for the current month, the record filters would need to be updated each month as the BATCHDATE value changes. Batch Date returns the month value, for example, a 9, whereas Fiscal Month is a relative month, with “0” meaning “the current month.”  Using the Fiscal Date and Control Record, the resolved constant in the Logic Table changes without requiring any changes to the view. 

For example, to select data from the current month use the logic text SELECTIF(ORDER_DATE = FISCALMONTH(+0)).  When this is executed with a Fiscal Date parameter of June, it results in a constant selecting June records.  Without updating the view but changing the Fiscal Data parameter to October results in selection of October records.  As the fiscal date constant is updated each month, this effectively creates “moving” criteria for the current month.  Attempting to use the BATCHMONTH keyword would require changing the criteria from a value of “6” for June to “10” for October. 

Also note that the Fiscal Date parameter is useful when processing on a subsequent day, perhaps passed midnight after all business is closed for the prior day.  The Fiscal Date can be set to the prior day or month, recognizing that the data is from the last day of the month, whie the Batch Date reflects when the process was actually run.

Slide 16

The Fiscal Keywords returns dates based on the fiscal values in the control record for the environment for a view. Each view specifies which control record should be used for its fiscal date processing.  This means that different views in the same batch run can have different fiscal dates because they are associated with different control records.  This is useful for processing views for multiple companies that have differing fiscal year ends. By comparison, RUNDAY is the same for all views in a batch.

In this example, the view is assigned the Fiscal Date for Control Record ID 1.  Other views processed at the same time may be associated with Control Record ID 2.

Slide 17

Similar to the Run Date, the actual fiscal date can be overridden in the JCL parameters.  To do so, under the keyword “Fiscal Dates” the Control Record ID is followed by an equal sign, and the override date to be used.  Multiple dates for different control records can be listed. 

Fiscal dates can also be updated through the GVBMR90 parameter for VDP built previously.  If no Fiscal Date parameters are passed to either program, they default to the Run Date as the Fiscal Date for any fiscal date keywords.

In this example, the Fiscal Date for Control Record ID 1 is set to 2010-12-01.  Note that the Run or Batch Date is set June 1, 2010. 

Slide 18

Next we’ll show a logic table containing a resolved fiscal date keyword.  In this example, we use the FISCALMONTH keyword, which requires a field in date form with a CCYYMM content code.  Our selection logic is SELECTIF(ORDER_DATE_CCYYMM = FISCALMONTH(-5)).  The RunDate will be set to 2010-06-01, and Fiscal Date to 2010-12-01.  This should result in selecting records from 5 months ago relative to the Fiscal Date parameter passed to GVBMR86, or records from fiscal month 2010-07.

Slide 19

The view logic generates a CFEC function, shown at the top, with the constant of 2010-07 and trailing digits.  Because the comparison is only 6 bytes these trailing digits have no impact upon processing.  The Logic Table Trace below this shows this value in HEX mode.  In this run, all records failed this test.

Note that if they had passed this selection logic, the lookup is effective dated.  This can be seen from the LKDC, Lookup Key Date Constant function in the logic table.  Note also that this lookup uses the run date, 2010-06, not the fiscal date or the adjusted fiscal date of the Logic Text.  This may produce inconsistent results between the selection logic and the date effective join.  Care must be taken in creating views to be sure parameters are consistent.  Viewing the generated Logic Table can help spot these types of inconsistencies.

Slide 20

Like BATCH DATE, additional keywords provide access to portions of the fiscal date, including Day, Month, Period, Quarters and Year.  These parameters all operate against the Fiscal Date parameter, and can be used in logic text in the view.  Each modifies constants in the Logic Table to perform the function.  They also allow numeric parameters to perform calendar math, either forwards or backwards.

Slide 21

Other Logic Text functions also manipulate the logic table in particular ways.  The String keywords of ALL and REPEAT replicate constant parameters in the logic text into constants in the logic table comparison functions.  The ISFOUND and ISNOTFOUND functions can change the logic table Goto True and False rows for lookups.  Other key words generated specialized logic table functions, like the ISSPACES which generates a CSL logic table function. 

Slide 22

These logic table functions are much less common, but might be seen in logic tables.  They perform functions such as declaring accumulator variables, setting those accumulators to values, using them in comparisons and extracting them, and performing math or functions against them.  Also logic table functions which work against the prior record might be seen, or range comparisons, or string searches.

Also, if the Logic Text keyword “Begins with” is used, SAFR changes the length of the field being tested to the length of the constant in the logic text. This may result is something like a standard CFEC compare field Event file field to a constant in the Logic Table with the adjust field length.

Slide 23

In this module, we examined the WRSU logic table function, which writes summarized Standard Extract File records.  Other less common logic table functions were also introduced.

Slide 24

This module described extract time summarization and other logic text keyword functions. Now that you have completed this module, you should be able to:

  • Describe uses for the Extract-Phase Record Aggregation (formerly knows as Extract-Time Summarization (ETS)) 
  • Read a Logic Table and Trace with ETS views
  • Describe how other SAFR functions affect the Logic Table
  • Explain the Function Codes used in the example
  • Debug ETS and other function views

Slide 25

Additional information about SAFR is available at the web addresses shown here. This concludes Module 18, Extract Time Summarization and Other Functions

Slide 26

Training Module 12: The Single Pass Architecture

The slides from the following video are shown below.

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 12, The Single Pass Architecture

Slide 2

Upon completion of this module, you should be able to:

  • Describe the Single Pass Architecture
  • Read an Extract only Logic Table and Trace
  • Explain the Function Codes used in the example
  • Debug AND and OR selection logic

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

One of the most powerful features of SAFR is the single pass architecture.  Input / Output processing or IO is often the slowest part of data processing, at times thousands of times slower than other functions.  SAFR’s single pass architecture allows for a single read of the input data to produce many different outputs.  Each output can be unique, as specified by individual views.

Slide 5

In this module, we will continue to build on the prior lesson, where a Copy Only view was processed alone.  In this module, we will add an Extract Only view.  Both of these views do not require the Reference and Format phases of the SAFR Performance Engine.

In this example, we’ll also execute the Copy View at the same time as the Extract Only View, demonstrating how the SAFR Single Pass Architecture works.  The logic table will contain two views in one logic table.  The Extract Engine, GVBMR95 will produce two outputs.

Slide 6

Instead of copying all fields on the input records to the output files, the Extract Only view writes selected fields to the output file.  Any field may be written to the extract file, in any order, regardless of the order or position on the input file. Field formats may also be changed, for example changing a zoned decimal to a packed field format. These columns will cause DTE Logic Table functions to be generated in the Logic Table.  Constants can also be written to the output file.  

Constants use DTC functions in the Logic Table.  Extract only views can also contain looked-up fields, which will be explained in the next module, and which generate DTL logic table function codes.

In our example, columns 1, 2, 3 and 5 of the view will extract the Cost Center field, Legal Entity, and the Account and Record Count respectively.

Slide 7

Logic text in Column 4 will cause the output file to contain a constant of either the value Asset Account or Liability Account. “AssetAcct” will be assigned if the account number field on the input file contains the values “111,”  “121” or “123”.  Otherwise column 4 will contain the Liability account constant “which is the value “LiabAcct”.  This logic text will create multiple CFEC functions, introduced in the prior module.

Slide 8

Column 6 contains Logic Text that tests the input amount field.  If the amount is a valid number (is numeric), it will be written to the output file.  If the amount on the input file is not numeric, a constant of a zero will be written to the output file.  This logic will generate a CNE function in the logic table.

Slide 9

This is the Logic Table generated for both the Copy View and the Extract Only view. 

The portions of the logic table generated for the Copy Only view example in the last module remain mostly unchanged.  It includes the HD Header, RENX Read Next, and logic table rows 3, 4 and 5.  The last two rows of the logic table, the ES End of Source (sometimes called End of String) and EN End of Logic Table functions are very similar as well.  Only the row numbers on these last two rows have changed

Slide 10

Our new Extract Only view, number 3262, is inserted in the middle of the logic table.  So each record read from the input file will first be passed through all the Logic Table Functions for the Copy Only view, and then through the logic table functions of our new Extract Only view.

Slide 11

Our Extract only view again includes an NV – New View function, and concludes with a WR function.  In this case, instead of a WRIN function which writes the input record (makes a copy), it is a WRDT function, Write the Data area of the extracted record. 

Slide 12

Columns 1, 2, 3 and 5 simply move fields from the input file to the output file.  These columns cause DTE functions to be generated in the logic table.  The DTE function simply moves data.  The “DT” is derived from “Data” , the “E” means the source is the input Event File field.

Each DTE function is followed by a Sequence Number.   The Sequence Number for each DTE shows the column number causing that logic to be generated.

Slide 13

Each DTE row also contains the position and length in the source Event file.  These positions, lengths, formats and numbers of decimal places are taken from the Logical Record describing the input file.

Each DTE row also contains the length and format to be placed in the output file.    A difference between the length, format, or number of decimals between the Source and Target columns indicates SAFR has performed a transformation before writing that field. In this example no transformations occurred.

The report does not show the position in the output file. The start position in the final output file is shown in the view editor of the Work Bench.  But the final output file position may be different than the extract file position, depending on the view type.  The extract file position can be calculated by accumulating the lengths of all preceding column outputs. 

Slide 14

The first part of the Logic Text in Column 4 contains the text “If Account = 111 or Account = 121 or Account = 123”.  This clauses causes multiple CFEC functions, (Compare Field, from the Event file to a Constant) to be generated in the Logic Table.  The CFEC functions in logic table rows 10, 11 and 12 are generated by this specific IF statement. 

The Sequence Number field of the report shows the column number that contains the logic that created the CFEC.  In this instance, the logic is contained in column 4 of the view.

Slide 15

CFEC functions work together to complete a complex OR or AND test.  In this example, the “OR” statements caused three CFEC functions to be created. 

The first test for Account equal to 111 is performed on Logic Table Row 10.  The second test for Account equal to 121 on Logic Table Row 11, and the third test for Account equal to 123 on Row 12.

Slide 16

Because each CFEC tests the ACCOUNT Field—the “E” or Event File field portion of the CFEC—the position, length and format of the ACCOUNT field is shown in the Source attribute columns.  Because the ACCOUNT is used three times in the logic text, the same position, length and format are used on all three CFEC rows.

Slide 17

The constants—the second “C” of the CFEC—are also placed in the logic table.  These three constants are placed at the end of the respective logic table rows.  These three constants all are a comparison type of 001 which is equal, and are all 3 bytes long

Slide 18

The OR condition of the logic text determines the numbers placed in the GOTO rows.  If the value in the Account field on the input Event file is “111”, then the result of the first test is “true” and the record should be selected for additional processing within the column.  Thus executing will jump to row 13, as specified in the Goto Row1 which is the true condition branch. 

Slide 19

If the logic table row tests false, then the other tests of the OR condition must be evaluated, including testing for “121” or “123”.  Thus the False GOTO Row is row 11, the next logic table row, where the next CFEC function will test against a constant of “121” rather than “111”

Slide 20

The second OR condition creates a similar pattern on the next CFEC function, testing against constant “121” on logic table row 11.  If true, then the next row to be executed is row 13 where the column will use the input record in some way.  If the value in not “121”, then the next row executed is row 12, the next OR condition to test against value “123”.

Slide 21

Row 12 tests the Account field on the input Event file for value “123”.  If it test true, then the next row to be executed is again row 13 which will move a constant to the output record. 

If the value in not “123”, then the next row executed is row 15 which means the else condition for the column value will be used.

Slide 22

The THEN and ELSE portions of the logic text cause additional rows to be generated in the Logic Table.  If a field was to be moved from the input to the output file, they would be DTE functions, like those generated for columns 1 and 2 and others.  In this example, constants are to be moved to the output record, so DTC functions are generated, DTC meaning Data from a Constant.

Logic Table rows 13 and 15 are both DTC functions.  Row 13 places the constant for Asset Account in the output file if any of the CFEC functions tested true.  Only if the ACCOUNT field on the Input File has the value “111”, “121” or “123” will the output column receive the value of “AssetAcct” in it. 

On the other hand, row 15 places a value for Liability Accounts.  It is executed if ALL CFEC tests are false.  Thus any ACCOUNT value besides those three tested will result in a value of “LiabAcct” in column four of the output file.

Slide 23

If all rows in the Logic Table were executed sequentially in order without skipping any rows the constant of Liability Account would overwrite all the Asset Account constants place in the output record.  To prevent this, a GOTO row is used to skip the DTC for Liability Account whenever an Asset Account is used.

In our example, Row 14 is a GOTO row.  If Row 13 placed the Asset Account value in the output file, the program naturally falls through to row 14.  The Logic Table then tells the program to jump to row 16, skipping row 15.  This prevents the value of Asset Account from being overwritten with the Liability Account constant.

Slide 24

With OR logic, all three CFEC rows execute row 13, which places the Asset Account value in the output.  If any one of the rows is true, Asset Account is placed in the output. 

If our logic text were changed, and we used AND logic to test three different fields, our GOTO ROW1 and ROW2 would be swapped. AND conditions require that all three rows test true. The effect is that the True and False row numbers switch places for AND verses OR logic.

Slide 25

With AND logic, a true condition on each CFEC causes the logic to continue to the next row of the logic table to continue the test.  After the final test of the AND condition, if all tests have proven true, the DTC function on Logic Table Row 13 is executed to build the Asset Account constant.  If any of the CFEC functions prove untrue, execution continues at Logic Table Row 15, the Liability Account DTC function.

A common debugging problem occurs if logic text requires the same field to contain multiple values by using AND when OR was intended.  The same field on a single input record can never contain multiple values. For example the field Account can never equal “111” AND “121”.  The condition would always prove false. Using the Logic Table to read how the logic is interpreted can help uncover these types of problems.

Slide 26

The last column of the view tests to ensure only valid numeric values are placed in the output file using the Logic Function “ISNUMERIC”.  The “Is Numeric” function in Column six of the view generates a CNE function, a Class Numeric test of an input Event file field.  The CNE function is similar to a CFEC function.  It tests a value and directs execution to the GOTO 1 or 2 rows depending on the results of the test.

The CNE function on row 17 in our example tests if the input field AMOUNT contains a valid numeric value.  If so row 18 is executed.  If the input field is NOT numeric, GOTO ROW 2 will cause row 20 to be executed.

Slide 27

The THEN and ELSE conditions of Logic Text  for column 6 perform different functions.  The THEN Condition causes a field from the input file to be moved to the output file.  The ELSE condition causes a constant of zero to be placed in the output file. 

The true test of the THEN condition of the CNE test on row 17  will execute row 18, a DTE function, moving the Amount from the input record to the output. 

The false ELSE condition on NON numeric causes row 20 to be executed, a DTC function placing a constant of 0 in the output file.

Slide 28

The final instruction of the Extract Only view is the WRDT function. This function  is generated by default at the end of a view if there is no WRITE Logic Text function in the view.  In these cases, it is always executed. 

In contrast to the WRIN function which moves the Input Record to the output file, the WRDT function moves data from the Extract record area to the output file.  All of the data moved to the extract record through the DT functions, both DTCs or DTEs, are actually written to the output file.

The WRDT is followed by a Sequence Number 1, meaning it writes its data to file number 1.  This is the same file the WRIN function of the Copy View uses.  Thus after the first input record is processed, the first record in the output file may be the copy of the input record selected by the Copy View, followed by the Extract Only data of the second view.

Slide 29

Having examined the Logic Table, let’s use our three record file again to see how it behaves through the trace process.  Trace is turned on by setting the TRACE parameter to “Y” in the MR95 Parameters file.

Slide 30

The first three rows of the trace are for view 3261, our Copy View from the last module. The first input record is compared against the constant in the CFEC function.  The comparison is true, and so the next row of the logic table is executed. 

Slide 31

Because the test proved true, the input record is copied to the output file by the WRIN function.

Slide 32

Because our SAFR execution included running more than one view, instead of looping to the next Event File record and the Copy Input View processing it, the input record is passed to our new Extract Only view, number 3262

Slide 33

Note that some rows of the logic table are not executed as record 1 is processed in this example.  Rows 11 and 12 are not executed because the OR condition in the Logic Text; the first condition proved true, so test 2 and 3 was not necessary. 

Also, Row 15 which would have placed the Liability Account in the output file, was skipped by the GO TO Function on row 14.

Row 20 was also skipped, because the amount was a valid number, so it was not replaced by a constant zero.

Note that the trace does not convert Packed and other “un-printable data” to a printable format.  The number tested on row 17 appears to print as a “p”, but if viewed in HEX mode, will display as a valid packed number based upon the format of the field used in the Numeric test. 

Slide 34

The Single Pass architecture allowed the same record to be used to create two output records, one which was an identical copy, and one which contained selected fields and constants. The second record in the output file contains many of the same fields as the input, but in a different order, for the Cost Center and Legal Entity, each built by DTE functions.

The Account value of “111” is written to the same position as in the input record by a DTE function

The Account Description is next, in this case Asset Account, built by a DTC function. 

Having tested the amount on the input file and found it to be numeric, the view copies it in the last position to the output file using a DTE function.

Both views are able to make use of the same input record, without having to read the file twice.  By making changes to the view, these output records could also have been written to different output files if desired.

Slide 35

Record 2 follows a similar pattern.  Record 2 is passed first to the Copy view which writes it to the extract file.  Record 2 is then passed to the Extract Only View. 

Note that the AMOUNT field on the input record two has a non-numeric amount of “alpha” in it.  This causes the Extract Only CNE test to be false, and thus this value is not moved to the output record.  Rather a packed constant of zero is moved to the output file (which is unprintable in this slide and shown as a series of periods).

Slide 36

Record 3 is read, which then is NOT selected by the Copy Only view because the Legal Entity tests false; the Legal Entity is 731, not the 999 required.  Thus this record is not written to the extract file, and the input records is passed to the Extract Only view.

When record 3 is processed by the Extract Only view, it is written to the output file.

Slide 37

The GVBMR95 Control Report shows that the three records in the input file have become 5 records in the output files:  2 for the Copy Only view, and 3 for the Extract Only view.

The Extract Engine in this process has significantly increased efficiency over alternative methods of producing these two outputs, because in a single pass of the file, one IO to get the event file into memory for processing has allowed both outputs to be done. Certainly programs can be written to do this same thing, but it demands a programmer writing the program to design it that way. With SAFR, two people independently can create views, and the tool will resolve them efficiently.

Remember, though, that this process does not include any parallelism. View number 2 is executed after view number 1 has seen the event record.  We’ll explain parallelism in a later module.

Slide 38

This logic table and module has introduced the following Logic Table Function Code:

  • CNE, a Class test Numeric on an event file field.
  • DTE, which moves an input field to the output buffer
  • DTC, which moves a constant from the logic table to the output buffer
  • WRDT, which writes the DT data to the extract file

Slide 39

This module described the single pass architecture and additional logic table processing using the Extract Only view. Now that you have completed this module, you should be able to:

  • Describe the Single Pass Architecture
  • Read an Extract only Logic Table and Trace
  • Explain the Function Codes used in the example
  • Debug AND and OR selection logic

Slide 40

Additional information about SAFR is available at the web addresses shown here. This concludes Module 12, The Single Pass Architecture

Slide 41

Training Module 8: Common Performance Engine Errors

The slides used in this video are shown below:

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR. This is module 8: “Common Errors Encountered When Running the Performance Engine.”

Slide 2

This module provides you with an overview of common errors encountered when running the SAFR Performance Engine. Upon completion of this module, you should be able to:

  • Diagnose and resolve common Performance Engine errors,
  • Set output record limits to assist in debugging, and
  • Analyze the relationships between components using the Dependency Checker.

Slide 3

Under normal circumstances, the SAFR Performance Engine transforms data and provides useful outputs. However, if a Performance Engine program detects an unusual condition, it will display an error message. Errors can occur for many different reasons.

Recall that the SAFR Performance Engine consists of six phases. Each phase creates data sets that are read by subsequent phases. If the JCL that executes a program does not point to the data set that the program is expecting, or if it is missing a needed data set, an error is reported.

Errors can also occur if keywords in JCL parameters are missing or misspelled. On the following slides, we’ll cover the most common error conditions and describe how to remedy them.

Slide 4

Recall that the Select phase produces a control report named MR86RPT, or the MR86 Report. This control report contains useful statistics for the current run and, if an error occurs, an error message to help you diagnose the problem.

Slide 5

A view must be active and saved to the Metadata Repository to be selected by the MR86 program. If it is not, an error message is displayed in the MR86 Report, specifying the view number and name.

To remedy this problem, find the view in the Workbench, activate the view, save it, and then rerun the job.

Slide 6

The MR86 program looks for views only in the environment specified by the ENVIRONMENTID parameter in the MR86 configuration file. If the MR86 Report displays a message saying that no views can be found, the problem is often that the wrong environment has been specified there. This value must match the value that is displayed in the Workbench status line, which is the bottom line of the Workbench window. We can see here that the true environment ID is 7 and the value for ENVIRONMENTID in the JCL is 7777, which is a typographical error. In such cases, you correct the ENVIRONMENTID value in the configuration parameters and rerun the job.

Slide 7

For the MR86 program to find your views, the database connection parameters in the MR86 configuration file must be specified correctly. One connection parameter is the schema. The value of the schema parameter must match the value that is labeled as “database” in the Workbench status line. In this example, the schema value is SAFRTR01, but the SCHEMA parameter is SAFRTR01XXXX. Because these two values do not match, a database error is displayed in the MR86 Report. In such cases, you should correct the SCHEMA value in the configuration parameters and rerun the job.

Slide 8

In the Compile phase, the MR84 program reads the XML file created by the prior step and converts it to a form that can be executed by the Extract phase later. This process, called compiling, checks for syntax errors and other errors in the XML file and displays associated messages in the MR84 Report. This report seldom displays any error messages because under normal circumstances, views that are read by the MR84 program have already been compiled by the Workbench. If you encounter a problem in this phase, report it to SAFR Support.

Slide 9

In the Logic phase, the MR90 program reads the VDP file created by the prior step and converts it into data sets that are used later by the Reference phase and the Extract phase. Errors that occur in this phase are rare and are usually the result of database corruption. If you encounter a problem in this phase, report it to SAFR Support.

Slide 10

In the Reference phase, the MR95 program converts reference data into a form that is appropriate for lookup operations that are executed subsequently in the Extract phase. MR95 produces a control report known as the MR95 Report, which either indicates successful completion or displays errors that must be corrected before proceeding.

Slide 11

The view shown here reads the ORDER_001 logical file and displays data from two lookup fields: CUSTOMER_ID and NAME_ON_CREDIT_CARD.

Slide 12

We can see from the column source properties that the CUSTOMER_ID field receives its value from data accessed by the CUSTOMER_from_ORDER lookup path.

Slide 13

We can also see from the column source properties that the NAME_ON_CREDIT_CARD field receives its value through a different lookup path named CUSTOMER_STORE_CREDIT_CARD_INFO_from_ORDER.

Slide 14

If the JCL for the Reference job does not contain DD statements for all reference files needed to support the selected views, the job terminates with an abnormal end (or “abends”). The MR95 Report displays a message with the DD name of the missing file, which in this case is CUSTCRDT.

Slide 15

The error condition in this case can be remedied by adding the CUSTCRDT file to the JCL and rerunning the job.

Slide 16

The Reference Extract Header (or “REH”) file, with a DD name of GREFREH, is required for the Reference phase to function correctly. If the DD statement for the REH file is missing, an error message is displayed in the MR95 Report, saying that the DCB information for the file is missing. To remedy the problem, add the DD statement for GREFREH to the JCL and rerun the job.

Slide 17

Similarly, a Reference Extract Detail (or “RED”) file associated with every lookup path is required for the Reference phase to function correctly. DD names for RED files begin with the letters “GREF” and end with a 3-digit number. If the DD statement for a RED file is missing, an error message is displayed in the MR95 Report, saying that the DCB information for the file is missing. In this case, the GREF004 file is missing. To remedy the problem, add the DD statement for GREF004 to the JCL and rerun the job.

Slide 18

For reference files to function correctly in SAFR, there are two rules which must be followed:

  • First, records must have unique keys
  • And second, records must be sorted in key sequence.

If either of these rules is broken, a “lookup table key sequence” error message is displayed in the MR95 Report. To resolve the problem, we must first examine the error message and note the DD name associated with the reference work file in error. In this case, the DD name is GREF004. We can then use this to find the DD name of the source reference file. Scanning upwards in the MR95 Report, we can see that GREF004 is written using records read from the DD name CUSTCRDT.

Slide 19

To research this problem further, we must find the length of the key associated with the logical record for customer credit card Info. In this case, it is 10. We must also find the data set associated with the DD name CUSTCRDT. From the JCL, we can see that it is GEBT.TRN.CUSTCRDT.

Slide 20

If we open the data set GEBT.TRN.CUSTCRDT on an ISPF Edit screen, we can see that records 5 through 9 all have the same key value. But rule #1 stated that each key must be unique. We can also see that the records are not in key sequence, which breaks rule #2. To solve our original problem, we remove the duplicate records and sort the records in key sequence. Doing this and then rerunning the job will remove the error condition.

Slide 21

Sometimes when you are debugging a problem, it is useful to limit the number of records written to an extract file. You can set such a limit in the Workbench on the Extract tab of the View Properties screen for a view. Processing for the view will stop after the specified number of records is written. Remember, you must rerun from the Select phase to incorporate the view changes into the Extract phase.

Slide 22

You can set a similar limit for the final output file that is written in the Format phase. You can set such a limit in the Workbench on the Format tab of the View Properties screen for a view. Processing for the view will stop after the specified number of view output records is written. Again, you must rerun from the Select phase to incorporate the view changes into the Format phase.

Slide 23

The Dependency Checker is a useful tool for debugging views. You can select it from the Reports menu of the Workbench.

Slide 24

In this example, you want to see all direct dependencies for view #71. First, select the appropriate environment, and then select the component type, which in this case is View. Then select view #71 from the list.

For debugging purposes, we normally select the Show direct dependencies only check box. Clearing the check box would cause indirect dependencies to be displayed, too, and this is typically useful only when an impact analysis is needed for a project.

Slide 25

After clicking Check, you see a hierarchical list of all the SAFR metadata components that are used by the view. This helps you determine whether any components that should not have been referenced are referenced and whether any components are missing. Portions of the list can be expanded by clicking the plus sign or collapsed by clicking the minus sign.

Slide 26

In the Extract phase, MR95 executes a series of operations that are defined in the Extract Logic Table, or XLT. The MR95 Trace function causes these operations to be written to a report to aid you in debugging a view. The Trace function can be selected and configured in MR95PARM, the MR95 parameter file.

Trace can be used to diagnose many SAFR problems:

  • LR problems
  • Lookup problems
  • View coding errors, and
  • Data errors

The Trace function will be described in detail in an advanced SAFR training module.

Slide 27

When you are developing a view, you should apply several guidelines to make your work more efficient.

Begin by coding a small subset of the requirements for a view, and then testing that. For example, you may choose to code a view with only a small number of columns or a smaller record filter. Once you have successfully gotten this view to work, you can add more complexity and continue until the view is complete.

It is useful to start testing with only a small number of records. In this way, you can more easily see the results of your view logic.

Verify that the output format for each column is correct. If it isn’t, review the column properties.

Check to make sure that the number of records written is what you expected. If it isn’t, check the record filters to verify that the logic is correct.

Finally, verify that the output value for each column is correct. If it isn’t, examine the column logic.

Slide 28

This module provided an overview of common errors encountered when running the SAFR Performance Engine. Now that you have completed this module, you should be able to:

  • Diagnose and resolve common Performance Engine errors
  • Set output record limits to assist in debugging, and
  • Analyze the relationships between components using the Dependency Checker

Slide 29

Additional information about SAFR is available at the web addresses shown here.

Slide 30

Slide 31

This concludes the Module 8. Thank you for your participation.