Proposed GenevaERS 5.0 Architectural Direction

The following are some initial thoughts on the next version of GenevaERS as an Open Source project might go:

Workbench

Currently the only way to specify GenevaERS processes (called a GenevaERS “view”) is through the Workbench, which is a structured environment allowing specifications of column formats, values to be used in populating those columns, including the use of logic called GenevaERS Logic Text.

GenevaERS developers have known for years that in some cases a simple language would be easier to use. The structured nature of the Workbench is useful for simple views, but becomes more difficult to work with for more complex views.

Specific Proposals

In response, we propose enhancements to the front-end of GenevaERS for the following:

  • Open up a new method of specifying logic besides the Workbench, initially Java.
  • This language would be limited to a subset of all the Java functions as supported by the the extract engines.
  • The current Workbench compiler would be modified to produce a GenevaERS logic table from the Java code submitted to it.
  • Develop plug-ins for major IDE’s (Eclipse, Intellij) that highlight use of functions GenevaERS does not support in the language.
  • GenevaERS Performance Engine Processes should be able to construct a VDP (View Definition Parameter) file from a mix of input sources.

Benefits

Doing this would allow:

  • Storage of GenevaERS process logic in source code management systems, enabling all the benefits of change management
  • Opening up to other languages; often extracts from GenevaERS repositories are more SQL-like because of the data normalization that happens within the repository
  • Taking in logic specified for Spark and other execution engines which conform to GenevaERS syntax, providing higher performance throughput for those processes
  • Begin to open the possibility of constructing “pass” specifications, rather than simply defined in execution scripting.
  • Perhaps creation of in-line “exit” like functionality wherein the specified logic can be executed directly (outside the Logic Table construct).

Performance Engine

The GenevaERS Performance Engine uses the Logic Table and VDP file to resolve the GenevaERS processes (GenevaERS “views”).

Specific Proposals

Proposed enhancements include:

  • Expanding today’s very efficient compiler to more functions, to support greater sets of things expressed in the languages.  This would include things like greater control over looping mechanisms, temporary variables, greater calculation potential, and execution of in-line code and called functions within a view under the GenevaERS execution engine.
  • If there is interest and capacity, we may even move in the longer term towards an alternative Java execution engine.  This would be a dynamic java engine, similar to the GenevaERS extract engine today; not a statically created for specific business functions as discussed below.

Alternative Engines

Existing customers have expressed interest in alternative execution engines for SAFR logic, as demonstrated in the Initial Open Source Work: Metadata Translation. The following are approaches that can be taken to these requests:

View to Stand-alone Java Program: It is possible to consider creating utilities which translate from GenevaERS meta data to another language, each view becoming a stand alone program, which could simply be maintained as custom processes. This would provide a migration path for very simple GenevaERS Processes to other tooling, when performance is not important.

Multi-View Execution Java Program: A set of GenevaERS views (a VDP) could be converted to a single Java program, which produces multiple outputs in a single pass of the input file, similar to what GenevaERS does today.  In other words, it is possible to look at how GenevaERS performs the one-pass architecture, isolates differing business logic constructs from each other, perform joins, etc., and write new code to do these functions.  This would also provide performance benefits from learning from the GenevaERS architecture.

Dynamic Java Program: Today’s GenevaERS compiler which produces a Logic Table could be converted to produce a Java (or other language) executable. This might add the benefit of making the processes dynamic, rather than static.  This can have benefits to changing rules and functions in the GenevaERS workbench, and some sense of consistent performance for those functions, and the potential benefit of growing community of GenevaERS developers for new functions.

Next Steps

These ideas will be discussed at an upcoming GenevaERS community call to gauge interest and resources which might be applied.

Research on Next Generation SAFR: 2017

The following is an R&D effort to define the next generation of SAFR, composed by Kip Twitchell, in March 2017. It was not intended to be a formally released paper, but the ideas expressed herein have continued to resonate with later assessments of the tool direction.

Proposal for Next Generation SAFR Engine Architecture

Kip Twitchell

March 2017

Based upon my study the last two month, in again considering the next generation SAFR architecture, the need to separate the application architecture from the engine architecture has become clear: the same basic combustion engine can be used to build a pump, a generator, a car or a crane.  Those are applications, not engines.   In this document, I will attempt to outline a possible approach to the Next Generation SAFR engine architecture, and then give example application uses after that.

This document has been updated with the conclusions from a two day workshop at SVL in the section titled “Conclusions” after the proposal in summary.

Background

This slide from Balancing Act is perhaps a simple introduction to the ideas in this document:

(from Balancing Act, page 107)

Key SAFR Capabilities

The SAFR architecture recognizes that building balances from transactions is critical to almost all quantitative analytical and reporting processes.  All financial and quantitative analytics begins with a balance, or position.  Creating these is the key SAFR feature.

Additionally, SAFR enables generation of temporal and analytical business events in the midst of the posting process, and posts these business events at the same time.  In other words, some business events are created from outstanding balances.  SAFR does this very effectively.

Third, the rules used to create balances and positions sometimes change.  SAFR’s scale means it can rebuild balances when rules change; for reclassification and reorganization problems, changes in required analyses, or new, unanticipated analyses.

Fourth, SAFR’s high speed join capabilities, especially date effective joins, and the unique processing model for joins on high cardinality, semi-normalized table structures creates capabilities not available in any other tool.

Last, the ability to do all these things in one pass through the data, post initial transactions, generated new transactions, reclassify some balances or create new ones, and join all the dimensions together, along with generating multiple outputs, and do so efficiently, creates a unique set of functions. 

This combined set of features is what is enable by the bottom diagram.

Key SAFR Limitations

SAFR has often been associated with posting or ETL processes, which are typically more closely aligned with the tail end of the business event capture processes.  It also has association with the report file generation processes at the beginning of the other end of the data supply chain.  It has done so without fully embracing either ends of this spectrum:  It has attempted to provide a simplified user interface for reporting processes which was not completely a language to express posting processes, and it has not integrated with SQL, the typical language of reporting processes.

The idea behind this paper is that SAFR should more fully embrace the ends of these spectrums.  However, to be practical, it does not propose attempting to build a tool to cover end-to-end data supply chain, the scope of which would be very ambitious. Covering the entire data supply chain would require building two major user interfaces, for business event capture and then reporting or analysis.   The approach below contemplates allowing most of the work on either of these interfaces to be done by other tools, and allowing them to use the SAFR engine, to reduce scope of the work involved. 

This paper assumes an understanding of how SAFR processes today to be briefer in writing.  This document is intended for an internal SAFR discussion, not a general customer engagement. 

Proposal in Summary

In summary, SAFR should be enhanced as follows:

Overall

1) The SAFR engine should be made a real-time engine, capable of real time update and query capabilities

2) It should also be configurable to execute in a batch mode if the business functions require (system limitations, posting process control limitations, etc.).

Posting Processes

3) The existing workbench should be converted or replaced with a language IDE, such as Java, Java Script, Scala, etc. The compile process for this language would only permit a subset of the total functions of those languages that fit in the SAFR processing construct.

4) When the engine is executed in “Compiled Code” mode, (either batch or real time), it would generally perform functions specified in this language, and the logic will be static.

5) Real-time mode application architecture would typically use SAFR against indexed access inputs and outputs, i.e., databases capable of delivering specific data for update, and applying updates to specific records.

Reporting Processes

6) SAFR should have a second logic interpreter in addition to the language IDE above, and accept SQL as well.  This would generally be used in an “Interpreted Logic” mode.

7) To resolve SQL, SAFR would be a “bypass” SQL interpreter.  If the logic specified in SQL can be resolved via SAFR, the SAFR engine should resolve the request. 

8) If SAFR cannot satisfy all the required elements, SAFR should pass the SQL to another SQL engine for resolution.  If the data is stored in sequential files, and SAFR is the only resolution engine available, and an error would be returned.

Aggregation and Allocation Functions

9) SAFR functionality should be enhanced to more easily perform aggregation and allocation functions. 

10) This may require embedded sort processes, when the aggregation key is not the natural sort order of the input file.

11) A more generalized framework for multiple level aggregation (subtotals) should be considered, but may not be needed in the immediate term.

12) Additionally, a generalized framework for hierarchy processes should also be contemplated.

Workshop Conclusions:

In the two day workshop I made the point that the z/OS version of SAFR does not meet the requirements of this paper; and the SAFR on Spark version did not as well.  Beyond this, the workshop concluded the following:

(1) The existing SAFR GUI and its associated meta data tables, language, and XML are all inadequate to the future problems, because the GUI is too simple for programming—thus the constant need for user exits—and it is too complex for business people—thus one customer built RMF;

(2) The database becomes a problem in the traditional architecture because there is this explosion of rows of data in the LR Buffer which then collapses down to the answer sets, but the database in the middle means it all has to get stored before it can be retrieved to be collapsed. 

(3) Platform is not an issue considered in this paper, and that the experience of both existing customers seems to demonstrate there is something unique in SAFR that other tools do not have. 

(4) We agreed that although the point about the data explosion and the database in the middle being a problem, taking on the SQL interpretation mode was a bridge too far; we should let the database do the work of satisfying the queries.  SAFR can still produce views which selects and aggregates data to be placed in the database for use, as it has historically.

(5) Stored procedures are just too simplistic for the kinds of complex logic in a posting process; so, we need to build something outside the database which can access the database for inputs and for storing outputs.

(6) There is not a good reason to require the programmer to go clear to a procedural language to specify the SAFR process; The SAFR on Spark POC SQL is a declarative language that does not put as much burden on the developer.

(7) It is possible to use in-line Java code within the SAFR on Spark prototype when a procedural language is required.  This could eliminate simple user exits.

(8) Spark is a good framework for a distributed work load; z/OS manages the parallel processes very effectively, but requires a big server to do so.  Using another language on another platform would require building that distribution network.  That is expensive.

(9) The real time capabilities would simply require creation of a daemon that would then execute the SAFR on Spark processes whenever needed.  This is nearly a trivial problem; performance and response time notwithstanding.

(10) We do not have enough performance characteristics to say if the Spark platform and the prototype will scale for any significant needs.

Real-time Engine and IO Models

The following table outlines likely processing scenarios:

Real-time ModeBatch Mode
Input TypesDatabase Messaging QueueSequential files or Database or Messaging Queue
Output TypesDatabase or Messaging Queue or Sequential FilesSequential Files

Sequential files in this table included native Hadoop structures.  Output from batch mode files could be loaded into databases or messaging queues beyond SAFR processing, as is typically done today with SAFR.

Recommendation:  SAFR today recognizes different IO routines, particularly on input.  It has an IO routine for sequential files, DB2 via SQL, VSAM, DB2 via VSAM. It has had prototype IMS and IDMS drivers.  This same construct can be carried forward into the new version, continuing to keep separate the logic specification from the underlying data structures, both inbound and outbound.  New data structure accesses can be developed over time.

Enhancement:  An additional processing approach could be possible but less likely except in very high volume situations: Real-time mode against input sequential files.  In this configuration, SAFR could continually scan the event files.  Each new request would jump into the ongoing search, noting the input record count of where the search begins.  It would continue to the end of the file, and through the loop back to the start of the file, continuing until it reaches the starting record count again, and then end processing for that request.  This would only be useful if very high volumes of tables scans and very high transaction volumes made the latency in response time acceptable.

Impediments:  The challenges to a real-time SAFR engine is all the code generation interpretation processes in the performance engine.  Making these functions very fast from their current form would be difficult.  Using an IDE for logic specification has a much better chance of making the performance possible.  Performance for the interpreted mode would require significant speed in interpretation and generation of machine code from the SQL logic.

Compiled Code Mode

Background:  Major applications have been built using SAFR.  The initial user interface for SAFR (then called Geneva, or 25 years ago) was intended to be close to an end user tool for business people creating reports.  The Workbench (created starting in 1999) moved more and more elements away from business users, and is now wholly used only by IT professionals. 

Additionally, because of the increasingly complex nature of the applications, additional features to manage SAFR views as source code have been developed in the workbench.  These include environment management, and attempts at source code control system integration.  These efforts are adequate for the time, but not highly competitive with other tools. 

SAFR’s internal “language” was not designed to be a language, but rather, only provide logic for certain parts of a view.  The metadata constructs provide structures, linkage to tables or files, partitioning, join logic.  The view columns provide structure for the fields or logic to be output.  The SAFR language, called Logic Text, only specifies logic within these limited column and general selection logic.

Recommendation:  Given the nature of the tool, it is recommended SAFR adopt a language for logic specification.  I would not recommend SQL as it is not a procedural language, and will inhibit new functionality over time.

The fundamental unit for compile would be a Pass or Execution, including the following components:

  • A Header structure, which defines the meta data available to all subprograms or “method”.  It would contain:
    • The Event and lookup LR structures
    • The common key definition
    • Join paths
    • Logical Files and Partitions
    • Available functions (exits)
  • A View Grouping structure,
    • Defining piping and token groups of views.
    • Tokens and similar (new) functions could potentially be define as In-line (or view) function calls (like tokens) which can be used by other views.
  • Each view would be a method or subprogram that will be called and passed each event file record

The final completed Execution code could be compiled, and the resulting executable managed like any other executable; the source code could be stored in any standard source code control library.  All IT control and security procedures could follow standard processes.  This recognizes the static nature of most SAFR application development processes.

Although this approach offloads some portion of the user interface, language definition, and source code management issues, the next gen SAFR code base would have to include compilers for each platform supported.  Additional SAFR executables would be IO modules, and other services like join, common key buffer management, etc. for each platform supported.

Impediments:  A standard IDE and language will likely not include the Header and Grouping structure constructs required.  Can a language and IDE be enhanced with the needed constructs? 

Interpreted Logic Mode

Background:  Certain SAFR processes have been dynamic over the years, in various degrees.  At no time, however, has the entire LR and meta data structure definition been dynamic. In the early days, the entire view was dynamic in that there were fairly simple on-line edits, but some views were disabled during the production batch run. In more recent years fewer pieces have been dynamic.  Reference data values have always been variable, and often been used to vary function results, in the last two decades always from externally maintained systems like ERP systems.  Most recently, new dynamic aspects have been bolted on through new interfaces to develop business rules facilities. 

The SAFR on Spark prototype application demonstrated the capability to turn SAFR logic (a VDP, or the VDP in XML format) into SQL statements quite readily.  It used a CREATE statement to specify the LR structure to be used.  It had to create a new language construct, called MERGE to represent the common key buffering construct.  SQL tends to be the language of choice for most dynamic processes.  SQL also tends to be used heavily in reporting applications of various kinds. 

Recommendation:  Develop a dynamic compile process to turn SAFR specific SQL into SAFR execution code.   This will allow for other reporting tools to gain the benefits of SAFR processes in reporting.  If the data to be accessed by SAFR is in a relational database (either the only location or replicated there), SAFR could pass off the SQL to another resolution engine if non-SAFR SQL constructs are used. 

Impediments:  It is not clear how the MERGE parameters would be passed to the SAFR SQL Interpreter.  If sequential files are used for SAFR, would a set of dummy tables in a relational system provide a catalogue structure sufficient for the reporting tool to generate the appropriate reporting requirements?  Also, can the cost of creating two compilers be avoided?

Enhancement:  CUSTOMER’s EMQ provides a worked example of a reporting application which resolves SQL queries more efficiently that most reporting tool.  It uses SQL SELECT statements for the general selection criteria, and the database functions for and sort (ORDER BY), and some aggregation (GROUP BY) functions.  When considering SAFR working against an indexed access structure, this construct may be similar.  More consideration should perhaps be given.

Compile Code and Interpreted Logic Intersection

Background:  The newer dynamic, rules based processes, and the older, more static Workbench developed processes cannot be combined into a single process very easily because of duplicated metadata.  This arbitrary division of functions to be performed in Compile Code mode verses Interpretive Mode is not desirable; there may be instances where these two functions can be blended into a single execution.

Recommendation: One way of managing this would be to provide a stub method which allows for calling Interpreted processes under the Compiled object.  The IDE for development of the Interpreted Mode logic would be responsible for ensuring conformance with any of the information in the header or grouping structures.

Aggregation and Allocation Processes

Background: Today’s SAFR engine’s aggregation processes are isolated from the main extract engine, GVBMR95, in a separate program, GVBMR88.  Between these programs is a sort step, using a standard sort utility.  This architecture can complicate the actual posting processes requiring custom logic to table, sort (in a sense, through hash tables) detect changed keys, and aggregate last records before processing new records. 

Additionally, today’s SAFR does not natively perform allocation processes, which require dividing transactions or balances by some set number of additional records or attributes, generating new records.  Also, SAFR has limited variable capabilities today. 

Recommendation:   Enhancing SAFR with these capabilities in designing the new engine is not considered an expensive development; it simply should be considered and developed from the ground up.

Next Steps

Feedback should be sought on this proposal from multiple team members, adjustments to it documented, and rough order of estimates of build cost developed.

Training Module 12: The Single Pass Architecture

The slides from the following video are shown below.

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR.  This is Module 12, The Single Pass Architecture

Slide 2

Upon completion of this module, you should be able to:

  • Describe the Single Pass Architecture
  • Read an Extract only Logic Table and Trace
  • Explain the Function Codes used in the example
  • Debug AND and OR selection logic

Slide 3

The prior modules covered the function codes highlighted in blue.  Those to be covered in this module are highlighted in yellow.

Slide 4

One of the most powerful features of SAFR is the single pass architecture.  Input / Output processing or IO is often the slowest part of data processing, at times thousands of times slower than other functions.  SAFR’s single pass architecture allows for a single read of the input data to produce many different outputs.  Each output can be unique, as specified by individual views.

Slide 5

In this module, we will continue to build on the prior lesson, where a Copy Only view was processed alone.  In this module, we will add an Extract Only view.  Both of these views do not require the Reference and Format phases of the SAFR Performance Engine.

In this example, we’ll also execute the Copy View at the same time as the Extract Only View, demonstrating how the SAFR Single Pass Architecture works.  The logic table will contain two views in one logic table.  The Extract Engine, GVBMR95 will produce two outputs.

Slide 6

Instead of copying all fields on the input records to the output files, the Extract Only view writes selected fields to the output file.  Any field may be written to the extract file, in any order, regardless of the order or position on the input file. Field formats may also be changed, for example changing a zoned decimal to a packed field format. These columns will cause DTE Logic Table functions to be generated in the Logic Table.  Constants can also be written to the output file.  

Constants use DTC functions in the Logic Table.  Extract only views can also contain looked-up fields, which will be explained in the next module, and which generate DTL logic table function codes.

In our example, columns 1, 2, 3 and 5 of the view will extract the Cost Center field, Legal Entity, and the Account and Record Count respectively.

Slide 7

Logic text in Column 4 will cause the output file to contain a constant of either the value Asset Account or Liability Account. “AssetAcct” will be assigned if the account number field on the input file contains the values “111,”  “121” or “123”.  Otherwise column 4 will contain the Liability account constant “which is the value “LiabAcct”.  This logic text will create multiple CFEC functions, introduced in the prior module.

Slide 8

Column 6 contains Logic Text that tests the input amount field.  If the amount is a valid number (is numeric), it will be written to the output file.  If the amount on the input file is not numeric, a constant of a zero will be written to the output file.  This logic will generate a CNE function in the logic table.

Slide 9

This is the Logic Table generated for both the Copy View and the Extract Only view. 

The portions of the logic table generated for the Copy Only view example in the last module remain mostly unchanged.  It includes the HD Header, RENX Read Next, and logic table rows 3, 4 and 5.  The last two rows of the logic table, the ES End of Source (sometimes called End of String) and EN End of Logic Table functions are very similar as well.  Only the row numbers on these last two rows have changed

Slide 10

Our new Extract Only view, number 3262, is inserted in the middle of the logic table.  So each record read from the input file will first be passed through all the Logic Table Functions for the Copy Only view, and then through the logic table functions of our new Extract Only view.

Slide 11

Our Extract only view again includes an NV – New View function, and concludes with a WR function.  In this case, instead of a WRIN function which writes the input record (makes a copy), it is a WRDT function, Write the Data area of the extracted record. 

Slide 12

Columns 1, 2, 3 and 5 simply move fields from the input file to the output file.  These columns cause DTE functions to be generated in the logic table.  The DTE function simply moves data.  The “DT” is derived from “Data” , the “E” means the source is the input Event File field.

Each DTE function is followed by a Sequence Number.   The Sequence Number for each DTE shows the column number causing that logic to be generated.

Slide 13

Each DTE row also contains the position and length in the source Event file.  These positions, lengths, formats and numbers of decimal places are taken from the Logical Record describing the input file.

Each DTE row also contains the length and format to be placed in the output file.    A difference between the length, format, or number of decimals between the Source and Target columns indicates SAFR has performed a transformation before writing that field. In this example no transformations occurred.

The report does not show the position in the output file. The start position in the final output file is shown in the view editor of the Work Bench.  But the final output file position may be different than the extract file position, depending on the view type.  The extract file position can be calculated by accumulating the lengths of all preceding column outputs. 

Slide 14

The first part of the Logic Text in Column 4 contains the text “If Account = 111 or Account = 121 or Account = 123”.  This clauses causes multiple CFEC functions, (Compare Field, from the Event file to a Constant) to be generated in the Logic Table.  The CFEC functions in logic table rows 10, 11 and 12 are generated by this specific IF statement. 

The Sequence Number field of the report shows the column number that contains the logic that created the CFEC.  In this instance, the logic is contained in column 4 of the view.

Slide 15

CFEC functions work together to complete a complex OR or AND test.  In this example, the “OR” statements caused three CFEC functions to be created. 

The first test for Account equal to 111 is performed on Logic Table Row 10.  The second test for Account equal to 121 on Logic Table Row 11, and the third test for Account equal to 123 on Row 12.

Slide 16

Because each CFEC tests the ACCOUNT Field—the “E” or Event File field portion of the CFEC—the position, length and format of the ACCOUNT field is shown in the Source attribute columns.  Because the ACCOUNT is used three times in the logic text, the same position, length and format are used on all three CFEC rows.

Slide 17

The constants—the second “C” of the CFEC—are also placed in the logic table.  These three constants are placed at the end of the respective logic table rows.  These three constants all are a comparison type of 001 which is equal, and are all 3 bytes long

Slide 18

The OR condition of the logic text determines the numbers placed in the GOTO rows.  If the value in the Account field on the input Event file is “111”, then the result of the first test is “true” and the record should be selected for additional processing within the column.  Thus executing will jump to row 13, as specified in the Goto Row1 which is the true condition branch. 

Slide 19

If the logic table row tests false, then the other tests of the OR condition must be evaluated, including testing for “121” or “123”.  Thus the False GOTO Row is row 11, the next logic table row, where the next CFEC function will test against a constant of “121” rather than “111”

Slide 20

The second OR condition creates a similar pattern on the next CFEC function, testing against constant “121” on logic table row 11.  If true, then the next row to be executed is row 13 where the column will use the input record in some way.  If the value in not “121”, then the next row executed is row 12, the next OR condition to test against value “123”.

Slide 21

Row 12 tests the Account field on the input Event file for value “123”.  If it test true, then the next row to be executed is again row 13 which will move a constant to the output record. 

If the value in not “123”, then the next row executed is row 15 which means the else condition for the column value will be used.

Slide 22

The THEN and ELSE portions of the logic text cause additional rows to be generated in the Logic Table.  If a field was to be moved from the input to the output file, they would be DTE functions, like those generated for columns 1 and 2 and others.  In this example, constants are to be moved to the output record, so DTC functions are generated, DTC meaning Data from a Constant.

Logic Table rows 13 and 15 are both DTC functions.  Row 13 places the constant for Asset Account in the output file if any of the CFEC functions tested true.  Only if the ACCOUNT field on the Input File has the value “111”, “121” or “123” will the output column receive the value of “AssetAcct” in it. 

On the other hand, row 15 places a value for Liability Accounts.  It is executed if ALL CFEC tests are false.  Thus any ACCOUNT value besides those three tested will result in a value of “LiabAcct” in column four of the output file.

Slide 23

If all rows in the Logic Table were executed sequentially in order without skipping any rows the constant of Liability Account would overwrite all the Asset Account constants place in the output record.  To prevent this, a GOTO row is used to skip the DTC for Liability Account whenever an Asset Account is used.

In our example, Row 14 is a GOTO row.  If Row 13 placed the Asset Account value in the output file, the program naturally falls through to row 14.  The Logic Table then tells the program to jump to row 16, skipping row 15.  This prevents the value of Asset Account from being overwritten with the Liability Account constant.

Slide 24

With OR logic, all three CFEC rows execute row 13, which places the Asset Account value in the output.  If any one of the rows is true, Asset Account is placed in the output. 

If our logic text were changed, and we used AND logic to test three different fields, our GOTO ROW1 and ROW2 would be swapped. AND conditions require that all three rows test true. The effect is that the True and False row numbers switch places for AND verses OR logic.

Slide 25

With AND logic, a true condition on each CFEC causes the logic to continue to the next row of the logic table to continue the test.  After the final test of the AND condition, if all tests have proven true, the DTC function on Logic Table Row 13 is executed to build the Asset Account constant.  If any of the CFEC functions prove untrue, execution continues at Logic Table Row 15, the Liability Account DTC function.

A common debugging problem occurs if logic text requires the same field to contain multiple values by using AND when OR was intended.  The same field on a single input record can never contain multiple values. For example the field Account can never equal “111” AND “121”.  The condition would always prove false. Using the Logic Table to read how the logic is interpreted can help uncover these types of problems.

Slide 26

The last column of the view tests to ensure only valid numeric values are placed in the output file using the Logic Function “ISNUMERIC”.  The “Is Numeric” function in Column six of the view generates a CNE function, a Class Numeric test of an input Event file field.  The CNE function is similar to a CFEC function.  It tests a value and directs execution to the GOTO 1 or 2 rows depending on the results of the test.

The CNE function on row 17 in our example tests if the input field AMOUNT contains a valid numeric value.  If so row 18 is executed.  If the input field is NOT numeric, GOTO ROW 2 will cause row 20 to be executed.

Slide 27

The THEN and ELSE conditions of Logic Text  for column 6 perform different functions.  The THEN Condition causes a field from the input file to be moved to the output file.  The ELSE condition causes a constant of zero to be placed in the output file. 

The true test of the THEN condition of the CNE test on row 17  will execute row 18, a DTE function, moving the Amount from the input record to the output. 

The false ELSE condition on NON numeric causes row 20 to be executed, a DTC function placing a constant of 0 in the output file.

Slide 28

The final instruction of the Extract Only view is the WRDT function. This function  is generated by default at the end of a view if there is no WRITE Logic Text function in the view.  In these cases, it is always executed. 

In contrast to the WRIN function which moves the Input Record to the output file, the WRDT function moves data from the Extract record area to the output file.  All of the data moved to the extract record through the DT functions, both DTCs or DTEs, are actually written to the output file.

The WRDT is followed by a Sequence Number 1, meaning it writes its data to file number 1.  This is the same file the WRIN function of the Copy View uses.  Thus after the first input record is processed, the first record in the output file may be the copy of the input record selected by the Copy View, followed by the Extract Only data of the second view.

Slide 29

Having examined the Logic Table, let’s use our three record file again to see how it behaves through the trace process.  Trace is turned on by setting the TRACE parameter to “Y” in the MR95 Parameters file.

Slide 30

The first three rows of the trace are for view 3261, our Copy View from the last module. The first input record is compared against the constant in the CFEC function.  The comparison is true, and so the next row of the logic table is executed. 

Slide 31

Because the test proved true, the input record is copied to the output file by the WRIN function.

Slide 32

Because our SAFR execution included running more than one view, instead of looping to the next Event File record and the Copy Input View processing it, the input record is passed to our new Extract Only view, number 3262

Slide 33

Note that some rows of the logic table are not executed as record 1 is processed in this example.  Rows 11 and 12 are not executed because the OR condition in the Logic Text; the first condition proved true, so test 2 and 3 was not necessary. 

Also, Row 15 which would have placed the Liability Account in the output file, was skipped by the GO TO Function on row 14.

Row 20 was also skipped, because the amount was a valid number, so it was not replaced by a constant zero.

Note that the trace does not convert Packed and other “un-printable data” to a printable format.  The number tested on row 17 appears to print as a “p”, but if viewed in HEX mode, will display as a valid packed number based upon the format of the field used in the Numeric test. 

Slide 34

The Single Pass architecture allowed the same record to be used to create two output records, one which was an identical copy, and one which contained selected fields and constants. The second record in the output file contains many of the same fields as the input, but in a different order, for the Cost Center and Legal Entity, each built by DTE functions.

The Account value of “111” is written to the same position as in the input record by a DTE function

The Account Description is next, in this case Asset Account, built by a DTC function. 

Having tested the amount on the input file and found it to be numeric, the view copies it in the last position to the output file using a DTE function.

Both views are able to make use of the same input record, without having to read the file twice.  By making changes to the view, these output records could also have been written to different output files if desired.

Slide 35

Record 2 follows a similar pattern.  Record 2 is passed first to the Copy view which writes it to the extract file.  Record 2 is then passed to the Extract Only View. 

Note that the AMOUNT field on the input record two has a non-numeric amount of “alpha” in it.  This causes the Extract Only CNE test to be false, and thus this value is not moved to the output record.  Rather a packed constant of zero is moved to the output file (which is unprintable in this slide and shown as a series of periods).

Slide 36

Record 3 is read, which then is NOT selected by the Copy Only view because the Legal Entity tests false; the Legal Entity is 731, not the 999 required.  Thus this record is not written to the extract file, and the input records is passed to the Extract Only view.

When record 3 is processed by the Extract Only view, it is written to the output file.

Slide 37

The GVBMR95 Control Report shows that the three records in the input file have become 5 records in the output files:  2 for the Copy Only view, and 3 for the Extract Only view.

The Extract Engine in this process has significantly increased efficiency over alternative methods of producing these two outputs, because in a single pass of the file, one IO to get the event file into memory for processing has allowed both outputs to be done. Certainly programs can be written to do this same thing, but it demands a programmer writing the program to design it that way. With SAFR, two people independently can create views, and the tool will resolve them efficiently.

Remember, though, that this process does not include any parallelism. View number 2 is executed after view number 1 has seen the event record.  We’ll explain parallelism in a later module.

Slide 38

This logic table and module has introduced the following Logic Table Function Code:

  • CNE, a Class test Numeric on an event file field.
  • DTE, which moves an input field to the output buffer
  • DTC, which moves a constant from the logic table to the output buffer
  • WRDT, which writes the DT data to the extract file

Slide 39

This module described the single pass architecture and additional logic table processing using the Extract Only view. Now that you have completed this module, you should be able to:

  • Describe the Single Pass Architecture
  • Read an Extract only Logic Table and Trace
  • Explain the Function Codes used in the example
  • Debug AND and OR selection logic

Slide 40

Additional information about SAFR is available at the web addresses shown here. This concludes Module 12, The Single Pass Architecture

Slide 41

Training Module 2: Performance Engine Overview

The following slides are used in this on-line video :

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting (or SAFR). This is module 2, “Performance Engine Overview.”

Slide 2

This module provides you with an introduction to the SAFR Performance Engine and how to use it. By the end of this training, you should be able to:

  • Identify the phases of the Performance Engine
  • Identify the main inputs and outputs for each phase
  • Give a high-level overview of the reports from each phase, and
  • Run a Performance Engine job stream

Slide 3

As we noted in the “Introduction to SAFR Views” module of this training course, SAFR consists of two software components: a PC-based Workbench and a mainframe-based batch process known as the Performance Engine. Developers use the Workbench to build applications that are stored in a metadata repository in an IBM® DB2® database. These applications are then executed by the Performance Engine, which reads data from source files or databases, transforms it, and writes it to output files. In this sense, SAFR is an application development tool and is not fundamentally different from any other tool or language.

Slide 4

The Performance Engine comprises six phases:

  • The SELECT phase, where the desired views are selected from the Metadata Repository
  • The COMPILE phase, where the selected views are converted to an executable form
  • The LOGIC phase, where logic from multiple views is consolidated and optimized for execution
  • The REFERENCE phase, where values are retrieved from lookup tables and optimized for execution
  • The EXTRACT phase, where data from source files is retrieved, merged with lookup data, and transformed for output, and
  • The optional FORMAT phase, where data is sorted, summarized, and formatted if necessary

Slide 5

The primary SAFR program in the Select phase is GVBMR86 (The names of all programs in the SAFR Performance Engine start with the letters “GVB,” so it is common to refer to this program as MR86). MR86 reads a list of view numbers and finds the corresponding views in the SAFR Metadata Repository. It then processes these views and converts them to an Extensible Markup Language (or XML) file. It also produces a control report known as the MR86 Report, which indicates successful completion or displays errors that may have to be corrected before proceeding.

Slide 6

The view numbers for the desired views are placed in the JCL, using a DD name called VIEWLIST. In this example, we are specifying view 90, which is the view we created in the “Introduction to SAFR Views” module of this training course.

Slide 7

For a view to be selected, it must first be set to Active status in the Workbench. If the view is not active, the MR86 Report will contain the error message “View is inactive.” A view that is inactive cannot be run.

Slide 8

When errors occur, MR86 also issues a condition code of 8 for the job step, which typically causes the remaining job steps to be skipped.

Slide 9

Once the view is activated in the SAFR Workbench, MR86 can be rerun and the view should be successfully selected for processing. If no other errors are encountered, the MR86 Report does not contain the word “ERROR” at the beginning of any row, and MR86 returns a condition code of 0 for the job step, allowing remaining jobs steps to run.

Slide 10

The primary SAFR program in the Compile phase is GVBMR84. MR84 reads the XML file created by the previous step and converts it to a form that can be executed by the Extract phase later. This new file is called the View Definition Parameters (or VDP) file. Because multiple versions of the VDP file are used in the SAFR job stream, this one is referred to as the MR84 VDP to indicate the program that wrote it. MR84 also produces a control report known as the MR84 Report.

Slide 11

The MR84 Report indicates successful completion or displays XML syntax errors and view compile errors that have to be corrected before proceeding. If there are no errors, the process continues. 

The MR84 Report also shows how columns in the view are translated into a specialized SAFR construct known as a logic table.

Slide 12

The primary SAFR program in the Logic phase is GVBMR90. MR90 reads the MR84 VDP file created by the previous step and creates a new VDP and logic tables that are used by the next steps, the Reference and Extract phases. The new VDP is called the MR90 VDP to distinguish it from the input VDP file. The logic tables are known as the MR90 JLT (for “Join Logic Table”) and the MR90 XLT (for “the Extract Logic Table”).

The next programs in the Logic phase, which are not shown here, read these files and produce converted versions known as the MR77 VDP, the MR76 JLT, and the MR76 XLT.

Slide 13

The MR90 Report displays summary statistics for the Logic Table Creation process. In this case, the Extract Logic Table contains nine records. This view included no joins, so no records were written to the Join Logic Table. The report also displays, in detail format, the contents of the Extract Logic Table and the Join Logic Table if applicable.

Slide 14

The primary SAFR program in the Reference phase is GVBMR95. MR95 reads the MR77 VDP file and the MR76 JLT file created by earlier steps and one or more Reference Data files. It then produces a Reference Extract Header (or REH) that describes the format of the reference data being processed. It also creates one Reference Extract Detail (or RED) file for each Reference Data file read. These files contain only the data required to perform joins in the Extract phase. Finally, MR95 produces a control report known as the MR95 Report, which indicates successful completion or displays errors that may have to be corrected before proceeding.

Slide 15

If this job stream does not access any Reference Data files, the MR95 Report displays a warning message saying that there is an “empty or abbreviated logic table.” If you do not intend to access any Reference Data files, this is considered a normal condition and you may proceed to the next step.

To be processed correctly, reference data must be sorted in key sequence and have no records with duplicate keys.

Slide 16

The primary SAFR program in the Extract phase is GVBMR95, which is the same program that is run in the Reference phase. In reviewing the job stream output, you should take care not to confuse the outputs of the two executions of MR95.

MR95 reads the MR77 VDP file, the MR76 XLT file, and the REH, and RED files created by earlier steps, along with source data for the process (Source data is sometimes referred to as business event data or just event data).

MR95 executes various transformations and, depending on specifications in the selected views, produces one or more View Output files. It may also produce one or more Extract Work files, which are temporary files processed by the Format phase. Finally, it produces an MR95 Report, which presents summary statistics for the process. 

Slide 17

Before running the Extract phase job, you must make sure that DD statements for any required input files are included, containing the data to be scanned and extracted according to the view requirements. The DD names (such as CUSTOMER or ORDER001) must match those in the Workbench physical files referenced by the views being run.

Slide 18

Similarly, you also must ensure that DD statements for any required output files are included. These DD names are determined by view parameters in the Workbench.

Slide 19

The Extract phase MR95 Report displays statistics that are useful for audit trails and performance tuning. In this example, we can see that 12 records were read from the ORDER001 file, 12 records were processed by view 90, and 12 records were written to the OUTPUT01 file.

Slide 20

A secondary program in the Extract phase is GVBUT90, which produces the UT90 Report. This report displays information from the Extract Logic Table (XLT), which shows each step that is executed in a data transformation. This is useful information for debugging views. A more detailed discussion of the logic table is presented in the training module entitled “Basic Debugging.”

Slide 21

The primary SAFR program in the Extract phase is GVBMR88. MR88 reads the MR77 VDP, REH, and RED files created by earlier steps, along with the temporary Extract Work files for the process. It then sorts, summarizes, and formats the data, producing one or more View Output files. Additional details will be provided in the training module entitled “Introduction to the Format Phase.”

Slide 22

This module provided an overview of the SAFR Performance Engine. Now that you have completed this module, you should be able to:

  • Identify the phases of the Performance Engine
  • Identify the main inputs and outputs for each phase
  • Give a high-level overview of the reports from each phase, and
  • Run a Performance Engine job stream

Slide 23

Additional information about SAFR is available at the web addresses shown here.

Slide 24

Slide 25

This concludes the Module 2. Thank you for your participation.

Training Module 1: Introduction to Views

The following slides are used in this on-line video :

Slide 1

Welcome to the training course on IBM Scalable Architecture for Financial Reporting, or SAFR. This is Module 1: Introduction to SAFR Views.

Slide 2

This module provides you with a broad overview of the SAFR product and the basics of its graphical user interface. By the end of this training, you should be able to:

  • Identify the two major software components of the SAFR product
  • Identify the major types of SAFR metadata
  • Navigate the SAFR Workbench, and
  • Create a simple view. 

Slide 3

SAFR is a software application development tool that solves high-volume data analysis and reporting problems. SAFR provides capabilities for data transformation, data mining, database query, and financial reporting.

Slide 4

SAFR consists of two software components: the PC-based Workbench and the mainframe-based batch process known as the Performance Engine. Developers use the Workbench to build applications that are stored in a metadata repository in an IBM®  DB2®  database. These applications are then run by the Performance Engine, which reads data from source files or databases, transforms it, and writes it to output files.

Slide 5

Several types of metadata make up a SAFR application. The most common are the environment definition, the physical file definition (or PF), the logical file definition (or LF), the logical record definition (or LR), the view definition, and the view folder. 

Note that, when discussing SAFR metadata, we often omit the word definition because it is usually clear from the context whether we mean the metadata or the entity it refers to. 

Slide 6

An environment definition describes a logical collection of metadata within the SAFR Workbench. Typical types of environments include development, production, or training environments. Access to an environment can be restricted to a certain set of users. 

Slide 7

A physical file definition, or PF, describes a data source. Examples include customer or order files. A logical file definition, or LF, describes a collection of one or more physical files. A logical record definition, or LR, describes a record layout. In COBOL programs, record layouts are often found in copybooks. In relational databases, they are found in table definitions.

Some examples of these metadata types are shown here. 

  • A logical record for a customer is used to map data to a customer logical file. The customer logical file refers to data in a customer physical file. 
  • An order logical record is used to map data from a logical file named ORDER_001, which refers to data in a single physical file named ORDER_001. 
  • The order logical record can also be used to map data from a logical file named ORDER_ALL. ORDER ALL refers to a collection of order physical files.

Slide 8

A view definition describes a data transformation. It is analogous to a program or a query. Views are the basic units of work that are performed by the Performance Engine. 

Views are often grouped together into view folders for ease of maintenance. View folders are often named for a particular developer or function. Security can be applied to view folders to prevent unauthorized access. 

Slide 9

The SAFR Workbench is used to add, change, and delete SAFR metadata. It contains a menu and toolbar, and consists of multiple display areas, or frames. 

  • The Navigator area displays the types of metadata available. 
  • The Metadata List area displays a list of items for the selected metadata type. 
  • And the Editor area is the part of the screen where you modify metadata items.

Slide 10

  • By expanding the View Folders item in the Navigator area, you can see a list of all view folders. 
  • The contents of the selected folder are displayed in the Metadata List area.
  • From there, you can select a view for editing and the view will be displayed in the Editor area.

Slide 11

View information is displayed on two separate screens:

  • The View Editor screen, where you can define specific data transformations.
  • The View Properties screen, where you can modify information that applies to the whole view. 

Slide 12

You use the General tab on the View Properties screen to specify the output format – flat file or hardcopy report – and other related information. 

This tab also displays information about when the view was created and last modified and by whom. 

Slide 13

In addition, the General tab displays the name of the view folder where the view is stored.  

Slide 14

You can access advanced features on the Extract Phase tab and the Format Phase tab. You can open these tabs by single-clicking them. 

Slide 15

You can toggle back and forth between the View Properties screen and the View Editor screen by clicking the first icon in the Editor area toolbar, or by pressing the F9 key.

Slide 16

In View Editor mode, the Workbench displays several frames of view information. 

Slide 17

The View Editor grid displays the characteristics of view output columns. These characteristics include the data type, the length, and the alignment, such as left, right, or center. 

Slide 18

You can display information about the data source for the view by right-clicking a blue cell in the View Editor grid. This information includes the logical record and the logical file. 

Slide 19

To open a frame showing the column source properties, you right-click a green cell. The source of a column’s data can be a field in the source file, a constant, a lookup value, or the result of a formula. 

Slide 20

The View Editor incorporates several functions, such as inserting a column or activating a view. You can run a View Editor function in several ways:

  • Select it from the Edit menu or the Action menu for the Workbench
  • Left-click the function icon on the View Editor toolbar
  • Right-click in the View Editor grid and select the function from the pop-up menu
  • Or press the appropriate key combination, which is noted on the Workbench menu and the pop-up menu. 

Choose whichever technique you prefer. 

Slide 21

To add a new view source, you right-click on the grid to display the pop-up menu, and then select Insert and View Source

Slide 22

The Insert View Source window opens. You can select from a list of data sources in the window. 

Slide 23

Now let’s take what you’ve learned and create your own view. The following example is a simple data transformation, reading data from the ORDER001 file and writing out only the Order ID, Customer ID, and Total Amount fields. 

Slide 24

If we were to code a conventional program, we would:

  • Define the file attributes
  • Define the record layouts
  • Code the business logic
  • Compile the program
  • Link the program and
  • Run the program

Slide 25

With the SAFR tool, the first three steps are performed in the Workbench and the last three are performed for you by the Performance Engine. 

Slide 26

Defining files and records will be covered in the SAFR training module entitled “Creating Metadata.” The Performance Engine will be introduced in the “Performance Engine Overview” module. The topic of coding business logic will be introduced in the next few slides. 

Slide 27

In the example described in the following slides, metadata has been pre-populated in the Workbench for ease of instruction. 

To create a new view, click the Administration menu, select New and then select View.

Slide 28

The View Properties tab opens. 

Slide 29

Enter a descriptive name for the view, such as “Simple_Transformation_View.” Note that embedded spaces are not allowed in names, so you must use underscores to separate words. Next, clear the Format Phase check box; this feature is not needed for this simple view.

Slide 30

To display the View Editor Grid, select Show Grid or Properties, from the Edit menu (Alternatively, you can click the toolbar icon or press the F9 key).

Slide 31

From the Edit menu, select Insert, and then select View Source (Alternatively, you can right-click and select Insert and then select View Source from the pop-up menu, or you can press the Shift key and the Insert key). The Insert View Source window opens.

Slide 32

From the Logical Record list, select ORDER. Then, from the Logical File list, select ORDER_001

Slide 33

From the Edit menu, select Add Column (Alternatively, you can click the plus sign icon on the toolbar or press Alt and the Insert key). A new column is added to the grid. 

Slide 34

Click the green cell. The Column Source Properties frame opens on the right. 

Slide 35

In the Column Source Type field, click the list box and select Source File Field

Slide 36

In the Column Source Value field, click the list box and select ORDER_ID.  

Slide 37

Repeat the previous steps to add columns for Customer ID and Order Total Amount. Then save the view by selecting Save from the File menu, or by clicking the Save icon in the Workbench toolbar, or by pressing Control and S. 

Slide 38

To activate the view, use any of these methods:

  • Select Activate from the Action menu
  • Press the Activate icon on the View Editor toolbar
  • Press F5. 

The view title bar now displays the word “active”. Save the view again to preserve this active state. The view is now ready to be run.

Slide 39

This module provided an overview of SAFR. Now that you have completed this module, you should be able to:

  • Identify the two major software components of the SAFR product
  • Identify the major types of SAFR metadata
  • Navigate the SAFR Workbench and
  • Create a simple view

Slide 40

Additional information about SAFR is available at the web addresses shown here.

Slide 41

Slide 42

This concludes the Module 1. Thank you for your participation.

SAFR Overview for Technical Audiences: 2012

Although this deck was started in 2009, it was the primary tool used in talking about GenevaERS (renamed as the IBM Scalable Architecture for Financial Reporting in 2009) since 2012.

Documentation of its development is contained in the book Balancing Act: A Practical Approach to Business Event Based Insights, available for download from ledgerlearning.com.

SAFR and A Smarter Planet for Financial Reporting: 2011

The following was written as part of a renewed sales initiative for SAFR, building upon the latest financial ledger projects.

This write-up written for IBM sales efforts, was used as the summary of Balancing Act: A Practical Approach to Business Event Based Insights, Chapter 2: The Problem and Chapter 3: The Solution.

The Problem

The recent financial crisis has exposed the systemic problem that the world’s largest financial institutions cannot adequately account for and report on liquidity, positions, currency exposure, credit, market, and interest rate risk and product, customer and organizational performance.  The CFO plays a critical role in correcting this problem by leveraging the financial data they already control, as well as leveraging scale to take out cost.  But even industry insiders do not realize that financial institutions suffer a unique set of domain problems when it comes to financial reporting.

Current financial reporting systems are antiquated and very inefficient. They were designed decades ago to simply track flow of capital, revenue and expenses at the company and departments levels.  The lack of transparency is evident in the increasing costs of the finance function with few benefits to show for the investment.  Sarbanes Oxley and other regulations have proven ineffective at getting at the root of the problem and the resulting financial meltdown regulations may well prove similarly ineffective.  These pressures create diseconomies of scale which affect the largest institutions the most.

For the most part, existing systems deliver accurate results in summary, but the increase in transparency requires line of site to the underlying causes of those results. Consider if your personal bank account statement or credit card bill only presented the change in balance between periods, but provided no list of transactions. When the statement is as expected, further detail may not be needed. But when the balance is in question, your first response is ‘why’ and you immediately want to see the transaction detail.  The same issues are at stake when managing the finances of the enterprise – with the associated cost and consequences considerably higher!  A single instance of financial restatement has cost some organizations hundreds of millions of dollars to correct, not counting lost market valuation.

Currently 90% of the money supply in mature markets is represented by digital records of transactions and not hard currency. It’s no wonder that that the volume of electronic finance records being kept has exploded compared to when the systems were first created. Yet our approach to these demands has not been to automate the process of keeping and accessing the details of the transactions. Almost all employees in today’s financial institutions are involved in capturing and coding financial details in some way, and a large number of non-finance employees are involved in the investigative process to locate the additional detail so often required. The effort for this manual intervention is incredibly inefficient and costly.

As we see all around us, computing capacities have increased by several orders of magnitude since these finance systems were designed. However, reporting processes have grown organically as a system of transaction summaries in order to continue to bridge multiple financial systems – but have lacked a single unified approach. This has meant that for the most part the business of financial reporting has not benefited from the increase of computing capacities available today.

The Solution

A Smarter Planet is founded on financial markets that provide for greater transparency and comprehension of the financial reporting by bank and non-bank entities, allowing the markets to react to conditions in more informed, less reactionary ways. IBM has spent 25 years refining an approach to this for financial institutions. The IBM® Scalable Architecture for Financial Reporting™ (SAFR) system provides financial reporting that is built bottoms up from the individual business event transactions to provide the finest grained views imaginable.

By harnessing today’s computing power and straight through processing approach, the details behind summary data can be made available in seconds rather than days or weeks. Providing nearly instant access to the highest quality financial data at any level of granularity will eliminate the duplicative reporting systems which tend to capture and produce summaries of the same business events for many stakeholders and reporting requirements.

More importantly, it will automate the hidden work of armies of people who are required to investigate details and attempt to explain results, or attempt to reconcile the disparate result of these many reporting systems—a truly wasteful activity caused by the systems themselves. Keeping the details in a finance system that can serve these needs allows for increased control, quality and integrity of audit efforts rather than dissipating them.

Some may question how much detail is the right level of detail?  Others may suggest this is too radical a change in a mature, understood and tested set of systems.  IBM experience with some of the largest financial services companies suggests that building a finance system.  based on the requirement to instrument the most granular level of transaction detail immediately stems the tide of increasing costs, lowers a variety of risks and can be a key driver of transformation of the banks ability to become more agile.  In time this approach begins to provide economies of scale for reporting.

SAFR is: (1) an information and reporting systems theory, (2) refined by 25 years of practical experience in creating solutions for a select group of the world’s largest businesses, (3) distilled into a distinctive method to unlock the information captured in business events, (4) through the use of powerful, scalable software for the largest organization’s needs, (5) in a configurable solution addressing today’s transparency demands.

The Theory

Companies expend huge sums of money to capture business events in information systems.  Business events are the stuff of all reporting processes.  Yet executives report feeling like they are floating in rafts, crying “Data, data everywhere and no useful information.”  Focusing reporting systems on exposing business events combinations can turn data into information.

The Experience

Although analysis of business events holds the answers to business questions, they aren’t to be trifled with, particularly for the largest organizations.  Reporting processes—particularly financial reporting processes—accumulate millions and billions of business events.  In fact, the balance sheet is an accumulation of all the financial business events from the beginning of the company!  Such volumes mean unlocking the information embedded in business events requires fundamentally different approaches.  The 25 years of experience of building SAFR in services engagements has exposed, principle by principle, piece by piece, and layer by layer the only viable way.

The Method

This experience has been captured in a method of finding and exposing business events, within the context of the existing critical reporting processes.  It uses today’s recognized financial data like a compass pointing north to constrain, inform, and guide identification of additional reporting details.  It facilitates definition of the most important questions to be answered, and to configuring repositories to provide those answers consistently.  It also explains how to gradually turn on the system without endangering existing critical reporting processes.

The Software

The infrastructure software, a hard asset with hundreds of thousands of lines of source code and feature set rivaling some of the best known commercial software packages, is most often what is thought of when someone refers to SAFR. 

The Scan Engine is the heart of SAFR, performing in minutes what other tools require hours and days to do.  The Scan Engine is a parallel processing engine, generating IBM z/OS machine code.  In one pass through a business event repository it creates many business event “views,” providing rich understanding.  It categorizes, through join processes, the business events orders of magnitude more efficiently than other tools.  Built for business event analysis, it consistently achieves a throughput of a million records a minute.  It is highly extensible to complex problems. 

SAFR Views are defined in the SAFR Developer Workbench or rule based processes in the SAFR Analyst Workbench or in custom developed applications.  The Scan Engine executed as a scheduled process, scans the SAFR View and Metadata Repository selecting views to be resolved at that time.

The Indexed Engine, a new SAFR component, provides one at a time View resolution through on-line access to Scan Engine and other outputs.  It uses Scan Engine performance techniques.  Reports structure and layout are dynamically defined in the Analyst Workbench.  The Indexed Engine creates reports in a fraction of the time required for other tools.  Its unique capabilities allow for a movement based data store, dramatically reducing data volumes required both in processing and to fulfill report request.

Upon entering Managed Insights, users select parameters to drill down to increasing levels of business events, and perform multidimensional analysis through the Viewpoint Interfaces.  The Insight Viewer enables discovery of business event meaning in an iterative development mode.

The Solution

The SAFR Infrastructure Software has been configured over 10 years for number of clients to provide an incredibly scalable Financial Management Solution (FMS) for the largest financial services organizations. 

The heart of FMS is the Arrangement Ledger (AL).  An “arrangement” is a specific customer/contract relationship.  The AL, a customer/contract sub-ledger, maintains millions of individual customer/contract level balance sheet and income statements.  This incredibly rich operational reporting system supports a nearly unbelievable swath of information provided by scores of legacy reporting systems in summary, with the added benefit of being able to drill down to business event details if needed.  Doing so allows reporting high quality financial numbers by customer, product, risk, counterparty and other characteristics, all reconciled, audited, and controlled.  

AL is fed daily business events typically beginning with legacy general ledger entries and then transitioning to detailed product systems feeds over time.  The business events become complete journal entries at the customer-contract level, including reflecting the impact of time in the Accounting Rules Engine.  Rules are under control of finance rather than embedded in programs in source systems, enabling Finance to react to changes in financial reporting standards, including International Financial Reporting Standards (IFRS).

The business event journal entries are posted by the Arrangement Ledger on a daily basis, while it simultaneously generates additional point in time journal entries based upon existing balances, including those for multi-currency intercompany eliminations, GAAP reclassification and year-end close processing.  It accepts and properly posts back-dated entries to avoid stranded balances, and summarizes daily activity to pass to the General Ledger.  The General Ledger provides another control point for the traditional accounting view of the data.  The Arrangement Ledger detects and performs reclassification keeping the arrangement detail aligned with the summary General Ledger

AL also accepts arrangement descriptive information with hundreds of additional attributes to describe each customer-contract, and counterparty or collateral descriptive attributes, enabling producing trial balances by a nearly unlimited set of attributes, not just the traditional accounting code block.  Extract processes produces various summaries, perhaps ultimately numbering in the hundreds or thousands, to support information delivery for not only traditional accounting but also statutory, regulatory, management, and risk reporting.  The SAFR one pass multiple view capability allows AL to load data, generate new business events, and create extracts all in one process, including loading the incredibly information rich Financial Data Store. 

Information Delivery includes multiple ways of accessing the Arrangement Ledger and Financial Data Store.  The major window is through SAFR Managed Insights.  This parameter-driven Java application provides thousands of different permutations of the data.  It allows drill-down from summaries to lower and lower levels of data without impacting on-line response time.  It allows dynamic creation of new reports and multi-dimensional analysis of Financial Data Store data.  Extract facilities provide the ability to feed other applications with rules maintained by finance.  Other reports provide automated reconciliation and audit trails.

FMS can be tailored to work within an existing environment, including working within the existing security and reference data frameworks.  FMS is often can be a sub-component of an ERP implementation.

Conclusion

This is a financial system architecture for the 21st century. This is the reporting system architecture for the 21st century. Finance transformation starts with finance systems transformation.  Finance systems transformation starts with rejecting the legacy finance systems architecture that provides only summary results.  It is transforming the financial systems—the original enterprise data warehouse—into a system capable of supporting today’s information demands.

__________

Copyright ©2010, 2011, 2015, 2018 by Kip M. Twitchell.

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the author except for the use of brief quotations in a book review or scholarly journal. Posted by permission.

GenevaERS Introduction to IBM: 2003

On October 1st, 2002, IBM closed the acquisition of PricewaterhouseCoopers Consulting division, and with it gained access to GenevaERS.

This deck was built to introduce the product to IBM, wondering if it would stay a consulting asset or if it would move to be part of the Software Group offerings.

This is a very comprehensive (you might read that as “long”) view of the product and service offering, from a decade of work by the dedicated team.

More about this time can be read in  Balancing Act: A Practical Approach to Business Event Based Insights, Chapter 54, Crisis.

Global Investment Bank Analyzes Trade Level Detail: 1999

[An investment bank arm], a division of [a major Swiss bank], chose Geneva ERS to manage its Detail Repository, a central component of the Global General Ledger SAP implementation.  This repository provides trade-level detail analysis to for internal and external global reporting.  [Bank] management has referred to the global general ledger as “a cornerstone of [the bank]’s strategy and its successful realization globally is vital to all future plans.”

Approximately 1.3 million detailed transaction data records from twenty-two different feeder systems are loaded into the Detail Repository nightly. These transactions are trade level detail records from Europe, Asia Pacific, and North America.  Geneva ERS scans the repositories’ 51 million records in 22 entities and 269 physical partitions.  It extracts 20 million records that are aggregated into approximately 480,000 summary balances.  These summary records are sent to SAP for balance sheet and summary profit and loss reporting.  This process runs in approximately 3 hours of elapsed time and 5 and ½ hours of CPU time and produces 30 different outputs.

A second Detail Repository process uses Geneva ERS and custom programs to satisfy the intricate regulatory requirements.  This system consists of 65 Geneva “Views” or programs, 4 custom programs, and 5 PERL scripts.  Geneva is executed 19 times with each execution handling a subset of the core business requirements.  During this nightly process Geneva reads 71 million records in 40 gigabytes, extract 59 million records in 30 gigabytes, and performs 229 million table joins.  The output is created in 12 CPU hours and 8 wall clock hours.  In comparison, legacy applications required 24 hours to complete a limited number of these processes.

Outputs from these processes are used for US tax and regulatory, Asia specific regulatory management, and Swiss regulatory reporting.  They include information on:

  • Capital Allocations
  • Average Balancing and Multi-Currency Revaluation
  • Syndicated Loan Netting
  • Federal and Swiss Regulatory Collateral Allocation
  • Residual Maturity
  • Interest Rate Selection
  • Product Risk Weighting
  • Specific Reserve Allocation
  • Unutilized Facility Calculation.

The view outputs include files used in additional processing or to feed other systems, delimited files, tabular reports, and inputs to a sophisticated executive information system.  The executive information system allows users to select which report to view and for what period.  The user is presented with the highest level summaries.  The user can then drill down into specific areas of interest, select ranges of data, sort columns, view other dimensions of the same data, graph the data, and export to spreadsheets.  The executive information system is accessed by as many as 50 users throughout the world.

The Geneva Views are maintained in Sybase tables accessed by Geneva ERS Visual Basic ViewBuilder front-end.  The software maintains various types of metadata including record layouts, field types, join relationships between record structures, logical and physical file partition information, as well as the actual query selection, summarization, and formatting logic.  The business logic contained within the views ranges from simple transform logic to the sophisticated globally defined business rules that make up the global general ledger accounting model.

More about this time can be read in Balancing Act: A Practical Approach to Business Event Based Insights, Chapter 20. Parallelism and Platform.