The following slides were developed while consulting at 12 of top 25 world banks about financial systems from 2013-2016. Although not specific to GenevaERS, it delineates the causes of the problems in our financial systems, and the potential impact of GenevaERS supported systems and the Event Based Concepts.
The following are two solution blueprints by an additional attempted start-up to license Geneva. In this instance, the platform was to be ported to the Intel Itanium processor, putting a very powerful data analysis tool right under the desk of the analyst.
The vision was far sighted, anticipating what happened with cloud and data analytics. But it was ravaged by the technology collapse post Internet bubble.
INTELLIGENCE ENGINE FROM PwC CONSULTING AND DIGINEER™
Consolidating massive amounts of data from multiple sources to gain an integrated view
OI5 DataEngine™ from Digineer™
Enterprise Hardware Platform
Scaleable Intel® Architecture servers
Advanced Transaction Analysis for Business and Government
Crucial data that can help businesses and government operate more efficiently often lies buried within an overwhelming mass of transaction details scattered among numerous computers. Extracting this data using most data mining solutions presents many difficulties, since these solutions lack the flexibility, scope, and sheer processing power necessary to convert uncorrelated data into useful information. The Intelligence Engine solution combines technology expertise from PricewaterhouseCoopers (PwC Consulting), Digineer™, and Intel to help solve this issue. The Intelligence Engine solution spans data repositories across the world, using query-processing techniques to generate data intelligence from large numbers of events and transactions.
FACING THE CHALLENGE
Without the ability to look deep into operational processes, large-scale businesses are hindered in their decision-making and organizational goals––essentially running blind. Where data exists, it often resides on disparate servers in a variety of formats, resisting easy access or useful analysis. Collecting, consolidating, and correlating this data requires tools that can be scaled for massive volumes of data, and adapted easily to expose a wide range of information. Challenges to this end include:
Huge volumes of transaction data––Many conventional data warehousing and data mining solutions collapse under the weight of millions of daily event-level transactions. A robust solution must marry intelligent query processing with sufficient processing power to meet high levels of activity.
Data scattered throughout many repositories––Large organizations frequently store data in a variety of formats in data repositories spread throughout the country––or the world. Finding the key data—the proverbial needle in a haystack—demands a solution scaled to the magnitude of the task.
Need to quickly spot trends and patterns––The accelerating pace of modern business makes it necessary to detect and respond to trends and patterns in days rather than weeks. Knowledge-discovery tools must be flexible and adaptable for this purpose.
Evolving data requirements––Creating data models and report formats can be a very time-consuming and expensive proposition, and updating them can also require excessive time and effort. Tools to reduce this design effort and the associated maintenance help ensure that changing data requirements can be accommodated in minimal time.
MEETING THE CHALLENGE
Being able to accurately detect emerging patterns and trends in daily operations is invaluable to any organization seeking to remain competitive under the pressures of rapid change. The Intelligence Engine solution incorporates Digineer™ OI5 DataEngine™ deployed on Intel® Itanium® processor family-based servers to provide fast and flexible analytical data handling across multiple systems. By providing consolidated views of dispersed data in a variety of formats, this solution lets large organizations make fully informed decisions within their industry by assimilating data from every group or department––regardless of platform. The OI5 DataEngine™ successfully bridges dissimilar data structures and geographically separated servers to turn raw data into useful knowledge, providing a robust solution to help guide leaders of large organizations.
To accelerate the process of collecting and consolidating data, Digineer’s toolset simplifies query-processing setup and report generation. PricewaterhouseCoopers has successfully deployed this solution in a wide range of industries, including retail, financial, healthcare, and government.
The Intelligence Engine solution lets any large organization benefit from improved visibility into the heavy volume of daily transactions and events specific to their business––whether retail sales tallies per product for a nationwide distributor, the weather conditions compared to yields at a juice producer’s orange orchards, or quality control data tracked by a steel manufacturer looking for ways to improve processes.
Clients adopting this solution enjoy improved operations management through rapid analysis of transaction data. They also gain deeper insights into key processes within their organization that may have been inaccessible in the past because of the difficulty in collecting and correlating the relevant data.
FEATURES : BENEFITS
Supports multiple data formats: The OI5 Data Engine™ provides support for a wide range of data formats, allowing flexible handling of data stored in repositories across many different platforms.
Designed for faster, high-volume transaction processing: The Intelligence Engine solution combines the intelligent pre-processing capabilities of the OI5 DataEngine™ with the architecture benefits of the Intel® Itanium® processor family. The result is a solution that scales well for the extremely large volumes of data involved in transaction processing––producing rapid responses to complex data queries.
Adapts easily to changing data requirements: The solution includes tools that simplify query design and report generation, encouraging companies to create new models for data collection and analysis, while meeting the core data extraction requirements for the organization.
Advanced Business Intelligence and Analytics Solution
Retailers, financial institutions, healthcare companies, and government agencies are collecting more data than ever before. This includes sales data from stores, warehouses, e-Commerce outlets, and catalogs, as well as clickstream data from Web sites. In addition, there is data gathered from back office sources––merchandising, buying, accounting, marketing, order processing, and much more. To pull together information from all these different sources and integrate, process, and analyze it, companies require advanced business intelligence and analytical solutions powered by a robust, high-performance infrastructure.
The Intelligence Engine solution addresses this very need, by combining technological expertise from PricewaterhouseCoopers (PwC Consulting), Digineer™, and Intel®. The query-processing capabilities of the Digineer™ OI5 DataEngine™ combined with the processing power of the Intel® Itanium® processor family, enables PwC Consulting to provide companies with the ability to quickly consolidate and integrate massive amounts of data from multiple sources––giving key decision makers a single view of vital business information.
Optimized to run on Intel® Itanium® processor family-based platforms, the Digineer™ OI5 DataEngine™ delivers a reliable, high-performance mechanism for processing and querying vast quantities of raw, event-level data through a software algorithm that employs intelligent pre-processing for improved efficiency. The Explicitly Parallel Instruction Computing (EPIC) technology and 64-bit addressing capabilities found in Intel® Itanium® architecture, deliver new levels of performance when applied to data mining and knowledge discovery operations. Powerful Intel® Itanium® processor familybased servers provide the massive computing power needed to run and consolidate memory-intensive databases across channels. In addition, integrating different processes and organizations is easier and more cost effective with a modular, open environment based on industry standards and Intel® Architecture.
THE BUSINESS CHALLENGE
Today’s competitive advantage will go to companies that make the right decisions at the right time. To succeed, these companies must be able to quickly consolidate and integrate massive amounts of data from multiple sources, into a single view of their business that can increase marketing and operational efficiency.
Turning Data into Dollars, a May 2001 Forrester report, indicates that executives charged with setting the strategic direction of their organization understand the value of business intelligence and analytics, yet are faced with the following challenges:
Mountains of data create processing bottlenecks—The sheer volume of data collected–– often involving terabyte-scale databases––creates the potential for processing bottlenecks in those systems not designed to effectively cope with this quantity of information. The massive data quantities produce such a heavy processing load that many existing solutions rapidly reach throughput limitations. Without sufficient processing power and highly optimized analytical tools, organizations must often relinquish access to important subsets of their data, or must add expensive hardware and software to increase system performance.
Analyzing information from scores of data repositories presents a challenge— Organizations cannot easily access and analyze their operational information because of the difficulty in extracting large quantities of data from multiple data repositories spread throughout the organization. Existing solutions lack the flexibility and robustness for effective data access and extraction.
Inflexible models fail to support evolving data streams—Organizations spend significant amounts of development time creating data models, collecting data, defining reports, and constructing consolidated data warehouses. In highly dynamic environments, these systems may rapidly lose relevancy and require ongoing adjustments at the core software or hardware level. Updating these core data model components to correspond with evolving knowledge discovery requirements necessitates a significant investment in time and effort.
New analytical views must be developed quickly and on demand—Different departments and groups within an organization have unique needs for the information mined from available data, and these needs change frequently, often on an urgent basis. A practical solution must have the capability of easily providing new analytical views from the data to respond to immediate needs from diverse groups.
Extracting key information is inordinately difficult—The ability to drill down through individual transactions and extract the useful patterns and trends is critical to knowledge discovery and analysis. Most existing systems do not adequately provide this capability when there are large quantities of data stored on multiple systems, or in one large system where data access is cumbersome.
The retail environment faces several challenges that can take advantage of a business intelligence solution like Intelligence Engine. For example, in a typical retail environment, data gathered from stores, Web sites, catalogs, as well as suppliers and warehouses provide valuable kernels of knowledge that can increase profitability. Consolidating and analyzing massive amounts of memory-intensive databases requires a robust infrastructure. As a result, many retailers mine only a subset of their available information. To recognize buying patterns over time, massive volumes of sales transactions must be analyzed and correlated by sales channel, region, individual store, product, and customer preference. Within the typical retail environment, this data resides in a number of separate systems.
THE SOLUTION OVERVIEW
The Intel® Architecture-based Intelligence Engine solution provides companies with a fast, affordable, and flexible way to consolidate, integrate, and analyze massive, memoryintensive data across multiple systems.
The Digineer™ OI5 DataEngine™, operating on a platform powered by the Intel® Itanium® processor family, performs high-speed analytical processing of massive amounts of event-level data. The data engine can simultaneously write large numbers of data extracts to downstream data warehouses, online and offline reporting tools, and real-time analytics applications. The 64-bit addressing capabilities of the Intel® Itanium® architecture provides the robust computing power needed to run and consolidate memory-intensive databases across channels. The extraordinary performance of the Intel® Itanium® processor family, coupled with the EPIC features, provide an architecture fully capable of adding capacity as needed, and providing heightened performance.
Historically, retailers have addressed the scale and performance problem by creating large data stores using conventional technology. As the volume of data that must be captured and processed increases––often exponentially––processing the massive volumes of data creates a bottleneck and prevents timely access to the detailed information required to run the business. Once this critical point has been reached, retailers must choose between adding expensive hardware to boost performance, or sacrificing the breadth and depth of captured data.
The OI5 DataEngine™ offers an attractive alternative to the scale and performance problem. Retailers use the solution to analyze key retail metrics, such as customer profitability, recency, frequency, volume analytics, supplier performance, and enterprise financials. The OI5 DataEngine™ derives these analytics from a collection of customer relationship management (CRM), supply chain management (SCM), and point-of-sale (POS) databases–– even if the data is distributed among several different machines. The performance provided in these kinds of implementations costs much less than equivalent systems.
The Digineer™ OI5 DataEngine™ delivers:
Speed—Performs fast and efficient processing of massive volumes of transactional data, resulting in a consolidated data store that accurately represents the organization’s operational dynamics.
Scale—Integrates data from multiple, disparate silos without requiring any modifications or enhancements to the existing data sources.
Depth—Includes powerful data query capabilities that efficiently and economically enable data mining of massive amounts of transaction-level event data.
Breadth—Provides an outstanding price and performance ratio, making the solution accessible to medium- and large-sized organizations.
PwC Consulting fully integrates the OI5 DataEngine™ into the client’s business processes and technologies. PwC Consulting’s background and experience in business intelligence and analytics produces a solution that enables organizations to rapidly collect, aggregate, manage, analyze, filter, customize, and distribute information throughout their value chains. By providing consolidated views of the dispersed data––rapidly and cost-effectively–– organizations can identify trends and quickly respond to changing needs and situations, thus improving overall business performance.
The Intelligence Engine solution runs efficiently on a 4-way Intel® Itanium® processorbased server. Supported operating systems include (Win64) Microsoft* Windows* 2000 server (Q3’02), Linux* (Red Hat*, version 7.2 for the Itanium® processor), and HP-UX* version 11 (Q4’02).
Components of the Digineer™ OI5 DataEngine™ rely on the following technologies:
Core Data Engine™—C++ with multi-threading
Result Set Viewer—Java* using Swing*
MetaData Manager—Visual Basic*
MetaData Database—Any ODBC-compliant database
The OI5 DataEngine™ runs optimally using Intel® EPIC architecture, taking maximum advantage of the Itanium® processor family’s multiple memory registers, and massive on-chip resources. The 128-bit general and floating-point registers excel at supporting the complex operations involved in analytics. Tuning the implementation to achieve maximum performance through parallelism, and tapping into the power of the Itanium® processor family’s multiple execution units makes it possible to rapidly process and analyze the large volumes of data necessary to accomplish this solution.
PwC Consulting solutions employ a variety of technologies using Digineer™ OI5 DataEngine™. By being able to create data stores through the importing and exporting of data files, this solution makes it unnecessary to devise complex, custom integration schemes. Support for each of the following sources and targets is provided:
Microsoft* SQL Server*
Target analysis engines
EMC* Symmetrix Manager*
IBM* Intelligent Miner*
Cognos* Powerplay* and Improptu*
Custom applications built in Visual Basic, Java, C++, and HTML using Microsoft Internet Information Server* (IIS) and Transaction servers
Digineer™ OI5 DataEngine™ has been deployed for use with SAP* and PeopleSoft* ERP packages as sources for data and targets for additional analysis. Future plans include development of interfaces for MQSeries*, and clickstream data sources, scheduled for release by mid-2003.
WHO THE SOLUTION WILL BENEFIT
The Intelligence Engine solution provides rapid analysis of data from multiple sources to gain a better picture of emerging trends and patterns. For a strong competitive advantage, companies will need to develop a quantitative, as well as qualitative, understanding of their business. The Intelligence Engine solution complements existing data warehouse and business intelligence solutions. It can also maximize the performance and extend the functionality of systems already in place.
The key vertical integration areas in which the Intelligence Engine solution can provide significant value include:
Retail—Domestic and international (Western European) retailers with annual sales of more than $1 billion, and large-scale customer, supplier, point-of-sale, and operational databases.
Manufacturers—Mid-market manufacturers with annual sales of more than $500 million and one or more large silo databases.
Financial Services—Financial services organizations with multiple geographic locations and stringent operational and regulatory reporting requirements.
Healthcare—Pharmaceutical and biotechnology companies that depend on data from multiple systems to manage and optimize their operations.
Government—Agencies under pressure to provide more extensive reporting of data, over which the organization has little governance. In addition, data sources continue to propagate, grow in size, and are increasingly complex and divergent from each other. 5
The PwC Consulting Intelligence Engine solution using Digineer™ OI5 DataEngine™ adapts well to environments where operational data is distributed among several different systems––sometimes in different data formats––and where the report requirements change frequently. Implementing the Digineer™ OI5 DataEngine™ on servers powered by the Intel® Itanium® processor family offers these benefits:
Provides faster high-volume transaction processing. Supplies the massive computing power needed to run and consolidate memory-intensive databases across channels. This enables key decision makers and business owners to obtain accurate business intelligence and analytics, and quickly gain deeper insights into the dynamics of their operations.
Supports multiple data formats. Supports multiple applications and loads required to consolidate and analyze data throughout the company. This enables the solution to easily access existing data stores and business intelligence solutions without modifications or retrofitting.
Adapts easily to changing data needs. Accommodates an organization’s evolving data needs and models so that new informational requirements can be met with minimal investment of time and money. Integrates different processes and organizes these in an easier and more cost-effective manner with a modular, open environment based on industry standards and Intel® Architecture. Solutions based on the OI5 DataEngine™ as deployed on servers powered by the Intel® Itanium® processor family provide a highly competitive price and performance ratio. 6
FUNCTIONAL BUSINESS CONCEPT
Maintaining a current and accurate view of an organization’s operational status requires the capability of locating and analyzing information from a very large data store in a short period of time. This problem can be compounded when data resides in disconnected data silos, or has been extracted from a variety of non-correlated reports. Reducing the scope of the data analyzed in order to save time or expedite processing can create an inaccurate view of current operations. Processing bottlenecks and inefficiencies often result from solutions implemented on platforms without a tuned architecture capable of handling data quantities that reach terabyte levels. While increasing the quantity and complexity of a solution’s hardware and software may improve performance, it also raises costs. A better approach is to deploy an implementation on a platform designed to accommodate massive volumes of data with processing power that is suitable to the task. The Intelligence Engine solution using the OI5 DataEngine™ on a platform powered by the Intel® Itanium® processor family performs very high-speed, analytical processing of massive amounts of event data. This event data can be dispersed across any number of data sources, including enterprise resource planning (ERP), SCM, CRM, and legacy systems–– without reducing the effectiveness of the solution. Processed data can then feed downstream data repositories, analytical and business intelligence tools, and executive reporting systems, as shown in the following diagram.
The OI5 DataEngine™ uses a patented software approach that leverages the latest advances in microprocessor technologies featured in the Intel® Itanium® architecture, including 64-bit addressing and parallel processing capabilities.
The open architecture of the OI5 solution allows data to be extracted and transformed directly from legacy systems. OI5 provides a massively scaleable, high-performance solution for the most demanding intelligence and analytics applications
The first step to using OI5 is to define the metadata, including record layouts, files or tables, and relationships between tables. The OI5 design removes the complexities of understanding table relationships. Common keys and other relationships between tables only need to be defined once. Instead of having to understand the “where” statement in SQL to add department names to a sales report, OI5 lets users select the department name and then add it to their report. OI5 automatically locates the correct data.
Next, define the views (queries):
Select the transaction files to read
Specify the appropriate output format
Indicate the appropriate filters to apply
Indicate the data to sort or summarize
Design columns, and indicate any calculations to perform
OI5 does the rest
During data file output, OI5 produces multiple files in one execution of the process. This feature reduces the time needed to supply data to other systems. OI5 views can reformat data,perform logic operations, and make calculations. With the open API, custom functions can be created for other types of transformation. These features make OI5 a powerful data extraction and transformation tool, as well as a flexible application engine.
The OI5 DataEngine™ helps large organizations meet the scaleability and performance challenge of assessing key organizational operations when the number of events being measured and analyzed becomes too great. The solution supports queries directed to extremely large volumes of event-level data in a single pass.
Data exposed by the OI5 DataEngine™ can reside in different physical locations on multiple systems, including ERP, CRM, SCM, and POS systems. The front end of the OI5 DataEngine™ maps these “in-place” data sources into the data engine’s MetaData definition modules. The engine then uses the MetaData definition to intelligently pre-process the queries before generating the data views and outputting data stores for analytical processing. The data output can either be viewed with the OI5 Viewer or directed to other business intelligence and reporting systems.
The core, patented technology of OI5 is configured for refreshing and managing a largescale operational data store (ODS) derived from multiple data sources. As shown in the following diagram, it includes three integrated modules that deliver the full power of the OI5 technology and its massive data throughput: OI5 MetaData Manager, OI5 View Manager, and OI5 Core Processor. The OI5 DataEngine™, as implemented on servers powered by the Intel® Itanium® processor, runs on Microsoft* Windows* 2000 Server (64-bit) and HP-UX systems.
The parallel query-processing capabilities of the system follow a defined software algorithm that is optimized for speed. The algorithm first pre-processes all the data queries based on the data engine’s knowledge of where and how the event data is structured and formatted (determined by the MetaData definition in combination with Read Exits), as well as the required output data stores (known as Views). Optimized code generated by this query pre-processing step gets sequenced into a logic table. When the logic table is processed, the OI5 DataEngine™ spawns parallel query instructions that minimize instruction path lengths to the available microprocessors. This explicitly parallel query-processing directly takes advantage of the EPIC architecture of the Itanium® processor family. The OI5 DataEngine™, using instruction path optimization and single-pass data queries, offers unmatched query-processing and exceptional handling of the data view output.
The OI5 DataEngine™––the basis of this data warehouse analytics solution––is optimized for the Intel® Itanium® processor family. The following diagram shows how information from existing data warehouses is delivered to the OI5 DataEngine™ for advanced analytics processing. The OI5 MetaData and View Manager applications, running on a workstation powered by the Intel® Pentium® 4 processor, define the relations and views for the associated processing and analytics. Processed data can be directed to Web portal solutions, as well as custom applications and reports, or supplied in a form for re-entry into an existing data warehouse.
The Web portal is powered by the Intel® XeonTM processor family. The OI5 DataEngine™ is powered by a 4-way Intel® Itanium® processor-family based server.
PwC Consulting has used the OI5 DataEngine™ in numerous scenarios that have been designed and deployed for clients. This solution achieves favorable results in knowledge discovery applications that involve massive numbers of transactions. Running on servers powered by the Intel® Itanium® processor family, the underlying OI5 software data engine performs very high-speed, batch-analytical processing of large volumes of event data–– while simultaneously writing multiple data views to downstream data warehouses, online and offline reporting tools, and real-time analytical processing applications. When used in combination with Intel® Itanium® processor family-based platforms, the OI5 DataEngine™ offers an exceptional price and performance value.
*Other names and brands may be claimed as the property of others. Information regarding third party products is provided solely for educational purposes. Intel is not responsible for the performance or support of third party products and does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of these devices or products.
An R&D statistical study at a Fortune 100 airline suggested that millions of dollars each month were being lost due to lost revenue. The cause of the lost revenue ranged from simple ticketing mistakes and training issues to intentional fraud. But the approach used in the R&D study had the following limitations:
Statistical sampling does not identify specific cases. To take action specific tickets must be identified. Also, some tests require identifying patterns for individual employees or customers, which requires very large samples, preferably the entire database.
Sources other than the actual ticketing database were used because of its complexity and critical production system availability. But working against anything less than the production ticketing database introduces the possibility that the results can be disputed.
Not all tickets could be inspected because of the volume of ticketing data. Over 500 million records, from approximately 40 entities comprising 40 million tickets needed to be scanned. This had to be done for over 10 different specific lost revenue detection tests.
Certain types of detection tests could not even be attempted in the study because of the complexity of the logic involved.
The airline agreed to put Geneva ERS to the test. In a 14-week effort, a nine-member team performed the following:
The detection test business logic was defined, and data mapping from the business logic to the database performed,
Custom code was developed to scan the CA IDMS ticket database, and execute other complex logic
An architecture was developed, Geneva ERS installed, and the database structures defined within the tool,
Geneva ERS “views” or queries were created to produce four files (virtual and physical) and over 10 violation reports,
The queries were executed and refined dozens of times against test databases about 1/6th the size of the production database.
Executions against the production database required scanning the 500 million records in approximately 1 ½ to 3 hours wall clock time and from 3 to 6 hours CPU time. The ticket database was scanned using 30 parallel processes, ultimately reading 170 different files. All detection tests were resolved in one scan of the production database.
The results validated the dollar values estimated in the R&D study showing that over $6 million annually were being lost in one area alone, and millions more might be reported incorrectly. It also provided insight into some areas that had never been investigated before. But more importantly, Geneva ERS identified specific cases which could be investigated and collected. The evidence was so solid that certain employees were dismissed as a result of the investigation.
The following deck proposed similar projects to other airlines.
The following presentation is composed of materials created for a launch of nGravity, and start-up around the commercialization of Geneva from PricewaterhouseCoopers. Ultimately the restrictions on investors because of the nature of the audit business
[An investment bank arm], a division of [a major Swiss bank], chose Geneva ERS to manage its Detail Repository, a central component of the Global General Ledger SAP implementation. This repository provides trade-level detail analysis to for internal and external global reporting. [Bank] management has referred to the global general ledger as “a cornerstone of [the bank]’s strategy and its successful realization globally is vital to all future plans.”
Approximately 1.3 million detailed transaction data records from twenty-two different feeder systems are loaded into the Detail Repository nightly. These transactions are trade level detail records from Europe, Asia Pacific, and North America. Geneva ERS scans the repositories’ 51 million records in 22 entities and 269 physical partitions. It extracts 20 million records that are aggregated into approximately 480,000 summary balances. These summary records are sent to SAP for balance sheet and summary profit and loss reporting. This process runs in approximately 3 hours of elapsed time and 5 and ½ hours of CPU time and produces 30 different outputs.
A second Detail Repository process uses Geneva ERS and custom programs to satisfy the intricate regulatory requirements. This system consists of 65 Geneva “Views” or programs, 4 custom programs, and 5 PERL scripts. Geneva is executed 19 times with each execution handling a subset of the core business requirements. During this nightly process Geneva reads 71 million records in 40 gigabytes, extract 59 million records in 30 gigabytes, and performs 229 million table joins. The output is created in 12 CPU hours and 8 wall clock hours. In comparison, legacy applications required 24 hours to complete a limited number of these processes.
Outputs from these processes are used for US tax and regulatory, Asia specific regulatory management, and Swiss regulatory reporting. They include information on:
Average Balancing and Multi-Currency Revaluation
Syndicated Loan Netting
Federal and Swiss Regulatory Collateral Allocation
Interest Rate Selection
Product Risk Weighting
Specific Reserve Allocation
Unutilized Facility Calculation.
The view outputs include files used in additional processing or to feed other systems, delimited files, tabular reports, and inputs to a sophisticated executive information system. The executive information system allows users to select which report to view and for what period. The user is presented with the highest level summaries. The user can then drill down into specific areas of interest, select ranges of data, sort columns, view other dimensions of the same data, graph the data, and export to spreadsheets. The executive information system is accessed by as many as 50 users throughout the world.
The Geneva Views are maintained in Sybase tables accessed by Geneva ERS Visual Basic ViewBuilder front-end. The software maintains various types of metadata including record layouts, field types, join relationships between record structures, logical and physical file partition information, as well as the actual query selection, summarization, and formatting logic. The business logic contained within the views ranges from simple transform logic to the sophisticated globally defined business rules that make up the global general ledger accounting model.
Insurance Company Manages Very Large Data Warehouse
Three years ago, a Fortune 100 Insurance Company embarked upon a “ Data Warehouse Program” to “build a single consistent common source of financially balanced data to be used by multiple business users for decision support, data mining, and internal/external reporting.” A major component of this information architecture is the construction of multiple Operational Data Stores or ODS’s. These ODS’s may contain up to 10 years of event level or detailed transaction history as the basis for populating other data warehousing applications. The company uses Geneva ERS to load and extract data from these data stores.
Data is loaded into these repositories nightly. In some cases Geneva ERS is used to extract data from operational source systems. “Source systems” vary from the PeopleSoft Journal Lines DB/2 table to legacy flat files. The DB/2 extract process reads 80 physical DB2 partitions in parallel, scanning over 80 million records in approximately 10 minutes.
This extracted data is edited using a combination of custom Geneva ERS processes and stand-alone custom programs. After the data is edited it is loaded into the repositories in parallel by Geneva ERS. The number of records loaded varies by repository, from 100,000 financial transactions (journal lines) to more than 1,000,000 policy transactions. In total, approximately 2 million transactions a day are added to the repositories.
In the same pass of the data that loads the repositories, Geneva ERS also produces multiple reports and extract files for use by business users and other application systems. One of the repositories is “backed up” by Geneva ERS at the same time. This daily execution of Geneva ERS reads over 450 million records in 23 different entities and 220 physical partitions. It writes over 900 million physical records to over 800 files. This process has run in 1 hour 3 minutes of wall clock time and 3 hours of CPU time.
Other executions of Geneva ERS extract data on a daily, weekly, monthly and annual basis. The output from one execution creates an executive information drill down file accessed via the company Intranet. This web site is accessible by over 300 business users.
The Geneva “Views” executed in all of the above processes are maintained within the Geneva ViewBuilder software. This includes record structures, field types, relationships between structures, and the actual queries themselves. Many queries are very similar to programs in their sophistication and contain complex business logic. Some have over 900 column definitions. These views also utilize custom code for accessing certain files and executing business logic requiring a programming language. Over 100 people have been trained on programming using Geneva ERS, and the company has had up to 10 testing environments at one time.
The most sophisticated use of Geneva ERS emulates a PeopleSoft financial cost allocation process. Custom programs were developed which generate over 6,000 Geneva ERS views based upon over 7,000 PeopleSoft rules. Geneva ERS executes these views to scan the financial repository selecting records eligible for allocation. It then allocates these costs through four allocation layers, such as products, and geographical units. At 1999 year-end, this process read over 50 million records selecting nearly 3 million that were eligible for allocation. These 3 million records were exploded into over 290 million virtual allocation records, of which 186 million summarized records were written to physical files. The process runs in 7½ hours wall clock time and 28½ hours of CPU time.
Financial Services Company Simplifies Complex Processes
Many financial service organizations were early adopters of computer technology. They quickly constructed systems to automate numerous processes. Programs were often added to the fringe of existing processes to solve new problems or provide an additional report or file for analysis. The need to keep multiple types of books and fulfill regulatory reporting requirements added to the complexity of some of these systems. These systems grew before modularization, subsystem definition, or other computing concepts were developed. Furthermore, the number of transactions generated by these organizations always challenged computing capacity. This required creative and complex solutions. Over time these systems metamorphosed into closely intertwined, interconnected, and inflexible systems.
In 1996, a Fortune 100 Insurance Company determined their reporting system was so inflexible that it would become unsupportable in the near future. They decided “…to build a single, consistent source of financially balanced data to be used by multiple business users for decision support, data mining, and internal/external reporting.” After analyzing their information needs they determined that the best approach to satisfy the broadest number of users was to construct a comprehensive data warehouse environment, including extract transformation processes, Operational Data Stores (ODSs), reporting environments, and data marts. The company viewed the ODSs as the heart of the data warehousing effort. ODSs are designed to contain up to 10 years of detailed transactional history. By storing the transactional detail, the company can satisfy requests for individual transactions, summaries using any field from the transaction detail, or snap shots as of any point in time. This robust environment truly enables the “single, consistent source of financially balanced data.”
Although such a data store simplifies and satisfies all types of information needs, it also means tremendous data volumes. Managing such data volumes requires a creative, industrial strength solution. It also requires people who know how to make it happen. This company went looking for a tool and the people that were up to the challenge. They chose PricewaterhouseCoopers and Geneva ERS.
Geneva offers a robust data warehousing solution able to process tremendous amounts of data quickly. It also provides for innovative, flexible solutions that are easily supported by company employees. This company uses Geneva to do the following:
Source system Extract and Transform
ODS Load, Query, Interface production
Detailed Financial Allocation Process
Executive Information Delivery
Extract and Transform
Extracting from legacy systems can often present a challenge to the most robust tools. Especially when source systems include complex data structures and high data volumes. Legacy systems often were designed to support monthly reporting cycles by accumulating and summarizing data before finally kicking out the hard copy report at the end of the month. Changing such systems to provide more recent information can make the difference between a successful information environment and simply a different format for the same old content. However, making sure the results are right can be a very expensive process.
PwC consultants used a new approach to attack this problem. For this client, they first created a program which understood these complex and unique legacy structures, which could be called by the Geneva ERS open API. This program opened up difficult legacy systems to the power of Geneva. Geneva was used to extract data from multiple sources. The Geneva development environment allows definition of source system data structures. Once defined to Geneva, business logic can be applied. The business logic is stored as a Geneva “View.” The Views are organized by logical frequencies like daily or weekly processes or on request processes. Once the environment was created, they focused on understanding the data structures instead of tracing through the entire legacy process. They used an iterative prototyping approach to discover the source of all the data contained in legacy downstream files. They successfully proved that the system could be converted from a monthly to a daily system. They extracted the source system data and populated the ODSs. The iterative prototyping approach used by PwC consultants shaved months off the delivery cycle and hundreds of man-hours spent in legacy system research.
The power of Geneva is evident in the types of source systems from which data is extracted. In addition to the complex legacy structure noted above, a DB/2 extract process reads 80 physical DB/2 partitions in parallel, scanning over 80 million records in approximately 10 minutes. All processing is completed in a fraction of the time it would take another data warehousing tool.
Load, Query and Interface Production
PwC set out to help create an environment that would allow many users, with diverse information needs to pull the desired information from the repositories. They created data and processing models that allowed all queries to be resolved in a single pass of the transaction detail. These models minimizes data storage requirements by eliminating duplicate data from transactions, but combining the data back together for report production in an efficient manner.
In the same pass of the data that loads the repositories, Geneva ERS also produces multiple reports and extract files for use by business users and other application systems. Many output formats are possible including reports, spreadsheets and files. And because the ODSs contain transaction level detail, users are able to choose what level of detail they wish to see. The data model also allows for use of Geneva’s date effective join capabilities. This enables users to create reports as of any point in time. Summaries can be created using organizational structures from any point in time.
The client choose to construct an ODS to support the ERP General Ledger being installed. However, the coding structure for the new ERP package differed significantly from the historical organization and account coding structure. The ODS supported all interface production, translating from old to new and from new to old. Global utilities were constructed that were called from the Geneva API. Because of Geneva’s ability to process the detailed transactions, all fields could be translated at the lowest level of detail. This enabled consistent answer set production for all interfaces.
The number of records loaded varies by repository, from 100,000 financial transactions (journal lines) to more than 1,000,000 policy transactions. In total, approximately 2 million transactions a day are added to the repositories. The single pass architecture even produces back ups using Geneva ERS in the same pass of the ODS. This daily execution of Geneva ERS reads over 450 million records in 23 different entities and 220 physical partitions. It writes over 900 million physical records to over 800 files. This process has run in 1 hour 3 minutes of wall clock time and 3 hours of CPU time.
Detailed Financial Allocation Process
The most sophisticated use of Geneva emulates an ERP financial cost allocation process. The insurance company recognized that with their volume of data, the ERP package could not handle the allocation process. They would have to change their business requirements or find another solution. They looked to PwC and Geneva to supply that solution. Client members and PwC consultants analyzed the allocation process and created custom programs which generate over 6,000 Geneva views based upon over 7,000 allocation rules. Geneva executes these views to scan the financial ODS selecting records eligible for allocation. It then allocates these costs through four allocation layers, such as products, and geographical units.
During the first year of implementation, the year-end allocation process read over 50 million records selecting nearly 3 million that were eligible for allocation. These 3 million records were exploded into 186 million allocation results. The process runs in 7½ hours wall clock time. The Geneva system produces these results 250 times faster than the ERP package. Because of this innovative Geneva solution, the business users were able to have their data represented exactly as they wanted.
Geneva ERS Executive Information System
Providing users with on-line access into repositories holding hundreds of millions of detailed records is a significant challenge. The PwC team developed an innovative approach to give users the access, but not compromise the performance of the ODS processes or require massive new processing capacity. The result was the Geneva ERS Executive Information System.
This system uses Geneva to produce summary “views” of the detailed transactions. These summaries were developed within Geneva, and could summarize by any field in the ODS. Approximately 20 different cuts of the data were developed. During the load and query processes, these queries are executed to refresh the summaries from the detail transactions. Because the summaries are always regenerated from the detail, no sophisticated update processes had to be developed and they also always contain the same consistent answer.
Users access the companies Intranet site, and select which summary to view. The Geneva ERS Java applet allows users to drill down within the report to lower and lower levels of detail. Because of the unique structure of the Geneva output file, data access if very efficient. This web site is accessible by over 300 business users
Most Geneva Views are created and maintained within the Geneva ViewBuilder software. This interface stores table structures, field types, relationships between structures, and the business logic to process the data. Geneva trainers trained over 100 company employees on site on the use of the ViewBuilder, for everything from basic view construction to use of the Geneva API, to key data warehousing concepts.
With PwC assistance the company implemented the first two ODSs. They have now moved on to developing additional warehouses on their own and multiple data marts. The result has been the ability to replace certain legacy systems with much more flexible architecture. The company has made major strides in meeting it objectives of “…a single, consistent source of financially balanced data to be used by multiple business users for decision support, data mining, and internal/external reporting.” They have been able to create business intelligence in an intelligent way.