Pages

Ads 468x60px

Labels

Showing posts with label Business Intelligence. Show all posts
Showing posts with label Business Intelligence. Show all posts

Monday 2 June 2008

Data Integration Challenge – Parent-Child Record Sets, Child Updates

There are certain special set of records like Loan & its Guarantor details in a banking system, each Loan record can have one or more Guarantor record. In a similar way for a services based industry Contracts & its contract Components exist, these sets can be called as parent-child records where in for one parent record like Loan we might have zero to many child records of Guarantor.
During data modeling we would have one table for the parent level record and its attribute, another separate table for the child records and its attributes.
As part of the data load process, have seen situations where a complete refresh (delete & insert) of the Child records is required whenever there is a change in certain attributes of a parent record. This requirement can be implemented in different ways; here we would look at one of the best ways to get this accomplished.
The following steps would be involved in the ETL process
  1. Read the parent-child record
  2. Determine if a change in the incoming parent record
  3. If a change has occurred then issue a delete to the particular set of child records
  4. Write corresponding incoming new child records into a flat file
  5. Once step 1 to 4 is completed for all parent records have another ETL flow that would bulk load the records from the flat file to the child table
We didn’t issue an insert with a new incoming child record after the delete because the deleted record wouldn’t have got committed and an insert can lock the table. We can issue a commit after every delete and then follow it with an insert but having a commit after each delete would be costlier, writing the inserts to the files handles this situation perfectly.
Also an option to insert first with a different key and then delete the older records would be costlier in terms of locating the records that needs to the deleted.
We could have also looked at the option of updating the records in place of deletion then we would at times end up having dead records in the child tables; the records that have been deleted in the source would still exist in the target child table, also updating a record can disturb contagious memory, deletion and insert would have the pages intact.
Read More about  Data Integration


Tuesday 15 April 2008

Using Analytic Hierarchy Process (AHP) for BI Tool Evaluation

Enterprise wide BI architecture utilizes a multitude of tools within its landscape, each serving a specific functionality – Extract, Transform and Load (ETL), Data Cleansing, Metadata Management, Databases (both relational and multidimensional), Reporting and Analytics (OLAP), Data Mining etc. For example, just taking the OLAP area alone, there are more than 40 different products that can potentially solve a customer problem. You can imagine the number of combinations possible when all the tool options are combined across the overall landscape. This establishes the fact that one of the most challenging and vexing problems in Business Intelligence domain is Tool Evaluation.
Tool Evaluation and selection has become strategic to the implementation of enterprise wide Business Intelligence. Traditionally, tool selection involved comparing the technical features of the tools, looking at demos by product vendors, reading up industry reports, get word-of-mouth referrals and then taking a final decision. In my humble opinion – that is not sufficient any more.
Technical features, though important, cannot be the definitive criteria for selecting a particular tool. More crucial than technical features is what I term as the “Business Fitment Index”. The selected tool should fit with the characteristics of the business process prevalent in the organization and should take into account the requirements of different classes of users. The concept of Business Fitment can be classified as a Multi Criteria Decision Making (MCDM) problem and one of the powerful tools in this category is the Analytic Hierarchy Process (AHP).
AHP is a systematic procedure that helps to:
  1. Represent the elements of any problem, breaking it down into smaller constituents
  2. Assign weightages to each constituent by following a pairwise comparison technique
  3. Leverage expert judgment and intuitive feel into a coherent framework for problem solving
Though AHP can be used in many situations, Hexaware’s BI practice has perfected the art of leveraging its power in the realm of “BI Tools Evaluation”. There are 3 steps to calculating the Business Fitment Index using AHP.
Step 1 – Pair-wise comparison of business parameters by customer stakeholders is done in this step. The parameters can be things like – Real Time Data Integration, Data Volumes, Data Quality, Business Rules Flexibility etc.
Step 2 – Relative ranking of Business Parameters based on the AHP (Analytic Hierarchy Process) technique
Step 3 – Each of the short-listed tools are evaluated against the business parameters and a final rating is arrived at taking into account the organization readiness factors
Bottom-line is that the technical features of the tools have to be taken in conjunction with the fitment level of tool to the characteristics of the business. That alone would ensure the success of the tool for enterprise wide BI initiatives.
AHP is a simple yet powerful way of arriving at a decision by consensus. There are wide ranging applications of AHP in BI and this is a great area for practitioners to get interested. If you have some thoughts on other applications of AHP in the BI world, please do share it with us. Thanks for reading!

Monday 24 March 2008

Metadata 101 – For BI Practitioners

For as long as I can remember, the definition given for Metadata is “Data about Data”. We have all said this in interviews, heard it from candidates, seen it on presentations, and (almost) always nodded our heads in agreement.
In the transaction processing world, where “data-in” is the paradigm, the definition is precise. The databases store the business data in the relational format and the system tables / catalogs describe the structure of that data – the columns, type, size, etc. This data about the structure of business data is “Metadata”.
In the Business Intelligence world, that definition of metadata is incomplete. A more precise definition of metadata has two components:
Metadata in BI = “Data about Data” + “Information about Information”
The first component “Data about data” is “Technical Metadata” and is similar to the metadata in the OLTP world. Having said that, the technical metadata in BI is arguably more complex, as it not only encompasses the databases but needs to cover the ETL and Reporting tools as well. Each of the tools in the overall BI landscape has its own metadata and this data has to be looked at in a comprehensive fashion to understand data lineage etc.
Even among BI tools, there are different categories – Tools that expose its metadata completely, tools that gives an handle to its metadata thro’ pre-defined APIs and tools that do not allow any access to the metadata. Given the industry direction and the evolution of Common Warehouse Metamodel (CWM) compliance standards, it is only a matter of time before the tool architecture is designed to expose the technical metadata. CWM is a fascinating topic of its own and you can get a feel for it by visiting this website: http://www.omg.org/technology/cwm/
To me, as a BI practitioner, the second piece of the metadata puzzle is more interesting. “Information about information” aspect of metadata is “Business Metadata” and understanding it is crucial to implementing the BI vision in any enterprise.
As an analytical information consumer, there are 2 important requirements:
  1. Need direction to access the required analytical content
    Example:
    • Where can I get Sales by Product for different locations over the last 2 years?
    • Am interested in Customer related Analytics. Where do I access it?
  2. Once the content is retrieved, need guidance on how to make sense of it
    Example:
    • Report shows Forecasted Sales for next quarter in the chart. How is this value calculated?
    • Does the total inventory value displayed in the report include the Raw material inventory or does it exclude it?
Business metadata when properly organized should provide direction to both the points mentioned above.
Metadata management in BI deals with integration of technical and business data in a way that is useful for the organization. The challenge of metadata management becomes even more daunting when one considers both structured and unstructured data. Having said that, it is important for BI practitioners to understand the true nature of BI metadata and provide implementable solutions in their specific organizational context.
In my future posts, I would discuss this fascinating area of Metadata management, with its manifestation as “Technical and Business Metadata” in both structured and unstructured data domains.
Read More about  Metadata 101

Friday 29 February 2008

BI Strategy – Approach based on First Principles

Business Intelligence Strategy definition is typically the first step in an organization’s endeavor to implement BI (Business Intelligence). This phase is very crucial as the overall execution direction hinges on decisions taken in this stage.
The precise approach to the BI Strategy definition includes the following steps:
  1. Business Area Identification - Identify and prioritize the business area(s) for which BI is considered. Ex: Human Resource Analytics, Supply Chain Analytics, Enterprise Performance Analytics etc.
  2. Process Mapping Document - Once the business area is identified, map out the individual processes involved in that particular domain. This can be a simple flow-chart that shows the entry and exit criteria for each sub-process.
  3. Business Questions Enumeration – Based on the subject areas involved in the business domain, enumerate the list of questions that are to be answered by the analytical layer.
  4. Data Elements Segregation – For each of the process steps, identify the data elements. These data elements, after subsequent validation (in conjunction with business questions) would translate into dimensions and facts during the data modeling stage.
  5. Data Visualization – Develop a prototype (set of screenshots) on how the data would be visualized for each business question. Business Analysts and domain experts are typically involved at this stage.
  6. BI Architecture Synopsis – At a fundamental level, BI architecture is fairly straightforward. The architecture is almost always a combination of the following processes: Extraction (E), Transformation (T), Loading (L), Cubing (C), and Analyze (Z). The number of layers, type of reporting etc. are a combination of ETLCZ components. Ex: ETLZ, ETLTLCZ, ELTZ, ELCZ are some options for BI architecture definition.
  7. Next Steps Document – The ‘Next Steps’ document would list down the other requirements of / from the analytical infrastructure. These can be points around Tool Evaluation, User profiles, Data volumes, Performance considerations, etc. Each of these requirements would translate to an assessment to be carried out before the actual construction begins.
The most common mistake is to start thinking about technology aspects before the actual business requirement is finalized. A precise definition of business questions goes a long way in designing a scalable and robust BI infrastructure. 
Read More about  BI Strategy

Monday 28 January 2008

Business Intelligence and Six Sigma

I just finished a Six Sigma project and was left wondering as to why BI practitioners are not using more of that Six Sigma power in Business Intelligence. Let me delve on this subject a bit more.
The Six Sigma project that I just completed was on “Developing a Function Point based estimation model for ETL loads”. Essentially, I was facing a lot of problems in estimating the effort for ETL (in this case, Informatica) loads that led to “Effort variances” beyond specified limits. So we kicked off a Six Sigma project that had the following DMAIC phases:
1. Define – Definition of the problem (Ex: Estimation process is out of whack)
2. Measure – We measured the effort variances before the start of the project and also set ourselves a target of where it should be.
3. Analyze – Analyzed the root-cause of the problem. The solution was to let go of the complexity based estimation that was done initially and to adapt Function points. In fact, this FP based estimation model was presented at the International Software Estimation Colloquium last year and won the Runner-up prize (http://www.qaiasia.com/Conferences/sec2007/leadership.htm)
4. Improve – Based on a pilot within the project, the Function points based linear regression model was arrived at and the team was educated on the estimation process. The improvements to the estimation process (effort variances) were measured on a regular basis.
5. Control – Periodic checks to ensure the institutionalization of the process and also fine-tune wherever necessary.
That in a nut-shell is what my Six Sigma project was all about. Basically, Six Sigma tries to improve process efficiencies by following the phases mentioned above.
Now let’s see the connection to Business Intelligence. Analytics at this stage of evolution (in majority of organizations) are being used to find the improvement area at a given point of time. The improvement area can be a problem (Ex: Trend chart showing that the Sales in the West region is dropping by 10% every quarter for the last 3 quarters) or an opportunity (Ex: Market potential for a product is huge and our share is small). BI is reasonably good at providing this information and it will only get better. But BI by itself does not enforce the process / execution rigor that is required for successful organizations.
To summarize, Six Sigma needs an improvement opportunity as the starting point for it to unleash its power to improve processes. BI generates lot of these opportunities with its DW/Reporting/Analytics components but does not enforce the process implementation rigor. I feel that there is lot of synergy in bringing both together – Six Sigma, the left hand and BI, the right hand when brought together can earn a lot of claps in the quest to create learning, performing organizations.
Just to sample the power of Six Sigma techniques, please take a look at the following link:http://www.kaushik.net/avinash/2007/01/excellent-analytics-tip-9-leverage-statistical-control-limits.html, which illustrates the use of control charts (one of Six Sigma’s potent tools) in metrics / KPI management. Fascinating!
Agree / Not Agree, Have more thoughts on this topic, this post is good / rubbish, for anything – Please do send in your comments.
Information Nugget:Having talked about execution rigor, let me recommend one of the best books I have read in that area. “Execution – The Discipline of Getting Things Done” by Larry Bossidy and Ram Charan (http://www.amazon.com/Execution-Discipline-Getting-Things-Done/dp/0609610570)