Pages

Ads 468x60px

Labels

Tuesday 21 October 2008

Business Intelligence Value Curve

Every business software system has an economic life. This essentially means that a software application exists for a period of time to accomplish its intended business functionality after which it has to be replaced or re-engineered. This is a fundamental truth that has to be taken into account when a product is bought or for a system that is developed from scratch.
During its useful life, the software system goes through a maturity life cycle – I would like to call it the “Value Curve” to establish the fact that the real intention of creating the system is to provide business value. As a BI practitioner, my focus is on the “Business Intelligence Value Curve” and in my humble opinion it typically goes thro’ the following phases as shown in the diagram.
curve1
Stage 1 – Deployment and Proliferation
The BI infrastructure is created at this stage catering to one or two subject areas. Both the process and technology infrastructure are established and there will be tangible benefits to the business users (usually the finance team!). Seeing the initial success, more subject areas are brought into the BI landscape that leads to the first list of problems – lack of data quality, completeness and duplication of data across data marts / repositories.
Stage 2 – Leveraging for Enterprise Decision Making
This stage takes off by addressing the problems seen in Stage-1 and overall enterprise data warehouse architecture starts taking shape. There is increased business value as compared to Stage-1 as the Enterprise Data Warehouse becomes a single source of truth for the enterprise. But as the data volume grows, the value is diminished due to scalability issues. For example, the data loads that used to take ‘x’ hours to complete now needs at-least ‘2x’ hours.
Stage 3 – Integrating and Sustaining
The scalability issues seen at the end of Stage-2 are alleviated and the BI landscape sees much higher levels of integration. Knowledge is built into the set up by leveraging the metadata and the user adoption of the BI system is almost complete. But the emergence of a disruptive technology (for example – BI Appliances) or a completely different service model for BI (Ex: Cloud Analytics) or a regulatory mandate (Ex: IFRS) may force the organization to start evaluating completely different ways of analyzing information.
Stage 4 – Reinvent
The organization, after appropriate feasibility tests and ROI calculations, reinvents its business intelligence landscape and starts constructing one that is relevant for its future.
I do acknowledge the fact that not all organizations will go through this particular lifecycle but based on my experience in architecting BI solutions, most of them do have stages of evolution similar to the one described in this blog. A good understanding of the value curve would help BI practitioners provide the right solutions to the problems encountered at different stages.

Friday 10 October 2008

Business Intelligence Challenge – Product Upgrades & Migrations Product Upgrades & Migrations, Object Consolidation – 2

As an initial step one of the key tasks to be considered in any Business Intelligence product upgrade or migration is ‘Object Consolidation’.
What is Object Consolidation? The process of getting to understand the current BI environment by means of the metadata and analysing them with a perspective to determine and eliminate redundant objects. The ‘object’ in a BI product would be its reports and the semantic layer definitions (like Universe in Business Objects).
Steps Involved in Object Consolidation
1. Locate all objects (reports and semantic definitions). These objects could be from a central repository and as well from individual user folders and desktops
2. Check whether the Object’s metadata are available in a relational storage (metadata repository) else build processes that would collect the metadata of the objects and store them into a relational structure
3. Run SQL queries against the relational structure to determine
a.‘Duplicates’; the objects that have same metadata elements
b.‘Clusters’; the objects that have similar metadata elements. when objects(reports) differ between them by a few 1 or 2
metadata elements then these Objects are grouped as ‘Clusters’
c. ‘ Dormant’; the objects that are no longer used
d. Complexity of the objects in terms of factors like the number of metadata elements being used in an object
4. Share the object consolidation findings to the users for confirmation and verification
5. By eliminating the duplicate & dormant and including only the prime in a cluster prepare the consolidated list of objects
a.Duplicate objects are directly removed
b.From the Cluster objects only the key object is considered for upgrade. After the upgrade of the key object rest of the
objects in the same cluster are derived from this upgraded key object
The consolidated list of objects and the understanding of the complexity of the existing environment becomes one of the key inputs to plan for the upgrade process.
Benefits of Object Consolidation
1. Eliminating upgrade of unwanted objects, saving on effort & cost
2. Enabling to build a clean system in the newer version or platform ensuring easier system maintenance
3. Enables effective upgrade planning based on the understanding of the environment
4. Improves the understanding of the existing environment through the metadata links
Object Consolidation Challenge: Accessing the metadata of the objects would be a challenge since many of the BI products don’t expose the metadata that can be queried through SQLs. But almost every products provide SDK kits through which the metadata can be accessed or expose the metadata as XML files. We would need to build tools that can pull the metadata using SDKs or in the cases of XML files build XML readers/parsers to pull the required metadata.

Friday 3 October 2008

Data Integration Challenge – Storing Timestamps

Storing timestamps along with a record indicating its new arrival or a change in its value is a must in a data warehouse. We always take it for granted, adding timestamp fields to table structures and tending to miss that the amount of storage space a timestamp field can occupy is huge, the storage occupied by timestamp is almost double against a integer data type in many databases like SQL Server, Oracle and if we have two fields one as insert timestamp and other field as update timestamp then the storage spaced required gets doubled. There are many instances where we could avoid using timestamps especially when the timestamps are being used for primarily for determining the incremental records or being stored just for audit purpose.

How to effectively manage the data storage and also leverage the benefit of a timestamp field?
One way of managing the storage of timestamp field is by introducing a process id field and a process table. Following are the steps involved in applying this method in table structures and as well as part of the ETL process.
Data Structure
  1. Consider a table name PAYMENT with two fields with timestamp data type like INSERT_TIMESTAMP and UPDATE_TIEMSTAMP used for capturing the changes for every present in the table
  2. Create a table named PROCESS_TABLE with columns PROCESS_NAME Char(25), PROCESS_ID Integer and PROCESS_TIMESTAMP Timestamp
  3. Now drop the fields of the TIMESTAMP data type from table PAYMENT
  4. Create two fields of integer data type in the table PAYMENT like INSERT_PROCESS_ID and UPDATE_PROCESS_ID
  5. These newly created id fields INSERT_PROCESS_ID and UPDATE_PROCESS_ID would be logically linked with the table PROCESS_TABLE through its field PROCESS_ID
  6.  
ETL Process
  1. Let us consider an ETL process called ‘Payment Process’ that loads data into the table PAYMENT
  2. Now create a pre-process which would run before the ‘payment process’, in the pre-process build the logic by which a record is inserted with the values like (‘payment process’, SEQUNCE Number, current timestamp) into the PROCESS_TABLE table. The PROCESS_ID in the PROCESS_TABLE table could be defined as a database sequence function.
  3. Pass the currently generated PROCESS_ID of PROCESS_TABLE as ‘current_process_id’  from pre-process step to the ‘payment process’ ETL process
  4. In the ‘payment process’ if a record is to inserted into the PAYMENT table then the current_prcoess_id value is set to both the columns INSERT_PROCESS_ID and UPDATE_PROCESS_ID else if a record is getting updated in the PAYMENT table then the current_process_id value is set to only the column UPDATE_PROCESS_ID
  5. So now the timestamp values for the records inserted or updated in the table PAYMENT can be picked from the PROCESS_TABLE by joining by the PROCESS_ID with the INSERT_PROCESS_ID and UPDATE_PROCESS_ID columns of the PAYMENT table
  6.  
Benefits
  • The fields INSERT_PROCESS_ID and UPDATE_PROCESS_ID occupy less space when compared to the timestamp fields
  • Both the columns INSERT_PROCESS_ID and UPDATE_PROCESS_ID are Index friendly
  • Its easier to handle these process id fields in terms picking the records for determining the incremental changes or for any audit reporting.
Read More About Data Integration Challenge

Tuesday 30 September 2008

Business Intelligence – The Reusability Gene

One issue that confronts me time and again while executing BI projects is “Reusability”, actually the lack of it. Let me give an example. 
In the many migrations and upgrade projects that Hexaware (my company) has executed, I always find that the number of reports finally migrated/upgraded to a new environment is only 40-50% of the number that is provided to us by the customer initially. Report Rationalization has become such a critical step that we have developed many specific metadata tools that helps rationalize the reporting environment.  Coming back to the topic – The reason for such a divergence between the final number of reports and the initial number is lack of ‘reusability’. Business users have their own versions of standardized (?) reports stored in their desktops which are nothing but small variations (usually with a new filter added) of an already existing report.
Another similar example on the data integration side is the creation of ad-hoc ETL routines as and when required. This results in duplication of ETL jobs and also results in a non-standard BI environment.
Lack of re-use causes two major problems:
1) BI environment becomes bloated with the increase in the number of unwanted components that use valuable computing resources, resulting in delays for availability of more important information.
2) Any attempt at upgrading/re-engineering the existing system results in high costs and undesirable heart-burn among business users
The Prescription:
1) Establish a corporate level BI team whose primary responsibility is to ensure that any component addition (ETL, Reports, and Models etc.) is justified based on its purpose. This team has to ensure that existing standards and components are reused to the maximum extent.
2) Strengthen the “Business Metadata” architecture within the organization. In one of my earlier posts, I had explained my view of BI metadata and that is very relevant to the task of improving reusability.
Basically, the “Reusability gene” seems to be a little muted in its functioning among BI practitioners. It is time that BI teams within organizations and system integrators like Hexaware look at reusability as a critical parameter while developing and deploying BI solutions.
Read More About  The Reusability Gene

Wednesday 17 September 2008

Business Intelligence Challenge – Product Updates and Migration-I

Product Upgrades are situations where we are moving from one version of the product to the latest version of the same product. Upgrades happen
  • to ensure support from the product vendor
  • to leverage new features provided by the latest version in terms of performance and user experience
  • as some other new product which is being added to the architecture doesn’t talk to the existing versions
Product Migrations are situations where we are moving from a platform of one vendor to another vendor’s platform. Migrations happen
  • as ‘BI Standardization’ initiatives drive organizations to move towards a common platform to operate BI systems at a lower cost and provide uniform user experience
  • because of bad experience with the current product not meeting the business needs in terms of performance or usability or product support or license cost
  • to be triggered also because of the recent mergers and acquisitions which lead organizations to think of a ‘safer’ platform
Upgrade a Challenge? With newer versions of every major product especially the ones like Business Objects, Cognos under go such a rapid change that the newer versions of the same product comes out on a different architecture with entirely new set of components, no longer upgrades are upgrades they have become effort intensive product migrations almost similar to moving from one BI product vendor to the another BI vendor.
Let us call either upgrade or migration as ‘Upgrade’ as any such initiative is for better upgraded experience of the business and the IT.
Can we do this upgrade next year? , a common dialogue when an IT team requests for a Business Intelligence Product Upgrade. Upgrade is one of the key items that would definitely come up for discussions during BI budget allocation in every organization. Fears among the business subsist that Upgrade projects would involve many of their hours without much benefit to them. For the IT Upgrade is a bigger challenge due to the unpredictability involved in the problems they would face during the course of the project and ensuring minimal disturbance to the business team. Hence the BI initiatives related to Product Upgrade get through multiple scrutinies before budget approval. Such projects are seen as an IT initiative and clear definition of business benefits becomes difficult to build.

Tuesday 9 September 2008

Business Intelligence – The Unconquered Territories

Bill Bryson, one of my favorite authors, writes this way in the book “A Short History of Nearly Everything” and I quote:
“As the nineteenth century drew to a close, scientists could reflect with satisfaction that they had pinned down most of the mysteries of the physical world: electricity, magnetism, gases, optics, kinetics, and statistical mechanics, to name just a few. If a thing could be oscillated, accelerated, perturbed, distilled, combined, weighed or made gaseous they had done it, and in the process produced a body of universal laws so weighty and majestic that we still tend to write them out in capitals. The whole world clanged and chuffed with the machinery and instruments that their ingenuity had produced. Many wise people believed that there was nothing much left for science to do”
Now we all know how much science did invent / discover in the 20th century.
Sitting now in 2008, sometimes when I hear people speaking about BI, I get a feeling that we are on the verge of accomplishing everything in this space. Alas! That is “as far as it gets” from the truth– There are so many “unconquered territories” in BI that if you were thinking that the past was challenging enough, it is time to get rejuvenated for wresting with bigger challenges in the future.
My top ten “Unconquered Territories” for BI Practitioners are:
1) Majority of BI decision making is geared towards analysis of structured data. Usage of unstructured data is minimal at best and non-existent in many cases.
2) There is still lot of work to be done in integrating the process rigor of a Six Sigma or a quality management methodology (say CMMI) to the BI paradigm. Unless that is done, BI will not be sustainable in the long run.
3) Lack of valuation techniques. BI systems are corporate assets like Human Resources, Brands etc. and there has to be concrete models for valuing them.
4) Predictive Analytics / Data Mining are used only by handful of organizations effectively. There is no shortage of techniques but the world is probably short of people who can apply high-end analytical techniques to solve “common-sense”, real world business problems.
5) Let’s face it – There are technology limitations. Operational BI (Lack of real-time data access), Guided analytics (Lack of comprehensive business metadata), Information as a Service (Lack of SOA based BI architecture) are some of those technology limitations that come to my mind.
6) Data Quality is a nightmare in most organizations. Either the data is already ‘dirty’ or there is really no governance process which leaves the only option that data will become ‘dirty’ eventually.
7) Here is a mindset challenge – BI Practitioners, in my view, need to develop a higher level of “business process” oriented thinking that seems to be lacking given the ever increasing technology complexity of BI tools.
8) Simulations!! – Businesses run with a lot of interdependent variables. Unless a simulation model of the business is built into the analytical landscape, there is really no way of having a handle on the future state of business. Of course, ‘Black Swans’ will continue to exist but that’s a different subject matter altogether.
9) On demand analytics – I accept that am being a little unfair here to expect BI to catch up with the nascent world of “cloud” computing so early. But the fact remains that much work can be done in this area of “Cloud Analytics”.
10) Packaged analytics is a step in the right direction – Organizations can quickly deploy analytical packages and spend more time on how to optimize business decisions. Having said that, the implementation difficulty combined with the lack of flexibility in packages are areas of concern to be alleviated.
Each one of us will have our own list of “unconquered territories”. Probably it is worthwhile to put everything down on paper and nudge your BI environments towards conquering all those areas and beyond.
Read More About  Business Intelligence

Monday 1 September 2008

Business Intelligence Challenge – Understanding Requirements, System Object Analysis

In the earlier discussion we had looked at understanding BI requirements through User Object Analysis, now let us look at another aspect.
The uniqueness in building BI systems when compared to other systems is that BI systems are built over the data collected by transaction (source) systems for effective data analysis. In principle a BI system should enable any kind of analysis on the data from source(s), but in many cases we pull only required elements initially to the data warehouse based on predefined analysis and get the BI system up. The requirements for a BI system is to define the scope in terms of what business processes, its scenarios and data that are of immediate need and get them available for analysis.
Even though many system owners or functional experts provide the details of the transaction system, there are still many data elements and relationship that are not reachable through the inputs from the business. We must have experienced new scenarios pointed out by the business like ‘this data element should not be updated’, ‘we need the value to be populated based on a certain flag’, such things emerge during the testing phase or in the production, such surprises occur not because that the requirements keep changing but due to lack of understanding of the clear scenarios based on the data present in the source system.
The means of understanding the business process and the system functions of a source system by looking at its data elements and their values is called ‘System Object Analysis’.
Following are the steps in ‘System Object Analysis’
1. Collect all tables from the source system, physical structure metadata like table name, column name, data type etc
2. Define the descriptions in terms of kind of data each of these tables store
3. Group the tables based on the functions through description understanding or through naming conventions present among the tables.Certain tables or groups can get eliminated here by interaction with the users. Also a table can belong to multiple groups
4. Reverse engineering the underlying data model would be useful as well
5. Perform data profiling for each of tables
6. Understand the domain values, their significance in terms when such value can occur and the relationship between tables
7. Determine the different scenarios on how the data has arrived into this table
8. Determine the fact, dimension and the attributes of dimensions within each functional area/group
9. Now with the clear details on each group and the facts-dimensions that they contribute, prepare certain questions that a business can get answered within and across the functional area (groups). Validate the questions and possibly collect more questions from Business.
10. Present to the business on what can be done on the system, prioritize and prepare the implementation plan
Based on the analysis of the tables, the Group or Functional defined initially can undergo changes in terms of the table list within a group or even a new group can come up. During the above steps regular interaction with the business users happens and the requirements of the BI system gets defined.
Benefits of System Object Analysis
Ensures complete understanding of the process by which data gets modified in the source system enabling to deliver more than what the business needs
Helps group, prioritize requirements and build case for the dependency and prepare roll out plan
Means to trigger the requirements definition from user through an interactive process, gets us raise many questions to the business about their system and process
Many a times the requirement defined by the business is to build an ad-hoc query environment for a transaction system, so System Object Analysis which enables the users navigate the requirements through the inputs from the technical team becomes almost mandatory for building an effective BI system.

Saturday 30 August 2008

Introduction of External Business Component (EBC)

In almost all the implementation of Siebel Application we come across the requirement as – “Bring the data from the external source and display” well for that in Siebel the solution is External Business Component (EBC) and Virtual Business Component (VBC). In this Blog the content is specific to EBC, so let me give the defination of EBC first then the process of creating it –
Defination:
EBC is a method used when there is a need to display data which is external to siebel. The external schema is imported into siebel tables using a wizard. Once the exteranal schema is imported, to display this data in an applet, the configuartion is going to be the ame as creating a BC, an Applet,a View etc.
Steps of Creating:
1. get the DDL file for your external table.
here is how a sample ddl file will look like:
CREATE TABLE TPMS.EBC_VEC
(
demo1 VARCHAR2(20),
demo2 VARCHAR2(20),
demo3 NUMBER(10)
)
2. Use siebel object creation wizard to create this table.
File –> New Object –> External Table Schema Import
3. The wizard will ask for following inputs:
i. Select Project this table will be part of from the list -
ii. Select the database where external table resides – Enter the database, for this example it is Oracle Server Enterprise Edition
iii. Specify full path of the file where table definition resides -
iv. Specify a 3 digit batch code for this import – eg 001
v. Click on Next and then click on Finish
4. This will create your External table. with a name like EX_001_0000001. The names of External tables begin with “EX_” the next 3 characters are batch codes and the rest is just a serial number.
* The Type field will be “External” for this table.
* You will also have to map one of the table columns to the Siebel’s Id field. to do this: go to the desired table column and in the “System Field Mapping” column select “Id”
5. Changes to be made in cfg file now follow the below steps
  • create an entry for a new datasource under [DataSources] section
TPMS = TPMS
  • add a new section [TPMS] to describe the datasource params:
[TPMS]
Docked = TRUE
ConnectString = VECDEV
TableOwner = TPMS_INT
DLL = abc.dll
SqlStyle = OracleCBO
DSUserName = vecdev
DSPassword = vecdev
  • Now that you have defined the Datasource in cfg file, go back to siebel tools and add the datasource to your external table. Go to your external table, and go to the Data Source and add a new record:
    Name = TPMS
  • External table is now ready for use in a EBC.
Use siebel object wizard to create a BC based on this table. Once the BC is created, change the Data Source property of the BC to “TPMS” .You are now ready to use this BC in a applet/view.
In the above process the description is about an external data source called TPMS and we are fetching the  data from TPMS to Siebel.
But what if we come across a bit more complex requirement … suppose the data is in siebel but it should not be modified from the front end or from the back end (unless and untill one has right to do so) just like an external data source schema. Or let me put it as can we make EBC on the same database we are working i.e. Siebel Database???
The answer and solution is the same Yes we can create an EBC based on the same database but for that we need to create a different DSN and then follow the steps given above.
Please feel free to put comments/questions/ideas