Pages

Ads 468x60px

Labels

Showing posts with label Business Intelligence Solutions. Show all posts
Showing posts with label Business Intelligence Solutions. Show all posts

Wednesday 10 December 2008

Business Objects Security

In the current business scenario, securing the data and restricting the users from what rows and columns of data they can see and what rows and columns of data they cannot see is very important.  We can secure the rows of data by row level security. Some people call this as ‘Fine grained access control’.  We can secure the columns of data by column level security. This is popularly called in Business Objects as ‘Object level security’
ROW LEVEL SECURITY
There are various ways through which the row level security can be implemented in a Business Objects environment.
One way is by securing the datamart. In case of this approach, the datamart is secured – meaning the security policies and rules are written in the datamart. Technically, a security table can be created and maintained having the users / groups with corresponding access rights.  Security policies can have a logic to compare the active logged in user and security table. All the users accessing the datamart are provided access to their data only after executing the security policies. We can also embed the security policies and rules in a view. A good example for row level security is — Non-Managers cannot see the data of   co-workers however managers can see the data of his / her sub-ordinates. In Oracle (for example), we can create a non-manager and manager views with the security rule (<security_table.user> = “USER”). The security views are imported in the Business Objects ( BO) universe and the reports use these security views through the universe. The main ADVANTAGE of securing your datamart is that your security rules can also be used by many other BI tools ( Cognos, Microstrategy )  as the rules are built at the datamart and NOT at the Business Objects)
Second way is by building the security rules at the Business Objects. Here the security rules comparing the logged in user and security data can be written in a virtual table of your Business Objects. These virtual tables are nothing but the universe derived table. BO Reports use the derived table to access the datamart tables. Alternatively, we can also define security filters in a BO universe. The filters are called as condition / filter  objects in the BO universe world. With this approach, you can take the maximum ADVANTAGE of the BO features however the disadvantage is that when you are going to a different BI tool like Cognos you need to rewrite the business security rules in your new tool.
In case of the projects dealing with the migration of Peoplesoft transactional reporting to Business Objects analytical reporting. We can potentially reuse / import some security tables  and security policies from Peoplesoft into our analytical datamart. These reusable components can save time in building the secured datamart and reporting environment.
COLUMN LEVEL SECURITY
Like ‘Row level security’, we can implement the column level security either at the datamart or Business Objects. In the financial industry, the business users do not want their revenue amounts, social security number , tax id number and other sensitive columns to be shown to unauthorized users.  Given this instance, we can mask the sensitive columns by a restricted tag in the place of sensitive columns. Non-sensitive columns like first name , last name , gender , age can be left and shown as it is to the end business user. These logic can be technically implemented in the business objects universe derived table or datamart views using a decode / ‘if then else’ / case statements.
Alternatively , we can use the universe object restriction feature in the BO designer to define restriction on the universe objects. So whenever a business user tries to drag the restricted object from the universe , the restriction rules get invoked , authorization occurs and the object access is given to the end user if he / she is successfully authenticated to access that object.
I’m signing off this BO security blog for now. The contents are based on my knowledge and BO experience in various projects.  Thanks for reading.  Please share your thoughts on this blog. Also, please let me know your project experiences pertaining to row and column level security in Business Objects.
Read More About  Business Objects Security

Monday 24 November 2008

Business Intelligence Challenge – Product Upgrades & Migrations, Moving the Code – 4

Last time we discussed about Impact Assessment , the next logical step after this is to perform the actual upgrade or migration of the code.
Moving the Code: Performing Upgrade or Migration of the Objects
When we talk about product upgrades, always the product vendor provides tools by which the objects in the earlier version can be upgraded to the latest version. Yes we would see some objects failing through while using such tools; these are the ones that would need rework after the upgrade process.
When we talk about product migration like moving from Cognos to Business Objects or Business Objects to Cognos, there is good scope for us to look for some ways to automate the code migration. Earlier discussions have been on how to leverage the metadata for understanding the environment, now we are looking at an option on how to manipulate or transform the metadata so that an object in platform ‘A’ becomes compliant to platform ‘B’.
Steps involved in building an automated product migration process
Perform metadata level object mapping between the two platforms, determine the gaps. This would actually be a ‘by product’ of ‘Step 2’ in Impact Assessment
Build individual components that would
  • Read the metadata from the source platform and prepare a repository
  • Have the knowledge of the match & gap between the platforms, could be reference tables
  • Transform the ‘source’ metadata and write out as understood by the ‘target’ platform by using the reference tables
Benefits of Automated Migration
  • Helps avoid creation of objects from scratch
  • Ensures availability of time for testing (core task) than code development
  • Enables team to have a flexible skillset
  • A faster way of delivering things when a ‘one to one’ migration from the source platform is seen as a must
Automated Migration Challenges
Transforming the source metadata to the target platform would be a challenge, especially with data manipulation functions. Having a good understanding of the gaps will help; a reference table mapping the functions between the platforms would be useful. In scenarios where a function cannot be converted to the target platform, a comment can be written into a log file enabling quicker attention.
Have seen good success in writing such automated migration components though not 100%. With almost every products providing good SDK kits for reading and as well writing metadata and as well with the support for XML structures, writing such bridges for object migration are getting easier.
Whether the objects in a product are migrated/upgraded in an automated way or not, the following activity of ‘Validation’ plays a key role in ensuring the final quality, next time let us discuss on some of the means for effective validation ….

Thursday 20 November 2008

Zachman Framework for BI Assessments

The Zachman Framework for Enterprise Architecture has become the model around which major organizations view and communicate their enterprise information infrastructure. Enterprise Architecture provides the blueprint, or architecture, for the organization’s information infrastructure. More information on the Zachman Framework can be obtained at www.zifa.com.
For BI practitioners, the Zachman Framework provides a way of articulating the current state of the BI infrastructure in the organization. Ralph Kimball in his eminently readable book “The Data Warehouse Lifecycle Toolkit” illustrates how the Zachman Framework can be adapted to the Business Intelligence context.
Given below is a version of the Zachman Framework that I have used in some of my consulting engagements. This is just one way of using this framework but does illustrate the power of this model in some measure.
zachman
Some Salient Points with respect to the above diagram are:
  • The framework answers the basic questions of “What”, “How”, “Who” and “Where” across 4 important dimensions – Business Requirements, Conceptual Model, Logical/Physical Model and Actual Implementation.
  • Zachman Framework reinforces the fact that a successful enterprise system combines the ingredients of business, process, people and technology in proper measure.
  • It is typically used to assess the current state of the BI infrastructure in any organization
  • Each of the cells that lies at the intersection of the rows and columns (Ex: Information Requirements of Business) has to be documented in detail as part of the assessment document
  • Information on each cell is gathered through subjective and objective questionnaires.
  • Scoring Models can be developed to provide an assessment score for each of the cells. Based on the scores, a set of recommendations can be provided to achieve the intended goals.
  • Another interesting thought is to create a As-Is Zachman framework and overlay that with To-Be one in situations where re-engineering of a BI environment is undertaken. This will help us provide a transition path from the current state to the future.
Thanks for reading. If you have used the Zachman framework differently in your environment, please do share your thoughts.

Monday 10 November 2008

Valuing your Business Intelligence System – Part 1

Sample these statements:
  • Dow Jones Industrial Average jumped 200 points today, a 2% increase from the previous close
  • The carbon footprint of an average individual in the world is about 4 tonnes per year which is a 3% increase over last year
  • The number of unique URL’s as on July 2008 in the World Wide web is 1 trillion. The previous landmark of 1 billion was reached in 2000
  • One day 5% VaR (Value at Risk) for the portfolio is $ 1 Million as compared to the VaR of $ 1.3 Million a couple of weeks back
Most of us buy into the idea of having a single number that encapsulates complex phenomena. Though the details of the underlying processes are important, the single number (and the trend) does act like a bellwether of sorts helping us quickly get a feel of the current situation.
As a BI practitioner, I feel that it is about time that we formulated a way for valuing the BI infrastructure in organizations. Imagine a scenario where the Director of BI in company X can announce thus: “The value of the BI system in this organization has grown 15% over the past 1 year to touch $50 Million” (substitute your appropriate currencies here!).
The core idea of this post is to find a way to “scientifically put a number to your data warehouse”. Here are a few level setting points:
  1. Valuation of BI systems is different from computing the Return on Investment (ROI) for BI initiatives. ROI calculations are typically done using Discounted Cash Flow techniques and are used in organizations to some extent
  2. More than the absolute number, the trends are important which means that the BI system has to be valued using the same norms at different points in time. Scientific / Mathematical rigor helps in bringing the consistency aspect.
  3.  
My perspective to valuation is based on the “Outside-in” logic where the fundamental premise is that the value of the BI infrastructure is completely determined by its consumption. Or in other words, if there are no consumers for your data warehouse, the value of such a system is zero. One simple, yet powerful technique in the “Outside-in” category is RFM Analysis. RFM stands for Recency, Frequency and Monetary and is very popular in the direct marketing world. My 2-step hypothesis for BI system valuation using the RFM technique is:
  • Step 1: Value of BI system = Sum of the values of individual BI consumers
  • Step 2: Value of each individual consumer = Function (Recency, Frequency, Monetary parameters)
Qualitatively speaking, from the business user standpoint, one who has accessed information from the BI system more recently, has been using data more frequently and uses that information to make decisions that are critical to the organization will be given a higher value. A calibration chart will provide the specific value associated with RFM parameters based on the categories within them. For example: For the Recency parameter, usage of information within the last 1 day can be fixed at 10 points while access 10 days back will fetch 1 point. I will explain my version of the calibration chart in detail in subsequent posts. (Please note that the conversion of points to dollar values is also an interesting, non-trivial exercise)
Am sure that people acknowledge the fact that valuing data assets are difficult, tricky at best. But then, lot more difficult questions on nature and behavior have been reduced to mathematical equations – probably, the day on which BI practitioners can apply standardized techniques to value their BI infrastructure is not too far off.
Read More About  Business Intelligence System

Wednesday 29 October 2008

Business Intelligence Challenge – Product Upgrades & Migrations, Impact Assessment – 3

The next step after ‘Object Consolidation’ is Impact Assessment.
What is Impact Assessment? The process in which, we try to determine the variations or gaps in the existing objects/reports by comparing against the target platform.
The gaps or variations in an existing environment could be due to an existing function being replaced by a new function or an existing function not being supported with an equivalent function in the new platform.
Steps Involved in Impact Assessment
We try to do this comparison against the target platform ‘Impact Assessment’ in an automated way by leveraging the underlying metadata of the existing environment.
1. Take the ‘object metadata’ collected as part of the Object Consolidation
2. Collect details of the possible issues faced during the upgrade or migration process. The source of details would be from
  • prior experience in executing similar projects
  • the manuals and release notes provided by the product vendor
  • the pilot project executed with the subset of objects from the existing setup
3. Convert the ‘issues’ identified into a relational structure table.
  • An issue could be an observation like the function ‘sum’ has been now changed in the newer version of the product to the word ‘sumif’
  • One way of converting this issue to a relational structure is to have two column names ‘issue_case’ and ‘issue_type’ where in issue_case will carry the value ‘sum’ and issue_type will carry the value ‘aggregate’.
  • Converting to a relational structure enables using SQL queries for automated search of the impacted objects by joining the ‘object metadata’ table with the ‘issue’ table
4.Run SQL queries joining the ‘issue’ table with the ‘object metadata’ table to determine the impacted existing objects
5.Classify each object by the degree of impact(number of impact points) and decide on the strategy of upgrade/migration for each these groups
Benefits of Impact Assessment
  • Foresee the challenges and being well prepared for system upgrade or migration
  • Ability to estimate and plan the execution & testing phases very effectively
  • Enable building comprehensive test cases
  • Minimize surprises and provide confidence to the execution team
  • Helps in making decisions of whether we should consider a object to be built from scratch or upgrade/migrate
Impact Assessment Challenges
First gathering the knowledge of issues; the options are talk to a person who has done an upgrade project to collect the issue details or perform a quick pilot with an appropriate sample from the set of objects to determine the issues.
Second conversion of issues logs into relational structure and running of queries to determine the impacts; both of these would require a good understanding of the underlying metadata structure, so explore the metadata structure and understand them to the fullest from the point of analysis.
Next time let us discuss on one other key task in an upgrade project…

Tuesday 21 October 2008

Business Intelligence Value Curve

Every business software system has an economic life. This essentially means that a software application exists for a period of time to accomplish its intended business functionality after which it has to be replaced or re-engineered. This is a fundamental truth that has to be taken into account when a product is bought or for a system that is developed from scratch.
During its useful life, the software system goes through a maturity life cycle – I would like to call it the “Value Curve” to establish the fact that the real intention of creating the system is to provide business value. As a BI practitioner, my focus is on the “Business Intelligence Value Curve” and in my humble opinion it typically goes thro’ the following phases as shown in the diagram.
curve1
Stage 1 – Deployment and Proliferation
The BI infrastructure is created at this stage catering to one or two subject areas. Both the process and technology infrastructure are established and there will be tangible benefits to the business users (usually the finance team!). Seeing the initial success, more subject areas are brought into the BI landscape that leads to the first list of problems – lack of data quality, completeness and duplication of data across data marts / repositories.
Stage 2 – Leveraging for Enterprise Decision Making
This stage takes off by addressing the problems seen in Stage-1 and overall enterprise data warehouse architecture starts taking shape. There is increased business value as compared to Stage-1 as the Enterprise Data Warehouse becomes a single source of truth for the enterprise. But as the data volume grows, the value is diminished due to scalability issues. For example, the data loads that used to take ‘x’ hours to complete now needs at-least ‘2x’ hours.
Stage 3 – Integrating and Sustaining
The scalability issues seen at the end of Stage-2 are alleviated and the BI landscape sees much higher levels of integration. Knowledge is built into the set up by leveraging the metadata and the user adoption of the BI system is almost complete. But the emergence of a disruptive technology (for example – BI Appliances) or a completely different service model for BI (Ex: Cloud Analytics) or a regulatory mandate (Ex: IFRS) may force the organization to start evaluating completely different ways of analyzing information.
Stage 4 – Reinvent
The organization, after appropriate feasibility tests and ROI calculations, reinvents its business intelligence landscape and starts constructing one that is relevant for its future.
I do acknowledge the fact that not all organizations will go through this particular lifecycle but based on my experience in architecting BI solutions, most of them do have stages of evolution similar to the one described in this blog. A good understanding of the value curve would help BI practitioners provide the right solutions to the problems encountered at different stages.

Friday 3 October 2008

Data Integration Challenge – Storing Timestamps

Storing timestamps along with a record indicating its new arrival or a change in its value is a must in a data warehouse. We always take it for granted, adding timestamp fields to table structures and tending to miss that the amount of storage space a timestamp field can occupy is huge, the storage occupied by timestamp is almost double against a integer data type in many databases like SQL Server, Oracle and if we have two fields one as insert timestamp and other field as update timestamp then the storage spaced required gets doubled. There are many instances where we could avoid using timestamps especially when the timestamps are being used for primarily for determining the incremental records or being stored just for audit purpose.

How to effectively manage the data storage and also leverage the benefit of a timestamp field?
One way of managing the storage of timestamp field is by introducing a process id field and a process table. Following are the steps involved in applying this method in table structures and as well as part of the ETL process.
Data Structure
  1. Consider a table name PAYMENT with two fields with timestamp data type like INSERT_TIMESTAMP and UPDATE_TIEMSTAMP used for capturing the changes for every present in the table
  2. Create a table named PROCESS_TABLE with columns PROCESS_NAME Char(25), PROCESS_ID Integer and PROCESS_TIMESTAMP Timestamp
  3. Now drop the fields of the TIMESTAMP data type from table PAYMENT
  4. Create two fields of integer data type in the table PAYMENT like INSERT_PROCESS_ID and UPDATE_PROCESS_ID
  5. These newly created id fields INSERT_PROCESS_ID and UPDATE_PROCESS_ID would be logically linked with the table PROCESS_TABLE through its field PROCESS_ID
  6.  
ETL Process
  1. Let us consider an ETL process called ‘Payment Process’ that loads data into the table PAYMENT
  2. Now create a pre-process which would run before the ‘payment process’, in the pre-process build the logic by which a record is inserted with the values like (‘payment process’, SEQUNCE Number, current timestamp) into the PROCESS_TABLE table. The PROCESS_ID in the PROCESS_TABLE table could be defined as a database sequence function.
  3. Pass the currently generated PROCESS_ID of PROCESS_TABLE as ‘current_process_id’  from pre-process step to the ‘payment process’ ETL process
  4. In the ‘payment process’ if a record is to inserted into the PAYMENT table then the current_prcoess_id value is set to both the columns INSERT_PROCESS_ID and UPDATE_PROCESS_ID else if a record is getting updated in the PAYMENT table then the current_process_id value is set to only the column UPDATE_PROCESS_ID
  5. So now the timestamp values for the records inserted or updated in the table PAYMENT can be picked from the PROCESS_TABLE by joining by the PROCESS_ID with the INSERT_PROCESS_ID and UPDATE_PROCESS_ID columns of the PAYMENT table
  6.  
Benefits
  • The fields INSERT_PROCESS_ID and UPDATE_PROCESS_ID occupy less space when compared to the timestamp fields
  • Both the columns INSERT_PROCESS_ID and UPDATE_PROCESS_ID are Index friendly
  • Its easier to handle these process id fields in terms picking the records for determining the incremental changes or for any audit reporting.
Read More About Data Integration Challenge

Tuesday 30 September 2008

Business Intelligence – The Reusability Gene

One issue that confronts me time and again while executing BI projects is “Reusability”, actually the lack of it. Let me give an example. 
In the many migrations and upgrade projects that Hexaware (my company) has executed, I always find that the number of reports finally migrated/upgraded to a new environment is only 40-50% of the number that is provided to us by the customer initially. Report Rationalization has become such a critical step that we have developed many specific metadata tools that helps rationalize the reporting environment.  Coming back to the topic – The reason for such a divergence between the final number of reports and the initial number is lack of ‘reusability’. Business users have their own versions of standardized (?) reports stored in their desktops which are nothing but small variations (usually with a new filter added) of an already existing report.
Another similar example on the data integration side is the creation of ad-hoc ETL routines as and when required. This results in duplication of ETL jobs and also results in a non-standard BI environment.
Lack of re-use causes two major problems:
1) BI environment becomes bloated with the increase in the number of unwanted components that use valuable computing resources, resulting in delays for availability of more important information.
2) Any attempt at upgrading/re-engineering the existing system results in high costs and undesirable heart-burn among business users
The Prescription:
1) Establish a corporate level BI team whose primary responsibility is to ensure that any component addition (ETL, Reports, and Models etc.) is justified based on its purpose. This team has to ensure that existing standards and components are reused to the maximum extent.
2) Strengthen the “Business Metadata” architecture within the organization. In one of my earlier posts, I had explained my view of BI metadata and that is very relevant to the task of improving reusability.
Basically, the “Reusability gene” seems to be a little muted in its functioning among BI practitioners. It is time that BI teams within organizations and system integrators like Hexaware look at reusability as a critical parameter while developing and deploying BI solutions.
Read More About  The Reusability Gene