Pages

Ads 468x60px

Labels

Monday 19 January 2009

Analytics, its Evolution

What is ‘Analytics’ – A business intelligence application with ready to use components for data analysis, we also refer to it as ‘packaged analytics’. ‘Business Analytics’ refers to analytics applications that support analysis of data collected as part of a business process.
In similar lines we can define an analytics application that supports analysis of data collected as part of a ‘computer user’ daily activity as ‘Personal Analytics’.
Business systems evolved from the state of building custom applications to a state of configurable generic Enterprise Resource Planning (ERP) systems. Now we have configurable generic business intelligence applications called ‘Business Analytics’ which have evolved from the state of building custom business intelligence applications.
The ERP systems are designed to collect the business data where as the Business Analytics systems are designed to analyze the collated business data, so one of the key sources for a Business Analytics application is an ERP system. Data analysis is a next logical step after data collection, the ERP vendors like Oracle, SAP, Microsoft got delayed in addressing this specific requirement of data analysis. In the last two years we have seen some finer business intelligence products being acquired by the ERP vendors. Clearly the customers who are on ERP products would get a better platform that can talk to their ERP applications for data analysis.
It’s a reality that not many companies, at least the larger (>USD 500million) companies would not run their entire business in one ERP system. Consolidating all applications to one single ERP platform will not happen immediately, multiple ERP and custom applications would get added if the company grows through acquisitions, hence existence of multiple transaction systems cannot be avoided. The number of customers embracing packaged analytics from the ERP vendors will increase as the flexibility of the business analytics applications from the ERP vendors matures to accept data from other outside applications.
Logical Data Model to Packaged Reports
The business analytics applications grew step by step as following
  • 1. Logical data model – as a first step towards the formation of packaged analytics, companies like IBM, Teradata provided industry specific logical data models (LDM) to help customers build their enterprise data warehouse. The LDM was based on the business process and provided the required jumpstart to enable the integration of data from multiple source systems effectively. We also have certain industry endorsed LDMs like Supply-Chain Operations Reference-model (SCOR), Public Petroleum Data Model(PPDM
  • 2. Metrics definition – LDMs led to the next step of defining metrics to measure the performance of the business process. The required data for the metrics that were specific to a business process were extracted (virtually/physically) into data marts as analytic data models in a fact-dimension data model
  • 3. Semantic Layers – the next step was the creation of semantic layer over the data mart to enable adhoc querying and report generation
  • 4. Reports and Dashboards – then we had set of reports and dashboards delivered over the semantic layer
Still the packaged analytics are positioned as a data mart application addressing specific business process like HR or Customer Relationship, unlike ERP systems which addresses complete end to end business process of an organization…there is still more time to go for an Enterprise Analytics Application to be established.
Read More About  Analytics and its Evolution

Friday 2 January 2009

What is “Safe to Bet On” in Business Intelligence?

While the phrase “Safe to Bet On” is an oxymoron of sorts, it is that time of the year where we first look at the past, derive some insights and look forward to what the future has in store for us. I have no doubts that 2009 will be doubly interesting for BI practitioners as compared to 2008.
Having said that, I decided to do a bit of introspection to figure out what skills (can also be read as competencies) should I be looking at to stay relevant in the Business Intelligence world far into the future, say at 2020. Hopefully that resonates with some of you.
Let me first try and get down to defining the skills required for Business Intelligence and Analytics. The trick here is to stay “high-level” as any BI person will acknowledge the fact that one we get down to look at the trees (rather than the forest), the sheer number of skills required for enterprise level BI can get daunting
Taking inspiration from the fact that any business can be condensed into 2 basic functions, viz. Making & Selling, I propose that there are 3 key skills that make for successful BI
Skill 1 – Business Process Understanding: If you are a core industry expert and can still talk about multi-dimensional expressions, that’s great! But most BI practitioners have their formative years rooted on the technology side and have implemented solutions across industries. The ability to understand the value-chain of any industry, map out business processes, identify optimization areas, translating IT benefits to business benefits are the key sub-skills in this area.
Skill 2 – Architecting BI Solutions: This skill is all about answering the question of “What is the blue-print” for building the Business Intelligence Landscape in the organization. Traditionally, we have built data warehouses & data marts either top-down or bottom-up, integrated data from multiple sources into physical repositories, modeled them dimensionally, provided ad-hoc query capability and we are done! – NOT ANYMORE. With ever increasing data volumes, real-time requirements imposed by Operational BI, increased sophistication for end-user analytics, the clamor for leveraging unstructured data on one hand and the advent of On-Demand Analytics, Data Mashups, Data Warehouse appliances, etc., there is no single best way to build a BI infrastructure. So the answer to “What is the blue-print?” is “It depends”. It depends on many factors (some of which are known today and many which aren’t) and the person / organization who appreciates these factors and finds the best fit to a particular situation is bound to succeed.
Skill 3 – BI Tools Expertise: Once a blue-print is defined and optimization areas identified, we need the tools that can turn those ideas into reality. BI practitioners have many tools at their disposal straddling the entire spectrum with excel spreadsheets at one end to high-end data mining tools at the other extreme. If you bring in the ETL & data modeling tools, the number of industry-strength tools gets into the 50s and beyond. With convergence of web technologies, XML, etc. into mainstream BI, it probably makes sense to simplify and say “Anything you imagine can be done with appropriate BI tools”. “Appropriate” is the key word here and it takes good amount of experience (and some luck) to get it right.
In essence, my prescription for BI practitioners to stay relevant in 2020 is to be aware of developments on these 3 major areas, develop specific techniques / sub-skills for each one of them and more importantly respect & collaborate with the BI practitioner in the next cubicle (which translates to anywhere across the globe in this flat world) for he/she would bring in complementary strengths.
Read More About  Safe to Bet On

Monday 22 December 2008

Business Intelligence Challenge – Product Upgrades & Migrations, Validation – 5

Once the code has been moved to the target platform (Moving the Code), whether it’s an upgrade to a newer version or migration to another newer platform, the next step is to validate the objects moved.
Validation Process involves verification or testing of the objects in the target platform to ensure that they deliver the same output as the older objects in the source platform.
Validation is a key process by which the migration or upgrade process is certified as successful, it’s usually laborious and a time consuming process. Let us see how the Validation Process can be broken into different steps and automated for saving time and for improved accuracy. We can look at the Validation process to encompass three steps, they are
  • Metadata Validation
  • Run Validation
  • Output Validation
Metadata Validation involves comparison of the metadata definitions between the existing source environment and the target environment. This requires that the metadata of the source and the target environment be captured for the comparison.
Steps Involved:
  • Capture the source metadata into a relational structure, as part of Object Consolidation we would have captured the source metadata
  • Capture the target platform metadata in a similar way into a relational structure
  • Run SQL queries to automate the metadata comparison process
Metadata Comparison would be done at the level of semantic layer definitions and individual reports. Let us take the case of metadata comparison between two semantic layers, in case of Business Objects; Universe is the semantic layer definition. After an upgrade from an older version of Business Objects to its newer version, the first level of metadata validation between the universes would be to check whether the object counts between the universes match like the classes, the objects, the filters and then further comparison on their definitions.
If there are any differences when comparing the definitions and if they fall within the known differences between the two versions (source & target) then they are good else would require code fixing in the upgraded object.
Since we always try to validate the reports by what it gives as output, the validation process is limited by the data fed in; we could miss scenarios of a filter clause not being tested. Metadata Validation can overcome the limitation in data preparation for different scenarios for testing. If a report passes through a Metadata Validation expectation then we could 100% say that the report has upgraded or migrated effectively.
Benefits:
  • Sets up a strong base on the metadata understanding, as the objects between different platforms has to be mapped and the bridges gaps identified to run automated metadata validation
  • Improved accuracy in the validation process, overcomes the limitation in data preparation
  • Enables determining issues without running the report against the data
Run Validation is to perform a dry run of the reports in an automated way to determine whether the reports run (open) successfully or not.
When we give a report to a tester, the first activity he would perform is to run the report and if it doesn’t go through the problem is reported or analysed further. We try to foresee this problem in an automated way.
Steps Involved:
  • Have scripts to invoke the reports in batch mode, as soon as the objects are upgraded invoke(open) all the upgraded reports in the batch mode
  • Capture the errors while opening/running the report into a log
  • Classify them into two categories ‘reports that ran’ and ‘reports that failed’
Some reports could fail to open because of incorrect connection details, some due to object not found etc. This process of quick run in an automated way enables to locate the failure reports immediately and also help determine the reason for the failures in one go. Limiting the data input should be considered while invoking the report.
Benefits:
  • Saves time in determining errors due to report opening or running
  • Enables building a common solution for the code fixing team, as the ‘run errors’ are consolidated
Output Validation, is to validate the output delivered by the reports. There are two levels of output validation; they are Format Validation and Data Validation.
Format Validation is to check on the format of the data presented like font size, colour, bold, label location etc which doesn’t relate to the data value.
Data Validation is to check cell by cell the data value content between the two reports.
Steps:
  • Run the source report and export the output data to excel/word
  • Run the target report and export the output data to excel/word
  • Compare the outputs for the format and the data
The best means of comparing the output of two reports is to export them to Excel and then performing a comparison between the two Excel’s. If we can export the reports to a word format then we can leverage the word compare utility, even an export to XML would enable using available utility. In case of excel we would need to build a utility that can compare the two excel sheets.
The above three validations are some of the key aspects in validating the objects of semantics and reports; let me know your thoughts on the other means of validation …

Monday 15 December 2008

The Esoteric World of Predictive Analytics

Let me start with the defintion of Predictive Analytics as used in literature – “The nontrivial extraction of implicit, previously unknown and potentially useful information from data”. If that doesn’t sound esoteric enough, you are probably more advanced than what this post gives you credit for!
For a BI practitioner, it is important to get an understanding of Predictive Analytics (also known as Data Mining) as this subject definitely deserves a place in the wide spectrum of Business Intelligence disciplines. BI at a broad level is about optimizing business through “Hindsight, Insight and Foresight”. Predictive analytics adds the powerful “Foresight” part to business decision making.
Most BI practitioners tend to equate statistics with predictive analytics and this post explains why such a view is inaccurate. To understand this let’s start at the very beginning (a la Alice in Wonderland). Broadly, this world is divided into 2 types of systems:
  • Physical Systems – Has causality and hence can be modeled mathematically with relative ease
  • Human Behavioral Systems – Lacks causality and can be modeled only with specialized techniques
Predictive analytics for business decision making is all about modeling human behavioral systems.
Why Traditional Statistics is insufficient?
Though the entry into predictive analytics requires that we understand the implications of traditional statistical analysis, statistics by itself is insufficient in the business context. Traditional statistical analysis allows us to understand the general group behavior and is primarily concerned with common behavior within the group – the central tendencies.
In business we generally develop models to anticipate human behavior of some type. Human behavior is inconsistent, lacks causality and distributions based on human behavior almost always violate the assumptions of traditional statistical analysis (like normal distribution of data, stability of mean and standard deviation etc). The strength of data mining comes from the ability of the associated techniques to deal with the tails of the distributions, rather than the central tendencies, and from the techniques’ ability to deal with the realities of the data in a more precise manner.
In the realm of predictive analytics, we are concerned with modeling human behavior and hence are interested with the tail of our distribution – small percentage of the population that responds to a campaign, commits a fraud, leave our business or purchase the next service.
Though there are specialized techniques used for Predictive Analytics (viz. Non-linear statistics, Induction Algorithms, Cluster Analysis, Neural Networks to name a few), a BI practitioner is only expected to appreciate its usage in different business situations, prepare and model data as required by the tools and interpret the results correctly (a much less daunting task indeed!)
Typically the model development process involves the following steps – a) Define Project, b) Select Data, c) Prepare Data, d) Transform Variables, e) Process Model, f) Validate Model, g) Implement Model. I will explain these steps in more detail in subsequent posts.
Fundamentally, an end-to-end BI view requires the practitioner to learn the concepts around statistics and predictive analytical techniques as available in tools (like say SQL Server Analysis Services) in addition to their technology bag of tricks around data integration, data modeling and OLAP.
Read More About  Predictive Analytics

Wednesday 10 December 2008

Business Objects Security

In the current business scenario, securing the data and restricting the users from what rows and columns of data they can see and what rows and columns of data they cannot see is very important.  We can secure the rows of data by row level security. Some people call this as ‘Fine grained access control’.  We can secure the columns of data by column level security. This is popularly called in Business Objects as ‘Object level security’
ROW LEVEL SECURITY
There are various ways through which the row level security can be implemented in a Business Objects environment.
One way is by securing the datamart. In case of this approach, the datamart is secured – meaning the security policies and rules are written in the datamart. Technically, a security table can be created and maintained having the users / groups with corresponding access rights.  Security policies can have a logic to compare the active logged in user and security table. All the users accessing the datamart are provided access to their data only after executing the security policies. We can also embed the security policies and rules in a view. A good example for row level security is — Non-Managers cannot see the data of   co-workers however managers can see the data of his / her sub-ordinates. In Oracle (for example), we can create a non-manager and manager views with the security rule (<security_table.user> = “USER”). The security views are imported in the Business Objects ( BO) universe and the reports use these security views through the universe. The main ADVANTAGE of securing your datamart is that your security rules can also be used by many other BI tools ( Cognos, Microstrategy )  as the rules are built at the datamart and NOT at the Business Objects)
Second way is by building the security rules at the Business Objects. Here the security rules comparing the logged in user and security data can be written in a virtual table of your Business Objects. These virtual tables are nothing but the universe derived table. BO Reports use the derived table to access the datamart tables. Alternatively, we can also define security filters in a BO universe. The filters are called as condition / filter  objects in the BO universe world. With this approach, you can take the maximum ADVANTAGE of the BO features however the disadvantage is that when you are going to a different BI tool like Cognos you need to rewrite the business security rules in your new tool.
In case of the projects dealing with the migration of Peoplesoft transactional reporting to Business Objects analytical reporting. We can potentially reuse / import some security tables  and security policies from Peoplesoft into our analytical datamart. These reusable components can save time in building the secured datamart and reporting environment.
COLUMN LEVEL SECURITY
Like ‘Row level security’, we can implement the column level security either at the datamart or Business Objects. In the financial industry, the business users do not want their revenue amounts, social security number , tax id number and other sensitive columns to be shown to unauthorized users.  Given this instance, we can mask the sensitive columns by a restricted tag in the place of sensitive columns. Non-sensitive columns like first name , last name , gender , age can be left and shown as it is to the end business user. These logic can be technically implemented in the business objects universe derived table or datamart views using a decode / ‘if then else’ / case statements.
Alternatively , we can use the universe object restriction feature in the BO designer to define restriction on the universe objects. So whenever a business user tries to drag the restricted object from the universe , the restriction rules get invoked , authorization occurs and the object access is given to the end user if he / she is successfully authenticated to access that object.
I’m signing off this BO security blog for now. The contents are based on my knowledge and BO experience in various projects.  Thanks for reading.  Please share your thoughts on this blog. Also, please let me know your project experiences pertaining to row and column level security in Business Objects.
Read More About  Business Objects Security