Pages

Ads 468x60px

Labels

Showing posts with label Predictive Analytics. Show all posts
Showing posts with label Predictive Analytics. Show all posts

Thursday, 5 September 2013

Business Intelligence - Are you game for the Moneyball Process?

The importance of data-driven decision making and different aspects of looking at data was much popularized among the civic society by Brad Pitt starred Hollywood movie ‘Moneyball’ which is based on a true story.

A quick snapshot of the storyline:
“Oakland Athletics general manager Billy Beane (Brad Pitt) is upset by his team’s loss to the New York Yankees in the 2001 postseason. With the impending departure of star players, Beane attempts to devise a strategy for assembling a competitive team for 2002 but struggles to overcome Oakland’s limited payroll. Billy turns baseball on its ear when he uses statistical data to analyze and place value on the players (not star players though) he picks for the team. This resulted in Oakland’s Athletics (baseball team) set a team record of 20 wins in a row. Similar strategy was adopted by Boston Red Sox’s who won the World Series in 2004 since their first win in 1918”

(Sources: WIKI, IMDB)

What Beane had done differently that turned the game around was application of Sabermetrics (A Statistical analytics method of analyzing data points in the game of baseball). The analytics lead application of Sabermetrics helped Beane to question traditional methods of evaluation such as RBI (Runs Batted In) and batting average. It took in-depth analysis to conclude that matches were not won by players with higher batting average but by those with a higher On Base percentage (OBP), Slugging percentage (SLG). Beane formed a team based on these new metrics and other parameters.

Monday, 15 December 2008

The Esoteric World of Predictive Analytics

Let me start with the defintion of Predictive Analytics as used in literature – “The nontrivial extraction of implicit, previously unknown and potentially useful information from data”. If that doesn’t sound esoteric enough, you are probably more advanced than what this post gives you credit for!
For a BI practitioner, it is important to get an understanding of Predictive Analytics (also known as Data Mining) as this subject definitely deserves a place in the wide spectrum of Business Intelligence disciplines. BI at a broad level is about optimizing business through “Hindsight, Insight and Foresight”. Predictive analytics adds the powerful “Foresight” part to business decision making.
Most BI practitioners tend to equate statistics with predictive analytics and this post explains why such a view is inaccurate. To understand this let’s start at the very beginning (a la Alice in Wonderland). Broadly, this world is divided into 2 types of systems:
  • Physical Systems – Has causality and hence can be modeled mathematically with relative ease
  • Human Behavioral Systems – Lacks causality and can be modeled only with specialized techniques
Predictive analytics for business decision making is all about modeling human behavioral systems.
Why Traditional Statistics is insufficient?
Though the entry into predictive analytics requires that we understand the implications of traditional statistical analysis, statistics by itself is insufficient in the business context. Traditional statistical analysis allows us to understand the general group behavior and is primarily concerned with common behavior within the group – the central tendencies.
In business we generally develop models to anticipate human behavior of some type. Human behavior is inconsistent, lacks causality and distributions based on human behavior almost always violate the assumptions of traditional statistical analysis (like normal distribution of data, stability of mean and standard deviation etc). The strength of data mining comes from the ability of the associated techniques to deal with the tails of the distributions, rather than the central tendencies, and from the techniques’ ability to deal with the realities of the data in a more precise manner.
In the realm of predictive analytics, we are concerned with modeling human behavior and hence are interested with the tail of our distribution – small percentage of the population that responds to a campaign, commits a fraud, leave our business or purchase the next service.
Though there are specialized techniques used for Predictive Analytics (viz. Non-linear statistics, Induction Algorithms, Cluster Analysis, Neural Networks to name a few), a BI practitioner is only expected to appreciate its usage in different business situations, prepare and model data as required by the tools and interpret the results correctly (a much less daunting task indeed!)
Typically the model development process involves the following steps – a) Define Project, b) Select Data, c) Prepare Data, d) Transform Variables, e) Process Model, f) Validate Model, g) Implement Model. I will explain these steps in more detail in subsequent posts.
Fundamentally, an end-to-end BI view requires the practitioner to learn the concepts around statistics and predictive analytical techniques as available in tools (like say SQL Server Analysis Services) in addition to their technology bag of tricks around data integration, data modeling and OLAP.
Read More About  Predictive Analytics