Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

«October 2015»

Data Warehouse from the Ground Up at SQL Saturday Orlando, FL on Oct. 10th

SQL Saturday #442SQL Saturday #442 is upon us and yours truly will be presenting in Orlando, Florida on October 10th alongside Mitchell Pearson (b|t). The session is scheduled at 10:35 AM and will last until 11:35 AM. I’m very excited to be presenting at SQL Saturday Orlando this year as it’ll be my first presenting this session in person and my first time speaking at SQL Saturday Orlando! If you haven’t registered yet for this event, you need to do that. This event will be top notch!

My session is called Designing a Data Warehouse from the Ground Up. What if you could approach any business process in your organization and quickly design an effective and optimal dimensional model using a standardized step-by-step method? In this session I’ll discuss the steps required to design a unified dimensional model that is optimized for reporting and follows widely accepted best practices. We’ll also discuss how the design of our dimensional model affects a SQL Server Analysis Services solution and how the choices we make during the data warehouse design phase can make or break our SSAS cubes. You may remember that I did this session a while back for Pragmatic Works via webinar. I’ll be doing the same session at SQL Saturday Orlando but on-prem! ;)

So get signed up for this event now! It’s only 11 days away!

Read more

Create Date Dimension with Fiscal and Time

Here are three scripts that create and Date and Time Dimension and can add the fiscal columns too. First run the Dim Date script first to create the DimDate table. Make sure you change the start date and end date on the script to your preference. Then run the add Fiscal Dates scripts to add the fiscal columns. Make sure you alter the Fiscal script to set the date offset amount. The comments in the script will help you with this.

This zip file contains three SQL scripts.

Create Dim Date

Create Dim Time

Add Fiscal Dates

These will create a Date Dimension table and allow you to run the add fiscal script to add the fiscal columns if you desire. The Create Dim Time will create a time dimension with every second of the day for those that need actual time analysis of your data.

Make sure you set the start date and end date in the create dim date script. Set the dateoffset in the fiscal script.

Download the script here:


Read more

Data Mining Add-ins - Analyze Key Influencers Tool

  • 2 May 2012
  • Author: Mike Milligan
  • Number of views: 6268

The Analyze Key Influencers tool is used to show how column values in a data set might determine the values of a specified target column.  The process creates a temporary mining model in Microsoft SQL Server Analysis Services using the Naïve Bayes algorithm.  It then produces a Main Influencers report which represents the key influencers for a distinct value of the target column.  You have the option of creating one or many additional Discrimination Reports that compares the influencers for any two distinct values of the target column.  The Discrimination Reports are only useful if your target column contains more than two distinct states.


The Naïve Bayes algorithm is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions.  The naïve part of the name comes from the fact that it assumes that all attributes are unrelated to each other and that the combination of attributes independently contribute to the probabilities that it predicts.  For example, a fruit may be considered an orange if it is round, has the color orange, has seeds, grows on a tree, etc.  Even if any of these features depend on the existence of other features, a Naïve Bayes classifier considers these properties to independently contribute to the probability that the fruit is an orange.  One advantage of this algorithm is that it only requires a small set of data to estimate the means and variances of the variables required for classification. 


This blog post will work through two examples using the sample data provided with the Microsoft SQL Server 2012 Data Mining Add-ins and another example using data from the Contoso sample database.


Example One

Which properties of a customer in the sample data help to predict a customer's level of education?

  • Open the DMAddins_SampleData.xlsx file. 
  • Select the Table Analysis Tools sample sheet, highlight a cell within the table so the ribbon at the top displays the Table Tools, Analyze ribbon, and click the Analyze Key Influencers button. 
  • Select the column Education to analyze for key factors and click the link that says 'Choose columns to be used for analysis.' 
  • Uncheck the ID column.  This is just a sequential number that has nothing to do with anything other than the order the row was inserted into the table.  We also want to uncheck any other columns that have nothing to do with the customer's education level to streamline our analysis and improve our accuracy.  Let's also uncheck the purchased bike column.  Click Ok, and then Run.
  • Once it finishes thinking, move the Discrimination based on key influencers dialog out of the way for a moment.


The Key Influencers Report for Education shows which columns and which values of those columns have a significant impact over the value of the Education column.  According to this report, people between the age of 37 and 46 who work in Management are very likely to have their Bachelors degree.  Persons with only one car and work in a clerical profession are very likely to have only attended some College.  People with two cars that work in a manual occupation and earn less than about 39K per year are likely to have only attended high school.  Similar characteristics apply for those that only received a partial high school education.  Persons that do not own an automobile are very likely to have completed a graduate degree. 


Now, back to the Discrimination report dialog that we moved out of the way.  Let's run a discrimination report that compares those with graduate degrees with those who only attended some of High School.





We can add as many discrimination reports as we want. 




The Table Analysis Tools Sample worksheet only contains 1000 rows.  When we go through the exact same steps on the Source Data sheet which has 10,000 rows, we get remarkably similar results.



 Example Two

Next, I'll run the tool to see what factors most strongly influence whether or not the customer is likely to purchase a bike.


  • Give the Source Data worksheet focus.  Click the Analyze Key Influencers button.
  • Select BikeBuyer as the column to analyze.  Uncheck ID from the columns to analyze and run the analysis.
  • Go ahead and run a Discrimination report against the Yes/No values.  This will demonstrate that this report is useless for target columns with only two values.



The Key Influencers Report for BikeBuyer shows us that strongest predictors of whether or not the customer is likely to purchase a bike are when the customer doesn't own any cars, and that they are between the ages of 36 and 46.  The strongest predictors that they will not buy a bike are when they own two cars and are over or equal to the age of 64.


The discrimination report shows us essentially the same thing.




Example Three

For the next example, I have imported the V_Customer view from the Contoso Retail demo database which you can download from Microsoft. 


If you import the data using the Data ribbon, From Other data sources button it will automatically format it as a table which is required.  If you import your data from a CSV or copy and paste it into a spreadsheet it may not be formatted as a table. 


  • Once the data is Excel, formatted as a table, click the Analyze Key Influencers button and select HomeOwnerFlag as the column to analyze. 
  • Click the Choose columns to be used for analysis link and uncheck CustomerKey and Consumption and Run the analysis.




Here we see that MaritalStatus has the most impact on influencing the value of HouseOwnerFlag.  We also see that not having any children is a strong indicator for not owning a home.


I hope this explains how to use the Analyze Key Influencers tool sufficiently.  If you have any questions, please use the comments section below. 


Here are some additional links:

Analyze Key Influencers Video Tutorial

Microsoft BI - Data Mining - Analyze Key Influencers

Categories: Blogs
Rate this article:
No rating

Mike MilliganMike Milligan

Other posts by Mike Milligan

Please login or register to post comments.