Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

«November 2015»

DirectQuery in Power BI Desktop

In the latest Power BI Desktop a new Preview features was released that now allows you to connect using DirectQuery to either SQL Server or Azure SQL Databases.  DirectQuery is a really neat feature that allows you to point to the live version of the data source rather than importing the data into a data model in Power BI Desktop. 

Normally when you want to get an updated dataset in the Power BI Desktop you would have to manually click the refresh button (this can be automated in the Power BI Service), which would initiate a full reimport of your data.  This refresh could take a variable amount of time depending on how much data your have.  For instance, if you’re refreshing a very large table you may be waiting quite a while to see the newly added data. 

With DirectQuery data imports are not required because you’re always looking at a live version of the data.  Let me show you how it works!

Turning on the DirectQuery Preview

Now, because DirectQuery is still in Preview you must first activate the feature by navigating to File->Options and settings->Options->Preview Features then check DirectQuery for SQL Server and Azure SQL Database


Once you click OK you may be prompted to restart the Power BI Desktop to utilize the feature.

Using DirectQuery in Power BI Desktop

Next make a connection either to an On-Premises SQL Server or Azure SQL database.

Go to the Home ribbon and select Get Data then SQL Server.


Provide your Server and Database names then click OK. ***Do not use a SQL statement.  It is not currently supported with DirectQuery***


From the Navigator pane choose the table(s) you would like to use.  I’m just going to pick the DimProduct table for this example and then click Load.  You could select Edit and that would launch the Query Editor where you could manipulate the extract.  This would allow you to add any business rules needed to the data before visualizing it.


Next you will be prompted to select what you want to connect to the data. Again, Import means the data

Read more

The Big Data Blog Series

Over the last few years I’ve been speaking a lot on the subject of Big Data. I started by giving an intermediate session called “Show Me Whatcha’ Workin’ With”. This session was designed for people who had attended a one hour introductory session that showed you how to load data, to look at possible applications … Continue reading The Big Data Blog Series
Read more

Data Mining Add-ins - Analyze Key Influencers Tool

  • 2 May 2012
  • Author: Mike Milligan
  • Number of views: 6552

The Analyze Key Influencers tool is used to show how column values in a data set might determine the values of a specified target column.  The process creates a temporary mining model in Microsoft SQL Server Analysis Services using the Naïve Bayes algorithm.  It then produces a Main Influencers report which represents the key influencers for a distinct value of the target column.  You have the option of creating one or many additional Discrimination Reports that compares the influencers for any two distinct values of the target column.  The Discrimination Reports are only useful if your target column contains more than two distinct states.


The Naïve Bayes algorithm is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions.  The naïve part of the name comes from the fact that it assumes that all attributes are unrelated to each other and that the combination of attributes independently contribute to the probabilities that it predicts.  For example, a fruit may be considered an orange if it is round, has the color orange, has seeds, grows on a tree, etc.  Even if any of these features depend on the existence of other features, a Naïve Bayes classifier considers these properties to independently contribute to the probability that the fruit is an orange.  One advantage of this algorithm is that it only requires a small set of data to estimate the means and variances of the variables required for classification. 


This blog post will work through two examples using the sample data provided with the Microsoft SQL Server 2012 Data Mining Add-ins and another example using data from the Contoso sample database.


Example One

Which properties of a customer in the sample data help to predict a customer's level of education?

  • Open the DMAddins_SampleData.xlsx file. 
  • Select the Table Analysis Tools sample sheet, highlight a cell within the table so the ribbon at the top displays the Table Tools, Analyze ribbon, and click the Analyze Key Influencers button. 
  • Select the column Education to analyze for key factors and click the link that says 'Choose columns to be used for analysis.' 
  • Uncheck the ID column.  This is just a sequential number that has nothing to do with anything other than the order the row was inserted into the table.  We also want to uncheck any other columns that have nothing to do with the customer's education level to streamline our analysis and improve our accuracy.  Let's also uncheck the purchased bike column.  Click Ok, and then Run.
  • Once it finishes thinking, move the Discrimination based on key influencers dialog out of the way for a moment.


The Key Influencers Report for Education shows which columns and which values of those columns have a significant impact over the value of the Education column.  According to this report, people between the age of 37 and 46 who work in Management are very likely to have their Bachelors degree.  Persons with only one car and work in a clerical profession are very likely to have only attended some College.  People with two cars that work in a manual occupation and earn less than about 39K per year are likely to have only attended high school.  Similar characteristics apply for those that only received a partial high school education.  Persons that do not own an automobile are very likely to have completed a graduate degree. 


Now, back to the Discrimination report dialog that we moved out of the way.  Let's run a discrimination report that compares those with graduate degrees with those who only attended some of High School.





We can add as many discrimination reports as we want. 




The Table Analysis Tools Sample worksheet only contains 1000 rows.  When we go through the exact same steps on the Source Data sheet which has 10,000 rows, we get remarkably similar results.



 Example Two

Next, I'll run the tool to see what factors most strongly influence whether or not the customer is likely to purchase a bike.


  • Give the Source Data worksheet focus.  Click the Analyze Key Influencers button.
  • Select BikeBuyer as the column to analyze.  Uncheck ID from the columns to analyze and run the analysis.
  • Go ahead and run a Discrimination report against the Yes/No values.  This will demonstrate that this report is useless for target columns with only two values.



The Key Influencers Report for BikeBuyer shows us that strongest predictors of whether or not the customer is likely to purchase a bike are when the customer doesn't own any cars, and that they are between the ages of 36 and 46.  The strongest predictors that they will not buy a bike are when they own two cars and are over or equal to the age of 64.


The discrimination report shows us essentially the same thing.




Example Three

For the next example, I have imported the V_Customer view from the Contoso Retail demo database which you can download from Microsoft. 


If you import the data using the Data ribbon, From Other data sources button it will automatically format it as a table which is required.  If you import your data from a CSV or copy and paste it into a spreadsheet it may not be formatted as a table. 


  • Once the data is Excel, formatted as a table, click the Analyze Key Influencers button and select HomeOwnerFlag as the column to analyze. 
  • Click the Choose columns to be used for analysis link and uncheck CustomerKey and Consumption and Run the analysis.




Here we see that MaritalStatus has the most impact on influencing the value of HouseOwnerFlag.  We also see that not having any children is a strong indicator for not owning a home.


I hope this explains how to use the Analyze Key Influencers tool sufficiently.  If you have any questions, please use the comments section below. 


Here are some additional links:

Analyze Key Influencers Video Tutorial

Microsoft BI - Data Mining - Analyze Key Influencers

Categories: Blogs
Rate this article:
No rating

Mike MilliganMike Milligan

Other posts by Mike Milligan

Please login or register to post comments.