tim laqua dot com Thoughts and Code from Tim Laqua

19Nov/092

Wouldn’t it be fun if Cubes could talk?

I didn't say "wouldn't it be useful" because after putting a test together, asking a cube questions with no context tends to return answers that it probably shouldn't have returned. In BI, it is incredibly important to understand what exactly it is you're asking for - if we just say we want "sales" and return an answer, nobody really knows what we meant by "sales." Sure, in various circles, "sales" means the same thing - but once you start talking to different areas, departments, etc - the meaning of the word starts to shift.

But I digress - asking cubes questions is still pretty fun and some of the random things it returns when you point it at your own cubes can be flat out hilarious.

Here's a few questions thrown at the Adventure Works cube in the Adventure Works DW 2008 Analysis Services database

what was sales for nichole nara
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Customer].[Customer].[Nichole Nara]} ON 0 FROM [Adventure Works])
$13,295.38
what were sales for nichole nara in cy 2004
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Date].[Calendar Year].[CY 2004]} ON 0 FROM (SELECT {[Customer].[Customer].[Nichole Nara]} ON 0 FROM [Adventure Works]))
$2,419.06
what were sales for nichole nara in cy 2004 in north america
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Sales Territory].[Sales Territory Group].[North America]} ON 0 FROM (SELECT {[Date].[Calendar Year].[CY 2004]} ON 0 FROM (SELECT {[Customer].[Customer].[Nichole Nara]} ON 0 FROM [Adventure Works])))
No Results Found
what were sales for nichole nara in cy 2004 for bikes
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Product].[Category].[Bikes]} ON 0 FROM (SELECT {[Date].[Calendar Year].[CY 2004]} ON 0 FROM (SELECT {[Customer].[Customer].[Nichole Nara]} ON 0 FROM [Adventure Works])))
$2,384.07
what were sales in north america in fy 2003 for bikes
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Product].[Category].[Bikes]} ON 0 FROM (SELECT {[Date].[Fiscal Year].[FY 2003]} ON 0 FROM (SELECT {[Sales Territory].[Sales Territory Group].[North America]} ON 0 FROM [Adventure Works])))
$1,739,306.95
sales for cy 2004 in cy q1, cy q2, and cy q4 in north america and canada
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Sales Territory].[Sales Territory Group].[North America],[Customer].[Country].[Canada]} ON 0 FROM (SELECT {[Date].[Calendar Quarter of Year].[CY Q1],[Date].[Calendar Quarter of Year].[CY Q2],[Date].[Calendar Quarter of Year].[CY Q4]} ON 0 FROM (SELECT {[Date].[Calendar Year].[CY 2004]} ON 0 FROM [Adventure Works])))
I don't know - Query Failure: Members belong to different hierarchies in the  function.
sales for cy 2004 in cy q1, cy q2, and cy q4 in north america
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Sales Territory].[Sales Territory Group].[North America]} ON 0 FROM (SELECT {[Date].[Calendar Quarter of Year].[CY Q1],[Date].[Calendar Quarter of Year].[CY Q2],[Date].[Calendar Quarter of Year].[CY Q4]} ON 0 FROM (SELECT {[Date].[Calendar Year].[CY 2004]} ON 0 FROM [Adventure Works])))
$3,967,371.16
sales between cy 2003 and cy 2004 for large resellers
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT [Large Resellers] ON 0 FROM (SELECT {[Date].[Calendar Year].[CY 2003]:[Date].[Calendar Year].[CY2004]} ON 0 FROM [Adventure Works]))
$19,561,960.04
sales for top 50 customers in fy 2004 in fy q1 and fy q3
SELECT {[Measures].[Internet Sales Amount]} ON 0  FROM (SELECT {[Date].[Fiscal Quarter of Year].[FY Q1],[Date].[Fiscal Quarter of Year].[FY Q3]} ON 0 FROM (SELECT {[Date].[Fiscal Year].[FY 2004]} ON 0 FROM (SELECT [Top 50 Customers] ON 0 FROM [Adventure Works])))
$177,263.41

The above questions returned pretty much exactly what I was looking for except for the one that threw us an error about members belonging to different hierarchies. That one was because "Canada" was in [Customer].[Country] and "north america" was in [Sales Territory].[Sales Territory Group]. Yeah, it was a dumb question because canada is in north america, but it illustrates that when you don't qualify things, you can get some pretty confusing results.

How's it work?

  1. Enumerate all the Attribute Values for each Attribute Hierarchy and store them as "keywords"
  2. Get some input
  3. Look for the aforementioned "keywords" in the input string
  4. Construct an MDX query based on what "keywords" were found in the string
  5. Make sure there aren't any extra (non-noise) words in the string
  6. Run Query
  7. Return Results

How the queries are constructed

  • Attribute Values just create a subcube - it creates a new nested subcube for each list, range, or named set it finds
  • It only supports one measure as written and it just selects that from whatever quagmire of subcubes it's created

There are tons of issues with this methodology when we have role playing dimensions because the attribute values will be the same for these dimensions - so only the first one enumerated will get in (there's a way to ignore certain dimensions in the app.config - ignore all but one role playing dimension). It throws away context and calls it "noise" - now that's dangerous; if you throw away the context and just look for keywords, it's pretty easy to answer a question that wasn't really asked (oops).

As far as configuring this thing goes - take a look at the app.config and throw in appropriate values for your setup. You'll have to tweak the *Pattern values to fit things you want to match or exclude in your particular cube. It's configured for the Adventure Works cube right now, but I didn't work with it much - I just flat out excluded a few dims and hierarchies that would obviously confuse it.

Also note that, as written, it needs an Analysis Services 2008 cube.

Let me know if you want to give it a try and I'll be more than happy to work with you on configuring/modifying it for your setup. Also post comments on interesting observations you have when asking cubes unqualified questions 😉

Download the VS 2008 Project: SmartCube.zip

Also note, the project uses an interface called IODoodad for getting questions and responding - this has two public methods, GetInput() and Respond(). The original theory here is that we could make an old-school IRC bot type thing. We did manage to hook it to Yammer for its IO, respond to private messages and public messages prefixed with @cube, etc. I didn't really explain the config for YammerIO (there's a custom config section for that that needs a bunch of OAuth stuff). Again, if you're interested in getting that working, let me know and I'll post instructions. The project above is set to use ConsoleIO.

I don't plan on doing any more development here, it was just a quick PoC to see what it looked like and get some initial feedback. If anyone's interested in moving forward with it and making it smarter (the project arguably should have been titled DumbCube at this point), let me know so I can post a link to your work and fork away 😉

Comments (2) Trackbacks (0)
  1. Interesting project, and as always, wink-filled.

    Now.. if it could only be hooked up to Yammer…

  2. This is really cool. There is probably a lot of value in it from a demo perspective.


Leave a comment

No trackbacks yet.