tim laqua dot com Thoughts and Code from Tim Laqua

23Jan/132

Hiding SSAS Attributes With Powershell

The challenge here was that we have a cube with some great data as well as some low level detail (customer detail, think PII) that we want to expose to a larger audience. The problem is that the low level detail is not something that the new audience needs (or in some cases, is allowed) to have access to. The obvious answer here is Dimension security - why not use that? Two reasons. First, Dimension security is slow. Second, even if it wasn't slow, to hide all the data in many (30+) attributes is tedious to setup, and when new attributes were added we would have to make sure and disable access to those as well. To be clear, we're not just hiding attributes here, we're creating an entire copy of the existing SSAS database and hiding attributes in the copy.

# Import Required Libraries
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices") >$NULL
 
# Connect to server
$server = New-Object Microsoft.AnalysisServices.Server
$server.Connect('localhost')
 
# Make a metadata copy of the existing cube
$newdb = $server.Databases.GetByName('POS Data').Clone()
 
# Change ID and Name of SSAS DB
$newdb.Name = 'POS Data - Limited'
$newdb.ID = 'POS Data - Limited'
 
# Drop Existing SSAS DB if it exists
$server.Databases['POS Data - Limited'].Drop()
 
# Add the new copy to the server
$server.Databases.Add($newdb)
 
# Sync our copy of the new database with the server's copy
$newdb.Update([Microsoft.AnalysisServices.UpdateOptions]::ExpandFull)
 
# Grab the cube we want to work with from the new database
$cube = $newdb.Cubes.GetByName('POS Data')
 
# Hide the Location Dimension
$cube.Dimensions.GetByName('Location').Visible = $false
 
# Hide all attributes and hierarchies in the Customer dimension
$cube.Dimensions.GetByName('Customer').Attributes |  %{$_.AttributeHierarchyEnabled=$false; $_.AttributeHierarchyVisible=$false;}
$cube.Dimensions.GetByName('Customer').Hierarchies | %{$_.Visible=$false; $_.Enabled=$false;}
 
# Enable the key attribute in the customer dimension - it won't work if the key isn't enabled
$cube.Dimensions.GetByName('Customer').Attributes['Dim Customer'].AttributeHierarchyEnabled=$true
 
# Enable the Market attribute in the customer dimension
$cube.Dimensions.GetByName('Customer').Attributes['Market'] | %{$_.AttributeHierarchyEnabled=$true; $_.AttributeHierarchyVisible=$true;}
 
# Hide the Location Database attribute in the Product dimension
$cube.Dimensions.GetByName('Product').Attributes['Location Database'] | %{$_.AttributeHierarchyEnabled=$false; $_.AttributeHierarchyVisible=$false;}
 
# Add a new member to the role granting Read Only permissions in the cube
$newMember = new-object Microsoft.AnalysisServices.RoleMember("domain\tlaqua")
$newdb.Roles['Role 1'].Members.Add($newMember)
 
# Push our updates to the server
$newdb.Update([Microsoft.AnalysisServices.UpdateOptions]::ExpandFull)
 
# Process the new database
$newdb.Process([Microsoft.AnalysisServices.ProcessType]::ProcessFull)
 
# Disconnect from the server
$server.Disconnect()

This approach has two notable downfalls. First, you have to think up a different database name because the original db with the low level detail still exists on the server. Second, you have to ProcessFull the clone that you made. It doesn't close the data, it clones the metadata. All in all, works great for us, this particular cube is only processed once a week and the 20 minutes we lose processing data in to the clone is more than acceptable (looks really clean to the users as well).

12Jun/120

Upgrading from SQL 2008 R2 Enterprise to SQL 2012 Business Intelligence

With all the new licensing considerations in SQL 2012, everyone's busy trying to figure out exactly what the cheapest license they can get by with is. For one of our instances, we decided rather than Enterprise, we could run the Business Intelligence edition to save some cash. That upgrade path is explicitly listed in the Supported Version and Edition Upgrades documentation.

What happened?

We clicked through the upgrade wizard as usual and it got through about 85% and then prompted an error:

Wait on the Database Engine recovery handle failed. Check the SQL Server error log for potential causes.

We clicked ok, then it finished and the results showed that the SQL engine and a few other features failed the upgrade. Great.

15Mar/122

Automating PowerPivot Data Refresh in Sharepoint 2010

Of course we want to do this, it's a fundamental requirement. Dear Sharepoint, please refresh my PowerPivot when the data is ready, not on some arbitrary schedule. Until this functionality is built-in, people will continue to hack away at it to figure out how to make this happen. Is this supported? Certainly not - so continue at your own risk.

So the method we'll be using here is just to mimic what Sharepoint does when you check the box to "Also refresh as soon as possible" in the schedule configuration page. To accomplish this we open the profiler, connect it to whatever instance our Sharepoint PowerPivot database is hosted on and filter you TextData to "%Schedule%" or filter to just the Sharepoint PowerPivot database (SP2010_PowerPivot_Service_Application in our case) - then open up your test PowerPivot schedule configuration, check the box, click OK, wait for the schedule history to come back up and then stop the trace. Now you know you've got what you need, you just have to find it:

27Feb/124

Slowly Changing Dimensions with MD5 Hashes in SSIS

We recently moved away from the 3rd party Checksum component (and all 3rd party components) in SSIS and I wanted to share the pattern we settled on for maintaining our Type 1 Slowly Changing Dimensions (SCDs). There are two things we wanted to address with our new pattern. First, our previous implementation wasn't performing as well as we needed it to or generating reliable checksums. The second was that we wanted to get away from dependencies on custom assemblies in general. To illustrate the pattern, we're going to build a SCD package off the Adventure Works DW DimCustomer table and skip over the actual source of the business keys and attributes by selecting directly from the completed dimension for now.

First, we assume that our dimension already exists (and we were using some other checksum or MERGE to maintain it). We have to add a column to store the MD5 hash:

ALTER TABLE dbo.DimCustomer ADD
	MD5 VARCHAR(34) NOT NULL DEFAULT ''

Second, we need a staging table to store updated/changed rows. Script out the current dimension as a CREATE, remove all unneeded constraints and indexes, and create a staging table as a heap:

CREATE TABLE [dbo].[Staging_DimCustomer_UpdatedRows](
	[CustomerKey] [int] NOT NULL,
	[GeographyKey] [int] NULL,
	[CustomerAlternateKey] [nvarchar](15) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NULL,
	[NameStyle] [bit] NULL,
	[BirthDate] [datetime] NULL,
	[MaritalStatus] [nchar](1) NULL,
	[Suffix] [nvarchar](10) NULL,
	[Gender] [nvarchar](1) NULL,
	[EmailAddress] [nvarchar](50) NULL,
	[YearlyIncome] [money] NULL,
	[TotalChildren] [tinyint] NULL,
	[NumberChildrenAtHome] [tinyint] NULL,
	[EnglishEducation] [nvarchar](40) NULL,
	[SpanishEducation] [nvarchar](40) NULL,
	[FrenchEducation] [nvarchar](40) NULL,
	[EnglishOccupation] [nvarchar](100) NULL,
	[SpanishOccupation] [nvarchar](100) NULL,
	[FrenchOccupation] [nvarchar](100) NULL,
	[HouseOwnerFlag] [nchar](1) NULL,
	[NumberCarsOwned] [tinyint] NULL,
	[AddressLine1] [nvarchar](120) NULL,
	[AddressLine2] [nvarchar](120) NULL,
	[Phone] [nvarchar](20) NULL,
	[DateFirstPurchase] [datetime] NULL,
	[CommuteDistance] [nvarchar](15) NULL,
	[MD5] [varchar](34) NOT NULL)

Now in to SSIS - We will be building:

  1. Execute SQL Task to Truncate our Staging table(s)
  2. Data Flow Task to Insert new rows and Stage updated rows
  1. OLE DB Source to retrieve our source data
  2. Script Component to Generate Row Numbers
  3. Conditional Split to Evenly Distribute Rows
  4. Script Component to Generate MD5 Hashes
  5. Union All to Squish it all back together
  6. Lookup to get the existing MD5 Hash (if it exists)
  7. Conditional Split to separate Unchanged and Changed rows
  8. RowCount Transformation
  9. OLE DB Destination for Changed rows
  10. OLE DB Destination for New rows
  • Execute SQL Task to Update changed rows
  • Completed Control Flow

    Completed Data Flow

    3Feb/122

    SSAS Cache Isn’t Making Cents

    I stole the pun from my Business Analyst, Mr. John Seiler 😉 Now on to my issue - when SSAS caches the value for my [Actual] measure, it seems to do so based on the results of the first query that requests that coordinate. In this particular cube, there's bunches of tiny fractions and depending on how you slice it, it aggregates a little different. This is a fun problem in itself, but the part that drives me (and the Finance department) crazy is that if you go and slice on something OTHER than that first query that created the cache, the values they see don't always add up to the "Grand Total" in Excel - aka. "All"

    These are the queries used for this test:

    Query A

    SELECT 
    	{Actual} ON 0
    FROM [GL] 
    CELL PROPERTIES VALUE

    Query B

    SELECT 
    	{Actual} ON 0,
    	NON EMPTY Hierarchize({DrilldownLevel({[Account].[Accounts].[All]},,,INCLUDE_CALC_MEMBERS)}) ON 1
    FROM [GL] 
    CELL PROPERTIES VALUE

    Query C

    SELECT 
    	{Actual} ON 0,
    	NON EMPTY Hierarchize({DrilldownLevel({[Date].[Fiscal].[All]},,,INCLUDE_CALC_MEMBERS)}) ON 1
    FROM [GL] 
    CELL PROPERTIES VALUE
    GO

    Results (Cache was cleared with ClearCache on the entire Database before each Series)

    Series 1 Series 2 Series 3

    So basically the Grand Total of this GL cube is a random number based on whoever sneaks the first query in after the cache is cleared (processing, sync, etc).

    And for all of you that think the MDX script breaks everything everywhere - I did comment out my entire MDX script before running these tests.

    30Oct/101

    Getting a useful FTP error message out of SSIS

    The plan: Upload a zip file for a bunch of different clients to various user-specified ftp servers - looks something like this:

    A quick note on what that "Update FTP Connection" task is doing - it's modifying the properties of the "FTP" connection manager to the appropriate Server/Username/Password for this particular client:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
            public void Main()
            {
                ConnectionManager ftp = Dts.Connections["FTP"];
                ftp.Properties["ServerName"].SetValue(ftp, (string)Dts.Variables["FTPServer"].Value);
                ftp.Properties["ServerUserName"].SetValue(ftp, (string)Dts.Variables["FTPUser"].Value);
                ftp.Properties["ServerPassword"].SetValue(ftp, (string)Dts.Variables["User::FTPPassword"].Value);
     
                Dts.TaskResult = (int)ScriptResults.Success;
     
            }

    The problem: Anyone who has tried to upload multiple files to multiple FTP sites in an SSIS package very quickly ran in to FTP errors (that's what happens when you let users tell you what their FTP url and authentication is). You can handle these errors and log them via the standard OnError handler dumping the ErrorCode, ErrorDescription, and usually SourceName out to a flat file or table.

    6Apr/102

    Excel 2007 Hangs When Trying to Edit an OLAP Cube Filter

    Ok, it doesn't hang every time you try to edit an OLAP cube filter, but sometimes - it appears to. In reality, I've never seen it permanently hang - just kind of go away for a while. Here's the basic symptom that the business will report to you:

    "Excel freezes (or hangs) when I try to change this filter"

    Most of us have seen this at one point or another and shrugged it off as a busy time or processing is going on or there are cats clogging up the tubes, etc. Tonight, I finally decided to figure out what's causing it.

    15Jan/101

    Clone Analysis Services Partitions with PowerShell

    Most of us with large Analysis Services cubes partition our cubes by month or year or some other time-based slice and we have all, at one point or another, developed some way to create partitions for new months on-demand. Often, the solution to this seems to be a C# console application or SSIS package using AMO to create a new partition based off an existing partition. The problem I see with this is that maintaining it requires opening up the project or package, making changes, re-compiling, deploying, testing, deploying to production, verifying, etc. It also requires that whoever is going to maintain it is comfortable with C#.

    To simplify the maintenance and get rid of the "black box" factor that utility apps like this tend to have, I put together a PowerShell script to do the same thing and a stored procedure to call the script. Really, it doesn't matter what you use as you're most likely using an almost identical chunk of code to get your new partition created - my argument is that using PowerShell instead of C# or SSIS reduces the cost of maintenance, improves readability, and facilitates better understanding throughout your team.

    19Nov/092

    Wouldn’t it be fun if Cubes could talk?

    I didn't say "wouldn't it be useful" because after putting a test together, asking a cube questions with no context tends to return answers that it probably shouldn't have returned. In BI, it is incredibly important to understand what exactly it is you're asking for - if we just say we want "sales" and return an answer, nobody really knows what we meant by "sales." Sure, in various circles, "sales" means the same thing - but once you start talking to different areas, departments, etc - the meaning of the word starts to shift.

    But I digress - asking cubes questions is still pretty fun and some of the random things it returns when you point it at your own cubes can be flat out hilarious.

    Here's a few questions thrown at the Adventure Works cube in the Adventure Works DW 2008 Analysis Services database

    24Oct/090

    Charting Analysis Services 2008 Trace Events

    The other day I was running some Analysis Services traces for some reason or another and ran across Chris Webb's post on (Visualising Analysis Services Trace Information in Reporting Services). After looking over that post, I thought it'd be interesting to visualize various types of processing, queries, etc - basically take a trace and then graph out what happened and when. Here's a few samples of what the project yielded:

    • Red: Cube Processing on a 2 Minute schedule
    • Green: Cube Processing on a 1 hour schedule
    • Blue: Cube Processing on a 1 hour schedule
    • Black: Query Activity

    Most of the activity here is from SSRS subscriptions firing around 8AM
    8AM MSRS Subscription Processing