tim laqua dot com Thoughts and Code from Tim Laqua

18Apr/110

Parallel Cube Processor 1.0

codeplex-logo_3

Parallel Cube Processor (PCP) is an application designed to process multiple Analysis Services databases at the same time. I originally wrote this application with the assumption that it would be blazing fast and essentially warp space and time. As it turns out, SSAS appears to be going as fast as possible at all times, so giving it more stuff to do doesn't really make the whole shebang faster - just different. Play around with it, watch some performance counters - it's an interesting conundrum. The application can handle parallelism using two methods.

The first method is XMLAParallel - in this mode, the application constructs an XMLA command to process the specified number of databases inside of a Parallel element. Only one XMLA command is issued at a time so the applicatoin will not submit another command until the longest processing job completes in each batch.

Tagged as: , , Continue reading
15Apr/112

Dealing With Long Running SSAS Queries using Powershell

powershellFeatured

So, your Analysis Services instance keeps eating all the memory on your server despite proper memory configuration? This happens when queries get too big for their britches - SSAS memory limits are basically soft limits. Whatever an active query requires in terms of resources, SSAS will attempt to provide it. Even if it has to page the entire operating system out to disk to do it. You can mitigate this with good cube design, better aggregations, user training, etc - but it's bound to happen. When your cube gets big enough, one of your users will blow the entire server up.

30Oct/101

Getting a useful FTP error message out of SSIS

The plan: Upload a zip file for a bunch of different clients to various user-specified ftp servers - looks something like this:

A quick note on what that "Update FTP Connection" task is doing - it's modifying the properties of the "FTP" connection manager to the appropriate Server/Username/Password for this particular client:

1
2
3
4
5
6
7
8
9
10
        public void Main()
        {
            ConnectionManager ftp = Dts.Connections["FTP"];
            ftp.Properties["ServerName"].SetValue(ftp, (string)Dts.Variables["FTPServer"].Value);
            ftp.Properties["ServerUserName"].SetValue(ftp, (string)Dts.Variables["FTPUser"].Value);
            ftp.Properties["ServerPassword"].SetValue(ftp, (string)Dts.Variables["User::FTPPassword"].Value);
 
            Dts.TaskResult = (int)ScriptResults.Success;
 
        }

The problem: Anyone who has tried to upload multiple files to multiple FTP sites in an SSIS package very quickly ran in to FTP errors (that's what happens when you let users tell you what their FTP url and authentication is). You can handle these errors and log them via the standard OnError handler dumping the ErrorCode, ErrorDescription, and usually SourceName out to a flat file or table.

6Apr/101

Excel 2007 Hangs When Trying to Edit an OLAP Cube Filter

Ok, it doesn't hang every time you try to edit an OLAP cube filter, but sometimes - it appears to. In reality, I've never seen it permanently hang - just kind of go away for a while. Here's the basic symptom that the business will report to you:

"Excel freezes (or hangs) when I try to change this filter"

Most of us have seen this at one point or another and shrugged it off as a busy time or processing is going on or there are cats clogging up the tubes, etc. Tonight, I finally decided to figure out what's causing it.

15Jan/101

Clone Analysis Services Partitions with PowerShell

Most of us with large Analysis Services cubes partition our cubes by month or year or some other time-based slice and we have all, at one point or another, developed some way to create partitions for new months on-demand. Often, the solution to this seems to be a C# console application or SSIS package using AMO to create a new partition based off an existing partition. The problem I see with this is that maintaining it requires opening up the project or package, making changes, re-compiling, deploying, testing, deploying to production, verifying, etc. It also requires that whoever is going to maintain it is comfortable with C#.

To simplify the maintenance and get rid of the "black box" factor that utility apps like this tend to have, I put together a PowerShell script to do the same thing and a stored procedure to call the script. Really, it doesn't matter what you use as you're most likely using an almost identical chunk of code to get your new partition created - my argument is that using PowerShell instead of C# or SSIS reduces the cost of maintenance, improves readability, and facilitates better understanding throughout your team.

19Nov/092

Wouldn’t it be fun if Cubes could talk?

I didn't say "wouldn't it be useful" because after putting a test together, asking a cube questions with no context tends to return answers that it probably shouldn't have returned. In BI, it is incredibly important to understand what exactly it is you're asking for - if we just say we want "sales" and return an answer, nobody really knows what we meant by "sales." Sure, in various circles, "sales" means the same thing - but once you start talking to different areas, departments, etc - the meaning of the word starts to shift.

But I digress - asking cubes questions is still pretty fun and some of the random things it returns when you point it at your own cubes can be flat out hilarious.

Here's a few questions thrown at the Adventure Works cube in the Adventure Works DW 2008 Analysis Services database

24Oct/090

Charting Analysis Services 2008 Trace Events

The other day I was running some Analysis Services traces for some reason or another and ran across Chris Webb's post on (Visualising Analysis Services Trace Information in Reporting Services). After looking over that post, I thought it'd be interesting to visualize various types of processing, queries, etc - basically take a trace and then graph out what happened and when. Here's a few samples of what the project yielded:

  • Red: Cube Processing on a 2 Minute schedule
  • Green: Cube Processing on a 1 hour schedule
  • Blue: Cube Processing on a 1 hour schedule
  • Black: Query Activity

Most of the activity here is from SSRS subscriptions firing around 8AM
8AM MSRS Subscription Processing

1Oct/090

Reporting Services (SSRS/MSRS) 2008 Error: Set used with the complement operator must have all members from the same level

When you use the Not In operator in a SSRS 2008 MDX query filter to exclude a named set, it uses a the complement operator in the constructed MDX. This is fine as long as "all members [are] from the same level." Since you got this error, they are not ;-) You can get around this by using the Except() MDX function instead of letting SSRS use the Complement operator

In the ReportServerService log, you'll see something like this:
Microsoft.AnalysisServices.AdomdClient.AdomdErrorResponseException: Query (..., ...) Set used with the complement operator must have all members from the same level.

Original filter
Dimension: Time
Hierarchy: Calendar Date
Operator: Not In
Filter Expression: [Today]

New filter
Dimension: Time
Hierarchy: Calendar Date
Operator: MDX
Filter Expression: Except([Time].[Calendar Date].[Calendar Date].MEMBERS, [Today])

23May/093

Maintaining a Type 1 Slowly Changing Dimension (SCD) using T-SQL

UPDATE 2012-03-16: Please also take a look at Slowly Changing Dimensions with MD5 Hashes in SSIS which we have determined to be fastest, most efficient approach to maintaining Type 1 dimensions.

A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime. Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window. The package was using the Slowly Changing Dimension Wizard transformation - we were doing the standard OLEDB Source (which basically represented how the SCD "should" look) and then sending it to the SCD transform and letting it figure out what needed to be inserted and updated. One option was to switch to lookups instead of the SCD wizard to speed things up, maybe even some fancy checksum voodoo for the updates (see http://blog.stevienova.com/2008/11/22/ssis-slowly-changing-dimensions-with-checksum/ for an example). Then after thinking about it a little more - why are we sending a million rows down the pipeline every hour? We know only a small percentage of these are new - and another small percentage needs to be updated. Well, we can just write a quick SQL query to get us just those sets and the package would be much more efficient!

Wait a tick - why would we give the rows to SSIS if all it is going to do insert one set and update the other? Let's just do it all in T-SQL:

15May/090

Quick Analysis of Cached Query Plans in SQL Server 2005

I run this one pretty frequently when we need to figure out what procs are killing a complex ETL process and what exactly about them is making the server cry. So basically, if it's on the development server, I'll do a DBCC FREEPROCCACHE and a DBCC DROPCLEANBUFFERS, run the entire set of ETLs, then run this query and then dig deeper in to the query plans that look suspect (high *Scan counts usually, sometimes lots of Hash Matches or Merge Joins). On a production server, the clearing of the proc cache and dropcleanbuffers can be problematic so I'll often just run the query after a scheduled ETL run. If you want to see the query plans mapped out visually, click on the query_plan value and SSMS will open up the XML. Then save that XML file as a .sqlplan file. Once you have that, close the XML and then open the .sqlplan file.