tim laqua dot com Thoughts and Code from Tim Laqua

8Jun/1128

Revisiting Embedded Text Qualifiers and the SSIS Flat File Connection Manager

To quickly summarize the problem at hand, the Flat File Connection Manager doesn't support embedded Text Qualifiers. The reason this is a big issue is because the most commonly used text qualifier is a double quote (") and that's also a commonly used punctuation mark. That means if you have a CSV to import and lines that looks like this:

"one","1","two","2","th""r""ee","3","asdf"
"""left","right""","mid""dle","w"",tf","w,""tf",",""omg","omg"","

You would expect to import

one 1 two 2 th"r"ee 3 asdf
"left right" mid"dle w",tf w,"tf ,"omg omg",

That's not what you get, you get an error. I've seen a few different approaches to working around this known issue:

What I'd like to explore in this post is using regular expressions to transform the entire file in to something that can be natively consumed by SSIS. There are two options for dealing with this:

  • Tokenizing the quotation marks found within the cells
  • Change the text qualifier

I chose to change the text qualifier because SSIS natively supports multiple character text qualifiers. This will allow me to pick something ridiculous that is incredibly unlikely to appear in the data naturally. Changing the text qualifier also means we won't have to add any extra transformations as we would if we tokenized the embeded quotation marks (to turn them back in to quotation marks).

20Apr/110

Reporting Services Source Control 1.0

Reporting Services Source Control is an application to create copies of all the reports stored in your Report Manager and commit them to source control (SVN is the only supported source control in the 1.0 release). The need for this arises for two reasons:

  • First, we empower our users to create reports using Report Builder and those reports usually only exist the ReportServer database.
  • Second, Report Builder has matured so well that it is no longer very beneficial to create a Report Server project using Business Intelligence Development Studio (BIDS).

So now we have intellectual property generated by both end users and developers sitting in Report Manager with no version control whatsoever. Reporting Services Source Control addresses this by scripting out reports, subscriptions, and other properties and committing them to source control.

Tagged as: , , Continue reading
19Apr/111

Monitoring Failed Report Server Subscriptions

While we love to empower our users as much as possible, we still need to pay attention to them and what they're up to. The first place I usually see users running amok as Microsoft BI adoption grows in organization is the Report Subscriptions in SSRS. They go nuts with them. I was at one company where 50% of the report subscriptions went to users outside of the company - that means our customers were depending on these subscriptions! Which brings up an interesting licensing debacle... but that's another story. Needless to say, when people set up a subscription, they expect it to work. If it doesn't, we REALLY need to let someone know.

Here, we have a procedure to monitor for Reporting Services subscription failures. Specifically, email subscription failures. When a failed subscription is detected, an email is sent to both the subscription owner and a digest of failures is sent to the specified admin group.

Tagged as: , , Continue reading
18Apr/110

Parallel Cube Processor 1.0

Parallel Cube Processor (PCP) is an application designed to process multiple Analysis Services databases at the same time. I originally wrote this application with the assumption that it would be blazing fast and essentially warp space and time. As it turns out, SSAS appears to be going as fast as possible at all times, so giving it more stuff to do doesn't really make the whole shebang faster - just different. Play around with it, watch some performance counters - it's an interesting conundrum. The application can handle parallelism using two methods.

The first method is XMLAParallel - in this mode, the application constructs an XMLA command to process the specified number of databases inside of a Parallel element. Only one XMLA command is issued at a time so the applicatoin will not submit another command until the longest processing job completes in each batch.

Tagged as: , , Continue reading
15Apr/112

Dealing With Long Running SSAS Queries using Powershell

So, your Analysis Services instance keeps eating all the memory on your server despite proper memory configuration? This happens when queries get too big for their britches - SSAS memory limits are basically soft limits. Whatever an active query requires in terms of resources, SSAS will attempt to provide it. Even if it has to page the entire operating system out to disk to do it. You can mitigate this with good cube design, better aggregations, user training, etc - but it's bound to happen. When your cube gets big enough, one of your users will blow the entire server up.

30Oct/101

Getting a useful FTP error message out of SSIS

The plan: Upload a zip file for a bunch of different clients to various user-specified ftp servers - looks something like this:

A quick note on what that "Update FTP Connection" task is doing - it's modifying the properties of the "FTP" connection manager to the appropriate Server/Username/Password for this particular client:

1
2
3
4
5
6
7
8
9
10
        public void Main()
        {
            ConnectionManager ftp = Dts.Connections["FTP"];
            ftp.Properties["ServerName"].SetValue(ftp, (string)Dts.Variables["FTPServer"].Value);
            ftp.Properties["ServerUserName"].SetValue(ftp, (string)Dts.Variables["FTPUser"].Value);
            ftp.Properties["ServerPassword"].SetValue(ftp, (string)Dts.Variables["User::FTPPassword"].Value);
 
            Dts.TaskResult = (int)ScriptResults.Success;
 
        }

The problem: Anyone who has tried to upload multiple files to multiple FTP sites in an SSIS package very quickly ran in to FTP errors (that's what happens when you let users tell you what their FTP url and authentication is). You can handle these errors and log them via the standard OnError handler dumping the ErrorCode, ErrorDescription, and usually SourceName out to a flat file or table.

6Apr/102

Excel 2007 Hangs When Trying to Edit an OLAP Cube Filter

Ok, it doesn't hang every time you try to edit an OLAP cube filter, but sometimes - it appears to. In reality, I've never seen it permanently hang - just kind of go away for a while. Here's the basic symptom that the business will report to you:

"Excel freezes (or hangs) when I try to change this filter"

Most of us have seen this at one point or another and shrugged it off as a busy time or processing is going on or there are cats clogging up the tubes, etc. Tonight, I finally decided to figure out what's causing it.

15Jan/101

Clone Analysis Services Partitions with PowerShell

Most of us with large Analysis Services cubes partition our cubes by month or year or some other time-based slice and we have all, at one point or another, developed some way to create partitions for new months on-demand. Often, the solution to this seems to be a C# console application or SSIS package using AMO to create a new partition based off an existing partition. The problem I see with this is that maintaining it requires opening up the project or package, making changes, re-compiling, deploying, testing, deploying to production, verifying, etc. It also requires that whoever is going to maintain it is comfortable with C#.

To simplify the maintenance and get rid of the "black box" factor that utility apps like this tend to have, I put together a PowerShell script to do the same thing and a stored procedure to call the script. Really, it doesn't matter what you use as you're most likely using an almost identical chunk of code to get your new partition created - my argument is that using PowerShell instead of C# or SSIS reduces the cost of maintenance, improves readability, and facilitates better understanding throughout your team.

19Nov/092

Wouldn’t it be fun if Cubes could talk?

I didn't say "wouldn't it be useful" because after putting a test together, asking a cube questions with no context tends to return answers that it probably shouldn't have returned. In BI, it is incredibly important to understand what exactly it is you're asking for - if we just say we want "sales" and return an answer, nobody really knows what we meant by "sales." Sure, in various circles, "sales" means the same thing - but once you start talking to different areas, departments, etc - the meaning of the word starts to shift.

But I digress - asking cubes questions is still pretty fun and some of the random things it returns when you point it at your own cubes can be flat out hilarious.

Here's a few questions thrown at the Adventure Works cube in the Adventure Works DW 2008 Analysis Services database

24Oct/090

Charting Analysis Services 2008 Trace Events

The other day I was running some Analysis Services traces for some reason or another and ran across Chris Webb's post on (Visualising Analysis Services Trace Information in Reporting Services). After looking over that post, I thought it'd be interesting to visualize various types of processing, queries, etc - basically take a trace and then graph out what happened and when. Here's a few samples of what the project yielded:

  • Red: Cube Processing on a 2 Minute schedule
  • Green: Cube Processing on a 1 hour schedule
  • Blue: Cube Processing on a 1 hour schedule
  • Black: Query Activity

Most of the activity here is from SSRS subscriptions firing around 8AM
8AM MSRS Subscription Processing