tim laqua dot com Thoughts and Code from Tim Laqua

15Jan/100

Clone Analysis Services Partitions with PowerShell

Most of us with large Analysis Services cubes partition our cubes by month or year or some other time-based slice and we have all, at one point or another, developed some way to create partitions for new months on-demand. Often, the solution to this seems to be a C# console application or SSIS package using AMO to create a new partition based off an existing partition. The problem I see with this is that maintaining it requires opening up the project or package, making changes, re-compiling, deploying, testing, deploying to production, verifying, etc. It also requires that whoever is going to maintain it is comfortable with C#.

To simplify the maintenance and get rid of the "black box" factor that utility apps like this tend to have, I put together a PowerShell script to do the same thing and a stored procedure to call the script. Really, it doesn't matter what you use as you're most likely using an almost identical chunk of code to get your new partition created - my argument is that using PowerShell instead of C# or SSIS reduces the cost of maintenance, improves readability, and facilitates better understanding throughout your team.

22Dec/092

Trending SQL Server Agent Job Duration by Hour

Earlier today I noticed a SQL Server Agent job taking a little longer than usual (or what I thought was longer than usual). Let's face it, we're not staring at the Job Activity monitor all day, so unless you've written a report to monitor job run times - on occasion you ask yourself "is that a normal run time for this thing?" The job I was curious about happened to be a job that runs throughout the day and should only have real work to do once or twice an hour - and it should run roughly the same amount of time on any given business day for a given hour (i.e. at 1:00 PM on any given business day, this thing should do the same amount of work).

So I came up with the following query to PIVOT the run duration on the hour the job executed:

19Nov/091

Wouldn’t it be fun if Cubes could talk?

I didn't say "wouldn't it be useful" because after putting a test together, asking a cube questions with no context tends to return answers that it probably shouldn't have returned. In BI, it is incredibly important to understand what exactly it is you're asking for - if we just say we want "sales" and return an answer, nobody really knows what we meant by "sales." Sure, in various circles, "sales" means the same thing - but once you start talking to different areas, departments, etc - the meaning of the word starts to shift.

But I digress - asking cubes questions is still pretty fun and some of the random things it returns when you point it at your own cubes can be flat out hilarious.

Here's a few questions thrown at the Adventure Works cube in the Adventure Works DW 2008 Analysis Services database

24Oct/090

Charting Analysis Services 2008 Trace Events

The other day I was running some Analysis Services traces for some reason or another and ran across Chris Webb's post on (Visualising Analysis Services Trace Information in Reporting Services). After looking over that post, I thought it'd be interesting to visualize various types of processing, queries, etc - basically take a trace and then graph out what happened and when. Here's a few samples of what the project yielded:

  • Red: Cube Processing on a 2 Minute schedule
  • Green: Cube Processing on a 1 hour schedule
  • Blue: Cube Processing on a 1 hour schedule
  • Black: Query Activity

Most of the activity here is from SSRS subscriptions firing around 8AM
8AM MSRS Subscription Processing

15Oct/096

Estimating the Size of a Table in SQL Server 2008

I have read this (http://msdn.microsoft.com/en-us/library/ms178085.aspx) article at least 6, maybe 7 times in the past - and every time I say to myself "this is ridiculous - someone has to have written a script to do this by now" and every time, I google for hours and fail to find anything. So I finally gave up and wrote something to do it. Note, I've only verified it on 100% fixed width tables. I compared its output to a 600+ million row table and it came out somewhere around 3% higher - fine with me as I'd rather over-estimate space requirements than under-estimate.

1Oct/090

Reporting Services (SSRS/MSRS) 2008 Error: Set used with the complement operator must have all members from the same level

When you use the Not In operator in a SSRS 2008 MDX query filter to exclude a named set, it uses a the complement operator in the constructed MDX. This is fine as long as "all members [are] from the same level." Since you got this error, they are not ;-) You can get around this by using the Except() MDX function instead of letting SSRS use the Complement operator

In the ReportServerService log, you'll see something like this:
Microsoft.AnalysisServices.AdomdClient.AdomdErrorResponseException: Query (..., ...) Set used with the complement operator must have all members from the same level.

Original filter
Dimension: Time
Hierarchy: Calendar Date
Operator: Not In
Filter Expression: [Today]

New filter
Dimension: Time
Hierarchy: Calendar Date
Operator: MDX
Filter Expression: Except([Time].[Calendar Date].[Calendar Date].MEMBERS, [Today])

25Sep/092

Determining when RESTORE DATABASE command will complete (SQL Server 2008)

Ah, I see you just started restoring that 1TB monster and now everyone wants to know when it's going to be finished, where you're at in the process, etc. Fear not, Microsoft is very good at making up fictional numbers for us to use as rough estimates! I usually add 10-20% on top of these estimates just incase the database gremlins wander by to ruin your day again. Or incase you encounter "storage issues."

SELECT 
	 percent_complete AS [PercentComplete]
	,estimated_completion_time/1000.0/60.0 AS [RemainingMinutes]
	,total_elapsed_time/1000.0/60.0 AS [ElapsedMinutes]
	,(estimated_completion_time+total_elapsed_time)/1000.0/60.0 AS [TotalMinutes]
	,DATEADD(MILLISECOND, estimated_completion_time, GETDATE()) AS [EstimatedTimeOfCompletion]
	,st.TEXT AS [CommandSQL]
FROM sys.dm_exec_requests r
	cross apply sys.dm_exec_sql_text(r.sql_handle) st
WHERE command LIKE '%RESTORE DATABASE%'
25Sep/090

Determining how long a database will be IN RECOVERY (SQL Server 2008)

So, your MSSQL service crashed in the middle of a big transaction? Or you bumped the service while it was rolling back some gigantic schema change (like say a column add on a 800 million row table)? Well, as you prepare your resume in preparation for the fallout from this debockle, you can use the following query to see how much time you have left. Or, I should say, how much time it thinks you have left... which seems to swing wildly up and down... microsoft math ftw.

 
DECLARE @DBName VARCHAR(64) = 'databasename'
 
DECLARE @ErrorLog AS TABLE([LogDate] CHAR(24), [ProcessInfo] VARCHAR(64), [TEXT] VARCHAR(MAX))
 
INSERT INTO @ErrorLog
EXEC sys.xp_readerrorlog 0, 1, 'Recovery of database', @DBName
 
SELECT TOP 5
	 [LogDate]
	,SUBSTRING([TEXT], CHARINDEX(') is ', [TEXT]) + 4,CHARINDEX(' complete (', [TEXT]) - CHARINDEX(') is ', [TEXT]) - 4) AS PercentComplete
	,CAST(SUBSTRING([TEXT], CHARINDEX('approximately', [TEXT]) + 13,CHARINDEX(' seconds remain', [TEXT]) - CHARINDEX('approximately', [TEXT]) - 13) AS FLOAT)/60.0 AS MinutesRemaining
	,CAST(SUBSTRING([TEXT], CHARINDEX('approximately', [TEXT]) + 13,CHARINDEX(' seconds remain', [TEXT]) - CHARINDEX('approximately', [TEXT]) - 13) AS FLOAT)/60.0/60.0 AS HoursRemaining
	,[TEXT]
 
FROM @ErrorLog ORDER BY [LogDate] DESC
23May/093

Maintaining a Type 1 Slowly Changing Dimension (SCD) using T-SQL

A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime. Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window. The package was using the Slowly Changing Dimension Wizard transformation - we were doing the standard OLEDB Source (which basically represented how the SCD "should" look) and then sending it to the SCD transform and letting it figure out what needed to be inserted and updated. One option was to switch to lookups instead of the SCD wizard to speed things up, maybe even some fancy checksum voodoo for the updates (see http://blog.stevienova.com/2008/11/22/ssis-slowly-changing-dimensions-with-checksum/ for an example). Then after thinking about it a little more - why are we sending a million rows down the pipeline every hour? We know only a small percentage of these are new - and another small percentage needs to be updated. Well, we can just write a quick SQL query to get us just those sets and the package would be much more efficient!

Wait a tick - why would we give the rows to SSIS if all it is going to do insert one set and update the other? Let's just do it all in T-SQL:

23May/090

Parsing QueryString (GET) Variables From a URL using T-SQL

First of all - I know, don't do this. The application that put the URL in the column in the first place is MUCH better at handling URLs and ideally, you would just add columns for the GET variables you're after and have the application put those in. Parsing them out after the fact from a VARCHAR field is insane.

So now we're here and we need to do that thing that we said we shouldn't do. My approach is to use a Table Valued Function that only returns one column - the value of the variable or NULL if it can't find the variable in the querystring.