Most of us with large Analysis Services cubes partition our cubes by month or year or some other time-based slice and we have all, at one point or another, developed some way to create partitions for new months on-demand. Often, the solution to this seems to be a C# console application or SSIS package using AMO to create a new partition based off an existing partition. The problem I see with this is that maintaining it requires opening up the project or package, making changes, re-compiling, deploying, testing, deploying to production, verifying, etc. It also requires that whoever is going to maintain it is comfortable with C#.
To simplify the maintenance and get rid of the "black box" factor that utility apps like this tend to have, I put together a PowerShell script to do the same thing and a stored procedure to call the script. Really, it doesn't matter what you use as you're most likely using an almost identical chunk of code to get your new partition created - my argument is that using PowerShell instead of C# or SSIS reduces the cost of maintenance, improves readability, and facilitates better understanding throughout your team.
UPDATE 2012-03-16: Please also take a look at Slowly Changing Dimensions with MD5 Hashes in SSIS which we have determined to be fastest, most efficient approach to maintaining Type 1 dimensions.
A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime. Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window. The package was using the Slowly Changing Dimension Wizard transformation - we were doing the standard OLEDB Source (which basically represented how the SCD "should" look) and then sending it to the SCD transform and letting it figure out what needed to be inserted and updated. One option was to switch to lookups instead of the SCD wizard to speed things up, maybe even some fancy checksum voodoo for the updates (see http://blog.stevienova.com/2008/11/22/ssis-slowly-changing-dimensions-with-checksum/ for an example). Then after thinking about it a little more - why are we sending a million rows down the pipeline every hour? We know only a small percentage of these are new - and another small percentage needs to be updated. Well, we can just write a quick SQL query to get us just those sets and the package would be much more efficient!
Wait a tick - why would we give the rows to SSIS if all it is going to do insert one set and update the other? Let's just do it all in T-SQL:
First of all - I know, don't do this. The application that put the URL in the column in the first place is MUCH better at handling URLs and ideally, you would just add columns for the GET variables you're after and have the application put those in. Parsing them out after the fact from a VARCHAR field is insane.
So now we're here and we need to do that thing that we said we shouldn't do. My approach is to use a Table Valued Function that only returns one column - the value of the variable or NULL if it can't find the variable in the querystring.
I have always felt that the traditional approach to getting Agent Job status by creating a table and populating it with output from xp_sqlagent_enum_jobs was incredibly crude. So I poked around a bit and came up with a solution using a Table Valued Function that queries sysjobactivity to determine the status of a given job. The reason I use a Table Valued Function is because we have a few tables that contain a list of jobs that need to be started at certain points in a given process (and they may or may not already be running) - with the TVF, you can simply CROSS APPLY the function and bob's your uncle.
I ran in to this the other day, did some googling, and really did not like what I saw for workarounds. In SQL Server 2005, when an XMLA job step fails (returns an Exception node in the XML response), the job step still reports success (because it's defining success as "did I get a response") - this has been fixed in SQL Server 2008. Common workarounds are using ascmd.exe or SSIS to handle the XMLA commands (ish - both of those solutions add a lot of complexity for a simple problem). So, I came up with a workaround that checks the text of the previous job step for the substring "<Exception ". It's been working thus far, with no issues.
After each XMLA command step, insert a T-SQL step to verify that the XMLA command step succeeded:
DECLARE @JobName VARCHAR(64) SET @JobName = ‘Name Of Job This Step Belongs to’ DECLARE @Message VARCHAR(1024) SELECT TOP 1 @Message = CAST([message] AS VARCHAR(1024)) FROM msdb.dbo.sysjobhistory a INNER JOIN msdb.dbo.sysjobs b ON a.job_id = b.job_id AND b.[NAME] = @JobName ORDER BY run_date DESC, run_time DESC, step_id DESC IF @Message LIKE ‘%<Exception %’ RAISERROR (@Message, 17, 1)
UPDATE (2009-04-03): Added
, step_id DESC to
ORDER BY clause - when the XMLA job fails instantly (say you tried to process a nonexistant partition), run_time doesn't have enough granularity to sort properly.
Once your done, your job steps will look something like this:
SELECT DirName, LeafName, tp_Email, CheckoutDate, IsCheckoutToLocal FROM AllDocs d WITH(nolock) INNER JOIN Webs w WITH(nolock) ON d.WebId = w.Id INNER JOIN Sites s WITH(nolock) ON w.SiteId = s.Id INNER JOIN UserInfo u WITH(nolock) ON (s.Id=u.tp_SiteID AND u.tp_ID=CheckoutUserId) WHERE d.Type <> 1 AND (LeafName LIKE '%.xls' OR LeafName LIKE '%.xlsx') AND (LeafName NOT LIKE '%template%') AND DeleteTRansactionId = CAST(0 AS VARBINARY) AND IsCurrentVersion = 1 AND IsCheckoutToLocal = 1 ORDER BY tp_Email ASC