ssis « tim laqua dot com

tim laqua dot com Thoughts and Code from Tim Laqua


Consuming Sharepoint Lists via OData with SSIS

We recently needed to bring data from a Sharepoint list in to SSIS for use in an attribute lookup for a cube dimension. Seems like this should be pretty straight forward since SSRS does it natively now, but no - that wizardry hasn't made its way over to the SSIS team yet. As stated before, we don't care for third party dependencies or external non-standard assemblies of any sort in our packages. That means, as usual, we'll be writing a script component to take care of getting our data out of Sharepoint.

First, navigate to the list you are interested in and export it as a data feed


SSIS, BIDS and 64bit System DSNs: Who’s on first?

After an hour or so of changing settings, package configurations, protection levels, passwords, connection strings, etc - you might just still have yourself one of those dreaded red boxes (and a headache). And you thought inserting data using ODBC was easy...

You might even have yourself some logging that indicates an error like this:
[AS400 [19]] Error: ADO NET Destination has failed to acquire the connection {43E6AE37-24E8-46F6-8AB0-689DB6531167}. The connection may have been corrupted.

Assuming you are using a System DSN for your connection (this article is based on an ODBC DSN using the iSeries Access ODBC Driver to write to DB2, but it applies to any ODBC driver with 32bit and 64bit versions where a System DSN is used for the connection).

Before going on, you should understand and have checked the PackageProtectionLevel property in your package. To get any connection that requires a password to run in BIDS, you can't use DontSaveSensitive (unless you're using package configurations to enter the password or an explicit connection string). This is by far the most common reason connections fail to validate - but there's many resources out there already on that situation, so I won't go in to it.

Now that we know it SHOULD work - why doesn't it? In short, because you don't have a 64-bit version of the System DSN you selected. What sense does that make? Let me show you.

Tagged as: , Continue reading

Consuming an Authenticated JSON Feed with SSIS

For this post, we'll look in to one option for consuming a JSON data feed from a service provider's API. In my case, the provider is AtTask - a web based project management service. This sort of assignment/request is pretty common as various corners of your organization simply want to use the best tool for the job. Far too infrequently does anyone really look at things like how it connects to your current systems. They say things like "we have a well documented API and a strong user community" - yeah, I've heard that a few thousand times. JSON is neat if you're using JQuery. It's horrible if you're trying to suck data down in to a SQL Server table. Which is what we're going to do.

First, we wander over to the AtTask API Documentation and figure out how to authenticate. Notice that every request needs a sessionID - and that sessionID is created by a request to the login url. This means we'll have to make at least two requests - one to nab a sessionID and another to actually get the data that we want.

Tagged as: , , Continue reading

Revisiting Embedded Text Qualifiers and the SSIS Flat File Connection Manager

To quickly summarize the problem at hand, the Flat File Connection Manager doesn't support embedded Text Qualifiers. The reason this is a big issue is because the most commonly used text qualifier is a double quote (") and that's also a commonly used punctuation mark. That means if you have a CSV to import and lines that looks like this:


You would expect to import

one 1 two 2 th"r"ee 3 asdf
"left right" mid"dle w",tf w,"tf ,"omg omg",

That's not what you get, you get an error. I've seen a few different approaches to working around this known issue:

What I'd like to explore in this post is using regular expressions to transform the entire file in to something that can be natively consumed by SSIS. There are two options for dealing with this:

  • Tokenizing the quotation marks found within the cells
  • Change the text qualifier

I chose to change the text qualifier because SSIS natively supports multiple character text qualifiers. This will allow me to pick something ridiculous that is incredibly unlikely to appear in the data naturally. Changing the text qualifier also means we won't have to add any extra transformations as we would if we tokenized the embeded quotation marks (to turn them back in to quotation marks).


Getting a useful FTP error message out of SSIS

The plan: Upload a zip file for a bunch of different clients to various user-specified ftp servers - looks something like this:

A quick note on what that "Update FTP Connection" task is doing - it's modifying the properties of the "FTP" connection manager to the appropriate Server/Username/Password for this particular client:

        public void Main()
            ConnectionManager ftp = Dts.Connections["FTP"];
            ftp.Properties["ServerName"].SetValue(ftp, (string)Dts.Variables["FTPServer"].Value);
            ftp.Properties["ServerUserName"].SetValue(ftp, (string)Dts.Variables["FTPUser"].Value);
            ftp.Properties["ServerPassword"].SetValue(ftp, (string)Dts.Variables["User::FTPPassword"].Value);

            Dts.TaskResult = (int)ScriptResults.Success;


The problem: Anyone who has tried to upload multiple files to multiple FTP sites in an SSIS package very quickly ran in to FTP errors (that's what happens when you let users tell you what their FTP url and authentication is). You can handle these errors and log them via the standard OnError handler dumping the ErrorCode, ErrorDescription, and usually SourceName out to a flat file or table.


Maintaining a Type 1 Slowly Changing Dimension (SCD) using T-SQL

UPDATE 2012-03-16: Please also take a look at Slowly Changing Dimensions with MD5 Hashes in SSIS which we have determined to be fastest, most efficient approach to maintaining Type 1 dimensions.

A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime. Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window. The package was using the Slowly Changing Dimension Wizard transformation - we were doing the standard OLEDB Source (which basically represented how the SCD "should" look) and then sending it to the SCD transform and letting it figure out what needed to be inserted and updated. One option was to switch to lookups instead of the SCD wizard to speed things up, maybe even some fancy checksum voodoo for the updates (see for an example). Then after thinking about it a little more - why are we sending a million rows down the pipeline every hour? We know only a small percentage of these are new - and another small percentage needs to be updated. Well, we can just write a quick SQL query to get us just those sets and the package would be much more efficient!

Wait a tick - why would we give the rows to SSIS if all it is going to do insert one set and update the other? Let's just do it all in T-SQL:


Adding Windows Event Log Logging To Existing SSIS Packages via C#

While working with some SSIS logging, it occured to me that there's really no good way to globally apply the same logging to all packages on your server. For our production SSIS server, it would make sense to log all OnError events to the Application Event log. Then, whatever event log monitoring app you use can notify you of the package error (with details!). Thus, this app was born - an app to remove pre-existing LogProviders and add in the intended LogProvider and options.

This is really just an example of the implementation that worked for my purposes - I'm sure some of you would prefer to log elsewhere. This should be enough to get you going. Please drop a line in the Comments if you post alternative approaches.


SSISForcedLogging.Console.exe "Z:\SSIS Packages"
SSISForcedLogging.Console.Exe "Z:\SSIS Packages\Package1.dtsx"

Exporting SSIS Packages Stored on the Server (MSDB) to .DTSX (XML)

We wanted to export an SSIS package that was stored on the server in the msdb.dbo.sysdtspackages90 table to a .dtsx file so we could poke at it.

Here's what we came up with:

declare @SQLcommand varchar(max)
set @SQLcommand = 
    'bcp "SELECT cast(cast(packagedata as varbinary(max)) as varchar(max)) 
     FROM msdb.dbo.sysdtspackages90 
     WHERE name=''PackageName''" queryout "c:\output.dtsx" -T -c'
exec xp_cmdshell @SQLcommand

Alternatively, you can just Export the package via SSMS 😉