<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>tim laqua dot com &#187; scd</title>
	<atom:link href="http://timlaqua.com/tag/scd/feed/" rel="self" type="application/rss+xml" />
	<link>http://timlaqua.com</link>
	<description>Thoughts and Code from Tim Laqua</description>
	<lastBuildDate>Sun, 09 May 2010 15:25:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Maintaining a Type 1 Slowly Changing Dimension (SCD) using T-SQL</title>
		<link>http://timlaqua.com/2009/05/maintaining-a-type-1-slowly-changing-dimension-scd-using-t-sql/</link>
		<comments>http://timlaqua.com/2009/05/maintaining-a-type-1-slowly-changing-dimension-scd-using-t-sql/#comments</comments>
		<pubDate>Sat, 23 May 2009 16:37:35 +0000</pubDate>
		<dc:creator>Tim</dc:creator>
				<category><![CDATA[Scripts & Code]]></category>
		<category><![CDATA[bi]]></category>
		<category><![CDATA[business intelligence]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[scd]]></category>
		<category><![CDATA[slowly changing dimension]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[ssis]]></category>
		<category><![CDATA[t-sql]]></category>

		<guid isPermaLink="false">http://timlaqua.com/?p=224</guid>
		<description><![CDATA[A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime. Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window. The [...]]]></description>
			<content:encoded><![CDATA[<p>A few days ago, one of our SSIS packages that maintained a Type 1 Slowly Changing Dimension (SCD) of about 1 million rows crept up to 15 minutes of runtime.  Now this doesn't sound too bad, but this is part of our hourly batches, so 15 minutes is 25% of our entire processing window.  The package was using the Slowly Changing Dimension Wizard transformation - we were doing the standard OLEDB Source (which basically represented how the SCD "should" look) and then sending it to the SCD transform and letting it figure out what needed to be inserted and updated.  One option was to switch to lookups instead of the SCD wizard to speed things up, maybe even some fancy checksum voodoo for the updates (see <a href="http://blog.stevienova.com/2008/11/22/ssis-slowly-changing-dimensions-with-checksum/">http://blog.stevienova.com/2008/11/22/ssis-slowly-changing-dimensions-with-checksum/</a> for an example).  Then after thinking about it a little more - why are we sending a million rows down the pipeline every hour?  We know only a small percentage of these are new - and another small percentage needs to be updated.  Well, we can just write a quick SQL query to get us just those sets and the package would be much more efficient!  </p>
<p>Wait a tick - why would we give the rows to SSIS if all it is going to do insert one set and update the other?  Let's just do it all in T-SQL:<span id="more-224"></span></p>
<p><em>The following tables and dim are fictional - I just need to make up a star schema DIM to illustrate the approach</em></p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">CREATE</span> <span style="color: #0000FF;">TABLE</span> #Temp_SCDInserts <span style="color: #808080;">&#40;</span><span style="color: #808080;">&#91;</span>PartId<span style="color: #808080;">&#93;</span> <span style="color: #0000FF;">INT</span><span style="color: #808080;">&#41;</span>
&nbsp;
<span style="color: #008080;">-- INSERTS for new Parts</span>
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> #Temp_SCDInserts
<span style="color: #0000FF;">SELECT</span>   
	<span style="color: #808080;">&#91;</span>PartId<span style="color: #808080;">&#93;</span>
<span style="color: #0000FF;">FROM</span> 
	Part a <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartType b <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartTypeId</span> <span style="color: #808080;">=</span> b.<span style="color: #202020;">PartTypeId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplier c <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartSupplierId</span> <span style="color: #808080;">=</span> c.<span style="color: #202020;">PartSupplierId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplierCategory d <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> c.<span style="color: #202020;">PartSupplierCategoryId</span> <span style="color: #808080;">=</span> d.<span style="color: #202020;">PartSupplierCategoryId</span>
<span style="color: #0000FF;">EXCEPT</span>
<span style="color: #0000FF;">SELECT</span>
	<span style="color: #808080;">&#91;</span>PartId<span style="color: #808080;">&#93;</span>	
<span style="color: #0000FF;">FROM</span>
	DW.<span style="color: #202020;">dbo</span>.<span style="color: #202020;">Dim_Part</span> <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span>
&nbsp;
<span style="color: #008080;">-- UPDATES</span>
<span style="color: #0000FF;">SELECT</span> 
	 dim.<span style="color: #808080;">&#91;</span>PartKeyId<span style="color: #808080;">&#93;</span>
	,a.<span style="color: #808080;">&#91;</span>PartName<span style="color: #808080;">&#93;</span>
	,b.<span style="color: #808080;">&#91;</span>PartTypeId<span style="color: #808080;">&#93;</span>
	,b.<span style="color: #808080;">&#91;</span>PartTypeName<span style="color: #808080;">&#93;</span>
	,c.<span style="color: #808080;">&#91;</span>PartSupplierId<span style="color: #808080;">&#93;</span>
	,c.<span style="color: #808080;">&#91;</span>PartSupplierName<span style="color: #808080;">&#93;</span>
	,d.<span style="color: #808080;">&#91;</span>PartSupplierCategoryId<span style="color: #808080;">&#93;</span>
	,d.<span style="color: #808080;">&#91;</span>PartSupplierCategoryName<span style="color: #808080;">&#93;</span>
<span style="color: #0000FF;">INTO</span> #Temp_SCDUpdates
<span style="color: #0000FF;">FROM</span> DW.<span style="color: #202020;">dbo</span>.<span style="color: #202020;">Dim_Part</span> dim <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> Part a <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span>
		<span style="color: #0000FF;">ON</span> dim.<span style="color: #202020;">PartId</span> <span style="color: #808080;">=</span> a.<span style="color: #202020;">PartId</span> <span style="color: #008080;">-- Business Key</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartType b <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartTypeId</span> <span style="color: #808080;">=</span> b.<span style="color: #202020;">PartTypeId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplier c <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartSupplierId</span> <span style="color: #808080;">=</span> c.<span style="color: #202020;">PartSupplierId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplierCategory d <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> c.<span style="color: #202020;">PartSupplierCategoryId</span> <span style="color: #808080;">=</span> d.<span style="color: #202020;">PartSupplierCategoryId</span>
<span style="color: #0000FF;">WHERE</span>
	dim.<span style="color: #202020;">PartName</span> <span style="color: #808080;">&lt;&gt;</span> a.<span style="color: #202020;">PartName</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartTypeId</span> <span style="color: #808080;">&lt;&gt;</span> b.<span style="color: #202020;">PartTypeId</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartTypeName</span> <span style="color: #808080;">&lt;&gt;</span> b.<span style="color: #202020;">PartTypeName</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartSupplierId</span> <span style="color: #808080;">&lt;&gt;</span> c.<span style="color: #202020;">PartSupplierId</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartSupplierName</span> <span style="color: #808080;">&lt;&gt;</span> c.<span style="color: #202020;">PartSupplierName</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartSupplierCategoryId</span> <span style="color: #808080;">&lt;&gt;</span> d.<span style="color: #202020;">PartSupplierCategoryId</span>
	<span style="color: #808080;">OR</span> dim.<span style="color: #202020;">PartSupplierCategoryName</span>	<span style="color: #808080;">&lt;&gt;</span> d.<span style="color: #202020;">PartSupplierCategoryName</span>
&nbsp;
<span style="color: #008080;">-- INSERT new records</span>
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> <span style="color: #808080;">&#91;</span>DW<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>dbo<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>Dim_Part<span style="color: #808080;">&#93;</span>
	<span style="color: #808080;">&#40;</span><span style="color: #808080;">&#91;</span>PartId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartTypeId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartTypeName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierCategoryId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierCategoryName<span style="color: #808080;">&#93;</span><span style="color: #808080;">&#41;</span>
<span style="color: #0000FF;">SELECT</span> 
	 a.<span style="color: #808080;">&#91;</span>PartId<span style="color: #808080;">&#93;</span>
	,a.<span style="color: #808080;">&#91;</span>PartName<span style="color: #808080;">&#93;</span>
	,b.<span style="color: #808080;">&#91;</span>PartTypeId<span style="color: #808080;">&#93;</span>
	,b.<span style="color: #808080;">&#91;</span>PartTypeName<span style="color: #808080;">&#93;</span>
	,c.<span style="color: #808080;">&#91;</span>PartSupplierId<span style="color: #808080;">&#93;</span>
	,c.<span style="color: #808080;">&#91;</span>PartSupplierName<span style="color: #808080;">&#93;</span>
	,d.<span style="color: #808080;">&#91;</span>PartSupplierCategoryId<span style="color: #808080;">&#93;</span>
	,d.<span style="color: #808080;">&#91;</span>PartSupplierCategoryName<span style="color: #808080;">&#93;</span>
<span style="color: #0000FF;">FROM</span> #Temp_SCDInserts i
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> Part a <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span>
		<span style="color: #0000FF;">ON</span> i.<span style="color: #202020;">PartId</span> <span style="color: #808080;">=</span> a.<span style="color: #202020;">PartId</span> <span style="color: #008080;">-- Business Key</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartType b <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartTypeId</span> <span style="color: #808080;">=</span> b.<span style="color: #202020;">PartTypeId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplier c <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartSupplierId</span> <span style="color: #808080;">=</span> c.<span style="color: #202020;">PartSupplierId</span>
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> PartSupplierCategory d <span style="color: #0000FF;">WITH</span><span style="color: #808080;">&#40;</span>NOLOCK<span style="color: #808080;">&#41;</span> 
		<span style="color: #0000FF;">ON</span> c.<span style="color: #202020;">PartSupplierCategoryId</span> <span style="color: #808080;">=</span> d.<span style="color: #202020;">PartSupplierCategoryId</span>
&nbsp;
<span style="color: #008080;">-- UPDATE existing records</span>
<span style="color: #0000FF;">UPDATE</span> <span style="color: #808080;">&#91;</span>DW<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>dbo<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>Dim_Part<span style="color: #808080;">&#93;</span>
<span style="color: #0000FF;">SET</span>
	 <span style="color: #808080;">&#91;</span>PartName<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartTypeId<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartTypeId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartTypeName<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartTypeName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierId<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartSupplierId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierName<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartSupplierName<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierCategoryId<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartSupplierCategoryId<span style="color: #808080;">&#93;</span>
	,<span style="color: #808080;">&#91;</span>PartSupplierCategoryName<span style="color: #808080;">&#93;</span> <span style="color: #808080;">=</span> b.<span style="color: #808080;">&#91;</span>PartSupplierCategoryName<span style="color: #808080;">&#93;</span>
<span style="color: #0000FF;">FROM</span> 
	<span style="color: #808080;">&#91;</span>DW<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>dbo<span style="color: #808080;">&#93;</span>.<span style="color: #808080;">&#91;</span>Dim_Part<span style="color: #808080;">&#93;</span> <span style="color: #0000FF;">AS</span> a
	<span style="color: #0000FF;">INNER</span> <span style="color: #808080;">JOIN</span> #Temp_SCDUpdates <span style="color: #0000FF;">AS</span> b 
		<span style="color: #0000FF;">ON</span> a.<span style="color: #202020;">PartKeyId</span> <span style="color: #808080;">=</span> b.<span style="color: #202020;">PartKeyId</span>  <span style="color: #008080;">-- Join on Business Keys or Surrogate Key</span>
&nbsp;
&nbsp;
<span style="color: #0000FF;">DROP</span> <span style="color: #0000FF;">TABLE</span> #Temp_SCDInserts
<span style="color: #0000FF;">DROP</span> <span style="color: #0000FF;">TABLE</span> #Temp_SCDUpdates</pre></div></div>

<p>The results?  Our 15 minute SSIS package has been replaced with a few lines of T-SQL and it now runs in less than 90 seconds.  Nice.</p>
]]></content:encoded>
			<wfw:commentRss>http://timlaqua.com/2009/05/maintaining-a-type-1-slowly-changing-dimension-scd-using-t-sql/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
