bi « tim laqua dot com

tim laqua dot com Thoughts and Code from Tim Laqua

8Apr/140 is dead – Create your own using SSRS and JDE metadata tables


Anyone who has used is certainly mourning its demise as JDE, like most ERP systems, is a bear to understand without a really big instruction book. Luckily you can chat with your JDE people and, given the correct libraries, modify the following query to get the same information:

       RTRIM(TDOBNM)   as Table_Name,
       RTRIM(d.SIMD)   as Table_Description,
       RTRIM(d.SIPFX)  as Table_Prefix,
       RTRIM(TDOBND)   as Data_Dictionary_Object_Name,
       RTRIM(TDSQLC)   as SQL_Column_Name,
       RTRIM(b.FRDSCR) as Column_Description,
       RTRIM(e.DRDL01) as Field_Type,
       RTRIM(c.FRDTAS) as Field_Length,
       RTRIM(c.FRDTAD) as Field_Decimals,
       --RTRIM(c.FROWDI) as DD_Item,
		ELSE 'Generic Edit'
	END as Edit_Type,
       --RTRIM(c.FROWER) as Edit_Rule,
       --RTRIM(c.FROER1) as Edit_Rule_Parm1,
       --RTRIM(c.FROER2) as Edit_Rule_Parm2,
       RTRIM(TDPSEQ)   as Program_Sequence_Number
        COPD910.F98711 a
        LEFT OUTER JOIN DD910.F9202 b
                ON a.TDOBND = b.FRDTAI
                   AND FRLNGP = '  '
                   AND FRSYR = '    '
        LEFT OUTER JOIN DD910.F9210 c
                ON a.TDOBND = c.FRDTAI
        LEFT OUTER JOIN OL910.F9860 d
                ON a.TDOBNM = d.SIOBNM
                   AND d.SIFUNO = 'TBLE'
                ON LTRIM(RTRIM(c.FROWTP)) = LTRIM(RTRIM(e.DRKY))
                   AND e.DRSY = 'H98'
                   AND e.DRRT = 'DT'
        RTRIM(TDOBNM) Like ?


Sample RDL can be found here: You will need to change the data source to work in your environment as well as update the following libraries to your environment:

  • COPD910.F98711 - F98711 varies per environment, so make sure you get the right library
  • DD910.F9202 - all the F92 files are in the same library
  • PRODCTL.F0005 - whatever library your UDC table is in
  • OL910.F9860 - the Object Librarian library
Tagged as: , , No Comments

Hiding SSAS Attributes With Powershell

The challenge here was that we have a cube with some great data as well as some low level detail (customer detail, think PII) that we want to expose to a larger audience. The problem is that the low level detail is not something that the new audience needs (or in some cases, is allowed) to have access to. The obvious answer here is Dimension security - why not use that? Two reasons. First, Dimension security is slow. Second, even if it wasn't slow, to hide all the data in many (30+) attributes is tedious to setup, and when new attributes were added we would have to make sure and disable access to those as well. To be clear, we're not just hiding attributes here, we're creating an entire copy of the existing SSAS database and hiding attributes in the copy.

# Import Required Libraries
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices") >$NULL

# Connect to server
$server = New-Object Microsoft.AnalysisServices.Server

# Make a metadata copy of the existing cube
$newdb = $server.Databases.GetByName('POS Data').Clone()

# Change ID and Name of SSAS DB
$newdb.Name = 'POS Data - Limited'
$newdb.ID = 'POS Data - Limited'

# Drop Existing SSAS DB if it exists
$server.Databases['POS Data - Limited'].Drop()

# Add the new copy to the server

# Sync our copy of the new database with the server's copy

# Grab the cube we want to work with from the new database
$cube = $newdb.Cubes.GetByName('POS Data')

# Hide the Location Dimension
$cube.Dimensions.GetByName('Location').Visible = $false

# Hide all attributes and hierarchies in the Customer dimension
$cube.Dimensions.GetByName('Customer').Attributes |  %{$_.AttributeHierarchyEnabled=$false; $_.AttributeHierarchyVisible=$false;}
$cube.Dimensions.GetByName('Customer').Hierarchies | %{$_.Visible=$false; $_.Enabled=$false;}

# Enable the key attribute in the customer dimension - it won't work if the key isn't enabled
$cube.Dimensions.GetByName('Customer').Attributes['Dim Customer'].AttributeHierarchyEnabled=$true

# Enable the Market attribute in the customer dimension
$cube.Dimensions.GetByName('Customer').Attributes['Market'] | %{$_.AttributeHierarchyEnabled=$true; $_.AttributeHierarchyVisible=$true;}

# Hide the Location Database attribute in the Product dimension
$cube.Dimensions.GetByName('Product').Attributes['Location Database'] | %{$_.AttributeHierarchyEnabled=$false; $_.AttributeHierarchyVisible=$false;}

# Add a new member to the role granting Read Only permissions in the cube
$newMember = new-object Microsoft.AnalysisServices.RoleMember("domain\tlaqua")
$newdb.Roles['Role 1'].Members.Add($newMember)

# Push our updates to the server

# Process the new database

# Disconnect from the server

This approach has two notable downfalls. First, you have to think up a different database name because the original db with the low level detail still exists on the server. Second, you have to ProcessFull the clone that you made. It doesn't close the data, it clones the metadata. All in all, works great for us, this particular cube is only processed once a week and the 20 minutes we lose processing data in to the clone is more than acceptable (looks really clean to the users as well).


Consuming Sharepoint Lists via OData with SSIS

We recently needed to bring data from a Sharepoint list in to SSIS for use in an attribute lookup for a cube dimension. Seems like this should be pretty straight forward since SSRS does it natively now, but no - that wizardry hasn't made its way over to the SSIS team yet. As stated before, we don't care for third party dependencies or external non-standard assemblies of any sort in our packages. That means, as usual, we'll be writing a script component to take care of getting our data out of Sharepoint.

First, navigate to the list you are interested in and export it as a data feed


Automating PowerPivot Data Refresh in Sharepoint 2010

Of course we want to do this, it's a fundamental requirement. Dear Sharepoint, please refresh my PowerPivot when the data is ready, not on some arbitrary schedule. Until this functionality is built-in, people will continue to hack away at it to figure out how to make this happen. Is this supported? Certainly not - so continue at your own risk.

So the method we'll be using here is just to mimic what Sharepoint does when you check the box to "Also refresh as soon as possible" in the schedule configuration page. To accomplish this we open the profiler, connect it to whatever instance our Sharepoint PowerPivot database is hosted on and filter you TextData to "%Schedule%" or filter to just the Sharepoint PowerPivot database (SP2010_PowerPivot_Service_Application in our case) - then open up your test PowerPivot schedule configuration, check the box, click OK, wait for the schedule history to come back up and then stop the trace. Now you know you've got what you need, you just have to find it:


Slowly Changing Dimensions with MD5 Hashes in SSIS

We recently moved away from the 3rd party Checksum component (and all 3rd party components) in SSIS and I wanted to share the pattern we settled on for maintaining our Type 1 Slowly Changing Dimensions (SCDs). There are two things we wanted to address with our new pattern. First, our previous implementation wasn't performing as well as we needed it to or generating reliable checksums. The second was that we wanted to get away from dependencies on custom assemblies in general. To illustrate the pattern, we're going to build a SCD package off the Adventure Works DW DimCustomer table and skip over the actual source of the business keys and attributes by selecting directly from the completed dimension for now.

First, we assume that our dimension already exists (and we were using some other checksum or MERGE to maintain it). We have to add a column to store the MD5 hash:

ALTER TABLE dbo.DimCustomer ADD

Second, we need a staging table to store updated/changed rows. Script out the current dimension as a CREATE, remove all unneeded constraints and indexes, and create a staging table as a heap:

CREATE TABLE [dbo].[Staging_DimCustomer_UpdatedRows](
	[CustomerKey] [int] NOT NULL,
	[GeographyKey] [int] NULL,
	[CustomerAlternateKey] [nvarchar](15) NOT NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NULL,
	[NameStyle] [bit] NULL,
	[BirthDate] [datetime] NULL,
	[MaritalStatus] [nchar](1) NULL,
	[Suffix] [nvarchar](10) NULL,
	[Gender] [nvarchar](1) NULL,
	[EmailAddress] [nvarchar](50) NULL,
	[YearlyIncome] [money] NULL,
	[TotalChildren] [tinyint] NULL,
	[NumberChildrenAtHome] [tinyint] NULL,
	[EnglishEducation] [nvarchar](40) NULL,
	[SpanishEducation] [nvarchar](40) NULL,
	[FrenchEducation] [nvarchar](40) NULL,
	[EnglishOccupation] [nvarchar](100) NULL,
	[SpanishOccupation] [nvarchar](100) NULL,
	[FrenchOccupation] [nvarchar](100) NULL,
	[HouseOwnerFlag] [nchar](1) NULL,
	[NumberCarsOwned] [tinyint] NULL,
	[AddressLine1] [nvarchar](120) NULL,
	[AddressLine2] [nvarchar](120) NULL,
	[Phone] [nvarchar](20) NULL,
	[DateFirstPurchase] [datetime] NULL,
	[CommuteDistance] [nvarchar](15) NULL,
	[MD5] [varchar](34) NOT NULL)

Now in to SSIS - We will be building:

  1. Execute SQL Task to Truncate our Staging table(s)
  2. Data Flow Task to Insert new rows and Stage updated rows
  1. OLE DB Source to retrieve our source data
  2. Script Component to Generate Row Numbers
  3. Conditional Split to Evenly Distribute Rows
  4. Script Component to Generate MD5 Hashes
  5. Union All to Squish it all back together
  6. Lookup to get the existing MD5 Hash (if it exists)
  7. Conditional Split to separate Unchanged and Changed rows
  8. RowCount Transformation
  9. OLE DB Destination for Changed rows
  10. OLE DB Destination for New rows
  • Execute SQL Task to Update changed rows
  • Completed Control Flow

    Completed Data Flow


    Uncivil UNIONs

    If you asked any SQL Server Developer or DBA what UNION does, they would tell you it combines two result sets. Then if you were to ask them if they knew it removed duplicates without the ALL argument, they would say yes. That's academic, but I dare you to go look at their and your code and see how often UNION without the ALL argument is used. UNION is evil. Why do I say that? Because DISTINCT is evil and blocking operations are evil so it follows that UNION is evil.

    Would you be OK with the following query as the source for a fact table?

      SELECT DISTINCT [c1],[c2] FROM [t1]
      SELECT DISTINCT [c1],[c2] FROM [t2]
    ) a

    No? Then why are you OK with this?

    SELECT [c1],[c2] FROM [t1]
    SELECT [c1],[c2] FROM [t2]

    Personally, I tend to use UNION to combine data from two different sources. For example, if I wanted to combine the sales transactions from two JDE instances - the query might look like:

    SELECT [c1],[c2] FROM [JDE1].[dbo].[F42119]
    SELECT [c1],[c2] FROM [JDE2].[dbo].[F42119]

    Each line here in a sales detail represents a line on an order, so assuming you selected all the PK columns in your query, a UNION won't remove any rows within either set, but when combined - it's possible the same PK could exist in both systems. Now you could split hairs and say it's highly unlikely the rest of the row would be identical, but that's not the point - it's wrong to ever DISTINCT a set of transactions intended for creating a fact table. Sure, in this particular case you could add a "SourceSystem" column that would stop that, but then you still have to accept that the UNION is blocking and expensive. Just because a DISTINCT doesn't modify results doesn't mean it didn't do any work - it actually did a great deal of work. Now consider that we're usually not selecting 2 columns, we're selecting 10-20. Do you want to wait for it to figure out all the rows are distinct when you already knew that? Further, if they're not distinct you probably screwed up a JOIN somewhere.

    As far as coding standards regarding UNION go, I would suggest never using UNION without the ALL clause and doing your DISTINCT elsewhere if that's actually what you intended. Does that seem silly? Yes - but I would wager you won't ever use the 2nd one outside of building Dimensions.

    SELECT [c1],[c2] FROM [JDE1].[dbo].[F42119]
    SELECT [c1],[c2] FROM [JDE2].[dbo].[F42119]
      SELECT [c1],[c2] FROM [JDE1].[dbo].[F42119]
      SELECT [c1],[c2] FROM [JDE2].[dbo].[F42119]

    And don't get me started on unioning subqueries with GROUP BY's - Why don't you just throw half your processers in the garbage, light tempdb on fire and put all your data in heaps.

    Tagged as: , 3 Comments

    Hiding SSRS Schedule Jobs In SSMS

    Everybody hates these things. If you're in the Activity Monitor or browsing via the SSMS tree view, these GUID jobs that represent SSRS subscriptions are really just none of our concern. Sure, I can admit that I've seen people manually fire these to re-send a given subscription, but you can just do that using the AddEvent proc in a query window. Personally - I don't want to see these... usually...

    Connect to the database instance you want to filter the agent jobs out on
    Browse to Databases > System Databases > msdb > Programmability > Stored Procedures > System Stored Procedures
    Right-click on dbo.sp_help_category and select Modify...

    At the top, change the definition of @where_clause to NVARCHAR(MAX)

      DECLARE @where_clause   NVARCHAR(max)

    At the bottom, add in a few lines to append the @where_clause variable with a predicate that filters out the Report Server category when it's you from your workstation (so you can still see that category from another machine if you need to).

      SELECT @cmd = @cmd + N'FROM msdb.dbo.syscategories '
      SET @where_clause += N'
    			name = ''Report Server'' 
    			AND (
    				SELECT RTRIM(nt_username) + ''|'' + RTRIM(hostname) 
    				FROM sys.sysprocesses 
    				where spid = @@spid) = ''tlaqua|TIMSLAPTOP''  THEN 0
    		ELSE 1
    	END = 1 '
      -- Execute the query
      EXECUTE (@cmd + @where_clause + N'ORDER BY category_type, name')
      RETURN(@@error) -- 0 means success

    So, what on earth are we doing here? First, replace my nt_username with yours and replace my hostname with yours. From my less-than-exhaustive trial and error testing, it seems that when either the SSMS Jobs node is expanded or the Job Activity Monitor fetches jobs, two DB calls are made - one to fetch categories and another to fetch jobs. I tried filtering out the jobs portion originally and that yielded some errors. So I'm assuming it's trying to marry the categories and the jobs internally, and it expects there to be jobs for each category the first query returned. By not returning the Report Server category at all, the resulting merged list of jobs doesn't contain any jobs belonging to that category (logically an INNER JOIN).

    Sure, this is a dirty hack, but I don't mind.

    Update (2012-01-11)
    Here is the predicate for hiding those jobs from ALL SSMS clients:

      SET @where_clause += N'
    			name = ''Report Server'' 
    			AND (
    				SELECT program_name 
    				FROM sys.sysprocesses 
    				where spid = @@spid) = ''Microsoft SQL Server Management Studio''  THEN 0
    		ELSE 1
    	END = 1 '
    Tagged as: , 1 Comment

    SSIS, BIDS and 64bit System DSNs: Who’s on first?

    After an hour or so of changing settings, package configurations, protection levels, passwords, connection strings, etc - you might just still have yourself one of those dreaded red boxes (and a headache). And you thought inserting data using ODBC was easy...

    You might even have yourself some logging that indicates an error like this:
    [AS400 [19]] Error: ADO NET Destination has failed to acquire the connection {43E6AE37-24E8-46F6-8AB0-689DB6531167}. The connection may have been corrupted.

    Assuming you are using a System DSN for your connection (this article is based on an ODBC DSN using the iSeries Access ODBC Driver to write to DB2, but it applies to any ODBC driver with 32bit and 64bit versions where a System DSN is used for the connection).

    Before going on, you should understand and have checked the PackageProtectionLevel property in your package. To get any connection that requires a password to run in BIDS, you can't use DontSaveSensitive (unless you're using package configurations to enter the password or an explicit connection string). This is by far the most common reason connections fail to validate - but there's many resources out there already on that situation, so I won't go in to it.

    Now that we know it SHOULD work - why doesn't it? In short, because you don't have a 64-bit version of the System DSN you selected. What sense does that make? Let me show you.

    Tagged as: , Continue reading

    Pattern for Conditionally Sending SSRS Subscriptions

    This is a pretty common request - send the report if there's an issue, don't send it if there's not an issue. This is a simple pattern for doing that using Data Driven Subscriptions. Here, we're not using the data driven subscription to dynamically populate recipient and parameter information, we're just using it to suppress the sending of the report if the report has no value.

    To do this, we only need one thing - a query that returns one row if the report should be sent and zero rows if the report should not be sent. Here is an example - Let's say I want to send a report out if a JDE Address Book record with an AT1 of 'DC' has been updated after a certain julian date.

    First, write a query to return rows meeting that criteria:

    Tagged as: , Continue reading

    Consuming an Authenticated JSON Feed with SSIS

    For this post, we'll look in to one option for consuming a JSON data feed from a service provider's API. In my case, the provider is AtTask - a web based project management service. This sort of assignment/request is pretty common as various corners of your organization simply want to use the best tool for the job. Far too infrequently does anyone really look at things like how it connects to your current systems. They say things like "we have a well documented API and a strong user community" - yeah, I've heard that a few thousand times. JSON is neat if you're using JQuery. It's horrible if you're trying to suck data down in to a SQL Server table. Which is what we're going to do.

    First, we wander over to the AtTask API Documentation and figure out how to authenticate. Notice that every request needs a sessionID - and that sessionID is created by a request to the login url. This means we'll have to make at least two requests - one to nab a sessionID and another to actually get the data that we want.

    Tagged as: , , Continue reading