Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

SCD Type 2

3D_Journey
7 - Meteor

Hi team, 

 

I'm not sure if this is right place to ask question. 

I have slowly change dimension type 2 for historic data working properly in Alteryx but my end date never updates when there is a change to my data. Any thought what I'm missing guys?

5 REPLIES 5
Treyson
13 - Pulsar
13 - Pulsar

Hey @3D_Journey,

 

This is sort of a big question. What type of DB are you using for your warehousing?

 

When you say type 2, I assume it's something like type 2 from this article: https://www.sqlshack.com/implementing-slowly-changing-dimensions-scds-in-data-warehouses/

 

I built a macro for this a few years ago that has been lost to time but essentially we would take the following steps.

 

1) Create a "holding" table where Alteryx would push all of the dimensions. Make sure to truncate the table at the beginning of this process so that you aren't potentially updating with old records. In the example above it would have one record per key and then a field with all of them.

2) Create a procedure that performs a merge/insert onto your SCD table. First checking for existing records where the "EndDate" is "9999-12-31" and has a key on your "holding" table. Also make sure that your insert statement is only updating/inserting when one or more of your dimensions don't match what is on the existing SCD table. The statement would update the value of the "EndDate" to today's date. I would also switch the "IsCurrent" flag to "no" or whatever boolean value you have representing that.

3) Insert all records on the "holding" table into your SCD table. To note, when creating that SCD table I would put rules on the "StartDate", "EndDate" and "IsCurrent" field to auto-populate when the new record is inserted. Which is why we update records that currently exist first.

4) You can call procedures in the output tool in the "post-SQL" section.

 

If you need help with this, please reach out! I have done this (or something like it) on quite a few occasions.

Treyson Marks
Senior Analytics Engineer
3D_Journey
7 - Meteor

Hi Treyson, 

 

Thank you for helping out. To answer your question, I'm using Redshift Discovery as DB and yes type 2 in the article.

I did pretty much your steps, but it seems like something here. Do I suppose to get duplicate results? Please look at the screenshot below!

Treyson
13 - Pulsar
13 - Pulsar

Are each of these records the same key? I think I need to see your scripts. If you send me a DM with your email we can get going on setting up a call. Or if you are okay posting your script here, that would be fine as well. If I was to guess, you might not be truncating the temp table before adding new records in?

Treyson Marks
Senior Analytics Engineer
3D_Journey
7 - Meteor

Sure, call will be great! If you can reply to my email.

Goa311@gmail.com

Treyson
13 - Pulsar
13 - Pulsar

For everyone following this thread. This became something much larger than Alteryx. It requires some understanding of warehousing.

 

We had to start from the ground up, creating a table in SQL Server that could handle slowly changing dimensions and then a process that compared dimensions on each record to see where records were changing between workflow iterations. It then closed out (changing the current field to N and added an end_date) old records and created new, current records.

 

I will, in time, create a webinar about this because I imagine not a ton of people that use Alteryx know what to do for Slowly Changing Dimensions.

Treyson Marks
Senior Analytics Engineer
Labels