Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Is there a way to do a delete statement in In-DB

jeeva_ganesan
8 - Asteroid

Hi All,

 

Is there a way to do a delete statement in in-DB alteryx ? I know that we can do "Delete Data and Append" in creation mode of Write Data-InDB. But there is no way to specify the where condition or its values in here. In the normal Non-IndB tools, we could update the pre or post sql statement by placing it in the macro. But when I do the same for In-DB tools, there is no way to specify the SQL statements here. Could someone help on this.

 

Regards,

Jeeva.

15 REPLIES 15
KaneG
Alteryx Alumni (Retired)

Hi Jeeva,

 

 

So, there is no out of the box easy way to just 'delete records In-DB' that I know of. Feel free to post it in the Ideas section. I've looked at the issue a couple of different ways as with the information being talked about across 10 comments, I'm not sure if I understand the original use case... one thing to remember if you are wanting to simulate SQL code, is you can put a Dynamic Output tool on your In-DB tools to see the query that is being sent to the DB. This may help with troubleshooting if you want to try some methods.

 

You could use a Block until Done in the Macro, so that nothing exits the Macro until the Output tool is finished? Of course, if this is an In-DB Macro then that all changes...

 

Image 001 - 20160122 - 170101.png

 

However the other option is to use a formula tool to change one of the fields to a Delete flag, then use that flag in an output tool Post sql delete statement, similar to the response (#7) from @MarqueeCrew. I think I may have your order around the wrong way though as you want to delete the records and then write data In-DB....?

 

Image 001 - 20160122 - 170128.png

 

Kane

jeeva_ganesan
8 - Asteroid

Ok. Thanks all for your explanations. Here I am explaining the actual situation I am trying to handle here.

 

I have a table which has data for past 12 months. Now in future, every month I receive moving 12 months data. for example, first month JAN-DEC, next month FEB-JAN, then MAR-FEB, so on.. Most important thing here is I dont have any key in this data. So first time I load the whole data in my main table, next month I am loading the newly received data in some stage table, then use In-DB to compare main and stage table to find out all the months which has changes in the overlapping period, then I am loading only the data for those months which has changes from stage to main table, also delete the already existing data for those months from main table. The problem I am facing here while using In-DB for this is, after I find out the months to delete and reload, I am not able to delete those months using In-DB write tool. Write-IN DB just deletes the whole data, no option to specify any "WHERE" condition. 

 

Challenges here - I dont want to delete the whole data every time from main table, also dont want to create new table every time. Also I dont want to stream out the whole data out of In-DB. (All these constraints mainly because the amount of data is huge).

 

And when I try to use any flag, I am not sure how I can update the flag in table based on the months I found out. Also, I dont know how I can delete based on flag in In-DB, if I do this outside of In-DB, how does the output tool with post sql know if the data load has been completed.

 

Let me know if any clarifications required.

KaneG
Alteryx Alumni (Retired)

In answer to 

| And when I try to use any flag, I am not sure how I can update the flag in table based on the months I found out.

 

The Filter/Formula combo in my last post should do this

 

| Also, I dont know how I can delete based on flag in In-DB

 

I don't think that you can.

 

So, I would say your method goes something like,

  1. Stream your new data into a temp table
  2. Use In-DB tools to compare new data to old data
  3. Flag data to be removed as it is being replaced
  4. Merge new data into the table
  5. Flag old data to be removed as it is too old
  6. Stream 1 record out of the DB and into a block until done tool (like the screenshot above)
  7. First part of the block until done can be anything (a browse for example)
  8. Second part of the block until done is your Output tool with Post SQL

 

Kane

s_pichaipillai
12 - Quasar

Jeeva

 

Since you don’t have Key columns, finding the rows that changed for existing rows on Main table is little tricky

 I would like to add some change in your Table .i.e Add HASHBYTES (here I used “SHA2_256” Algorithm)

 

 I have created a sample Table with Hash Key columns as a computed column like below

 TableHash.PNG

 

 

[HashKey]  AS (CONVERT([bigint],hashbytes('SHA2_256',((((((((([MonthFlag]+'|')+[CarrierTrackingNumber])+'|')+CONVERT([varchar](255),isnull([OrderQty],(0))))+'|')+CONVERT([varchar](255),isnull([ProductID],(0))))+'|')+CONVERT([varchar](255),isnull([SpecialOfferID],(0))))+'|')+CONVERT([varchar](255),isnull([UnitPrice],(0)))))) PERSISTED

 

 

 

Then I connect In- DB for Main table and connect the source file which comes with Moving data and join them (Make sure you do Full Outer Join) and join with All columns

This give you data comparison of both source and main table

 

Now flag them based on the new and existing row. Here I flagged New Rows that are not exist in main table as “I” and  “D” as the existing row which needs to be deleted

 

Finally load them using write InDB to your Staging table. I selected Delete and Append as this is a Staging Table

 

One Issue here:  when you design your staging table computed column cannot be used while writing In DB. So I Derived the Hashkey at final SQL Script

 

Finally I created a stored Procedure to Delete ONLY the row changed and Insert all New Rows (UPSERT) and execute the SP using SQLCMD

This proc does Delete and Insert task using Merge SQL Statement  

CREATE PROC usp_OrderDetail_UPSERT
AS
MERGE [orderdetail] AS TARGET 
USING (SELECT [monthflag], 
              [carriertrackingnumber], 
              [orderqty], 
              [productid], 
              [specialofferid], 
              [unitprice], 
              [flag],
			  [HashKey]  =  CONVERT(BIGINT,HASHBYTES('SHA2_256',
																		[MonthFlag]
																+ '|' + [CarrierTrackingNumber]
																+ '|' + [OrderQty] 
																+ '|' +	[ProductID]
																+ '|' +	[SpecialOfferID]
																+ '|' +	[UnitPrice]
															 
																))    
FROM [OrderDetail_Stg]
 ) AS SOURCE	ON source .[HashKey] = TARGET.[hashkey] 
WHEN MATCHED AND flag='D' THEN 
  DELETE 
WHEN NOT MATCHED AND flag='I' THEN 
  INSERT ([monthflag], 
          [carriertrackingnumber], 
          [orderqty], 
          [productid], 
          [specialofferid], 
          [unitprice]) 
  VALUES ([monthflag], 
          [carriertrackingnumber], 
          [orderqty], 
          [productid], 
          [specialofferid], 
          [unitprice]); 

here is the actual workfow looks like 

Jeeva_InDB.PNG

Call SP.PNG

Here is the step for, how this method does actually works:

 

  1. Compare the Target and Source using Full Outer and Flag then NEW and OLD rows
  2. Filter ONLY D and I flags rows into Staging table i.e for Delete and Insert
  3. Using MERGE Delete ONLY the rows which are changed in Target then LOAD all New Rows. By this way you no need to delete all the data from Main table
  4. Delete is not the direct option here, because I think , In-DB tool are intended for data retrieval performance
  5. So call it using the Alteryx Event after successful run

 

Your challenges addressed here:

  1. I don’t want to delete the whole data every time from main table : this method don’t delete all the data but delete ONLY the data which has changed
  2. dont want to create new table every time: yes it just uses the existing staging table and every time delete and append the required data
  3. Also I dont want to stream out the whole data out of In-DB: yes, this does not streams the data out but we did everything using InDB except the Stor Proc which has to be executed using Event. I am sure MERGE does good JOB for huge volume of data

finally , most of DB folks looking the SQL tool to call sql code in Alteryx , 

So please vote for chris idea

 

http://community.alteryx.com/t5/Alteryx-Product-Ideas/Have-an-SQL-Tool/idi-p/5392

 

i also raised for executing the Alteryx workflow with piece of Logic in Sequence, so that we can load the data into staing and do some data cleaning using SQL then load it into target. and it fits into your scenorio as well . vote the below too if you find its useful 

 

http://community.alteryx.com/t5/Alteryx-Product-Ideas/Control-Order-of-Execution-of-Workflow-Objects...

 

hope this helps and let us know if you still need anything 

Happy Data Blending :)

jeeva_ganesan
8 - Asteroid

Thanks each and everyone for their time.. Its really useful to different solutions to handle a particular problem. I think I got what I need, I am going to try both what Kane and Pichaipillai suggested, let me see which one works fast as I have to do this for millions of records..Happy day. Thanks for the help.

Mychele
7 - Meteor

Thanks!  This is a simple workaround.


However the other option is to use a formula tool to change one of the fields to a Delete flag, then use that flag in an output tool Post sql delete statement, similar to the response (#7) from @MarqueeCrew...

 

Kane


 

Labels