community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

import into redshift - DEFAULT data value

Hi,

 

I am trying to use Alteryx as an ETL tool to load data into AWS Redshift.

 

One of the columns in the table is setup with the DEFAULT data option

 

i.e.

datetimeinserted datetime NOT NULL DEFAULT convert_timezone('Australia/Sydney'::text, getdate()),

 

This column is the last column in the table.

 

It does not work when using the bulk redshift upload tool in Alteryx

 

stl_load_errors table in Redshift shows "Delimiter not found"

 

Any ideas? I've tried passing through a "" (blank) and NULL but fails because I need the columns as a NOT NULL

 

The column is designed to represent the true date time inserted of records into the table.

 

Using the DATETIMENOW tool in Alteryx is not an option, because there is no way to control the execution of tools in a workflow. Well, it could be done by wrapping the tool inside a macro, but still it is not a true datetimeinserted value.

 

Appreciate the help,                                                                    

Ryan

Alteryx Certified Partner
Alteryx Certified Partner

I was unable to recreate your exact issue but I did experience something similar

 

Created the following table in Redshift

 

CREATE TABLE test_tmp (
first int8 NOT NULL,
second int8 NOT NULL,
third int8 NOT NULL DEFAULT 8
)

 

I used a Text Input tool with two columns and two rows

 

FirstSecond
24
35

 

Using the Amazon Redshift ODBC connector I was able to write to the test_tmp table

 

However, when I switched to the Amazon Redshift bulk loader, I was unable to write to the table and received the following error:

Error: Output Data (7): The number of columns to Append (2) do not match number the number of columns (3) in the table

 

Not the exact same error as you are receiving but still an error.

 

The COPY statement would need to be constructed as such

copy test_tmp(first, second) 
from 's3://mybucket/...' 
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
delimiter ',';

 

I wonder if the COPY statement is leaving out the explicit field list after the table name 

Thanks for testing in your environment Michael. You are definitely correct in that the COPY command done by the bulk upload tool is leaving out the explicit field list

@Alteryx, please change bulk upload tool to do this, or at least have an option to enable it (I'll wack a post in the Product Ideas forum)

 

The raw_line column in stl_load_errors table showed something like this:

1|1|2000-01-01 00:00:00|2000-01-01 00:00:00|1|1|1

 

So i create a .txt file with this inside it, put it on s3 and ran the copy command using the explicit field list as per your example Michael, as well as set delimiter to '|'.

It works.

 

I'll be using the below macro until the bulk upload tool is enhanced

datetimenow macro.PNG

 

 

I initially tried using the "date time now" tool, but it was executing at the same time the main workflow was executing, creating a large difference in time when compared to the actual start time of the bulk upload part. The "date time now" tool does not accept inputs, so it could not be postponed using the "block until done" tool.

 

It will be placed directly before the bulk upload tool so the datetimenow() generated in the macro is the closest time to the start of the bulk upload to redshift.

 

bulk upload to redshift.PNG

 

The name of the column in the redshift table will be changed from datetimeinserted to bulkupload_starttime, with the DEFAULT option removed.

 

Meteoroid

Hi Guys,

 

I'm late to this conversation but would like to know if this problem has been solved because I'm facing the same issue at one of my client.

Even I have the work around of nearest time but I don't want to go with that solution.

 

Thanks

Labels