Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Decoding a Neilsen CSV

Ideoclick36
6 - Meteoroid

Hi, I am still relatively new to Alteryx, and I have a dilemma. 

 

The Neilsen csv data file (very large) I am trying to decode has single rows with a single column of a string of numbers and letters that are assigned a segment of values (see table below).  Since the file is csv I am past the point of using the initial file to segment the data into the respective fields using the flat file method.  For example, for this string: 02000000160216010313930248001330005201010029900000022334POUND 000500000111, there are these parameters in the table below -  where the first number is where the field starts in the string and the second number is where the field stops in the string. 

 

What is the most efficient way in Alteryx to create a database with the separate fields using a csv file with a column of rows of strings?

 

Note the last segment in the first sting has a field "unit of measure" (e.g., POUND) which may vary (eg. OZ), but the space in between the two predefined segments (see example of one string above) allows for a quick change to two columns to decode separately if necessary.

 

That's as far as I've gotten.

 

Help?

 

PANEL ID18
PROCESS PERIOD94
DATE136
MODULE194
NHS OUTLET 234
UPC2712
UNITS392
DOLLARS415
DEAL461
COUPON474
BRAND516
UNIT OF MEASURE576
SIZE637
MULTI703
DEPARTMENT731
TRIP MISSIONS741
6 REPLIES 6
BenMoss
ACE Emeritus
ACE Emeritus

I've approached this problem a few times with some open data and the solution I use is to use the mapping table that you have created to assign each part of the string to a field.

 

I've attached an example workflow using your data.

 

Ben

Ideoclick36
6 - Meteoroid

Thanks, Ben.  This workflow is awesome!

Avinash_K
8 - Asteroid

If you know the string format (i.e ) length of strings you need from a line you could always use Regex to parse the string .

 

Capture.PNG 

Ideoclick36
6 - Meteoroid

Hi, Ben,

 

Back again.

 

I ran the workflow using the Neilsen data in all its glory (some 7M plus rows), and I ended up with only one row of output.

 

How are the additional rows generated?

 

Thanks again,

 

Sharon

Ideoclick36
6 - Meteoroid

Hi, Avinash,

 

Thank you for jumping into the discussion.

 

I tried the workflow with the entire Neilsen file, and I get the attached error message.

 

Any work around you know of?

Avinash_K
8 - Asteroid

The error you got is because of the text input tool .. I used the text input tool in my sample workflow to quickly get the input instead of creating a csv file (my mistake probably) , you can replace the text input tool with an input data tool and choose relevant  CSV file . Attached the workflow with input data tool.

Labels