Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Data Manipulation - break list of known fields by2byte chunks, convert, concatenate field

sjgittins
5 - Atom

Hi all! First post.  I'm relatively new to Alteryx and have found most solutions to my issues reading thru this very helpful community so I have a unique (unable to find anything remotely close) issue that I need help with:

 

Have hundreds of thousands of rows of data with basic information and a large single cell of data that contains many parameters.   Once this large field is broken up into the corresponding 50+ parameters into their own new column, these need to be broken up into 2 byte chunks since the data in each column is a 2d array with different lengths (happen to be even numbers).  Each cell of the array needs to be converted from HEX to Decimal, and concatenated back into a single cell, as a 2d array.

 

The current path takes a single RegEx and splits them into columns based off the location of bytes in the large field.  Then the tedious part begins, I utilize a REgEx tokenize step with (.{2}) as the regex with split to rows, followed by a multi field formula to convert each field to decimal, and then a summary tool to concatenate back into a single cell.   (see attached workflow)

 

Currently, I have to maintain  50+ join blocks, 50+ regex tokenize blocks, 50+ multifield formula blocks, and 50+ summarize blocks which is a nightmare of spaghetti lines.  I spent days tediously joining all 50 of these toolbox grouped steps and finally have it working but at immense frustration and stressful tedious work.

 

Is there a way to implement something like multifield formula to this process?  My current understanding of the multi field formula would not be able to perform the tokenize, convert, and concatenate for a list of signals. 

 

It feels like I'm not going to be able to avoid having to perform each individual calculation with a unique block, and can't lump them all together.  Any hints?

 

Is this something outside the scope of standard blocks, and requires developer integration (like python?)  New to all things alteryx, so any suggestions are welcome.  

 

Thanks greatly in advance!

 

Scott

 

 

 

7 REPLIES 7
apathetichell
18 - Pollux

This transposes all of the elements and runs the same process for each... Does it work? It matched to your sample output for #4 - and for 10.

mpennington
11 - Bolide

I'm not sure if this works or not, but I'm betting I had the same idea as @apathetichell --I haven't checked out the workflow that was created.  

 

2021-04-18_22-45-52.jpg

apathetichell
18 - Pollux

@mpenningtonGreat workflow! one quick fix -  my joins were initially off because I used "(", and")" - you used "{" and "}" -  @sjgittins uses "[" and "]"

mpennington
11 - Bolide

@apathetichell Likewise. Good catch, I thought I should've looked a bit closer to the join bracket. Characters are awful small even a 24" monitor. 

sjgittins
5 - Atom

2 great answers before I even woke up this morning.  Super impressive.  I brought it into my workflow and re-mapped the signal names.  The only thing I needed to add at the end was a cross tab block to bring all the items back into 1 row of data with many columns.  I can't believe how simple it looks.  I ran the full data load thru this with 100k + rows of data and it finished in a record 18 minutes! Cheers!!

 

Lastly, I am still trying to understand how the heck you were able to do this.  Here's my take, and PLEASE correct or add details so I can learn from this and not just take an answer and be blind dumb and happy.

 

  1. The first transpose takes all the elements (aka parameter/name) and puts them into a column.
  2. The record ID (which I'm not very familiar with) creates a new column with unique values for each row in a single column called RecordID.
  3. The select block selects all the items - is this good practice in case I didn't need all of the items, or does this serve some other purpose?
  4. The RegEx block parses each value for each element (aka parameter/name) by row and creates an even longer column of data
  5. Multi-Field Formula self explanatory, just converts the hex values in the column "Value" into decimal
  6. The Summarize tool concatenates each individual 2d element back into the original table by using group by "element / name"     
  7. me - added a cross tab block to regroup the items into a single row
sjgittins
5 - Atom

Thank you both, this works now and I was able to happily delete almost 200 blocks with spaghetti lines going EVERYWHERE.  Cheers! 

mpennington
11 - Bolide

@sjgittins  Love that you are wanting to learn more about how it works.  I think your understanding is pretty good, although for some reason I couldn't see your image. Your bullet points seemed correct, I'll add my thoughts below:

 

  1. The first transpose takes all the elements (aka parameter/name) and puts them into a column .  Yes, you frequently need to work with Transpose and Cross Tab to try to solve many data problems. You often then end up using the opposite tool at the end (as you do with your Cross Tab reference in 7 below)
  2. The record ID (which I'm not very familiar with) creates a new column with unique values for each row in a single column called RecordID. This tool is really invaluable when approaching this type of problem. You typically need to identify elements and this a quick way to do so. So if you are parsing words into letters and you are going to need to get back to a word summary at some point Record ID. Unless it already exists in your data set, if you are doing any parsing or splitting of rows or columns, it is a safe bet that you need this tool. 
  3. The select block selects all the items - is this good practice in case I didn't need all of the items, or does this serve some other purpose? Good practice if you don't need columns or if you need to resize or rename elements. If you are dealing with large data, your workflow can improve speed and efficiency, if you don't pass items you don't need through the workflow. 
  4. The RegEx block parses each value for each element (aka parameter/name) by row and creates an even longer column of data  Correct, this is probably my favorite tool, but it does take some time to figure out the nuance.
  5. Multi-Field Formula self explanatory, just converts the hex values in the column "Value" into decimal  Correct, this and the Multi-Row tool are great for batch applications
  6. The Summarize tool concatenates each individual 2d element back into the original table by using group by "element / name"  Correct, this tool is quite powerful. It can work with strings, spatial objects, in addition to more standard data types.    
  7. me - added a cross tab block to regroup the items into a single row.  Commonly needed approach.  
Labels