This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
It's the most wonderful time of the year - Santalytics 2020 is here! This year, Santa's workshop needs the help of the Alteryx Community to help get back on track, so head over to the Group Hub for all the info to get started!
I am considering the best / most efficient way to turn Null values of multiple rows into 0
I have 2 input files of which one of them comes without a value for two columns called "Increase" and "Decrease".
Then I union these files resulting in for example:
Total Increase Decrease
a "Null" "Null"
b 13 12
c 142 123
Now the first row is from the file without a value of increase / decrease and the other rows are coming from the other file.
I am considering several solutions:
1. Data cleansing for Null fields
This does not work as the fields are not yet numeric values. Later in the workflow I do convert the columns to int32 which would allow me to data cleanse these. However this is after a join with another set of data which leads me to think it's not the most efficient way of processing the data. Furthermore I like doing changes as close to the source as possible to avoid any hiccups in between.
2. Multi-row formula with an expression IF IsNull([Increase]) OR IsNull([Decrease]) THEN 0 ELSE [Decrease] ENDIF.
This gives me "Error: Multi-Row Formula (33): The field "" is not contained in the record."
3. Single formulas are easy, but it does not feel good having to use two of them.
4. Changing the columns data type to int32, then using the data cleaning tool. However this would add another action in the workflow.
5. Adding the two column behind the first input file and putting all rows to 0.
I am leaning towards option 5 because it also avoids us not noticing any 'Null' values from the other two input files.
So this leaves me with the following questions:
1. Which solution would be the most efficient?
2. Why does my formula in the second scenario not work?
I just wanted to add into here as you were interested around efficiencies, and thought I'd pipe in 🙂
The data cleanse tool is actually a macro, if you right click, there is an option to open it up. You'll then see it is in fact a stream of multi-row formulas.
So the most efficient was is that of using one multi-row formula. As it will apply just the logic you need, not any "extras" that the data cleanse may be doing.
I'll also add that if you are eventually converting to a Int32, then why not do this earlier? It can take up less space than storing it as string and for me will keep things cleaner to have the field types representing the data values themselves as early as possible.