I have a dataset with lat/lon information. My problem is that I have multiple duplicate entries and would like to interpolate between them.
For instance say I have 1000 unique GPS points, but 10,000 recorded data points (associated with other unique information that updated faster than GPS when collecting).
It's not always a fixed amount of points between unique GPS points so it has to be somewhat of a dynamic solution. I can probably write a C# or python script to handle it and run it externally from the run command, but that's a pretty ugly solution, is there a way to handle it within alteryx?
I was thinking multi-line formula, but the fact that there are different numbers of duplicates in a row is throwing me off.
Solved! Go to Solution.
Can you please provide some sample or dummy data to illustrate your starting data and expected result? That will help us visualize your question better! Thanks!
NJ
I've attached a file. So every row has unique data, but GPS didn't update fast enough so some rows have old duplicate GPS locations. If possible I'd like to interpolate between them.
For example:
in the attached CSV rows 14,15,16 have duplicate GPS points. I'd like to interpolate 15 and 16 using row 14 and 17
Hi @StephenMack
By interpolate, do you mean take the average of the values found at duplicate point locations?
Assuming the answer to @Philip question is yes, i.e. straight-line interpolation, then I believe the following will work. Although there is very likely a much faster/cleaner way to get there (this felt like a lot of tools to get to a fairly simple result)... it should work.
1. Add RecordID
2. Find Unique Lat/Lon combos
3. Use Multi-Row Formulas to find the difference between each Lat/Lon Combo, then divide by the number of rows between them, i.e. if there is a difference of 0.2 between RecordID 1 and RecordID 3, then the incremental amount that will be added to RecordID 2 when the data is brought back together should be 0.1, or 0.2/(3-1).
4. Find & Replace tool to join the NewLat & NewLon incremental amounts back to the original data by RecordID
5. Sort by Descending RecordID
6. If NewLat/NewLon is null, then use the row above
7. Sort by Ascending RecordID
8. Tile by Unique Value to find groupings by Lat/Lon combo
9. Formula: If the Tile Sequence = 1 then leave Lat/Lon as is, otherwise add the NewLat/NewLon to determine the interpolated location
Let me know if this gets you the data you need, or if additional tweaking/simplification is needed?
Cheers!
NJ
That looks wonderful thank you!
I was having trouble working out the correct way to use the multi-line formula, but I think how you added the recordID and separated out unique values to put it back together later is the right way to approach it.
Thank you!!
What if there are different numbers of rows. For example, what if you have two null value in location, three in another, and 1 in another? In excel, you'd hit F4 to set the second value and row ID but I'm not sure how you would do that in Alteryx.
I'm not totally sure what you mean.
You're saying if you have:
location
1
1
1
2
3
3
4
4
4
4
4
4
5
6
Are you asking how would you interpolate the different number of redundant rows between each unique location? So each individual interpolation could be different lengths? I believe the solution above handles that.
I think I understand, and maybe you can force your problem to fit into what I have in this post.
Can you force all the nulls to equal the last known value? Maybe on a multiline formula you can say
if [Value]=Null then [Row-1:Value] else [Value] endif
That way instead of having something like
1
null
null
8
9
null
11
you'll get
1
1
1
8
9
9
11
and then using the solution in this post all the non-unique sections will be interpolated