Hi, please see attached macro.
Practically what I want to achieve is to read dates in one table and compare them to intervals in the other table. Make a join based on this.
I found the solution in R here: https://stackoverflow.com/questions/41132081/find-which-interval-row-in-a-data-frame-that-each-eleme...
I works well in R studio but not in Alteryx.
Error: R (7): Error in grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): invalid regular expression '(^~)|(^.x %[]% data.frame(intervals[, c("start", "end")]))'
Info: R (7): Calls: %>% -> eval -> eval -> map -> map.poly -> grep
Info: R (7): In addition: Warning message:
Info: R (7): In grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): PCRE pattern compilation error
Info: R (7): 'unmatched parentheses'
Info: R (7): at ')'
Error: R (7): Execution halted
Any idea how to interpret/fix this?
Thank you,
F
Solved! Go to Solution.
Thanks Philip, challenge is this only works for small datasets, but I want a script that can be applied to larger data sets...
The native Alteryx way of doing it, using rows generate is also possible but that one also requires millions of records at least... which cannot be usable if the data periods grow a lot...
How large are the data sets?
Hi @florini,
Do you use the same version of R with RStudio and Alteryx? Is it 3.2.3 or 3.3.2?
Looking at your code, why do you need library("maps") [https://cran.r-project.org/web/packages/maps/maps.pdf]?
Thanks
This is the almost correct code:
You need the tidyverse package for the map function.
You need the intrval package for the %[]% function.
It still gives an error I'm tracking down.
Error: R (7): Error in `mostattributes<-`(`*tmp*`, value = attributes(elts[[1L]])) :
Info: R (7): adding class "factor" to an invalid object
Info: R (7): Calls: %>% ... .intrval -> .get_intrval -> pmin -> mostattributes<-
library(tidyverse) library(intrval) elements = read.Alteryx("#2", mode="list") intervals = read.Alteryx("#1", mode="data.frame") phases_for_elements <- map(elements, ~.x %[]% data.frame(intervals[, c('start', 'end')])) %>% map(., ~unlist(intervals[.x, 'oid'])) result <- unlist(phases_for_elements, recursive = TRUE, use.names = TRUE) final <- data.frame(lu,result) write.Alteryx(final)
i am interested how the R function performs versus the Alteryx method. I ran 10 million records through the Alteryx on my i5 laptop in about 3 minutes.
So I have 3.8 million dates in one table to lookup in 30,000 intervals in the second table... and the Alteryx way (all those ways are literally creating fake lines for the comparison) is creating far too many lines ... so I looked for a scritp to treat one line at a time without mass creating more lines...
Right, I installed the packages but did not called them anymore. I un-commented the lines, all worked well... now I get the same error as you do :(
Thanks for this observation...