Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

R code non parsed well

florini
6 - Meteoroid

Hi, please see attached macro. 

 

Practically what I want to achieve is to read dates in one table and compare them to intervals in the other table. Make a join based on this. 

I found the solution in R here: https://stackoverflow.com/questions/41132081/find-which-interval-row-in-a-data-frame-that-each-eleme...

I works well in R studio but not in Alteryx. 

 


Error: R (7): Error in grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): invalid regular expression '(^~)|(^.x %[]% data.frame(intervals[, c("start", "end")]))'
Info: R (7): Calls: %>% -> eval -> eval -> map -> map.poly -> grep
Info: R (7): In addition: Warning message:
Info: R (7): In grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): PCRE pattern compilation error
Info: R (7): 'unmatched parentheses'
Info: R (7): at ')'
Error: R (7): Execution halted

Any idea how to interpret/fix this?

Thank you,

F

 

 

 

12 REPLIES 12
florini
6 - Meteoroid

I use 3.3.2 in Alteryx, 3.4 in R studio.... but I am sure this is not the cause... any other idea?

 

Philip
12 - Quasar

@florini, if you can't get the R code to work, here is an alternative Alteryx workflow that splits the data into grouping levels. I ran it against your stated 3.8 millions dates and 30,000 intervals and it ran in less than 3 minutes on my laptop (including creating the data sets). You could optimize the indexing levels to improve performance a bit more, and add in an additional level(s) with smaller tiles groups to improve performance more.

 

Find range matches 2 levels.png

florini
6 - Meteoroid

Thanks guys, at the end I took a different R code approach, I used foverlaps, see below the code. Hope this helps someone one day. 

 

I compared this to 'Alteryx way' of adding one line for each day of the period and... Alteryx way was 3x faster in my case... I assume if I grow the number of intervals and their span, the process might reverse...

 

 

#require(data.table)
library(data.table)

access = read.Alteryx("#1", mode="data.frame")
suborders = read.Alteryx("#2", mode="data.frame")

access <-na.omit(access, cols="accesstime_out")
suborders <-na.omit(suborders, cold="startDate", "endDate")

access$accesstime_out2 <- access$accesstime_out

access$accesstime_out <- as.POSIXct(access$accesstime_out)
access$accesstime_out2 <- as.POSIXct(access$accesstime_out2)
suborders$startDate <- as.POSIXct(suborders$startDate)
suborders$endDate <- as.POSIXct(suborders$endDate)

x = data.table(access)
y= data.table(suborders)

setkey(y, cid, startDate, endDate)
w <- foverlaps(x, y, by.x=c("custid", "accesstime_out", "accesstime_out2"),
type="within", mult = "all")

df <- data.frame(w)
df[] <- lapply(df, as.character)


write.Alteryx(df)

 

 

 

 

 

 

 

 

 

 

Labels