Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

R code non parsed well

florini
6 - Meteoroid

Hi, please see attached macro. 

 

Practically what I want to achieve is to read dates in one table and compare them to intervals in the other table. Make a join based on this. 

I found the solution in R here: https://stackoverflow.com/questions/41132081/find-which-interval-row-in-a-data-frame-that-each-eleme...

I works well in R studio but not in Alteryx. 

 


Error: R (7): Error in grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): invalid regular expression '(^~)|(^.x %[]% data.frame(intervals[, c("start", "end")]))'
Info: R (7): Calls: %>% -> eval -> eval -> map -> map.poly -> grep
Info: R (7): In addition: Warning message:
Info: R (7): In grep(regexp, the.map$names, ignore.case = TRUE, perl = (length(regions) < :
Info: R (7): PCRE pattern compilation error
Info: R (7): 'unmatched parentheses'
Info: R (7): at ')'
Error: R (7): Execution halted

Any idea how to interpret/fix this?

Thank you,

F

 

 

 

12 REPLIES 12
florini
6 - Meteoroid

I use 3.3.2 in Alteryx, 3.4 in R studio.... but I am sure this is not the cause... any other idea?

 

Philip
12 - Quasar

@florini, if you can't get the R code to work, here is an alternative Alteryx workflow that splits the data into grouping levels. I ran it against your stated 3.8 millions dates and 30,000 intervals and it ran in less than 3 minutes on my laptop (including creating the data sets). You could optimize the indexing levels to improve performance a bit more, and add in an additional level(s) with smaller tiles groups to improve performance more.

 

Find range matches 2 levels.png

florini
6 - Meteoroid

Thanks guys, at the end I took a different R code approach, I used foverlaps, see below the code. Hope this helps someone one day. 

 

I compared this to 'Alteryx way' of adding one line for each day of the period and... Alteryx way was 3x faster in my case... I assume if I grow the number of intervals and their span, the process might reverse...

 

 

#require(data.table)
library(data.table)

access = read.Alteryx("#1", mode="data.frame")
suborders = read.Alteryx("#2", mode="data.frame")

access <-na.omit(access, cols="accesstime_out")
suborders <-na.omit(suborders, cold="startDate", "endDate")

access$accesstime_out2 <- access$accesstime_out

access$accesstime_out <- as.POSIXct(access$accesstime_out)
access$accesstime_out2 <- as.POSIXct(access$accesstime_out2)
suborders$startDate <- as.POSIXct(suborders$startDate)
suborders$endDate <- as.POSIXct(suborders$endDate)

x = data.table(access)
y= data.table(suborders)

setkey(y, cid, startDate, endDate)
w <- foverlaps(x, y, by.x=c("custid", "accesstime_out", "accesstime_out2"),
type="within", mult = "all")

df <- data.frame(w)
df[] <- lapply(df, as.character)


write.Alteryx(df)

 

 

 

 

 

 

 

 

 

 

Labels