cancel
Showing results for 
Search instead for 
Did you mean: 

Using the select tool to isolate leading number in a string

SOLVED
Highlighted
CB1
Atom

Hi,

 

I have a some strings with numbers at the start, e.g.

 

35.68]/dsfo48,.fdhgk

45joidfo8.980

 

And i'm trying to just extract the leading numbers, so the result for the above would look like;

 

35.68

45

 

I believe 1 way of doing this would be to use the RegEx tool, but it seems like I can achieve the same output by just changing the data type of the field from string to double using the select tool.

 

I was just wondering whether this is a reliable way of performing this operation? Or whether the Alteryx implementation of the data conversion could throw up issues in some cases?

 

Also, is there a repository where things like this are documented? I did try googling, but it seems there's only really documentation of how functions/tools operate on a higher level, not so much the "nuts and bolts" of the implementation.

 

Thanks

 

 

Alteryx Certified Partner
Alteryx Certified Partner

@CB1,

 

Try using a Regular Expression like this:

 

Regex_Replace([String],"(\d+\.{0,1}\d+)\D.*",'$1')

Create a GROUP (indicated by the data enclosed in parenthesis)

That GROUP starts with 1 or more numbers (\d) and is immediately followed by a period (\.) that can occur 0 or 1 times ({0,1}) followed by 1 or more numbers.

This GROUP is followed by 0 or more Non-Numbers (\D*).  That group is identified as GROUP #1.

 

Replace (or output) the data with the string found in GROUP #1 ($1).

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

CB1
Atom

Thanks for the reply Mark.

 

I appreciate that this can be done using RegEx, and I understand its implementation, but my question is more focused on whether the select tool is an acceptable and reliable solution.

 

Given that it's slightly simpler and quicker, my preference would be to use the select tool, which is why I'm asking.

 

 

BenMoss
Alteryx Certified Partner

It does appear to work, I wouldn't say this is a reliable solution, predominantly because of the rules that you would place on your incoming data.

 

What if the number you want isn't always the leading value in the string.

 

What if their are no letters separating the two numeric parts of the string.

 

If you are sure that you can apply the rule 'Alteryx will take only the first numbers of the string and drop everything else', then I guess it's fine.

 

Personally i'd go with regex even if I knew this was the case. You know it works. Have you actually measured the performance difference when using the two methods?

 

Ben

CB1
Atom

Thanks Ben.

 

I'd processed the data prior to this point, so it was in a position where i just wanted the first number and then everything else dropped, but i appreciate what you're saying.

 

Apologies, when i said "simpler and quicker", I should have just said simpler - I haven't actually done a performance comparison. I'm still just getting to grips with Alteryx, but will compare the 2 methods when I start working with enough data where runtime becomes an issue. 

 

I guess in terms of being sure about the actual behavior of the implementation, that's something that's documented anywhere?

BenMoss
Alteryx Certified Partner
Not sure on the documentation side of things.

I think regex provides you with the simplest solution considering you know it will work 👍🏼
JORGE4900
Asteroid

@BenMoss That is a pretty elegant solution; the only thing that I would add to the regex would be the caret symbol to specify that those rules must be enforced at the beginning of the line.

 

^(\d+\.{0,1}\d+)\D.*

CB1
Atom

@BenMoss

 

Given that even in this thread someone has been corrected on the best way to use RegEx, and given my lack of experience with it, I trust the select tool more than I do RegEx/my use of RegEx.

 

But yes, I get your point, why bother with asking this when RegEx can do the job; and that'll be the solution I use when I have more experience with it.

 

Thanks :)