Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Extracting series of integers from a statement

jerry239
8 - Asteroid

Hello

 

I am trying to build a logic where I can extract a series of integers from a statement. the minimum length of those integers is 4 and it can go to any maximum length. Also, the integers can be at any place. Example,

1. This is a random statement 1234

2. This is 12345 a random statement

3. This is a random 123456 statement. I want my output as:

1. 1234

2. 12345

3. 123456

 

Can someone please help me in building this logic?

5 REPLIES 5
ShankerV
17 - Castor

Hi @jerry239 

 

One way of doing this is with Regex tool.

 

Screenshot 2024-05-03 143844.png

 

Many thanks

Shanker V

jerry239
8 - Asteroid

Hello @ShankerV 

 

thank you for looking into this, just one question, is it going to consider the fact that the field extracted is minimum 4 integer length. there can be chance when the numbers are at two places like below:

 

there are 2 numbers in this 1234 sample statement.

Output should be:

1234

Matt_D
10 - Fireball

Hi @jerry239  you could use (\d{4,}) instead of (\d+)

 

I'd make sure you cannot have something like "there are 2254 numbers in this 1234 sample statement." as the change will take the first number 2254 not 1234. Could this happen?

 

REGEX is fantastic but it still relies on a consistent pattern.

aatalai
15 - Aurora

@jerry239 this might work using the data cleanse tool and slightly simpler than regex

RobertOdera
13 - Pulsar

Please consider the below @jerry239 

 

Option1 = using Data Cleansing Tool

Option2 = using Regex \b\d{4,}\b

 

\b asserts a word boundary to ensure that we match whole numbers.
\d{4,} matches a digit (0-9) occurring 4 or more times.
\b asserts another word boundary to ensure that we match whole numbers.

 

Please like or mark as a solution if it works for you. Cheers!

 

For_jerry239_1.png

 

For_jerry239_2.png

 

For_jerry239_3.png

Labels
Top Solution Authors