Important Community update: The process for changing your account details was updated on June 25th. Learn how this impacts your Community experience and the actions we suggest you take to secure your account here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex to extract Text and Number into different columns

Pranab_C
8 - Asteroid

Hi Champions, 

 

i need help in extracting text and numbers into different columns from this

 

What I have is this:-

 

" Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO)"

"Qatar (QA) 0 0 United Arab Emirates 147 0 (AE)"

" Missouri 70 0 (US-MO)"

" Utah (US- 0 0 UT) Total 217 0"

 

What I need is, i.e. Arizona in one column and zero in another

 

Arizona-0, Colorado-0

Qatar-0, United Arab Emirates 147

Missouri-70

Utah-0, Total 217

 

13 REPLIES 13
flying008
14 - Magnetar

Hi, @Pranab_C 

 

Because you first line string have no leading space, but your sample data has, so you only need modify the parse expression from ^["\s]+([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? to ^["\s]?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? , then all done. 

Tips: only change the first + to ? .

 

^["\s]*?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))?

 

录制_2023_12_22_15_30_20_854.gif

 

******

If can help you get your want, please mark it as a solution and give a like for more share.

Pranab_C
8 - Asteroid

It should be in each row, I understand the data is not standard but here is how the data is in most cases. Please see attached the PDF file and the way data would be in 99.9% cases. Your help in getting this extracted would be greatly appreciated. 

flying008
14 - Magnetar

Hi, @Pranab_C 

 

Maybe you can find the macro of readpdf or readword from gallery.

 

录制_2023_12_23_09_30_39_999.gif

 

Table_NoRowIDLocationWorkdays (W)COVID-19 Workdays (CW)Non-Workdays (O)COVID-19 Non-Workdays (CO)Not SpecifiedTotal
11Oman (OM)001001
12Qatar (QA)002002
13United Arab Emirates (AE)147066.500213.5
14United States Arizona (US-AZ)008008
15United States Colorado (US-CO)008008
16United States Missouri (US-MO)7004700117
17United States Utah (USUT)0015.50015.5
18Total217014800365
Pranab_C
8 - Asteroid

Thank you but the issue is that this workflow would be run in gallery and directory would not work in that environment. We are currently using R code to extract PDF, its working fine for the entire PDF except this page. Any suggestions or help would be much appreciated.

Labels