Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex to extract Text and Number into different columns

Pranab_C
8 - Asteroid

Hi Champions, 

 

i need help in extracting text and numbers into different columns from this

 

What I have is this:-

 

" Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO)"

"Qatar (QA) 0 0 United Arab Emirates 147 0 (AE)"

" Missouri 70 0 (US-MO)"

" Utah (US- 0 0 UT) Total 217 0"

 

What I need is, i.e. Arizona in one column and zero in another

 

Arizona-0, Colorado-0

Qatar-0, United Arab Emirates 147

Missouri-70

Utah-0, Total 217

 

13 REPLIES 13
flying008
14 - Magnetar

Hi, @Pranab_C 

 

Because you first line string have no leading space, but your sample data has, so you only need modify the parse expression from ^["\s]+([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? to ^["\s]?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? , then all done. 

Tips: only change the first + to ? .

 

^["\s]*?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))?

 

录制_2023_12_22_15_30_20_854.gif

 

******

If can help you get your want, please mark it as a solution and give a like for more share.

Pranab_C
8 - Asteroid

It should be in each row, I understand the data is not standard but here is how the data is in most cases. Please see attached the PDF file and the way data would be in 99.9% cases. Your help in getting this extracted would be greatly appreciated. 

flying008
14 - Magnetar

Hi, @Pranab_C 

 

Maybe you can find the macro of readpdf or readword from gallery.

 

录制_2023_12_23_09_30_39_999.gif

 

Table_NoRowIDLocationWorkdays (W)COVID-19 Workdays (CW)Non-Workdays (O)COVID-19 Non-Workdays (CO)Not SpecifiedTotal
11Oman (OM)001001
12Qatar (QA)002002
13United Arab Emirates (AE)147066.500213.5
14United States Arizona (US-AZ)008008
15United States Colorado (US-CO)008008
16United States Missouri (US-MO)7004700117
17United States Utah (USUT)0015.50015.5
18Total217014800365
Pranab_C
8 - Asteroid

Thank you but the issue is that this workflow would be run in gallery and directory would not work in that environment. We are currently using R code to extract PDF, its working fine for the entire PDF except this page. Any suggestions or help would be much appreciated.

Labels