Hi Champions,
i need help in extracting text and numbers into different columns from this
What I have is this:-
" Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO)"
"Qatar (QA) 0 0 United Arab Emirates 147 0 (AE)"
" Missouri 70 0 (US-MO)"
" Utah (US- 0 0 UT) Total 217 0"
What I need is, i.e. Arizona in one column and zero in another
Arizona-0, Colorado-0
Qatar-0, United Arab Emirates 147
Missouri-70
Utah-0, Total 217
Solved! Go to Solution.
Hi, @Pranab_C
Because you first line string have no leading space, but your sample data has, so you only need modify the parse expression from ^["\s]+([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? to ^["\s]?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))? , then all done.
Tips: only change the first + to ? .
^["\s]*?([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))?
******
If can help you get your want, please mark it as a solution and give a like for more share.
Hi, @Pranab_C
Maybe you can find the macro of readpdf or readword from gallery.
Table_No | RowID | Location | Workdays (W) | COVID-19 Workdays (CW) | Non-Workdays (O) | COVID-19 Non-Workdays (CO) | Not Specified | Total |
1 | 1 | Oman (OM) | 0 | 0 | 1 | 0 | 0 | 1 |
1 | 2 | Qatar (QA) | 0 | 0 | 2 | 0 | 0 | 2 |
1 | 3 | United Arab Emirates (AE) | 147 | 0 | 66.5 | 0 | 0 | 213.5 |
1 | 4 | United States Arizona (US-AZ) | 0 | 0 | 8 | 0 | 0 | 8 |
1 | 5 | United States Colorado (US-CO) | 0 | 0 | 8 | 0 | 0 | 8 |
1 | 6 | United States Missouri (US-MO) | 70 | 0 | 47 | 0 | 0 | 117 |
1 | 7 | United States Utah (USUT) | 0 | 0 | 15.5 | 0 | 0 | 15.5 |
1 | 8 | Total | 217 | 0 | 148 | 0 | 0 | 365 |
Thank you but the issue is that this workflow would be run in gallery and directory would not work in that environment. We are currently using R code to extract PDF, its working fine for the entire PDF except this page. Any suggestions or help would be much appreciated.