Alteryx Designer Desktop Discussions

craigja · ‎12-23-2019

Hi folks,

I have a challenge! I've sat and looked at this all morning but cant get it to work correctly but I'm sure at least one of you will be up for it 🙂

The data below needs reformatted, it is in one column and in another column I have a primary key (Publication ID). The numbers seem to identify the publisher but I'm not interested in the number, only in using it to split out the data, into separate rows with the primary key. So the pattern I'm looking for is: ignore the number, but take what comes after the number, and before the next number, the first section between the commas is the acronym, I want that, then the next section (before the start of the next number) should be in brackets, which is the full name of the publisher.

A simple example is

Primary Key Data

1234 41,EU ESMA,European Securities Market Authority, 42, FCA, Financial Conduct Authority

Should look like:

Primary Key Data

1234 EU ESMA (European Securities Market Authority)

1234 FCA (Financial Conduct Authority)

Sounds simple! But it soon gets complicated, there are often multiple publishers and sometimes the name of the publisher has commas in it as well, so I really need to use the numbers to signify where the publisher changes.

Here are some more complicated ones!!

Primary Key Data

5678 49,FinCEN,Financial Crimes Enforcement Network,106,US Treasury,US Department of the Treasury

Should look like:

Primary Key Data

5678 FinCEN (Financial Crimes Enforcement Network)

5678 US Treasury (US Treasury,US Department of the Treasury)

Primary Key Data

9876 12,Basel,Bank for International Settlements (BIS), and Basel Committee,19,CFTC,Commodities Futures Trading Commission,46,FDIC,Federal Deposit Insurance Corporation,47,Fed,Federal Reserve Board, (US Central Bank),87,OCC,Office of the Comptroller of the Currency,313,OpCC,Options Clearing

Should look like:

Primary Key Data

9876 Basel (Bank for International Settlements (BIS), and Basel Committee)

9876 CFTC (Commodities Futures Trading Commission)

9876 FDIC (Federal Deposit Insurance Corporation)

9876 Fed (Federal Reserve Board, (US Central Bank))

9876 OCC (Office of the Comptroller of the Currency)

9876 OpCC (Options Clearing)

fmvizcaino · ‎12-23-2019

Hi @craigja ,

I'm using tokenize option from regex and text to columns to achieve the main part of your problem.

Attached is an example showing how to get that done.

Let me know if this works for you.

Best,

Fernando Vizcaino

afv2688 · ‎12-23-2019

Hello @craigja ,

Hope this helps 🙂

craigja · ‎12-23-2019

Yep, that looks good! Is there any way to remove . (full stops) and also sometimes there are 2 commas, where there are 2 next to each other, can I remove 1 of them?

afv2688 · ‎12-23-2019

You can remove commas and fulls tops from the beggining and end using the Trim function:

Trim([Field], ",")

or

Trim([Field], ".")

You can also eliminate easily any character within a string using regex function:

Regex_Replace([Field], <Character to look for>, <Character to delete>)

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

fmvizcaino · ‎12-23-2019

Hi @craigja ,

Take a look at my example as I'm doing exactly what you requested.

Let me know if that works for you.

Best,

Fernando Vizcaino

Alteryx Designer Desktop Discussions

Help with regular expression to split a column into multiple rows with a primary key

Re: Row creation

Re: How to select columns dynamically using number...

Re: Batch macro to read 1000+ .xlsx files with var...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...