Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Processing 100s of regex for string replacement

Fnold
7 - Meteor

Folks,

 

I have a few hundred regular expressions with which to replace text in a field in a sequential fashion. One way might be to hardcode each regex into successive Expressions within a formula tool - but I don't know if a single formula tool can handle hundreds of expressions. And in any case this becomes quite laborious.

 

Or if there were a Find Replace tool that handled regexes, I might have conceivably used that...

 

Alternatively, I thought I might be able to parameterise the formula tool such that I can reuse it as many times as I want, each time inputting my desired regex (e.g. "\s+") into a parameter '<param>' and my desired replacement as <replace> and then having a single expression:

 

Regex_replace([myfield], <param>, <replace>)

 

Admittedly this is also laborious for hundreds of regex.

 

Any suggestions on parametrising? Or alternative approaches?

8 REPLIES 8
patrick_digan
17 - Castor
17 - Castor

@Fnold I immediately thought of an iterative macro. I've attached a very quick working example. 

MarqueeCrew
20 - Arcturus
20 - Arcturus

@Fnold,

 

I don't have an example to post, but do have a picture.  Last year I posted a blog regarding storing formulas in a file (e.g. yxdb or .xlsx) and referencing those formulas in a workflow.  You could easily create 100 formulas by selecting them from your library of expressions and applying them dynamically via the Crew (http://www.chaosreignswithin.com/) Macros.  Check out the dynamic formula.

 

Here's a copy of my post:

 

For the Love of Formulas

In the face of a challenge, I have found that I like choosing the easy route.  The tip that I’m about to share with you is not only easy to implement, but it also is elegant and provides for data governance and simplification of testing.  In this case, the easy method is likely the best method.  The challenge at hand is to take a known set of data through a known set of data computations to generate a known set of variables.  One use case for this would be the data preparation in front of executing a model.  When you’ve got a known set of data you might have a set of standard calculated data elements (e.g. Age in Months or maybe binary flags like Interest in Gardening).  Let’s say that you have 100 or more of these calculated variables in your organization.  What are the chances that everyone will use the same formula to compute the data?  What are the chances that they will configure their workflows correctly without typos or cut/paste errors?  In my experience, this effort is substantial.  Let’s see how we could simplify the configuration of tens, hundreds or thousands of variables.

 

You could use saved expressions.  For those not familiar with saved expressions, please see Alteryx Help (Saved Expressions).  These are terribly convenient and worth learning how to manage.  But they still require you to configure formula tools with the potential for error.  I am going to point you to a CReW (www.chaosreignswithin.com) macro for my tip.  The macro of interest is:  Dynamic Formula.  The dynamic formula tool allows you to dynamically create formulas using a data feed.  It effectively allows you to write formulas by accessing a repository of stored formulas.  Everyone creating a known variable will use the identical logic and virtually no coding is required for any of the formulas.  In our use case of 100 formulas, once you filter to the set of formulas that you’re interested in calculating the answers are ready for downstream processing.

 

The real effort is the initial configuration of each variable’s formula and the subsequent testing of the formulas.  Once the calculation is proven to be accurate, it will remain accurate.  The expectations for these stored formulas are that they have a known set of input variables and that the output calculation is fixed.  Please note that if you wanted to have bespoke (custom) formulas, you can manage them and use them as you see fit.

 

Capture.PNG

 

In this sample workflow we read “Raw Data” that includes a few demographic variables.  In another input file we have stored the mean or slug variable values for use when the “Raw Data” records have null or empty values for the variables.  Yes, you could use the impute tool to calculate each variable (individually), but if you are using a pre-defined mean this is a very efficient way to apply them.  The next step is to convert the “Slug Values” to be appended to each record (you’ll drop these values in a select after you’ve used them.  The next input tool contains the set of variables that you want to calculate.  In this example, you’ll calculate 6 variables.  They’ve been stored in Excel, but could be stored in a secured data table elsewhere.  The dynamic macro looks for the following values (Variable, Type, Size, and Expression).  It reads them from the ‘F’ input and generates each variable on every record.  That’s it.  You’re done.  No coding, no cut and paste; Nothing to do but bask in the simplicity of the workflow.

 

Thanks to Adam Riley for creating this macro.  Using it I can convert SAS or other models into simple Alteryx workflows.  If you’re interested in finding out it is possible to score SAS models in Alteryx using this technique, please comment on this blog and I’ll get back to you with the approach.

 

Cheers,

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Fnold
7 - Meteor

@patrick_digan  

 

That looks like an interesting way to accomplish this. How do I make [Field1] in the macro as a general input parameter, such that I can choose which field from the calling workflow will be processed inside the macro?

patrick_digan
17 - Castor
17 - Castor

@Fnold I've attached that change. I made two simple changes in the macro: On my L input I checked the box labeled Show Field Map. This now has you select the field on the macro when you drop it into a workflow. It essentially changes the field name for whatever field you select to __Field1__. Since I'm assuming you want your original field name back, I used an action tool and a select at the end connected to the L input to do a reverse field map update. Note that I changed it from Field1 to __Field1__ to avoid any hiccups with fields that already exist.

 

EDIT: I forgot I had set the number of allowable iterations really low while I was testing. I've reattached a version with a much higher number.

sonseeahray
8 - Asteroid

@MarqueeCrew Just read your post from 2017 on utilizing a macro to dynamically create formulas from a formula 'repository.' I did download the Macro Pack and reviewed very quickly the CReW_DynamicFormula macro. The last version history update is "Updated to Alteryx 9.0 macro interface." Would this be compatible with the latest Alteryx versions?

MarqueeCrew
20 - Arcturus
20 - Arcturus
Absopositively!

Cheers,

Mark
Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
sonseeahray
8 - Asteroid

Wonderful! Thanks Mark!!

 

I will be automating an Excel data model utilizing Alteryx. This model contains over 200 excel formulas that I have maintained in a data dictionary. Will I need to translate these Excel formulas to a RegEx formula in order to use the Dynamic Formula macro?

sonseeahray
8 - Asteroid

That was a distracted moment...while not all of the formulas need to be RegEx, they will all need to be rewritten in Alteryx syntax.

Labels
Top Solution Authors