Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
MarqueeCrew
20 - Arcturus
20 - Arcturus

Editor’s note: @NeilR (Neil) and @MarqueeCrew (Mark) got to talking recently about how some accepted solutions on community are amazing in how they helpfully and concisely answer the OP’s immediate need, but may not fully teach the concepts underlying the solution in order that they may be equipped to solve the next similar problem on their own. We discussed a recent solution Mark provided about how to remove all vowels from a string unless it is the first letter of a word. Mark followed up the solution with a YouTube video detailing an alternate methodology. What follows is a synopsis of our conversation. 

 

 

[Neil] You already answered the question that the OP marked as an accepted solution, what compelled you to create the above video? 

  

[Mark] Neil, providing a solution to the community gives us an opportunity to help someone get unstuck today and hopefully tomorrow, too. This specific post gave me an opportunity to demonstrate how to find your own solution path. Besides my YouTube video, I sent this solved YXMD to the Alteryx Academy for a weekly challenge. This puzzle gives beginners a challenge to use string functions and advanced users a chance to exercise their regex and error handling skills.   

  

Wait a second, the solution you presented in the video is different than the one you shared in the thread. What happened? 

  

Regex was handy from my iPhone. I could readily respond to the post without direct access to my Alteryx desktop. I wanted to help community members think about how to solve. But if you watched the video (past the thank you slide) the regex solution gets explained. This just goes to show you that there isn’t just one way to solve a challenge. 

  

ReplaceChar? Where did that come from? Where do I find all these string functions? 

  

Have you ever reviewed the functions inside of the Formula tool? I have. I also review the help documentation for excitement. OK, that's a stretch. I've had plenty of opportunities to clean dirty data. Knowing string functions has saved me lots of trial and error. The functions have been updated many times since I started using Alteryx. Keep your saw sharp and light bulbs bright by exploring. In the good old days there was no Contains, StartsWith or EndsWith.  

  

So which one is better - regex or not regex? 

  

Regex wins! Generally, I apply regex to unstructured data. When you can use a combination of string functions, I will opt for those instead. Regex consumes more resources than string functions and isn’t so friendly to untrained users. As a consultant, I must make workflows easily explained and maintained for my clients. In this specific example, I wondered about the true complexity of the text data. The regex approach left little room for errors. I am aware of the defect of the YouTube solution.  

  

Input Data 

Expected Results 

Beginner Results 

Regex Results 

Forty-acre 

Frt-acr 

Frt-cr 

Frty-acr 

Apples,Avocados & Apricots 

Appls,Avcds & Aprcts 

Appls,vcds & Aprcts 

Appls,Avcds & Aprcts 

                            

When the “beginner” approach is used, the assumption is that a “word” is preceded by a space. The rules do not specify how to handle examples like hyphenated words. Similarly, a comma might not have a space after it and will cause the first letter of the following word to delete a vowel. The regex expression handles words by boundaries. In simple terms, if the preceding character is not {0-9, A-Z a-z}, then it is a boundary (along with start of field).  

  

REGEX_Replace([Input Data], "\B[aeiouAEIOU]", '')

 

\B = Not a boundary character 

[…] = Set of characters 

  

Explanation: The first character of a word preceded by any character not in the set of {0-9, A-Z a-z} is skipped. All other vowels (as listed) are identified by the regular expression and Alteryx will replace them with ‘’ (nothing). 

  

In this case, regex will outperform the string functions because of the parse and summarize tool expense. In order to make the “beginner” solution work, you’ll need to do some messy work. Experience has taught me to take data into consideration and to Keep It SIMPLE. I wonder if @hellyars considered the data examples and could comment on what the expected results would be. It is possible that he’d want to ignore the punctuation/number and delete the vowel. In that case, SIMPLE wins! 

  

I see you used a Message tool – is that APA?

  

The Message tool is mind-blowing. Don’t underestimate the power of the tool when you are preparing yourself for Analytic Processing Automation (APA). It is a fundamental tool for defensive-configuration. I’m coining the phrase APA Readiness as the ability to repeatedly run workflows with confidence, given a fundamental lack in confidence that the data won’t become trashy in the future.

  

A simple use case for the Message tool and why it is so important is when you read CSV files. All of your input comes to you as V_WString. If you get an amount of $123,456.78 and place a SELECT tool after the input to change the type to DOUBLE, you might get a warning message. If a date of 2021-02-31 comes in you might get a warning message. I say might because you’ve probably got a limit of 10 warnings per occurrence. You might also get it, but if you’re running the workflow on your Server and you don’t check your logs, you might not even know about the problem until you provide your data or report to your boss or client. Enter the Message tool. You can check your data and ERROR, stop or save your @$$ from upstream issues. 

  

For this example, I wanted to demonstrate how the tool works and how you can configure it to both ERROR and display useful messages. If you don’t use a Message or a Test tool in your workflow, you won’t get an A in my virtual class. When you run a workflow on your desktop and you’ve got lots of Browses to visually check for data anomalies, you have a fairly low APA readiness score. You’re spending time doing what Alteryx does faster. 

  

Anything else? 

  

"Anything else?" That’s .* in regex 😉 I’ll keep my eye on community posts for ideas for future deep-dive adventures. I’d really like to encourage readers to comment and to suggest areas where “How To” examples can be helpful. Why did you read this post? If there is desired content, I’d like to contribute what I can to get you involved in Community. 

Comments
Jon_Taylor
8 - Asteroid

Thank you for this article.

 

I believe this is very good topic because as a beginner I find it hard to differentiate between "getting there" and an Expert solution ( most efficient solution). I know there are many ways to come up with a solution but knowing how to distinguish the different methods I think is very valuable.

 

Take Challenge #37 for example. I attempted this challenge and was stuck for a good time. After viewing training lessons and trying over and over, I decided to take a look at the solution. I was amazed to see most all responses used the XML Parse tool for the XML based challenge but the Actual Solution provided did not have an XML Parse Tool at all. At this moment as a beginner I was completely lost and felt as if I did not know what I was doing. Seeing the 2 methods made me think why and which method should I use as well as  how to think about attacking challenges problems.

 

Providing insight into these different approaches and methods can help everyone become better I believe, I will make sure to comment in areas I feel a deep-Dive could be presented.

 

Jon Taylor

MarqueeCrew
20 - Arcturus
20 - Arcturus

@Jon_Taylor ,

 

Thank you for taking the time to give feedback.  Please feel free to shout out any suggested challenges or points of interest that you think would benefit others. I plan on working closely with Neil to dive into more posts. 

cheers,

 

 mark

 

 PS: Alteryx is a journey

WillM
Alteryx Alumni (Retired)

@MarqueeCrew thank you for taking the time to put this together!  I love the video and the blog, and completely agree that there are many paths to solving a problem.  Sometimes the obvious way isn't the best (or even the easiest). This is the perfect way to show that everyone approaches solving in a different capacity and that a marked solution is by no means the only solution, so giving your own perspective can never hurt!

JP_SDAK
8 - Asteroid

This was a really useful article Mark.  I thought I was pretty adept at Regex tools, and used one to solve this weekly challenge.  It worked fine for the Challenge, but not for your examples above where there was a hypen and lower case second word  (mine assumed that if the word started with a vowel it would be capitalized).  Thank you for the insight on \B and word boundaries.

 

JP_SDAK_0-1615833963500.png

 

j_trefethen
8 - Asteroid

@MarqueeCrew echoing what the others said - really appreciate you taking the time to walk through the thought process behind both "beginner" and Regex methods. Also learned about the Message Tool, and will be using that more as well (getting rid of those extra Browse tools).

mceleavey
17 - Castor
17 - Castor

Nice article, @MarqueeCrew .

 

As you know, Regex is my friend.

 

M.

nravik
11 - Bolide

This is a great article and video @MarqueeCrew !! Thank you.