Alteryx Designer Desktop Discussions

Austin09 · ‎09-06-2018

Hi all,

Firstly wanted to say I'm loving using Alteryx so far! Secondly, wanted to apologise in advance for misnaming anything - I'll try my best to explain what I'm trying to do!

Simplified aim: I'm trying to get to a data set of competitor restaurants which are within a given proximity of a particular brand. e.g. Find the set of restaurants near all the McDonald''s in the UK.

Now, to get the data I want it so that I limit the share of restaurants in the competitor set, so one brand doesn't exceed 50% e.g. if there are 5 fast food restaurants in the comp set, and 3 are Burger King, then I'd want to adjust the data.

So far I've tried this manually, and it works, however, as I'm a massive data geek, I want to know if I can create a loop which can do this, without me creating multiple sets.

I have the data ranked by proximity at the moment, so I'm only picking the nearest 7 restaurants to a given McDonald's. Then I have tested if any brand accounts for over 50% of the competitor set. My manual flow says... if it has more than 50% of one brand, add the next nearest restaurant than isn't the brand that has the highest share of compset i.e. if there are 60% Burger King in the set, add the nearest brand that isn't Burger King. I've done this latter part manually, but repeating the flow a few times i.e. at the 7th closest restaurnt... test... add the 8th.... test.

Is there a better way to do this which says..... Test if comp set >50%, if so add the next closest until that condition is met. Is there a tool in the palette I've overlooked, I suspect I've overcomplicated this! Some of the similar posts in here talk of macros, but their problems are more complex than mine.

Thanks in advance to anyone that can help!

tcroberts · ‎09-06-2018

It sounds like you'll want an iterative macro. If you haven't tried working with Macros in Alteryx so far, there's a great tutorial by @PaulT here

If you could provide a mock dataset or a sample workflow I could try to whip something together to give you an idea of what you'd want.

In general though, you'll want something that would test that condition, and then, if the share is >50%, removes that closest restaurant (the one that caused it to be >50% share) from the data stream that feeds into the iteration Output Tool.

Then, on the next run, your potential restaurants would be everything you had looked at previously, except the one that leads it to be >50%.

Note that the macro will keep iterating until it reaches the specified max iterations, or until the iteration output stream does not receive any more records, and so you'll likely want to add a second test which removes everything once your 'selected' restaurants hits the desired size.

Let me know if you have any questions,

Cheers!

Austin09 · ‎09-06-2018

Wow, thanks! That sounds right, the only change i'd make, is I'd prefer to add additional restaurants to the data set to satisfy the 50% condition, as I'd like a minimum of 6 in the comp set.

I'll take a look at that tutorial now.

My workflow is a bit of a mess as I've been trying to solve via trial and error. I'll tidy it up after watching the tutorial and upload it so you can see.

Thanks again for your help!!!

Austin09 · ‎09-06-2018

Thanks again, that is exactly what I was looking for. Really pleased there is a solution for this!

I've attached part of my workflow - sorry it is a mess. This is a similar workflow to the restaurant one I've been working on, with the same principal, just using hotels instead of restaurants. The added complication is that I'm using quantity of rooms rather than # of hotels i.e. needs to be fewer than 50% total rooms. I've just selected the bit that is more relevant, the rest is just to get me to this point and join various data sources. Let me know if you need more steps or walking through my messy workings.

This sums the total properties in a data set - a data set being a brand in a city i.e. brand A in London. Then calculates the share of all competitor brands in that city.

For those that are over 50%, I've created a separate flow, which is what I've attached. The properties are ranked by proximity currently, so I planned to add to the data set the next nearest property and retest. To pick the nearest competitor, I'd like to use a formula like 'min' to pick the closest.

e.g. like my restaurant example - the data set would be London data for McDonald's. In London there are for example 100 McDonald's which would all be included, each with 7 competitors. I'd want to pick the nearest competitor to any of those 100 McDonald's - not sure if my ramblings made sense there.

Let me know if you want me to screenshare or something and explain.

Austin09 · ‎09-06-2018

Thanks again, that is exactly what I was looking for. Really pleased there is a solution for this!

I've attached part of my workflow - sorry it is a mess. This is a similar workflow to the restaurant one I've been working on, with the same principal, just using hotels instead of restaurants. The added complication is that I'm using quantity of rooms rather than # of hotels i.e. needs to be fewer than 50% total rooms. I've just selected the bit that is more relevant, the rest is just to get me to this point and join various data sources. Let me know if you need more steps or walking through my messy workings.

This sums the total properties in a data set - a data set being a brand in a city i.e. brand A in London. Then calculates the share of all competitor brands in that city.

For those that are over 50%, I've created a separate flow, which is what I've attached. The properties are ranked by proximity currently, so I planned to add to the data set the next nearest property and retest. To pick the nearest competitor, I'd like to use a formula like 'min' to pick the closest.

e.g. like my restaurant example - the data set would be London data for McDonald's. In London there are for example 100 McDonald's which would all be included, each with 7 competitors. I'd want to pick the nearest competitor to any of those 100 McDonald's - not sure if my ramblings made sense there.

Let me know if you want me to screenshare or something and explain.

tcroberts · ‎09-06-2018

Great! I'm glad I could help point you in the right direction.

Your workflow seems to be on the right track, unfortunately I can't see the datatypes or fields of anything since I don't have the bit with you data, and I'd assume you don't want to share anything proprietary.

I don't think your complication (quantity of rooms) should cause too much of a hiccup, as your workflow seems to compute it just fine, and thus you'll be able to filter based on that.

Are you using SpatialObjects to map out the locations in order to get distances from one another? If so, you should check out the Find Nearest Tool which may help you, instead of computing distance and then using MIN().

Let me know if I can help with anything else, or you have some questions about setting up the Iterative Macro, they can be intimidating in Alteryx if you haven't used them before,

Austin09 · ‎09-06-2018

Thanks so much again for your time! It is such a fun tool, its just one of those, if you don't know you don't know situations where I'm doing things the hard way right now.

I'm ok to share via email if that works?

I think what I can do next is workout the data for the macro. The steps I have in my workflow to date...

1. Identify the nearest 6 competitors for each property

2. Roll up to the city level and test to see if at the city level, the sum of competitor rooms is over 50%. For the cities with any instance of brands >50% create a separate output - which I like to call 'FAILURES'

3. If so, create 2 data sets - A) the nearest 6 competitors by property B) the remaining properties further away

4. Identify which in list B is the brand with the >50% share e.g. if its Brand Z, don't select brand Z to add to the existing list

I've done up to this point in my workflow

Then i THINK (i'm probably wrong!) I can create the iterative macro, which uses the data from #2 'FAILURES' and adds in the nearest from point 4 i.e. the next nearest (using the tool you mentioned) amongst the set i've identified which doesn't include the brand causing the failure. Then test again, if not at the next nearest.

The last sentence above I'm most confused by, as I don't want to add the smallest one twice. Are you able to help me - not sure if the attached workflow is of much use - my understanding isn't quite there yet so I'm struggling to follow yet sorry! It'll click soon I'm sure!

tcroberts · ‎09-06-2018

No problem, this is a pretty interesting problem and it's quite fun to take a crack at solving it.

You're definitely on the right track, and remember that there's more than 1 way to do this.

I think that I would personally would try and add each and every competitor through the Iterative Macro, although depending on the size of your dataset, its possible that doing it the way you've laid out above might achieve some speedup.

To do it your way, you'll need to remember that the Iterative Macro will cycle until the Iterative Output does not receive any records, or until the maximum number of iterations are reached. This means that you'll need one filter to remove *ALL* records once you've hit the desired number of competitors (with no failures), similar to that first block of your macro example where you do the Append Field. It may alleviate your concern of adding the same one twice if you do two separate filters: one to check for failures, and another to check for the number of records, after you're sure there's no failure. This way, you can remove any records which bring you over 50%, stopping it from going through the iteration again. Just make sure that the F anchor does not connect to anything leading to the iteration output, or you'll get stuck in a loop until you hit the max iterations.

If I'm understanding correctly you're using the Summarize Tool right before the union to add in the next closest competitor. I'm not sure how this works with non-mock data, but I'm assuming you're relying on the fact that they're ordered by distance, and you'll be adding all necessary fields, not just the sum of whatever this mock data represents.

I can probably help a little more, but I'm still not really sure about the format of the data, the types it contains, or what exactly you'll be passing to this macro. It does seem like you've got a great start and just need to work out a few minor details. The plan you have for the macro seems like it should work, and the great part about Alteryx is that there's usually a number of different ways to arrive at a solution, even if its not obvious at first.

Could you enable Private Messaging? Go to your user settings, preferences tab, Private Messenger tab, and make sure Private Messages is checked. I'll send you some more information there.

Good Luck!

Austin09 · ‎09-07-2018

Thanks again, great points, really getting my head scratching going. Certainly a fun task as you say!

I'd like the macro to add an additionally property, until the % share goes to 50% or less. I'm not sure how to get it to recalculate the new share?

I'm reticent to add all competitors as I want to keep the proximity as low as possible.

Is that possible or do you think I'm trying to do too much in one step?

I've tried to enable PM but I think I'm not at a high enough rank yet. Not sure what to suggest there sorry!

tcroberts · ‎09-07-2018

Weird about the PMs, I've never encountered that before.

I certainly think its possible the way you're doing it. Just think about all the things you want to do each iteration:

1) Find next closest competitor

2) compute share

3) test for 50% -> If this test fails, drop the failure property from your list, and feed everything else back into the macro output.

4) test for < 6 competitors -> If this fails, send the successes to the regular output, drop everything else.

Break your Macro into steps and test out each bit, then start combining them together one at a time and test the larger bits as well.

Don't get too bogged down in solving it a particular way though, try alternatives. Like I mentioned, I think it would be easiest to do the entire thing inside the iterative macro, sending the successful nearest competitors to the D output, while the rest cycles back through the iterative one and 'failures' are filtered out entirely. You could force the macro to break at 6 competitors by adding a count column to the input, and have it increase by 1 for every successful competitor sent to the D output (using Count Tool and appending that field)

If you can send me the schema of your data, or make up some mock data I can try to get a more concrete example for you, I'm just a little wary about posting my private or work e-mail in a public forum.

Alteryx Designer Desktop Discussions

Creating a workflow which loops until condition is met