Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Matching Data with Multiple Criteria

DavisData
5 - Atom

Hello Community,

 

First time poster. I have been using Alteryx for about a month and have learned a lot. However, I am having trouble creating an automated workflow for what has historically been a very manual process. My neophyte skills are too limited to find a solution, and I am hoping for the community's assistance. My background is mostly using Excel.

 

Below is an example of my data set, my end goal and a description of the manual process for achieving that end goal. Also below is a sample workflow (though very simplified) for what I am trying to do in Alteryx. Please let me know what questions or additional information would be necessary for solving this problem.

 

Thank you in advance for your insights and time.

 

The goal is to identify jobs that have accumulated more than 1,800 hours in a given period (i.e., "Hours 1" and "Hours 2") and give that job a special project number. If a job does not meet the hours requirement, I need to find a replacement that matches on multiple criteria: "Job", "Term Date" to "Hire Date" (i.e., approximately 60 days), and total hours >= 1,800. In Excel, I would create a pivot from the data and then manually filter and identify replacements that match all of the criteria, then I would manually add them to my end goal list.

 

Actual data sets can have a few thousand lines in them, so the manual process is very time consuming. Automating this is a valuable time saver. I have already done a lot of work manipulating data to get it in the format shown in the below example. But this is where I am getting stuck.

Data Set:

IDJobHireTermHours 1Hours 2
123Accountant1/1/20177/1/201818001000
124Manager4/1/2015 20002000
125Analyst2/15/201710/1/201818001600
126Accountant9/15/2018  300
127Accountant4/1/2018  1000
128Analyst5/30/2018  900
129Accountant6/30/20189/1/2018 750
130Analyst10/15/2018  200

 

End Product:

Project ID #IDJobHireTermHours 1Hours 2
1123Accountant1/1/20177/1/201818001000
1A129Accountant6/30/20189/1/2018 750
1B126Accountant9/15/2018(blank) 300
2124Manager4/1/2015(blank)20002000
3125Analyst2/15/201710/1/201818001600
3A130Analyst10/15/2018(blank) 200

 

Alteryx example is attached.

6 REPLIES 6
danilang
19 - Altair
19 - Altair

Hi @DavisData

 

I think this is what you're looking for. See the Edit below  The purpose of the unique tool is to be able to sort your final output in the order that the Jobs are listed in the input data.  I also made the assumption that your output is ordered by Job and then by Hire

 

Solution.png

 

Note that your output example doesn't include all the rows and your input data had a misspelling of "Analyst" that I corrected.

 

Edit:  Upon rereading your requirements, the workflow does produce the correct output but not according to the process that you describe.  Can you possibly post different sample input and output data that would better demonstrate the process you go through?

 

Dan

DavisData
5 - Atom

Hello @danilang

 

Thank you for your quick response. I saw your answer as soon as you posted it and immediately began to look at your proposed solution. I apologize for just now replying, but it has been a busy day.

 

As you stated in your edit, the workflow that you provided does not produce the same output that my manual process did, but it is close -- which I am extremely grateful for and excited about. Thank you very much for that. I can work with that process to build my end goal as it gets me started in the right direction and is much better than anything I have attempted. However, if you are willing to continue with your support, then I will gladly accept it.

 

I will try to address your questions so that I am more clear about my data and intentions. There is a new data set attached with a few more details about the process and final data that I am working towards. Hopefully it helps.

 

Some quick notes:

The order of my output does not matter except that all connected jobs are grouped together (i.e., 1, 1A, and 1B are listed sequentially). The Analyst misspelling was unintentional. Lastly, I do not want all jobs in the final output, only those that either individually or collectively total 1,800 hours (or greater). In the previous example, Project ID# 1B and 1C are not necessary because 1A takes us over the desired hours target. Lastly, Project ID# 3A would not qualify because the hire date is outside the tolerable date range based on #3's termination date.

 

Please see the attached example for more detail. Thank you again for your help. If nothing more comes from this, you have already done me a tremendous service.

 

Kind Regards,

Scott

danilang
19 - Altair
19 - Altair

This should be close to what your looking for

 

Solution v2.0.png

 

Note that project ids differ, but are sequential within groups.  Also the Filter tool called "Suitable Records" uses PrevTermdate-5 to PrevTermDate+60, because RecordID 129 was hired the day before the previous one terminated

 

Dan

DavisData
5 - Atom

Thank you @danilang.

 

I am going to take a closer look at this tonight. I really appreciate your time spent on this.

 

Kind Regards,

Scott

danilang
19 - Altair
19 - Altair

Good Morning

 

Here's hopefully the final version.  I added in 2 tools to take into account the running total by job(start of the 4th row)

 

Solution v2.1.png

 

Dan

DavisData
5 - Atom

@danilang,

 

It is going to take me a while to digest this, but I can say from my first few minutes in the workflow, it looks incredible! So much more elegant than what I had been building. The cognitive framework for Alteryx is so much different from Excel and other programs that I have used. I am always amazed at how much my thinking about processes and data needs to change so that I can fully utilize this powerful tool.

 

Thank again for sharing so much of your time and knowledge with me.

 

Problem solved!

 

Kindest Regards,

Scott

Labels