Dear All,
I have the following use case and would like to get your recommendation about the best approach.
I have a list of awards almost 40 type of awards and there are more than 40 criteria which are used to evelaute in order to assign an award. I would like to know what will be the best machine learning approach to assign an award based on the combination of these criteria.
By the way I know decision tree might be one appraoch but I really would like to get more robust and efficient algorithm if these is one better.
Solved! Go to Solution.
Hi @qais1975 ,
The approach will really depend on the cyour specific case:
1. How much data do you have for training? -> Big impact on algorithm choice.
2. What is the business rule used for determining award? -> Determines whether machine learning should be used at all as opposed to an expert/rule based approach.
Just to name a few.
Hi @martinding ,
Thank you for your reply. The answer to your questions are listed below:
1. 1. How much data do you have for training? I have a historical data from 1990 till 2022. This data will be much enough for any training model aas well as can be enough for testing and validation.
2. 2. What is the business rule used for determining award? I do not think you understand my question. We have criteria that have being used previously to assign award. We need to automate this process used the historical data in building such knowledge.
We can also validate the model using SME input as well.
Thank you.
Hi @qais1975,
Personally, I feel that building a rule-based system will be sufficient for solving this automated awarding system.
If a machine-learning path is to be taken, you might want to consider the following when deciding on the algorithm:
1. Multi-class classification: The algorithm has to be suited to predict 1 of the 40 awards.
2. Interpretable/explainable: This I think is a really important condition to consider, because stakeholders might be interested in understanding why an award was given to an individual (especially for controversial cases).
Therefore you might want to try:
1. Decision Trees: Interpretable and simple to implement.
2. Logistic Regression: there are multi-nominal implementations of logistic regression.
3. Naive Bayes: A classic algorithm that works best when your criteria/predictors are independent of each other.
4. Random Forest: An ensemble method that is much more powerful than Decision Trees, but is less interpretable, luckily, RF has feature importances, AND external packages (both in Alteryx and in python) allow you to calculate the Shap values, which are very useful for interpreting the predictions (e.g. which criteria contributed how much to a particular prediction).
Hi @martinding ,
Thank you fpr your illustration regarding the ML approach. May I ask you the best senario to apply rule-based approach? Is there specific Alteryx tool for that? What do I need to utilize this approach?
I might run the two approaches ML and rule-based rule and do a comparison. However, I need to understand the rule-based system first and take an action.
Thank you.
Qais
Hi @qais1975,
You can simply think of a rule-based system as a series of IF-ELSE statements, let me give you a simple example:
Let's say there are 3 awards and 4 criteria:
Award: Best Actor, Best Actress, Best Film
Criteria: Category, Gender, Acting Score, Film Score
And if you draw it up, it will actually appear like a decision tree. But in essence, it's just if and else.
Thank you @martinding . I think ML will be my selection as the rule-based will be time consuming and laborous.
Thank you