SMOTE in Designer - Is it possible? - Machine Learning Question
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello Everyone,
I am working with a heavily imbalanced dataset where I am trying to create synthetic data for the minority target variables to balance out the target so the algorithms can learn better. I have no Python or R coding skills and was wondering if there is any tool or Macro available out there in Designer that would allow me to create synthetic data to balance out my target field. It is a multiclassification dataset and out of 6 total target variables, 3 of them are heavily imbalanced. I looked at the Oversample tool but that seems to do the opposite of what I am trying to achieve. I think some of the more recent versions of Designer and/or BI Suite has this capability but unfortunately i only have access to Designer 20202.
Thank you
Solved! Go to Solution.
- Labels:
- Machine Learning
- Predictive Analysis
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alkan
I think if the data is enough, you can use Over Sampling tool. But if not, you should go to SMOTE approach as you said.
I found the SMOTE macro that can use in Designer.
Please check this Blog.
Balancing Act: Classification with Imbalanced Data
The SMOTE macro is made at version 2021.1. But you can use downgrade scenario. Please check this Blog.
Making Workflows, Apps & Macros Backwards Compatible
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Akimasa - Your guidance was very helpful. I was able to downgrade the Smote Family macro and the sample workflow to my version and try it out. However, I am still getting the below error on the R Tool when trying to run the sample workflow that has the SMOTE family macro built in using the R Tool. I would appreciate if you can please take a look and help me out. Below is the error detail.
Thank you again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alkan
I use the Designer 2021.4 that R version is upgraded(3.6.3 => 4.0.4), so the SMOTE family package was not supported.
But my colleague's PC is installed 2020.4. I tried the macro on that PC and it worked.
I need more time to look into it...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @AkimasaKajitani,
I am on version 2020.2 and could not figure out why the macro is not working. Really appreciate your time looking into this issue.
Thank you,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alkan
I would like to confirm something.
Please right click on SMOTE macro and select "Open Macro". Then the smote_family.yxmc will be opened.
After that, please run the smote_family.yxmc.
I want to know what error occur.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It is asking me to type in a seed value. Configuration for the seed value shouId be defaulting to 3 but i could not figure out why the process does not pick it up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alkan
Yes. When we run this macro itself, the configuration is designed not to be applied.
So, this behavior is not problem. It is correct.
But the Smote family package seems to be normally installed. So I can not understand why the error occurs like the previous post(unzip error).
So I go to another approach. I make the Python version of SMOTE macro. But it is testing now. If you want to try this macro, please check it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @AkimasaKajitani ,
After trying your Python macro, I am getting the below environment error. I updated the version opening the macro in notepad to 2020.2 and also did some research on the env issue. Is says starting with 2021.1.4, the env name changed to designerbasetools from JupyterTool. I opened again in notepad and changed wherever i see designerbasedtools to JupyterTool but still could not get it to work. I really appreciate all your effort and hope you can direct me in the right direction.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @alkan ,
Sorry for the late reply. But there were many processes more than I assumed.
Designer 2020.3 is the last version of the Designer using Python 3.6.8.
The newest SMOTE package needs Python 3.7 after.
So I decided on the older version SMOTE Package.
And I installed Designer 2020.3 and I made the SMOTE macro on 20203.
Please check the attached SMOTE macro.
If it will work in your environment, I'm going to publish it as the formal version macro.
