Features are "Too highly associated with target"
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm just starting out, and I am attempting to model a sensor that reads 30 times per sample.
Before I have actual data, I am trying to play with faked data.
I assume an actual value, then create 30 "measured" values by adding or subtracting a small random number to simulate noise.
I run the Assisted Modeling, but it tells me that all of the measured values are "Too highly associated with target."
Is there a way to override this?
Thanks,
Steve
- Labels:
- Question
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello,
This is a feature and not necessarily and issue. You want to avoid features that are too highly associated with the target variable because it can lead to overfitting. Overfitting happens when your model learns the details and noise in the training data to the extent that it performs well on that data but poorly on new, unseen data.
Think of it this way: You're studying for a test by memorizing the answers to specific questions rather than understanding the underlying concepts. You might do great if the test has those exact questions, but if the questions change even slightly, you might struggle.
My guess here is that the synthetic data is not random and has a pattern that is too highly associated with your target. If you can try to get data closer to the real world data the model may encounter.
