topic Re: Forest Model & Logistic Regression tools in Alteryx Designer
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304439#M53900
<P>I'll address your question about logistic regression first because it's more theoretical:</P><P> </P><P>When you perform logistic regression, you're computing some value: y = a + bX where a is a bias term, b is a vector of weights, and X is a feature vector. You then can compute a probability by applying:</P><P> </P><PRE>P = (e^y)/(1 + e^y)</PRE><P>This P is the output you see, between 0 and 1.</P><P> </P><P>In a hand-wavy sort of way, you can think of the b values from our formula for y above as the "marginal effect" of that particular feature. In plain English: b_1 is the amount that a unit change in x_1 will affect y.</P><P> </P><P>This means, that these b values are scaled by the magnitude of the X feature they're associated with.</P><P> </P><P>Lets think about this in terms of economics. If I want to model Consumption: C as a function of Income: Y, then I would have something like: C = a + bY.</P><P>The value b tells me how much a single dollar increase in income will increase my consumption. If my income is currently $10 million, a single dollar increase in income probably wouldn't mean I spent an extra dollar in consumption. However, if my income is currently $1000, then my consumption is more likely to increase by nearly that full dollar.</P><P> </P><P>The reason I'm giving this example is because the weights that are computed in your logistic regression are inherently taking the scale of each feature into account during training. If you normalize the predictor variables, then these weights will be scaled up (or down, depending on the original values of the predictors) accordingly, and you'll get the same output.</P><P> </P><P>That being said, if you have some extremely high variance in the values of your predictor variables, your model may be worse at making predictions for those values away from the mean, and this is something you'll certainly need to diagnose, however normalization won't help you out here.</P><P> </P><P>Let me know if you need me to clarify anything above, or if I skipped something. I'll look into your question about the Random Forest model's output in the meantime,</P><P> </P><P>Cheers!</P><P> </P><P> </P>Thu, 20 Sep 2018 21:56:42 GMTtcroberts2018-09-20T21:56:42ZForest Model & Logistic Regression tools
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304434#M53898
<P>Hi,<BR /> Having a few problems setting up Random Forest & Logistic Regression models.<BR /><BR />When I run a Random Forest model, the output report is not giving a confusion matrix.<BR />Just wondered if there was an obvious reason why this is happening.<BR /><BR />Also, in logistic regression models, is it advisable to normalise predictor variables? (e.g. 0-1 range)<BR />(there seems to be conflicting advice when I Google it)<BR /><BR />It's just a fairly straightforward credit default dataset I'm using but the results are all</P><P>over the place.<BR /><BR />thanks<BR />J</P>Thu, 20 Sep 2018 21:44:20 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304434#M53898datascot2018-09-20T21:44:20ZRe: Forest Model & Logistic Regression tools
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304439#M53900
<P>I'll address your question about logistic regression first because it's more theoretical:</P><P> </P><P>When you perform logistic regression, you're computing some value: y = a + bX where a is a bias term, b is a vector of weights, and X is a feature vector. You then can compute a probability by applying:</P><P> </P><PRE>P = (e^y)/(1 + e^y)</PRE><P>This P is the output you see, between 0 and 1.</P><P> </P><P>In a hand-wavy sort of way, you can think of the b values from our formula for y above as the "marginal effect" of that particular feature. In plain English: b_1 is the amount that a unit change in x_1 will affect y.</P><P> </P><P>This means, that these b values are scaled by the magnitude of the X feature they're associated with.</P><P> </P><P>Lets think about this in terms of economics. If I want to model Consumption: C as a function of Income: Y, then I would have something like: C = a + bY.</P><P>The value b tells me how much a single dollar increase in income will increase my consumption. If my income is currently $10 million, a single dollar increase in income probably wouldn't mean I spent an extra dollar in consumption. However, if my income is currently $1000, then my consumption is more likely to increase by nearly that full dollar.</P><P> </P><P>The reason I'm giving this example is because the weights that are computed in your logistic regression are inherently taking the scale of each feature into account during training. If you normalize the predictor variables, then these weights will be scaled up (or down, depending on the original values of the predictors) accordingly, and you'll get the same output.</P><P> </P><P>That being said, if you have some extremely high variance in the values of your predictor variables, your model may be worse at making predictions for those values away from the mean, and this is something you'll certainly need to diagnose, however normalization won't help you out here.</P><P> </P><P>Let me know if you need me to clarify anything above, or if I skipped something. I'll look into your question about the Random Forest model's output in the meantime,</P><P> </P><P>Cheers!</P><P> </P><P> </P>Thu, 20 Sep 2018 21:56:42 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304439#M53900tcroberts2018-09-20T21:56:42ZRe: Forest Model & Logistic Regression tools
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304441#M53901
<P><SPAN>As for your question about the Random Forest Model, you could try out this <A href="https://gallery.alteryx.com/#!app/Model-Comparison/56bbd3013df7da08b8fcd00a" target="_self">Model Comparison Tool</A> </SPAN><SPAN>from the Alteryx Gallery. I believe you can get a confusion matrix out of it. I don't recall if you're able to get out out of the Forest Model Tool, but you should be able to get precision and recall, which are accuracy-like measures computed from that table.</SPAN></P><P> </P><P><SPAN>Let me know if you need help setting that up.</SPAN></P><P> </P><P><SPAN>Cheers!</SPAN></P>Thu, 20 Sep 2018 22:03:52 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304441#M53901tcroberts2018-09-20T22:03:52ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304446#M53902
Thanks again...that's all very helpful!<BR /><BR />I'll analyse my feature variables bearing in mind your points, and hopefully get a better<BR /><BR />understanding of what's going on.<BR /><BR />J<BR /><BR />________________________________<BR />The University is ranked in the QS World Rankings of the top 5% of universities in the world (QS World University Rankings, 2016/17)<BR />The University of Stirling is a charity registered in Scotland, number SC 011159.Thu, 20 Sep 2018 22:18:49 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304446#M53902datascot2018-09-20T22:18:49ZRe: Forest Model & Logistic Regression tools
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304448#M53903
<P>great..thanks very much.<BR />I'll try using the model comparison tool as you advise.<BR />I'll be back on this tomorrow...it's been a long day...just going to sleep now!<BR />Your input is much appreciated.</P><P>cheers</P><P>j</P>Thu, 20 Sep 2018 22:20:18 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304448#M53903datascot2018-09-20T22:20:18ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304840#M54019
Hi,<BR />Could I check with you about downloading the Model Comparison Tool?<BR />When I request the download, it imports a workflow into Alteryx, (Model Comparison.yxmc),<BR /><BR />but I can't see any reference to the MCT tool itself?<BR /><BR />At the moment, when I run my Random Forest model, all I get in the report is Basic Summary, Percentage error for diff tree numbers graph<BR />& Variable Importance plot.<BR />That's all...no precision, accuracy etc stats<BR /><BR />I'm probably missing something really obvious here, but getting very confused! ðŸ˜ƒ<BR /><BR /><BR />cheers<BR /><BR />J<BR /><BR />________________________________<BR />The University is ranked in the QS World Rankings of the top 5% of universities in the world (QS World University Rankings, 2016/17)<BR />The University of Stirling is a charity registered in Scotland, number SC 011159.Fri, 21 Sep 2018 20:28:49 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304840#M54019datascot2018-09-21T20:28:49ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304850#M54021
<P>Essentially, that tool is treated as a standard Alteryx macro. Youll want to include it in your workflow by right clicking on the canvas and selected "Insert Macro", then navigating to that file. Once you've done that you should be able to configure it, connect an input stream, etc.</P><P> </P><P>As for the rest of the outputs from the Forest Tool, what anchor are you looking at? Have you attached browses to all the anchor outputs?</P><P> </P><P>Let me know if I've misunderstood your question.</P>Fri, 21 Sep 2018 20:58:36 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304850#M54021tcroberts2018-09-21T20:58:36ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304853#M54023
ah ok...thanks!<BR />I misunderstood...I thought the tool was just a standard tool addition with would appear on the "predictive" toolbar.<BR /><BR />I haven't used a macro before in Alteryx so I'll investigate that!<BR /><BR />With the Random Forest workflow that I'm using, I'm looking at the "R" output from the Forest Model tool<BR />(I've attached the workflow)<BR /><BR />thanks again for your help!<BR /><BR />cheers<BR /><BR />J<BR /><BR />________________________________<BR />The University is ranked in the QS World Rankings of the top 5% of universities in the world (QS World University Rankings, 2016/17)<BR />The University of Stirling is a charity registered in Scotland, number SC 011159.Fri, 21 Sep 2018 21:12:49 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304853#M54023datascot2018-09-21T21:12:49ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304865#M54026
<P>Oh yes, for some reason I was under the impression that the Forest Tool had an Interactive Report anchor as well, similar to the Decision Tree Tool.</P><P> </P><P>If the Model Comparison tool doesn't work for you, you could try to extract this information out of the O output using the R Tool.</P><P> </P><P>Let me know how it goes,</P>Fri, 21 Sep 2018 21:36:50 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304865#M54026tcroberts2018-09-21T21:36:50ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304867#M54028
<P>thanks...!<BR />I've worked out how to include the macro...so hopefully I should get the information ok from the Model Comparison<BR />Thanks for the tip about the R tool !</P><P>cheers</P><P>J</P>Fri, 21 Sep 2018 21:41:53 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/304867#M54028datascot2018-09-21T21:41:53ZRe: Forest Model
https://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/500995#M103232
<P>Hey Folks, im using the forest model tool and I would like to look at the confusion matrix, precision score, etc and see how accurate the model is besides finding the MSE = 0.04. What can i do to obtain that? </P>Wed, 11 Dec 2019 05:52:50 GMThttps://community.alteryx.com/t5/Alteryx-Designer/Forest-Model-amp-Logistic-Regression-tools/m-p/500995#M103232Outlaws2019-12-11T05:52:50Z