This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hello, I would like to use a date field "date complete" as the target variable in my linear regression. It looks like I can't. Any suggestions on what I can do to be able to use the date field as the target variable?
Linear regression wants to predict, basically, a floating point number. Assuming your training data has both a "DateStarted" and a "DateCompleted," then you can easily calculate a "HowLongToComplete" as the date difference between the two, which you can express as a floating point number (measured however you wish: seconds, days, hours, whatever gives the best predictions). Then use that as your target variable. Once you have it, you just add it to any proposed "DateStarted" to get the predicted "DateComplete."
That's unfortunate. Not knowing the data, I can only make guesses. Can you calculate a StartDate? e.g. for some process type, is there a completion timestamp such that we may assume some sort of continuous processing (in which case the start date is the same as the most recent completion data from another row... (in which case, sorting and MultiRow formula would allow you to calculate the StartDate (and some sort of "TimeToProcess") very easily? If not that, any kind of logic to give you a "TimeToProcess"?