Hello together,
i cant find a solution for the following task. I received a comprehensive data set as a training set of used Mercedes Benz cars and now i should make price forecast for other used Mercedes Benz cars.
The data set contains several thousand Mercedes Benz vehicles and prices (fuel type, Workshop visits, km, gearbox, model, etc.).
I aim to build a model to provide accurate predictions for the missing values/Prices.
Now i need to make forecasts are needed for the price of several used MBC C-classes.
Best wishes and thanks for your help
Niklas
Solved! Go to Solution.
Hi @niklasstepanek ,
I'll try to give a general idea, how to proceed on this problem.
First step should be to analyze data on a general level using tools provided in "Data Investigation" (e.g. Field Summary, Association Analysis). Thereby you may get a first impression of relations between data (what measure could have influence on price).
You'll have to perform some data cleansing and transformation then (simplidfied version: remove records containing NULL values or illogical values like negative price - for practical application, you'll have to decide if rows/columns should be removed or imputed).
Depending on previous findings, you should remove all unnecessary fields (e.g. VIN)
The resulting dataset is the input for the Linear Regression tool (or an other predictive tool, e.g. SVM, Neural Network). The "quality" of the model is reported by "R" or "I" output anchor, the model itself is delivered by the "O" output anchor.
You can connect the "O" anchor to a Score tool ("M") and use the test data file as data, result will be the predicte price.
Hope this is helpful as a first introduction.
Best,
Roland