I have a dataset which contains different information such as Price, Rating etc. Rating will be always in range of 0-5, however other data like Price and Availability will be different ranges. I am trying to normalize them so that all the attributes look in the range of 0-5 scale.
Example: US has 3 cities with each city contain Price, Rating and Availability information. I am looking to see a summary information for each country for each attribute (Price, Rating and Availability) in the range of 0-5 based on the average of values within the country. This will help to plot them in a single graph to compare easily
Challenge is how can we adjust the pricing and availability information into a scale of 0-5 similar to how the Rating column is.
Base Data
Country | State | Name | Value |
US | CA | Price | 600 |
US | CA | Rating | 4.2 |
US | CA | Availability | 1 |
US | NY | Price | 800 |
US | NY | Rating | 3.9 |
US | NY | Availability | 1 |
US | PA | Price | 450 |
US | PA | Rating | 4.5 |
US | PA | Availability | 0 |
Portugal | Lisbon | Price | 320 |
Portugal | Lisbon | Rating | 4.1 |
Portugal | Lisbon | Availability | 1 |
Portugal | Porto | Price | 250 |
Portugal | Porto | Rating | 4 |
Portugal | Porto | Availability | 0 |
Expected Output
Country | Name | Value |
US | Price | 3.8 |
US | Rating | 4.2 |
US | Availability | 2.5 |
Hey @jay_viz
I don't quite see how the US Availability (with values 0, 1, 1) would end up being normalized to 2.5. But here's my attempt at normalizing the values. Hopefully this can help start you off in the right direction...
Hello, @jay_viz.
I am interpreting the minimum as 0 and the maximum as 5. This gives a different result, but I believe it's one method for solving the problem.
Hi, @jay_viz
Kindly consider interpolation - keep in mind that you might want to use Non-Zero Averages for all fields except Rating.
I hope you find this helpful - Cheers!