Want to get involved? We're always looking for ideas and content for Weekly Challenges.
SUBMIT YOUR IDEA
A solution to last week's challenge can be found here.
Use Designer Desktop or Designer Cloud, Trifacta Classic to solve this week's challenge.
Recently, your local arcade introduced a new virtual reality (VR) experience for people of all ages. They wanted to know how well the VR headsets were performing, so they conducted a survey last week to gather feedback from users. The arcade wants you to use the dataset from the survey to determine which brand of headsets had the highest rating. They will purchase more of those headsets in the future.
The arcade is currently working with three brands: HTC Vive, PlayStation VR, and Oculus Rift. They would like you to conduct an analysis for each brand.
The arcade owner also wants you to categorize users based on their age ranges. The age groups are as follows:
18–28 years old
29–39 years old
40–50 years old
51 years old and older
In the dataset, you have values for each user that include Duration (the length of time the user spent in the VR experience in minutes), and a Motion Sickness Rating, which is a reported value from 1-10, with higher values indicating a higher level of motion sickness. Ideally the users would feel very little motion sickness regardless of how long they are using the headsets. You have a Fun Score formula to apply to determine the correlation between the duration of time on the VR and the reported motion sickness.
Fun Score = Motion Sickness/60 * Duration in Minutes
Using the Fun Score formula, list the brands and age groups that have more than 20 people providing a fun score of <1 = LIFE CHANGING!
Fun Scores:
9 or above = Refund
8 or above = Really sick
7 or above = Sick
6 or above = Dizzy
5 or above = Feeling weird
4 or above = Pretty good
3 or above = Fun
2 or above = Great!
1 or above = AMAZING
<1 = LIFE CHANGING!
Source: https://www.kaggle.com/datasets/aakashjoshi123/virtual-reality-experiences
Not sure why it won't let me add my workflow snippet, but here it is:
My solution attached.
Python Tool
import pandas as pd
from ayx import Alteryx
df = Alteryx.read('#1')
df.Age = df.Age.astype(int)
df.Duration = df.Duration.astype(float)
df.MotionSickness = df.MotionSickness.astype(int)
labels = ['18-28 years old', '29-39 years old', '40-50 years old', '51 years old and older']
bins = [0,29,40,51,999]
df['Age Group'] = pd.cut(df.Age, bins=bins, labels=labels, right=False)
df['fun_score'] = df['MotionSickness'] * df['Duration'] / 60
df['Fun Score'] = df.fun_score.apply(lambda x: 'LIFE CHANGING!' if x < 1 else 'Other')
# pd.cut(df.fun_score, bins=[0,1,100], labels=['LIFE CHANGING!', 'Other'], right=False)
df_agg = df.groupby(['Age Group', 'VRHeadset', 'Fun Score'])['UserID'].count()\
.reset_index().rename(columns={'UserID': 'Count', 'VRHeadset':'Brand'})
df_agg[(df_agg['Fun Score']=='LIFE CHANGING!') & (df_agg.Count >= 20)]\
.sort_values('Count', ascending=False)
My Solution:
My solution.