Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Finding best combination of rows-columns with less nulls

Highlighted
8 - Asteroid

I have a dataset with 60000+ rows and 150+ columns. This consists of the various test results done on patients in a hospital. 

 

Given below is a small sample of the data. Rows (P1-15) are the individual patients, Gender M-Male, F-Female. Columns T1 to T5 denote the test results done on them. The null values mean the particular test was not done on the patient. 

 

 GenderT1T2T3T4T5
P1M0.4410.134 0.2000.092
P2M0.075  0.6670.811
P3F 0.8900.4240.177 
P4F 0.0810.427 0.802
P5F0.6060.3260.7120.1740.125
P6M0.606 0.577 0.942
P7M0.4510.2940.5940.9910.201
P8M0.045  0.548 
P9F 0.7580.3900.1360.945
P10F 0.3190.3970.5780.728
P11F0.379 0.9180.7900.927
P12M0.222 0.933 0.509
P13F 0.260 0.8840.639
P14F0.6670.712 0.576 
P15M0.7670.650   

 

Now, i would like to find the best combination of patients-tests where there are no/minimum number of nulls with maximum number of patients and tests covered. This could be an optimization problem. Any idea how this can be done in Alteryx?

 

Highlighted
12 - Quasar

Your best bet is to pivot the data (this will produce a large number of records due to your column count), and then use the summarize tool finding the Min/Max (whichever you prefer), grouping by patient.

 

Once complete, then join back to your original dataset to find the trial value. Note, in this instance if there are ties then you'll have to account for that as you will get multiple rows.

 

Highlighted
Alteryx Certified Partner

Hi @GaRaGe

Here i attached my file.

Hope it will give you answer for your question.If any queries let me know.patient.png

Labels