Showcase your achievements in the Maveryx Community by submitting a Success Story now!
SUBMISSION INSTRUCTIONSA little less than three years using Alteryx, I think about the same advice once given to me by an individual at Alteryx - you’ve probably heard his name. He’s outgoing, really fun to talk to, a great speaker, and doesn’t live in the U.S. (hope that didn’t give it away too much).
At the New York stock exchange - we hang out there because we are cool - I asked him, “hey, there are so many tools out there. There is Alteryx. There is Tableau. There is Thomson Reuters. There is something called Powerbi but I’m still not sure what that means. And there is code. Where do I start?”
He said back: “pick one tool and learn to solve any problem with it.” At the beginning of my data journey (which is like yesterday), I thought to myself, “I’ll just be an accountant for the rest of my life using columnar paper - ain’t nobody got time for excel...there is just too much to learn to start a new career path.” Yes, I know I’m a bit dramatic.
However, what this individual said to me changed my entire approach to data analytics, data science and being a self-learner. I knew I had the mind to problem solve, think over complicated issues and listen and share new ideas, but the vast world of data analytics was overwhelming. Until I watered it all down and decided I’d pick one tool and solve any problem I face using it. That tool for me became Alteryx. Like obviously.
So, for this use case, I wanted to share one problem I needed to solve using Alteryx. That problem was spending more money.
It’s frustrating to be sold a tool to only be told afterwards that the tool doesn’t do “that.” Well, my client purchased Alteryx. Their advisors suggested they hire a data architect to assist building a database system that would grow on average about 12,000 records each month.
Here I am thinking, I need to solve any problem in Alteryx. I want to be the best I can using Alteryx. I don’t want my client to have spent even more money. How can I help them best utilize their investment dollars they’ve spent on their innovation journey?
As well, I don’t know SQL. I don’t know how to build a “real” database from scratch. But as I’ve said repeatedly, I want to solve and provide my solutions using Alteryx.
To give a little more background, my client would generate about 12,000 records of transactional insurance information. The issue was that their outdated ERP system would only export a single month in an Excel binary worksheet (let’s ignore that complicated fact). Regardless, every month the client would have to copy and paste the data into an Excel spreadsheet until the spreadsheet got too big to deal with.
My client wanted a single location consisting of all their insurance transactional data, that they can run queries on in Alteryx. And again, these are beginners...if you know what I mean.
Here is what we did.
Step 1: create database directory folder
Firstly, we needed to create a database directory folder to hold our database. We did this on the client’s network drive.
Step 2: create an empty .yxdb file that includes the datetimenow() function in the filename.
We built an analytical application that would export a .yxdb file to the new database directory folder.
Step 3: the data
Our analytical application asks the user to upload the excel document they want to add to the database. We confirm that the excel documents being added to the database all have the month and year of the records included appended to the excel filename before the .xlsb extension.
What this looks like is: the_jake_tool_data_january_2020.xlsb
Our workflow will filter null records, append the “january 2020” to the dataset, and append a second new field containing the datetimenow() function.
So, to conclude we have our original data, our month and year taken from the filename, and the datetimenow ().
Note: due to the limitations of .xlsb files, Alteryx sometimes takes a longer time to process these files resulting in the dataset actually being given different datetimenow() values. Therefore, I use a multi-row formula tool to drop down the first given datetimenow () value to the rest of the records found in the dataset.
Step 4: adding the data to the database
We made sure that our client’s analytical application is linked directly to their directory database folder we already referenced. In our team’s initial build out, we actually allowed the user to direct to the directory database on the analytical application.
The workflow will input all of the .yxdb files in the directory database. Here I’ll refer to the empty database we just created. Therefore, our directory database only has that one file labeled “master_database_2020-10-01 17_04_11.yxdb”
The workflow will filter out the most recently created file based on the datetimenow() information included in the .yxdb filename, but in this situation, there is only one file.
Our workflow will remove the filename record automatically that has the value “new_file_to_be_used_to_re_start_new_database”, and union the new data to the bottom of the existing data (in this first month, there is no other real data).
Step 5: the next month
So, in February, our client would upload “the_jake_tool_data_february 2020.xlsb.” We will now have two files in our database directory folder - the first file we created (that was technically blank), and the file we created with January 2020 information; our workflow will filter for the most recently created database file, that is the January 2020 file, and append the February 2020 information to the bottom of that .yxdb file.
Now did I lose you? Well good thing we have zoom 😉 ha!
So, after we added the February information, we would have three files:
Our database directory folder will keep getting larger and larger. It’s up to the client if they want to delete the earlier .yxdb files, but due to the size of their network drive, I suggest they keep them around for backup.
Lastly, I want to include some pointers.
Did I upload this data already?
We built in a test that will notify the user if they already uploaded a specific excel file into the database. Because we require the excel file name to include the month and year of the data, we are able to query our database to see if there is already data from that month and year included in the database. If the analytical application sees that there is already information for that month and year, the user can select in the analytical application to replace the data of that month and year with the new data for that time frame included in the excel being uploaded to the database.
Building further analytical applications
We asked our client to give us some ideas of reports and charts they’d like, comprised of the information from their new database.
Now, we just include a macro after a directory input tool that references the database directory folder and pulls the latest database file created. With just the click of a single button, and no other analytical application input, the user can export graphs and charts created in Alteryx and published to pdf files.
Our client can now simply click “finish” on an analytical application to receive a list of all of the vendors included in the database, sent directly to their email address.
The list is endless for the amount of additional applications we can now provide our client.
The best part is, everything was done in Alteryx. Our client’s expertise level has expanded in Alteryx, and they feel comfortable making changes and editing anything we share.
Lastly, you might question what happens if we reach the .yxdb size limit. I think that is for another article, but I’d love to hear your ideas in the comment section below. If I have the patience, I’ll definitely share what we are thinking.
Hi,
This looks awesome. Thank you so much for sharing.
However would you be able to share the workflow so that I can try it on my own? The picture that you've shared using that I didn't get enough of it. Also there are few steps that I might help with.
Thanks
Hi,
This is was phenomenal. I really loved the way you used Alteryx to solve the problem. I would definitely want to hear more on this solution like did it work, were there any more add on that you did to solve more problems. It would be great if you could share the workflow with the dummy data so that I could also build it. Further, I would also like to hear your journey of getting into total different background and how did you manage it.
Thankyou
Hi ThalitaC,
Thanks for sharing your learning journey to solve your challenges. Your planning and execution to create a database is simply awesome.
A very interesting successful story, love to heare more real examples👍