The Thrill of Solving in Higher Education: Alteryx + Hong Kong Polytechnic University
The Hong Kong Polytechnic University (PolyU) has roughly 29,000 students and 5,500 academics and professional staff from Hong Kong and around the world. PolyU started as a traditional polytechnic institution emphasizing vocational training and practical skills. Today, the university provides a much wider range of programs across professional education, academics, and applied research. Anson Wun, Senior Institutional Research Analyst, is part of a team in the Institutional Research and Planning Office (IRPO). Reporting to the provost, his team consolidates large volumes of data from various internal and external sources and provides analytics for senior management to better understand how the university is performing against key measures and to make strategic decisions about growing the institution. The team is also responsible for reporting key institutional statistics to the government.
PolyU is evolving to a full university, but changing its culture and administrative processes has presented a number of challenges. “We are still going through a phase where we are trying to go from the polytechnic culture to a university that excels in teaching, research, and knowledge transfer, which means there are lots of changes to everything from programs offered, pedagogy, infrastructure, to reforming our administrative processes,” said Anson.
Anson’s team gathers data from academic departments and central administrative units, the government, and other external sources. He estimates that 90 percent of the data files he receives from internal departments are Excel-based. Not to mention, the data they receive from external sources are only downloadable in CSV and Excel formats as well. “Our team has traditionally relied on Excel, or Microsoft Access for our more technical staff. Beyond that, we’ve had to rely on our centralized IT unit for more advanced data requests. But that was proving to be a lengthy and highly complicated process to get the information we needed.”
Anson’s first major task in the Institutional Research and Planning Office (IRPO) was to create a balanced scorecard for nearly 30 academic departments, specifically to set KPIs for senior management to evaluate departmental performance. The team began looking at indicators like student-to-staff ratio, student diversity and the percentage of incoming/outgoing exchange students, service-learning opportunities, intake quality, and more. They also look at research performance, such as income from both public and private sectors, and citations in influential journals and other bibliometrics.
Anson estimates that the process to run the “first cut” blending after receiving data from departments was taking weeks, which meant his team could only manage this process on a quarterly basis, which was a significant limitation for the team’s output potential. Anson explains, “We had to deal with 100 different sizable files every month and we just reached a point that we couldn’t handle the load. We needed to process data much faster to provide the insights our stakeholders required to make decisions.” Not only was data processing taking too long, if anyone had follow-up questions on any of the data or wanted to modify variables to look at data from other angles, the team had to go back to original data sources, then re-slice data and repeat the entire process.
Anson needed a robust self-service data processing and analytics solution that wasn’t reliant on highly technical and error-prone configurations. To cut down the learning curve, Anson was also looking for an intuitive user interface with simple drag-and-drop functionality, plus on-demand online training and support resources. Additionally, he needed the ability to share work and collaborate with colleagues in real time.
Anson learned about Alteryx through Velocity Business Solutions, an Alteryx partner also based in Hong Kong. At the time, he had been using QlikView for data visualizations and Velocity recommended Alteryx to not only speed up data preparation and blending, but to look at more advanced capabilities such as predictive analytics. Once he downloaded the trial and the Alteryx Starter Kit, it was the point of no return!
Understanding Research Output
Anson shares an example of what Alteryx allows him to achieve. He and his team analyze the research output of PolyU. In the past, the focus was to look at the number of papers published. The previous assumptive benchmark had been that the more papers produced by faculty, the better it is for the university. Realizing this was not a comprehensive measure, they decided to look at a more robust set of bibliometrics to understand how, where, and the number of times papers were cited (referenced) in published material. He can also look at citation patterns, analyze the frequency and lifecycle of these citations to get a better understanding on topic relevance, quality of publication sources, and collaboration impact.
In the past, he could only do one citation analysis at a time, because it took so long just to hunt and slice that data. “With Alteryx I can now look at more metrics such as number of publications in top journals, citations impact, or mutual benefits of collaborations. I can look at the correlation between these metrics, and their relationship to the demographics of the departments and researchers. That was all impossible to do before.”
Looking Backwards, Moving Forward
Every six years, the government assesses research activities of each university, which can result in adjustments to grant monies that universities receive. Due to the assessment model of the exercise and its backward-looking nature, the results may not always reflect the true potential of PolyU’s researchers. So, the IRPO team was tasked with doing deeper analytics. Anson explains, “What we found is, when we benchmarked ourselves, using a wider set of metrics, against other institutions included in the assessment or even the world, we were actually performing quite well in many disciplines. So we were able to do cross-validation with the assessment results, as well as providing additional insights to departments. I can’t imagine being able to achieve this without a powerful tool like Alteryx.”
Using Alteryx, Wun could validate senior management’s concern by using more comprehensive measures. This analysis has helped balance internal and external expectations across academic departments at PolyU.
According to Anson, "We often need to analyse research metrics of publications. One of the sources we use is Scopus, the world’s largest abstract and citation database of research publications such as journal articles, books, conference proceedings etc. It allows users to download details of each publication. But there is a limitation in the amount that can be downloaded, and some metadata may not be always available.
Conference proceedings data, in particular, are a case in point. There are tens of thousands of conference proceedings. Due to the limitation of Scopus, we can only download 2,000 records each time. And a bulk of the metadata associated with the conferences are not available this way. There are APIs to get around this, but this implies getting IT to write programmes for extraction and transformation, and the turn-around time is usually long.
So this workflow comes in very handy: it first constructs the API call (using Download Tool) from existing publications data that we have. The returned data will be in JSON format. Then we use the JSON Parse Tool to parse the data into properties-values pairs. A regex in the Formula Tool follows, truncating the properties strings that will next be used as column names (Cross Tab Tool). It’s then finally output to yxdb for further processing.
The workflow may appear to be rather straightforward and innocent-looking, but the amount of time that it saved us is unfathomable. If we resort to the pre-Alteryx way, due to the sheer amount of data and manual work, we can only afford to do it perhaps twice a year. Any changes to the parsing programme will also require extra IT efforts. With this workflow we can do it anytime we want, and can modify the parameters easily by ourselves.
Alteryx gives Anson’s team a robust and versatile platform to perform analytics in a timely and flexible manner, and make decisions based on data. He says, “With society, demographics, and culture changing so fast, if we don't have a very robust and flexible way to process and analyze data, then we risk falling behind.”
Repeatable Workflows and Self-Service Analytics
Using Alteryx, Anson can create reports and analyses of individual departments using a streamlined workflow. It takes his team only a few hours to compile all the different data sources from various departments, translate the data into consumable formats for visualization, create a model, and be able to update dashboards in real time. A project that used to take two weeks can now be done in just a few hours! As a result, the team can cut down the lead-time to refresh reports and dashboards significantly. “Some reports that I used to produce quarterly can now be done on a monthly basis,” commented Anson, “this also frees up time to work on other projects, and allows management to identify issues and monitor trends more effectively.”
What’s more, the process is fully repeatable (repeatable is a better word). Anson can make adjustments easily based on alternative ways of looking at data sets, without having to redo the whole process. He is able to share his workflow so that other team members can do any analyses they might want to do, when they want to do it.
With built-in analytics, Alteryx not only gives Anson the power to blend in data, but it allows him to add predictive, spatial and even prescriptive insight to his datasets.
Anson says, “Alteryx has decluttered our team’s processes and deliverables. Before Alteryx, I was running multiple Excel spreadsheets simultaneously, and perhaps throwing in Microsoft Access or Power BI, then other statistical or scripting tools in order to overcome the shortcomings of each of these individual tools. With Alteryx’s comprehensive set of modules, we’ve completely optimized data management and analytics output by bringing many of these tasks under one roof.”
Anson continues to learn new things while building workflows in Alteryx. He says, “Just when I think I have ‘nailed it’, I discover there are smarter ways I can accomplish objectives in the platform.” While learning the latest best practices or other features and tools available, he’s often able to extrapolate even deeper insight from his data sets. “This is really beyond what I expected when I first started using Alteryx.”
When Anson tells colleagues about Alteryx, he said many naturally question what benefits it has over their traditional approaches with spreadsheets. He explains that it is like comparing a smartphone to a feature phone. “Like a smartphone, Alteryx lets us tap into an entirely different culture and mindset. It’s a whole new way to manage, explore, and discover data.” He elaborates, “Alteryx is very humbling, because it shows how much you don't know about your own data.”
Using predictive analytics with Alteryx, his team has embarked on modeling student projections. The team is now identifying factors related to student progress, and use them to predict outcomes. With the help of Alteryx’s intuitive workflow interface and powerful integration with R, they can easily modify the underlying parameters and to optimize the statistical models. “I think we are now ready to tap into the next level of Alteryx, which is its predictive and prescriptive capabilities. And we are very excited about that,” Anson said.
Anson values the online training materials and finds the Alteryx Community invaluable. He explains, “There’s no better way to improve oneself than from sharing with fellow users and taking advice from the experts!”