2018 Excellence Awards Entry: Alteryx Providing the Data Source Functionality of a Conventional Data Warehouse
Name: Gene Denny
Title: Data Systems Integration Specialist
Company: Des Moines Public Schools
Overview of Use Case:
With no fewer than ten major database systems in play at Des Moines Public Schools, an urban school district of 33,000+ students, the ability to leverage student data points from disparate systems for the purpose of driving decisions and forecasting trends was a challenge. Outside consultants were engaged at different times in an attempt to architect a solution which would afford the district the luxury of using the various data points and systems, independent of the environment in which the data resided. In most of these cases, large scale implementations of full-fledged Data Warehouses were proposed, costing tens to hundreds of thousands of dollars so that the data could be run through an ETL process and subsequently stored in a unified structure. By strategically placing Alteryx between the disparate data sources and Tableau Server, an entirely new world of analytics could be achieved benefiting students, teachers, schools, administrators, and other stakeholders in the community as a whole.
Figure 1: Architecture as proposed by outside consultants.
Describe the business challenge or problem you needed to solve:
The Des Moines Independent Community School District, also known as the Des Moines Public Schools or DMPS, is a large urban school district located in the heart of the Midwest. As a public school district in Iowa, operational funding is provided primarily via property tax valuations. While other funding sources in the form of grants, etc. are also available, school districts in Iowa have a capped amount of Spending Authority based upon student enrollment and that Spending Authority may only be exceeded in certain circumstances. Thus, the need to be creative in the search for solutions to common business problems is of utmost importance.
In the case of DMPS, data points for students are being housed in "Data Islands" based upon the type of data stored. While not an all-inclusive list, the following data systems are in use:
Infinite Campus -- Student Information System housing student Demographics, Attendance, Gradebook, Transcripts, Behavior, Assessment, Health, Fees, Graduation Progress, and State Reporting data elements.
CanvasLMS -- A Learning Management System recently integrated providing the delivery and tracking of curriculum components as desired by classroom teachers. This system is fully customizable by teachers with regards to components and resources provided to students as well as both manual and automated assignment grading.
SunGard BusinessPlus -- A suite of programs for Human Resource Management, Wages, Credentials, Purchase Orders and all required Financial Applications.
NOVAtime -- Automated Time and Attendance program for Time Clock and additional Employee Tracking services.
Follett -- Library Management software for circulation and reference activities.
Education Logistics (EDULOG) -- School bus routing and planning software.
MCS Software -- Point of Sale and Food Service software for school cafeterias.
Qualtrics -- Experience Management Software (student, staff, and community surveys and/or feedback).
DMPS has long been investing in Tableau for data reporting, but the ability to blend data from these varying data environments has been limited at best, partially due to the size of the data environments involved. Infinite Campus, which serves as the Record of Authority for student data elements, currently sits at over 1.3 BILLION rows. Adding in the other data sources, each functioning on their own SQL database (or similar) and the potential data in need of being blended easily tops 4 BILLION rows. While Tableau's ability to handle the amount of data referenced is impressive, even with the advent of Hyper the return time for generated reports is not always sufficient. Tableau data extracts are used and able to solve some of the performance issues, but cannot be updated often enough to allow for Real-Time, or at least Near-Time, data in all instances. Proposed solutions to this challenge typically incorporated either an on-premise or cloud-based Data Warehouse solution, similar to Figure 1, where the necessary data could be unified in a single, contiguous data environment. These solutions were not feasible in the financial climate of public schools at the time. There HAD to be a better way to look at relationships between data points that lived and breathed in different worlds. Take the following questions as examples of the “data island” challenge schools face:
What effect does a teacher's degree level or years of experience have on student achievement? (BusinessPlus, Infinite Campus, CanvasLMS).
What is the Per Pupil Spend at North High School vs Roosevelt High School when supplies, services and salaries are all considered? (Infinite Campus, BusinessPlus).
How does the Per Pupil Spend above correlate to the amount of time students are logged into their coursework and does it fluctuate by time of day? (CanvasLMS, BusinessPlus, Infinite Campus) * Note: Students not engaged in coursework outside of the regular school day can be an indicator of a school or district's "Digital Divide, also known as “The Homework Gap."
How do students perform on standardized assessments when the assessments are given 2 hours prior to lunch? 1 hour? 2 hours after lunch? (MCS Software, Infinite Campus, CanvasLMS).
Describe your working solution:
A solution to these disparate "data islands" was found in the offerings of Alteryx. Initially, DMPS purchased Alteryx Designer as a tool to more easily shape large data sets coming from the Student Information System (Infinite Campus) so that Tableau reports could more quickly be rendered for front line users. As the district became more comfortable with the application, additional data sources were brought into the fold of the existing workflows, resulting in reports which contained data elements from very large data sets previously disparate in nature. (Figure 2)
Figure 2 (End State): Alteryx strategically placed between disparate student data environments and Tableau Server so that current and future reports are able to incorporate blended data from several large data systems. Systems positioned, but not yet connected via Alteryx, are marked with (*).
At about the same time as the realization of Alteryx's abilities, Des Moines Public Schools had also embarked on a mission to identify and implement a full-featured Learning Management System, or LMS. Learning Management Systems have been the emerging software products in education for the past several years and are now starting to enjoy significant success in both the K-12 and higher education marketplace. Their role is to administer, document, track, report and deliver educational courses and/or components to students over a wide range of platforms and have become an essential tool in the toolbox of professional educators. Des Moines Public Schools chose CanvasLMS as the software platform and implementation occurred beginning in early 2017.
One of the abilities that the chosen LMS brought to the table was an additional Gradebook for teachers that lived in the CanvasLMS cloud. While Infinite Campus would continue to be the Record of Authority for student grading and assessment, CanvasLMS provided the ability for educators to grade and assess students from within the same environment as the curriculum itself, often times in an automated fashion. The result? Grading data points for a given student could potentially exist in two disparate locations. Prior to the purchase of Alteryx Designer, this in itself could have presented immense challenges with regards to reporting trends in comprehensive day to day student performance for teachers.
However, because Alteryx Designer is fully implemented, it becomes a straight forward task to connect the two databases via a single Alteryx workflow, blend and shape the data as desired, and subsequently publish the data to Tableau Server to be leveraged by those closest to the student. (Figure 3)
Figure 3: Critical grading and assessment data sources blended for the purpose of trend analysis.
The resulting report, and others which could be constructed in a similar fashion, provides trend data that is INDEPENDENT of where the data points live. Front end Tableau users are able to see, interpret, and recalculate student performance trends on a day to day, week to week, or custom date range basis and remain completely unaware that the data may be coming from multiple sources. (Figure 4)
Figure 4: A student's assignment performance over time and the resulting performance trend. Individual data points are rendered from differing data systems, unbeknownst to the user.
The ability to create reports such as the one referenced in Figure 4 is nothing overly impressive in today’s world of data analytics. Organizations in various market segments have long been able to render such reports with the help of an on-premise or cloud-based Data Warehouse which unifies the data contained in the data systems. The structure of the Data Warehouse provides the flexibility to create reports and analyses from a single, contiguous data source. Additionally, industry-specific interoperability frameworks have allowed for the sharing and propagation of data across systems. Education, as a market space, falls behind in these types of "cooperating" data systems. Furthermore, Data Warehouse solutions, whether on-premise or cloud-based, are often out of reach financially for public education entities. Alteryx Designer is currently able to bridge this gap by providing the functionality of a unified Data Warehouse as it relates to reporting and analysis, but through a virtual architecture.
Describe the benefits you have achieved:
Since the original implementation of Alteryx Designer, Des Moines Public Schools has invested in Alteryx Desktop Automation which has allowed the district to schedule and fully automate the workflows in operation. DMPS currently runs over 20 scheduled, unattended workflows from one to ten times per day. This scheduling of workflows provides DMPS and its users with the reporting functionality of a unified Data Warehouse that is Near-Time in its nature. While DMPS may need to eventually invest in a more conventional Data Warehouse solution for unifying student data, the time for planning and financing such a solution can now be expanded.Figure 5: Disparate data environments blended and published to Tableau Server. Users are not cognizant of the lack of a unified data platform, such as a formal Data Warehouse.
Figure 6: Additional disparate data environments providing the reporting functionality of a conventional Data Warehouse’s unified data architecture.
Figure 7: On-the-fly trend recalculation utilizing data points from disparate data environments leveraging Alteryx as a Virtual Data Warehouse.
As Des Moines Public Schools looks to the future, it is not possible to forecast exactly which applications will be implemented across the entire organization or in individual schools or classrooms. However, with Alteryx Designer with Automation positioned as it is between the myriad data environments and the reporting mechanism (Tableau), DMPS has strategically made use of Alteryx's ability to provide billions of rows of blended data in a clean, quickly-rendered format to provide a "data pump" from each of the available data environments. The result is a data environment for analytics that appears unified in architecture -- a “Virtual” Data Warehouse.