This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Preface: A commonly asked question with no simple answer.
A friend once asked me: “One question I have is where to start with data analytics? Do we choose a platform or software, or start with a concept? Also, how will a software help, compared to traditional Excel?”
My response to her was:
“I feel there is no one-size-fits-all, data analytics is a very broad umbrella term. Personally, for me, I start with my own use cases, what I need, a current state assessment so to speak, and identify the “block” that takes up the most time though it is routine in nature. Basically, if I need to do something N times, I need N or > N efforts.
Depending on the use case, the answer to where to start can be different.
However, if what you have in mind is generic, I suggest you do a broad classification of your “learning path”. Are you aiming to be a data engineer, a data scientist, a business analyst, or user/consumer of the business data analytics or …? Having this clarity helps you to decide how much effort to allocate to which areas. Since it is impossible for one person to know everything, we need to know not only what we can do but also what others in the team or the organization can do, so that we can make the most of the resources at hand.”
Her questions are not unusual.
I am often asked this question “Where to start with data analytics?” from friends and colleagues. While the first question is often agnostic of the industries (retail, tech, F&B, VC) and the organizational roles (strategy, value creation, portfolio management, business development), when probing further, it becomes clear that the underlying use cases driving this desire to learn “analytics” are vastly varying.
Rather than discussing the concept, I think it is more useful to share my actual use cases, my learning journey. The aim is not to provide a comprehensive guide to ALL software/systems out there. I am not an expert in that. I can only share my personal journey on learning and applying the analytics software at work.
Hopefully, you can see some aspects that are relevant to your own use cases where you can improvise and leverage to design your own learning curve.
PART I: WHY ALTERYX?
My use case: Alteryx = superb data blending for due diligence.
PART II: HOW TO GET FROM ZERO TO “GOOD ENOUGH” FOR EXCEL USERS
For Excel users (part 1): “Is Alteryx difficult to learn?”
For Excel users (part 2): “How much time do I need to spend to learn Alteryx?”
For Excel users (part 3): “How to get started?” – Towards Core Certification
For Excel users (part 4): “Going beyond Core to Advanced?”
PART III: BUILDING TEAM CREDS AND USE CASES
How weekly challenges, certification, etc. can help boost your analytics creds and your team’s as well?
Selected use cases. For due diligence and beyond!
MY USE CASE: Alteryx = superb data blending for due diligence (OR WHY you need more than excel)
I do financial due diligence on companies for a living.
Amongst other things, financial due diligence involves looking at the historical financial results of the company to assess (i) the key business drivers, (ii) how the trends of historical financial KPIs are linked to the activities and initiatives taken by management, and (iii) impacts of the external factors that influence the business such as relationships with customers, suppliers, and overall macroeconomics.
The most dominant “tool for the trade” for due diligence has been Excel since before I started work. Excel is really cool – when we need flexibility, simple data, quick, bite-sized analysis.
But… I often need an “Excel-plus” package because:
Data provided for due diligence are “outputs” i.e. management reports that are generated for specific purposes and often need significant processing to be suitable for due diligence. Often, there are multiple reports generated based on different business rules, and the due diligence involves unraveling / reverse-engineering to find the “anchor” set of numbers.
I need to blend financial and operational data. Blending is critical because financial results are manifest outcomes of the operational activities, so operational and financial data need to be blended and analyzed together to understand the key business drivers. Very often the business-as-usual KPI sets need to be challenged and enriched to support/disprove the incoming investor’s hypothesis.
Triangulating and reconciling different datasets: companies are organized by function and different datasets are often “owned” by different persons, do not talk to each other directly, and are being aggregated at different levels of details.
The data is getting larger. The newest version of Excel is limited to 1,048,576 rows by 16,384 columns, which is good enough for most general uses, although, in my experience, performance on a normal laptop degrades quickly if the Excel file is >40MB. For due diligence on GL data, customer transactions, the dataset can run to GB territory with hundreds of millions of records.
Lastly, we often receive multiple sets of data and need to quickly update all analyses, including reconciliation checks.
All this blending and updating calls for LOTs OF VLookup, Ctr-C and Ctr-V.
And the scariest part with manually blending:
There is no way to guarantee repeatability.
If there is a manual error, there is no way to rectify except to redo because there is no clear segregation between inputs vs. interim vs. output data. The ultra-flexibility of Excel is both a boon and a nightmare.
There is no easy way to document what was done. Imagine this: if you spend 1 day doing manual data blending in Excel (ouch…) how much time will you need to explain to someone what you did so that the person can repeat the same process correctly? Imagine explaining to multiple people again and again (ouch…)
My fellow financial due diligence practitioners likely will identify with the issues above. As companies amass more data and businesses become more sophisticated, it will become even more important for the due diligence process to (i) “ingest and digest” ever more data, (ii) get to insights quicker and (iii) iteratively process data rapidly without losing the grip on data integrity.
Excel on its own is good, but not good enough. If some of the issues described above echo your feelings, then you will find Alteryx an excellent addition to your tool for the trade.
Alteryx ticks all the boxes, summarized below (Hint: you can even use this table to convince your boss & team members!)
Issues with Manual data blending
There is no way to guarantee repeatability.
Alteryx workflows ensure that the same data processed the same way, based on the rules you set.
If there is a manual error, there is no way to rectify except to redo because there is no clear segregation between inputs vs. interim vs. output data.
Input, data processing and output are segregated.
Your “provided by client” (PBC) remains the way it is downloaded from VDR/received from Client/Target, which makes it easier for SPA documentation.
There is no efficient way to save interim steps.
Can view inputs AND outputs at EVERY step. Outputs at every step can be extracted and used for recon check or inputs to your analysis.
It is principally a one-way street with no possibility to partially reverse some of the steps.
Recyclable workflows: easy to trim or add steps to test multiple options / configurations of the same data, without compromising data integrity.
There is no easy way to document what was done to ensure audit trail and knowledge transfer.
Drag-and-drop tools with clear connectors. Almost intuitive logic flows. The workflow itself can serve as documentation.
Deaths by a (few) thousand VLookups, Ctrl-C and Ctrl-V
Optimized data blending functions with intuitive interface for equivalents of Vlookups’, sample, joining and more…
The dilemma: to keep or to break the external file linkage?
No such issue with relative path option.
Limit on number of records
No software limitations.
Performance issues when processing large file (>40MB or more) on laptops (based on my personal experience).
No performance issues noted on laptops up to 10 GB (based on my personal experience).
If you want to use Alteryx but are unsure how to get started (and get up the learning curve quickly), in the next installment I will share my personal learning paths and “review” of some of the resources that I found most useful.
Next installment: Some questions we Excel-users typically ask ourselves before committing the time and energy to learn: “Is Alteryx difficult to learn? How much time do I need to spend to learn Alteryx? How to get started...” Tune in next week to find out!