Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alter Everything

A podcast about data science and analytics culture.
Episode Guide

Interested in a specific topic or guest? Check out the guide for a list of all our episodes!

VIEW NOW
MaddieJ
Alteryx Alumni (Retired)

You’ve heard of the cloud. Now get ready for Alteryx in the cloud, plus some awesome cloud technology like Snowflake, AWS and Azure. We’re joined by Alex Gnibus (Alteryx), Paul Warburg (Alteryx) and Mike Klaczynski (Snowflake) to chat about the benefits of managing data in the cloud, why it’s good to get smart on cloud, and how easy it is to combine Alteryx with other platforms for maximum cloud potential. 

 


Panelists

 


Topics

 

 

Youtube Thumbnail.png

 


Transcript

 

Episode Transcription

MADDIE 00:01

[music] Welcome to Alter Everything, a podcast about data science and analytics culture. Today, we’re going to talk about Alteryx in the cloud, from how it works to how you can benefit from it. My colleague Alex Geneva is going to walk away us through everything you need to know. Alex interviewed some amazing folks in the field, including Paul Warburg, Senior Product Marketing Manager at Alteryx, who will share some of the ins and outs of Alteryx Designer Cloud, as well as Mike Klaczynski from Snowflake, who manages their ecosystem of technology partners. Mike will share how Alteryx plus Snowflake is a great example of how you can stack platforms to maximize cloud potential. Let’s get started. Alex, welcome to Alter everything.

ALEX 00:47

Thanks for having me, Maddie.

MADDIE 00:48

So I’m super excited to chat with you because I feel like we have so much going on at Alteryx right now within the cloud realm, and so let’s unpack everything. And to start, let’s talk about the demand for cloud.

ALEX 01:01

Yes, absolutely. So to start off, I have a question for you. What does Oktoberfest have to do with the cloud?

MADDIE 01:08

Oh.

ALEX 01:08

The answer is actually nothing, probably, unless you happen to be a brewery. I thought a brewery would be a fun way to imagine why the cloud might be useful. So I invented a hypothetical brewery for the purposes of this podcast called Big Data Brewing. And Big Data Brewing is dealing with a lot of data challenges during Oktoberfest. So more on that later in the episode. But before we get to that, I think it’ll help to get some background on what the cloud is and why we should care. Because the cloud is a term we all know. Everybody throws it around. But thinking about how it actually works from a large-scale analytics perspective is a lot different than just having your music on the cloud or your photos on the cloud. So why is there such high demand for cloud across industries? Why do we care about it? And why is it good to know about it for our data work and Alteryx and, in general, for your career? So that’s why I decided to talk to a couple of people knowledgeable about the cloud to get up to speed on how you can use cloud, what you might use it for, and solutions available right in Alteryx that can get you started.

MADDIE 02:07

Perfect. Yes, I would love to hear some insider knowledge.

ALEX 02:11

Great. So for this podcast, I found two people who know a lot about cloud. The first is our Alteryx colleague, Paul Warburg, who is an expert on Designer Cloud powered by Trifacta. That’s Alteryx new cloud offering. And the second is Michael Klaczynski from Snowflake, which is a leading cloud data warehouse that many of our listeners might actually already be using. So the first person I spoke with was Paul, and here’s how he likes to think about how the cloud changes your data experience. [music]

PAUL 02:39

I think the key thing for Designer Cloud is that word cloud. So just to talk a little bit about why the cloud is important for analytics, there’s a lot of advantages to working in the cloud. And I’ve actually invented an acronym for the main advantages that kind of makes it a little bit easier to remember. And that acronym is GAMES, which stands for Governability, Accessibility, Manageability, Elasticity, and Scalability. So in the cloud, you can basically do the same thing that you could do on-premise, but you can get a lot of these benefits with that. Basically, with cloud, organizations are able to easily create environments where anyone in the organization can access all of their data for analytics or machine learning or what have you all the time. And that’s going to be done in a way that’s still manageable and that adheres to strict rules of governance and security and user permissions and that sort of thing. What that really does is it unlocks data so that people can actually use it and the organization can actually realize a real tangible benefit from it. If you think about the alternative to cloud, which is kind of storing a lot of data on-premise servers or on desktop machines, that can be an effective strategy, especially if you’re a larger company and you have the resources to kind of manage that infrastructure. But for a lot of organizations, they found that managing their infrastructure is a huge burden. So you have to manage your own servers. You have to keep them up and running. You can only scale as far as servers let you build and then data lands on desktop machines. And when that happens, unfortunately, you don’t always know what’s happening to that data or how that data’s being used, or even where the data sits in the real world if it’s on a physical device which could theoretically end up in someone’s car, it could end up in someone’s backpack, and be in a place where it could potentially get stolen and lead to data breaches and things like that. So what the cloud does and Designer Cloud does, by taking advantage of the cloud, is it helps you move your data to a location that’s secure and accessible from anywhere in the globe, doesn’t require you to manage any infrastructure, and it also gives you the ability to scale up server resources depending on your need. And that really takes away the burden of managing complex systems when it comes to providing access and security to data. And it also makes it far easier to know who, what, where, when, how, and why data’s being used at all times. And so if you’re an organization that places a high value on analytics and on analytical insights, in particular, that just means the cloud makes it far easier to democratize data access and democratize data use across a organization. [music]

ALEX 05:16

So Paul is basically saying that cloud tackles a lot of the everyday struggles you might run into as a data worker. You need secure access to data. You need computing power to process those bigger workloads. You need flexibility to use more or less resources, depending on what you’re doing.

MADDIE 05:31

Got it. And it sounds like Alteryx is designed to really maximize these benefits, whether you’re on desktop or in Designer Cloud.

ALEX 05:38

Exactly. We’ve got partnerships and integrations up the wazoo across the cloud ecosystem, from AWS to Azure to Snowflake. And that’s because we want you to be able to use Alteryx while also taking advantage of all these badass things the cloud can do for you.

MADDIE 05:53

Right. So if you’re handling a ton of data in Alteryx, and you want to use the cloud to do some serious computing that you don’t want to do on your own desktop, Alteryx makes it easy to connect with the cloud.

ALEX 06:04

Bingo. And one great example is in-database processing, which is something you can do with both Designer Desktop and Designer Cloud with platforms like Amazon Redshift, Snowflake, Azure Synapse. Paul shared a little bit more about how this works. [music]

PAUL 06:19

Because Designer Cloud is built to natively leverage the cloud, it’s immensely useful when you need to utilize scalable computing to process really large data preparation jobs. So Designer Cloud uses something called push-down processing, which leverages the power of your cloud data warehouse to process data at extremely fast speeds. And so if you’re working with a large data set, this can literally take the processing time from literally hours to just minutes. So kind of looking at what problems do we solve if you’re an organization who wants to empower access to data across your entire organization in a way that’s scalable, is secure, and it’s governed, Designer Cloud really does fit that niche perfectly. And it has been designed to fit all the needs of businesses with that problem.

ALEX 07:05

Got it. So you mentioned in-database push-down processing. I’d like to dig into that just a little bit more. I know that’s a big capability that Alteryx offers with other partners, such as Snowflake, Databricks, Azure. So let’s talk about that a little bit. How does Alteryx work with other partners like that? And what are the benefits of that pushdown processing?

PAUL 07:26

Yeah. So for certain of our providers, like Google BigQuery and Snowflake, like you mentioned, we have the ability, like I said, to leverage the power of that cloud data warehouse to process data. And basically, what that means is that we’re not processing it natively on servers on our product. We’re not processing it on our products servers or your servers. We’re actually using the power of a cloud data warehouse, which is able to infinitely scale up computing resources to meet the need of the job without you having to build that infrastructure yourself. And so if you want to run a job and you’ve built it out, let’s say, in Designer Cloud, when we execute that job, we’ll actually turn it into a SQL query and then execute that SQL query using the computing power of the cloud data warehouses that we can push down to. And again, that really just saves a ton of time and a ton of cost when it comes to processing your data. So it’s a really powerful feature.

ALEX 08:19

So if you know how to connect with a cloud-based data warehouse like Snowflake, you can really do your team a solid when it comes to time and budget?

PAUL 08:27

Absolutely. You make the data accessible to them. They don’t have to be in whatever geographical location, connect to the on-premise data, and they can access it anywhere in the world and process it really quickly. It’s really awesome. [music]

ALEX 08:42

So Maddie, I got to tell you; I am actually obsessed with in-database processing. It is the most unsexy-sounding thing, but it works wonders. You’re literally using the power of another platform to run Alteryx. So with this in mind, I asked Mike at Snowflake for more details on how you might leverage in-database with Snowflake.

MADDIE 09:02

Yeah, I want to hear more about it in terms of Snowflake because it sounds like you can get a lot of benefits of cloud on its own, but if you’re doing this at scale, you probably have a data warehouse.

ALEX 09:12

Exactly. Chances are you’re already using some kind of data warehouse or data lake. So it’s good to know how to really take more advantage of in-database. It’s something that I wish more people knew about. Here’s Mike.

MIKE 09:26

So essentially, instead of Alteryx being that engine, whether it’s Alteryx Server or Alteryx Designer on your desktop, you can actually push all that analytics and all that logic that you’re manipulating in Alteryx, and then Snowflake will execute it right on top of the data within Snowflake. So again, back to that scalability perspective, you’re no longer limited by your resources on your desktop computer. You can now put the full power of Snowflake to process that for you. And so, yeah, that push down is huge. We’ve seen some really incredible performance improvements, but also what it helps with is just governance and security, right? Instead of grabbing a subset of that data, bringing it to your desktop, and doing some analysis, you can now do that analysis on all of your data. And your IT team is happy because that data’s not really leaving Snowflake, right? There’s a summary and some aggregations that are coming into Alteryx, but most of that data’s still staying in Snowflake.

ALEX 10:17

Right. Whereas if you’re that end data user who’s pulling things out of spreadsheets, putting it on your desktop, putting it on your personal device, moving it all over the place, instead IT has full visibility into where it is. And it doesn’t even have to leave. It can stay right there in Snowflake, and Alteryx can process right there. Is that what you’re saying?

MIKE 10:37

Exactly. Yep. So that’s enabling that self-service because anybody that has permission to can access that data, but IT is happy because it’s governed and secure.

MADDIE 10:48

So using Snowflake, it’s like this chocolate cake that’s amazing on its own, but then you get to add the frosting and the ice cream on top. It’s a real game changer at that point.

ALEX 10:58

Yes. I love that metaphor. And now I want cake. So you could start to see how partners like Snowflake help us achieve those benefits of cloud that Paul walked us through, like governance and scalability. So now I’ll go ahead and share a little more of my conversation that I had with Mike at Snowflake. And to start, I just asked him some background on what Snowflake is and why it even exists. [music]

MIKE 11:20

They got on a whiteboard and said, “If we were going to design a way to store, analyze, and access data in the cloud, what would that look like?” And so the big kind of revelation, or the concept, was separating compute from storage. So being able to essentially bring as much compute as you want to your data and not having to have those coupled. So traditional databases, what happens is you’ve got your CPU with eight cores and maybe it has a terabyte or two of storage. When you run out of storage space, you now have to go get a new computer that has more storage. And you’re essentially bringing that coupled interface together and upgrading both of those. With Snowflake, your data sits in cloud storage. So on AWS, that’s going to be S3. On the other cloud providers. It’s going to be their native Blob store. And then compute are actually these virtual warehouses, and you can mount as many of these as you want on top of that data. And so I think that was really the big solution and the big revelation here is traditionally you had bandwidth issues and there were queues, right? You’d have Monday morning; everybody would log in and they’d want to pull up the report and everybody would just sit there waiting for things to spin up because you had a fixed amount of resources. But with Snowflake separation of compute, you now have the ability to say, “Wow, there’s a huge spike. Instead of having 10 virtual warehouses, I’m now going to have 100 or 1,000.” So everybody essentially gets that first-class experience that they want.

ALEX 12:49

So, Maddie. Here we go. I’d love to reframe all this info into an example. Because it all sounds great in theory, right? But how does it actually apply to you?

MADDIE 12:58

Exactly. Yeah, my brain operates solely on these practical examples.

ALEX 13:02

Same. Same. So here it is, the moment we’ve all been waiting for, Oktoberfest. So Oktoberfest is a really quick time frame. It actually starts in September and ends in early October, so it’s only 16 to 18 days. And if you’re my imaginary hypothetical brewery, Big Data Brewing, Oktoberfest season is exactly what cloud was made for. You’ve got a seasonal spike in demand and a lot of different departments that have to efficiently promote and get beer on the right shelves at the right time. So I went ahead and sprung this example on Paul, and it turns out great minds think alike. So, Paul, another use case I wanted to walk through is actually an imaginary hypothetical one that I invented for the purposes of this podcast.

PAUL 13:43

Perfect.

ALEX 13:43

So it’s Oktoberfest season, and I have come up with a business called Big Data Brewing. And you mentioned scalability and how that’s a big benefit of cloud is how you can scale resources up and down as needed. And Big Data Brewing happens to need to scale up to meet the seasonal demand of Oktoberfest as we brew our seasonal Oktoberfest beer, which is pronounced Marse-Tsen. It’s spelled Märzen, which is embarrassing because that’s how I’ve been ordering it, but it’s pronounced Marse-Tsen. So, Paul, what I’d like to do now is walk through a potential use case with that as the example of how a company might use cloud to scale up to meet the demand. Maybe it’s supply chain, or maybe it’s that the marketing department needs access to data to plan their promotions, or I’m trying to think of a way to best illustrate how you might need to scale up around a seasonal push like that. So is there anything that comes to mind?

PAUL 14:41

So it’s funny you mention this. I don’t think you knew this, Alex, but we actually, on our team, we just did a DIY data webcast where we showed how a brewery can forecast demand for Oktoberfest to [crosstalk] more beer.

ALEX 14:53

There you go.

PAUL 14:54

So I don’t think you knew that.

ALEX 14:56

No. Did not.

PAUL 14:57

But that’s just one advantage of a self-service user coming in, a brewery owner, who doesn’t necessarily have technical analytical skills. And on that episode, we show how they can actually pull in weather data and attach to a self-service machine learning algorithm and use that to forecast demand for how much beer they should produce and how many items they should order to produce that beer. Slightly different from your question. Your question is, I’m a mega brewery and now I’m getting a lot of demand for my product. How does the cloud help me scale up to meet that demand? And the answer is, for seasonal businesses like this, cloud is especially useful because, with cloud, you aren’t required to build the infrastructure to meet the scale. If you’re a business that normally has a low level of demand, but you have seasons of really high demand and you’re not using the cloud, you actually have to build way more infrastructure than you normally need on a daily basis because you need to build for your maximum possible demand. What the cloud allows you to do is leverage other people’s computing resources, other people’s scale. And as you need the power, you can actually pull in other computers from that cloud provider and scale up as you need it. So if you only need it for two days a year but you need a lot of it, you don’t have to go build a bunch of infrastructure that you don’t need. So for organizations that from time to time really have to run really large data jobs but they just don’t want to build the infrastructure and so maybe they haven’t done it because the infrastructure would be too cost prohibitive, they can now take advantage in and really meet the demands of the season at the time that they need them. [music]

ALEX 16:37

It was cool hearing from Paul how scalability really comes in clutch for times you need to boost your performance, especially if you don’t need the same amount of resources year-round. So I also posed my hypothetical brewery example to Mike at Snowflake. And in addition to what Paul said, he pointed out another couple of benefits of cloud for Big Data Brewing. So for the purposes of my hypothetical brewery, which I’m calling Big Data Brewing for this podcast, you’re saying I could use-- I could use Snowflake and Alteryx together to, let’s say, enable my marketing team who wouldn’t normally have access to data maybe in another organization that doesn’t use Snowflake. They could use Snowflake to get the location data they need, use Alteryx to analyze that data right there in Snowflake, get the insight they need to plan like a location-based promotional campaign for Oktoberfest all on their own. Where without Snowflake and Alteryx, they’d be stuck contacting someone outside marketing, like IT, to get their questions answered.

MIKE 17:32

Exactly. What we found is that customers that have embraced Snowflake, they really see Snowflake as that single source of truth. And your example of the marketing team, right? Instead of them having to figure out, “Where’s our data? Where do we get permission? Who do we ask? Is this data in spreadsheets? Where is it?” They know that that data’s in Snowflake and they know that they have the scalability and resources to go and do their project, right? So they can do something simple, but if they are picking a location and they are going to do deeper analytics and build up data science and machine learning problems, they’re going to want powerful compute to do that, right? And so again, that’s where Snowflake provides that data and then also that engine and that compute to do that. And I mentioned this briefly earlier, but with this data sharing capability, we actually have a data marketplace. So there’s hundreds of providers that provide demographic data, localization data, weather data, whatever sort of data you need. Instead of having to go externally and download that or access that through an API, it’s already natively within Snowflake. So, within a couple of clicks, you can take your business data, combine it with this third-party data, and then get those additional insights. And the best part is, again, with Alteryx anybody can do it.

ALEX 18:44

So my hypothetical marketing team, not only can they access their own data, but they can use the data marketplace to find additional data to uncover new insights, like that weather data. So maybe they can figure out by location what weather patterns might influence their campaign strategy.

MIKE 19:01

Absolutely.

ALEX 19:02

Amazing. That data access piece is huge because Alteryx actually just commissioned an IDC report that showed 82% of organizations indicate that data access policies are only moderately effective or worse. So clearly, it’s an issue. And it’s great that that’s something that Snowflake directly addresses because access to data can be really time-consuming when you’re going around finding, requesting access, collecting that data.

MIKE 19:29

Absolutely. Yeah. And this is actually a really interesting concept. Prior to Snowflake and prior to previous couple of companies, I was at Tableau for about five years. And so we were kind of at the cutting edge of what was happening with democratizing data. And everybody had this clear concept of you either have governance and security or you have self-service and accessibility. And what we realized is actually governance and security enables self-service because people trust the data. They know where they can go to it. It’s essentially prepared for them. And then instead of them having to go to the raw data and search for it and prepare it, it’s already brought to them on a platter, right? So your hypothetical marketing team, instead of them having to dig and combine 15 different data sources, they can go in Snowflake and there should be a data source that’s prepared for them by data engineers at the IT department, and they can just go off and start solving their business problems.

MADDIE 20:22

Okay. This is sounding seriously cool. And with the demand for technology comes the demand for skilled workers to use these products.

ALEX 20:31

Yeah. So this is really important because cloud technology has developed so fast, it’s outpaced the skills available to work in cloud. There’s this huge talent gap. So now you have these data challenges that exist that can be solved using the cloud. Great. But the lack of awareness or skills to be able to actually use cloud to its full potential. So my hypothetical brewery could know all about the benefits of cloud. Yay. But we miss out on actually using those benefits if we don’t have enough skills or know-how. And that’s why there’s a good chance there could be a business challenge like Oktoberfest at your organization that could easily be solved with the cloud that your team just doesn’t know about yet.

PAUL 21:06

Cloud skills are in great demand. And the cool thing about Alteryx is it makes it easy for anyone to expand to the cloud without having to build a bunch of skills, but you make yourself look super valuable because you know certain things. So if you know the advantages of a cloud infrastructure and you use that in your daily workflows, people will see the difference in how fast you’re able to get jobs done and how fast you’re able to process, and how cost-effectively you’re able to process. And the cool thing is you frame this in a way, “What skills do I need to learn?” And the truth is you don’t need to learn that many different skills for the cloud as a data analyst because of tools that Alteryx provides. With Designer Cloud, you’re eventually going to get that same user interface that you already have on designer desktop and automatically be able to leverage your existing skills to use it on the cloud. This is super important because technical talent is super expensive for companies. Technical talent is in high demand. And even if organizations can’t afford it, that doesn’t always mean they can acquire it because the skill gap is so large and so many other organizations are competing for that same limited pool of cloud talent. But Alteryx kind of helps you solve that problem by empowering anybody who knows how to use data to leverage the power of the cloud to process that data and to build into in data pipelines.

ALEX 22:23

So just by getting started in Alteryx you’re already upskilling yourself on cloud?

PAUL 22:28

Exactly. That’s what we like to say. [music]

ALEX 22:33

I also talked to Mike about this. Here’s what he had to say. So, Mike, you’ve seen over time how data is now for everyone where it used to be limited to a silo or specific team. And so one of the things I want to talk about is upskilling and how cloud skills will be the next in-demand skill. There’s a huge talent gap. And so I’m curious to hear from your perspective, especially with your industry experience, how you’ve seen cloud evolve and what’s next in cloud adoption, and maybe what our listeners should know about what they should be getting smart on when it comes to cloud?

MIKE 23:08

Yeah. Absolutely. So cloud has been talked about now for at least more than a decade. That’s what I’ve seen in my experience. And initially, there was quite a bit of hesitancy, right? People were like, “I want to be able to go out and grab my server rack and turn it off or pull out the hard drive if I need to, and I want to know who’s coming into that room and who has access to that.” But over time, people realize that these cloud providers, that is their business. They’re able to hire the brightest minds and have the best security and the best scalability, the best reliability, and high availability than pretty much any other company out there. And so as with anything, it was a gradual adopter, right? You had your early adopters come in, prove it out, and then the rest of the industry was looking at them and realizing they really had a competitive advantage. And so you slowly saw these different industries toppled like dominoes, right? And I think the biggest one now has been government and federal. They have lots of very interesting data. And also, financial services, right? People are very protective of their personal or PII data, especially financial data. So now that these industries are doing it, there’s really nothing holding folks back. And I guess the third one would be healthcare and life sciences. So same thing there. And I think what the cloud has really enabled these industries to do is take all that data that was difficult to analyze, right? This concept of big data. They were just building out more and more data. And then the other concept was this data was dark data. Because you had all this data and it was locked away and you had so few people that had access to it, most of that data never saw the light of day. So this kind of goes back to your question of skill gap.

MIKE 24:48

Now, with the cloud, you’ve got all that data in one place. Theoretically, anyone within the organization has the compute and the scalability to be able to analyze that. And now you just need to find people that have the skill set, or better yet, build out centers of excellence and communities internally to help people that have business questions and have business problems translate those into ways of manipulating that data. Again, 10 years or so of being in the analytics and data space, I’ve seen a tremendous shift. A lot of my colleagues at Tableau have actually gone on and written books and taught courses at different universities. And so we’re definitely seeing this big shift of going to colleges and helping people speak the language of data. And now all these people are coming to the workforce, and now they actually have tools like Alteryx that are super accessible, super powerful, and all the data in one place with scalable resources like Snowflakes, so they can actually make those business impacts. And I think the other part here is traditionally there’s a big barrier to entry. And if you can lower that barrier to entry by having a tool like Snowflake - and Snowflake is actually very easy - you don’t have to worry about administrative tasks and vacuuming and partitioning traditional database utilities. You essentially upload the data and it’s ready to go. And similarly with Alteryx, right, you don’t have to necessarily know how to write SQL queries or how to write Python scripts. If you want to use the no-code and low-code capabilities within Alteryx, you can. And then as your skill set gets better and better, well, then you can actually do those more complicated things in Alteryx as well. So it’s a low entry point and a high ceiling. In addition to the integrations we’ve built, we’ve also built out some blueprints and recipes. So depending upon what vertical or industry you’re in, we do have a lot of templates ready to go. So again, instead of sitting down in front of a blank slate, you can actually get started with some of these kits right off the bat and start solving your business problems right away.

ALEX 26:42

So you can go ahead and download the Snowflake Starter Kit and get some templates for my real-life use cases for my hypothetical brewery? [music]

MIKE 26:50

Exactly.

MADDIE 26:54

Thanks for walking me through that, Alex.

ALEX 26:56

Any time. This was fun. So, at the risk of sounding like a total cheeseball, cloud is for everyone, including all of you, listening to this. I truly believe that everyone in an enterprise can benefit from cloud in the same way everyone can use data. So for resources about all things cloud, check out our show notes at community.alteryx.com/podcast.

 


This episode was produced by Maddie Johannsen (@MaddieJ), and Mike Cusic (@mikecusic). Special thanks to @andyuttley for the theme music track, and @mikecusic for our album artwork.

Comments
MeganDibble
Alteryx Community Team
Alteryx Community Team

Great episode @AlexG and @MaddieJ!