Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Data Science

Machine learning & data science for beginners and experts alike.
SydneyF
Alteryx Alumni (Retired)

During my time as a graduate student, I learned to be pretty cheap frugal. This is a habit I haven’t been entirely able to shake off. Fortunately, I believe I can leverage my learned thriftiness for good.

 

Learning a new skill or trade can be expensive. Luckily in the age of the internet, it doesn’t have to be. Here is an index of no-cost resources for learning data science concepts and skills. I hope you find them helpful!


lecture.pngCourses in Mathematics, Machine Learning, Programming, and Data Science

 

Having a solid foundation in mathematics and computer science will only make you a stronger analyst and data scientist. If you need to brush up on some concepts, or even get exposed to them for the first time, many companies and universities have posted courses around mathematics and machine learning. 

 

MIT’s Open Courseware is a massive library of many different courses taught at MIT. Many of the classes can be very useful for brushing up on mathematics and computer science. There are two sections of courses particularly worth checking out.

  • Mathematics > Probability and Statistics
  • Engineering > Computer Science > Algorithms and Data Structures / Artificial Intelligence / Data Mining

Kahn Academy is a non-profit educational organization dedicated to creating online resources for students. There are some posted mathematics courses that you might find helpful:

Three Blue 1 Brown is an awesome collection of videos posted on YouTube, focusing primarily on mathematical concepts. The Neural Networks series is particularly excellent.

 

Learn with Google AI is another vast catalog of resources for machine learning, including tutorials, videos, documents, and courses. Google also offers a Python course designed for people with a little bit of programming experience interested in Python.

 

Amazon AWS recently made all of their online training courses free to take. I recommend checking out the Machine Learning section. There are multiple curated paths under Machine Learning, including a Data Science path and a Developer path (more focused on data engineering). The curated paths can provide a template for learning the skills to become a data scientist.

 

Andrew Ng’s popular Machine Learning course offered through Coursera has two options: you can audit the course for free, or purchase the course and earn a certificate. This is widely considered to be a thorough introductory course in machine learning.

 

Another great open-source course is Practical Deep Learning for Coders. Put on by fast.ai, the goal of this organization is to make AI accessible for everyone (hence the slogan “Making neural nets uncool again”). This course also has an associated forum of weekly-challenge style posts for hands-on practice.

 

blog.pngBlogs 

 

A popular way to get into data science is through blogs (both reading and writing them). Here are a couple of blogs I’ve found useful or interesting.

 

KD Nuggets is a massive blog-aggregator. There is always something new to find here.

 

R-bloggers is another blog-aggregator, focusing on analysis, tutorials, and examples in the R programming language.

 

Kaggle's No Free Hunch highlights data science news, as well as interviews from Kaggle competition (more details under the hands-on practice section) winners, and data analysis highlights posted on Kaggle.

 

Medium's Towards Data Science features articles on different aspects of data science from a large number of individual contributors. Some articles are great; some articles are less great. 

 

book.pngBooks (and Papers)

 

Books are great. Consider looking for them at your local library or searching for open-source copies online.

 

If you’d like to learn R, the book The Art of R Programming will give you a strong foundation.

 

Think Python (available for Python 2 or Python 3) is written for people with no coding background that would like to learn Python as well as how to think like a computer scientist (there is a sister book called Think Java, written for the Java programming language). 

 

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems is an excellent book that includes hands-on exercises (in Python) as well as an overview of many popular machine learning algorithms.

 

For a more extensive list of data science books, check out this Github page with 40 contributors.

 

Papers are a great way to understand what areas of analytics are being actively researched. Two Minute Papers is a YouTube channel that reviews research papers in around two minutes.

 

If you’d prefer to read than listen, this curated index of technical papers might be right up your alley (complete with ratings).

 

hand.pngHands-on Practice and Datasets

 

For many people, the best way to learn is by doing. Step one to trying hands-on data science is getting data. The following are a few suggestions on where to get data or hands-on practice.

 

The UCI Machine Learning Repository is a database of datasets that have been used for research in AI and machine learning.

 

Google AI Datasets is another repository of datasets used for research in a wide range of computer science disciplines.

 

If you haven’t heard of Kaggle, you’re missing out. Kaggle is a site for online data science competitions. A benefit of Kaggle is that in addition to posting datasets to analyze (you don’t have to submit to a competition if you don’t want to), you can also learn from how other users are approaching different problems.

 

Finally, there's a thread on the Alteryx Community that started in 2015 and is actively updated with good freely available data.

 

Another (recommended) option is to make your own dataset! You may have already heard the statistic that data collection and preparation makes up about 80% of most data science work – a fabulous way to take data science learning head-on is to gather and prepare your own data for analysis. Don’t forget to post your process and code on your GitHub, and write about your results somewhere (maybe even submit it for publication here!).

 

self-promotion.gifAlteryx

 

Of course, I would be remiss if I didn’t mention the many resources spread across the Community. The Alteryx Academy has interactive lessons and weekly challenges. The DesignerServerConnect and Promote knowledge bases are invaluable for learning all things Alteryx.  


This list is certainly not comprehensive, but hopefully it provides resources that you find helpful on your path to data science mastery. If you've come across a resource you found to be particularly helpful in your journey, please post it in the comments below!

Sydney Firmin

A geographer by training and a data geek at heart, Sydney joined the Alteryx team as a Customer Support Engineer in 2017. She strongly believes that data and knowledge are most valuable when they can be clearly communicated and understood. She currently manages a team of data scientists that bring new innovations to the Alteryx Platform.

A geographer by training and a data geek at heart, Sydney joined the Alteryx team as a Customer Support Engineer in 2017. She strongly believes that data and knowledge are most valuable when they can be clearly communicated and understood. She currently manages a team of data scientists that bring new innovations to the Alteryx Platform.

Comments
SydneyF
Alteryx Alumni (Retired)
papalow
8 - Asteroid

@SydneyF Thanks for you post.  I have already started with Google's Python course.  

ankitdixit
5 - Atom

Hey SydneyF, I am also start learning data science and  I am very happy to it. I found one resouces where i have find best Data Science courses & tutorials submitted and voted by the programming community. Courses for Mathematics and Statistics required for Data Science are also included here.

Thanks,

Ankit Dixit

NeilR
Alteryx Alumni (Retired)
karengracias
5 - Atom

Thank you so much for sharing your inputs Sydney. Adding some more to help others out - 

 

Free edX Data Science Courses - https://www.edx.org/course/subject/data-science

 

Free Coursera Data Science Courses - https://www.coursera.org/courses?query=free%20courses%20data%20science

 

Free Udacity Data Science Course - https://www.udacity.com/course/intro-to-data-science--ud359

 

Free Digital Defynd Data Science Courses - https://digitaldefynd.com/best-data-science-certification-course-tutorial/

raghav_poet_writer
7 - Meteor

A fantastic article which I have bookmarked and will be frequently dipping and diving into. I am at the very beginning of my journey so the introductory resources you have mentioned such as the "towards data science" blog and the think python book are likely the ones I'll begin with. Also have been hearing a number of good things about Kaggle and I've heard you can jump right in even as an absolute beginner. They apparently even have learning resources and a problem solving learning path.

 

For now my focus is on Alteryx and predictive analytics. I have just begun my Alteryx core certification journey as part of the ADAPT program, so well the course resources will be my primary focus for now.

 

Onward toward growth!

saptarshimisra
5 - Atom

Hey thank you for this post. This is a great collection. I found one recently published as well. Sharing with all of you.

 

Free course on Codegigs: https://www.codegigs.app/free-data-science-course/

This contains a lot a explanations and code shared as well. I found this website to be also sharing many interesting data science related posts as well

 

Hoping this will help other readers too. 

Thank you for reading

DemandEngineer
8 - Asteroid

Love it. Thank you so much for posting this.  

 

Data Science is hard but anything worthwhile is.  There are so many methods, algorithms, models... it hard to organize everything we learn.  So I put this data science method map/framework together to help me and others.

 

I'd like to crowdsource the updating of this shared spreadsheet.  There is a column for Alteryx Tools for each method.  I'm calling on all of you to help populate and update this... to pay it forward and help others on this learning journey.

 

Post about using the Method Map:

https://marketingoptimized.wixsite.com/guide/post/data-science-method-map-navigate-the-dizzying-plet...

 

Method Map Sheet:

https://docs.google.com/spreadsheets/d/1y8uK2UvM2Raoq17X7WcZkMniMIBQOfJyVLIDRz-p2co/edit?usp=sharing