This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Many if not most supervised-classification problems involve some degree of class imbalance, where at least one class occurs more frequently than the others. The imbalanced-classification problem illustrates the value of approaching data-science problems as empirical (as well as formal) optimization problems, using techniques termed cost-sensitive learning. This post will show you how to do cost-sensitive binary classification.
Most real-world data-science design patterns combine several models to solve a single business problem. This post surveys the most common and effective techniques for combining models. Once you make it through this post (and its predecessors), you'll be ready to take on the design patterns we'll begin learning in 2017.
Cross validation (CV) is a difficult topic. There are many ways to do CV, and articles on the subject can be very technical. This blog post is a gentle introduction to CV. Read it and you'll find it much easier to understand later posts describing data-science design patterns that use CV.
The recent 10.6 Predictive Release includes the introduction of the Prescriptive Category. This blog post will demonstrate different uses/configurations of the 3 simulation tools (Simulation Sampling, Simulation Scoring, Simulation Summary) via an example use case. This example can be downloaded from thePredictive Districthere.
At the time I'm writing this, we are focused on putting the finishing touches on the 10.6 release (Now available here). Many of the new capabilities that are being introduced with this release are focused on advanced analytics. We are particularly excited about the introduction of four new tools that are focused on prescriptive analytics.