This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Many if not most supervised-classification problems involve some degree of class imbalance, where at least one class occurs more frequently than the others. The imbalanced-classification problem illustrates the value of approaching data-science problems as empirical (as well as formal) optimization problems, using techniques termed cost-sensitive learning. This post will show you how to do cost-sensitive binary classification.
Most real-world data-science design patterns combine several models to solve a single business problem. This post surveys the most common and effective techniques for combining models. Once you make it through this post (and its predecessors), you'll be ready to take on the design patterns we'll begin learning in 2017.
Cross validation (CV) is a difficult topic. There are many ways to do CV, and articles on the subject can be very technical. This blog post is a gentle introduction to CV. Read it and you'll find it much easier to understand later posts describing data-science design patterns that use CV.
The new Optimization tool has just come out in the Alteryx 10.6 Predictive Release (see Dr Dan's post on prescriptive analytics). With this single tool, we can solve linear programming, mixed integer linear programming and quadratic programming problems. The tool offers 3 input modes (manual input, file input and matrix input) to give you flexibility in defining your model. If you are a baseball fan and Alteryx fan, I have a great news for you! Today I'll show you how to use the Alteryx Optimization tool to solve a fantasy baseball daily lineup problem.