This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 01-05-201703:06 PM - edited on 03-08-201901:06 PM by SydneyF
What are some "Small Data Sets" available over the internet?
Small data is data that is small enough size for human comprehension. A few thousand lines of credit data or marketing segmentation example data, B2B client contact history of a firm are some examples...
Kaggle has started a section called Kaggle Datasets, that has public datasets that you can use as datasets for the competitions were often restricted for use outside the competition. https://www.kaggle.com/datasets
Kaggle also has scripts for processing the given data sets: https://www.kaggle.com/scripts, which are usually in R or Python. It can be instructive to look at those and discern which parts can be pulled into standard Alteryx tools, and which parts left to a custom R call, for instance. The nice thing is that, once you've finished, you can submit your output to the relevant Kaggle competition (even after the fact) to see how your output stacks up to the competition.
Here is an addition from Europe...http://open-data.europa.eu/en/data/ "The European Union Open Data Portal is the single point of access to a growing range of data from the institutions and other bodies of the European Union (EU). Data are free for you to use and reuse for commercial or non-commercial purposes. By providing easy and free access to data, the portal aims to promote their innovative use and unleash their economic potential. It also aims to help foster the transparency and the accountability of the institutions and other bodies of the EU."