This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I'm trying to create a workflow that can do the following for my data:
Data Quality Assurance: Checking for accuracy - Completeness (i.e. Rate of missing and "null" values); and Correctness (i.e. invalid codes, duplicates. invalid dates, out of range, outliers, and extreme observations)
Data Cleansing / Preparation: Handling missing and "null" values, invalid codes, invalid dates, out of range, outliers, and extreme observations.
The data involved is from multiple sheet excel files with a few common columns but different schema.