Alteryx Designer Desktop Discussions

tbergama · ‎01-14-2020

I am using the Download node to download a csv from a url and I am getting ~1500 rows when I am expecting ~400,000.

My workflow (attached) has the following nodes connected in the following sequence:

1) Text Input

Contains the request url.

2) Download

Downloads the csv from the request url using default settings.

3) Text to Columns

Breaks the 'DownloadData' column into rows, using newline as the delimiter.

This node results in 2123 rows, ~400,000 expected.

4) Text to Columns

Breaks the 'DownloadData' column into columns, using comma as the delimiter.

This node results in 1672 rows, input was 2123 rows.

Doing the equivalent in Python, I get the expected ~400,000 rows, so it doesn't seem to be the API.

import pandas as pd

url = "http://cdec.water.ca.gov/dynamicapp/req/CSVDataServlet?Stations=MRZ&dur_code=E&SensorNums=1&Start=2000-01-01"

# Download csv
df = pd.read_csv(url)

# Check length
len(df)
Out[4]: 399788

Any idea what is going on here? Why aren't I getting the full csv? What is happening to those rows in between nodes 3 and 4?

DiganP · ‎01-14-2020

@tbergama Its actually pulling in the full dataset, you just have to add the browse tool. The browse tool allows you to take a look at the full dataset. You can also use an ouptut data tool to write it out to a flat file.

Digan
Alteryx

tbergama · ‎01-14-2020

@DiganP That makes a lot of sense. Thanks.

Alteryx Designer Desktop Discussions

Download node returns orders of magnitude less data than expected

Re: How to select columns dynamically using number...

Re: Issue when using Block Until Done and Power BI...

Example workflow for setting up a custom list to u...

Re: Firm names parse

Re: Help with Multi-Row formula