Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Download node returns orders of magnitude less data than expected

tbergama
5 - Atom

I am using the Download node to download a csv from a url and I am getting ~1500 rows when I am expecting ~400,000.

 

My workflow (attached) has the following nodes connected in the following sequence:

 

1) Text Input

Contains the request url.

 

2) Download

Downloads the csv from the request url using default settings.

 

3) Text to Columns

Breaks the 'DownloadData' column into rows, using newline as the delimiter.

This node results in 2123 rows, ~400,000 expected.

 

4) Text to Columns

Breaks the 'DownloadData' column into columns, using comma as the delimiter.

This node results in 1672 rows, input was 2123 rows.

 

Doing the equivalent in Python, I get the expected ~400,000 rows, so it doesn't seem to be the API.

 

 

import pandas as pd

url = "http://cdec.water.ca.gov/dynamicapp/req/CSVDataServlet?Stations=MRZ&dur_code=E&SensorNums=1&Start=2000-01-01"

# Download csv
df = pd.read_csv(url)

# Check length
len(df)
Out[4]: 399788

 

 

Any idea what is going on here? Why aren't I getting the full csv? What is happening to those rows in between nodes 3 and 4? 

 

2 REPLIES 2
DiganP
Alteryx Alumni (Retired)

@tbergama Its actually pulling in the full dataset, you just have to add the browse tool. The browse tool allows you to take a look at the full dataset. You can also use an ouptut data tool to write it out to a flat file. 

DiganP_0-1579042235601.png

Digan
Alteryx
tbergama
5 - Atom

@DiganP That makes a lot of sense. Thanks.

Labels