I am relatively new to Alteryx and thus far been successful with most data parsing tasks. I have recently started to experiment with some public data available from https://download.bls.gov/pub/time.series/ap/ and it has posed a challenge for me.
The file in question is located here: https://download.bls.gov/pub/time.series/ap/ap.series
This file is a challenge because the "field" called series_title is filled with individual elements that are separated by commas.
It is difficult for me to determine how best to read this file successfully using the "download" tool and properly parsing that particular LONG field. If it has been "quoted" on both ends, it would be easier.
The data from that file looks like the following:
series_id area_code item_code series_title footnote_codes begin_year begin_period end_year end_period
APU0000701111 0000 701111 Flour, white, all purpose, per lb. (453.6 gm) in U.S. city average, average price, not seasonally adjusted 1980 M01 2017 M10
APU0000701311 0000 701311 Rice, white, long grain, precooked (cost per pound/453.6 grams) in U.S. city average, average price, not seasonally adjusted 1980 M01 1981 M12
APU0000701312 0000 701312 Rice, white, long grain, uncooked, per lb. (453.6 gm) in U.S. city average, average price, not seasonally adjusted 1980 M01 2017 M10
So my question is how would a person successfully read this file structure directly from the web?
I've included the experimental workflow I constructed for reading this file. Clearly things go bad once it gets to the seriestitle field for obvious reasons. Hopefully this is an interesting puzzle to someone. I'm curious how to best solve this scenario since I can imagine it will occur from time-to-time.
Thanks in advance for comments and suggestions.
-Stew