I have an architecture setup where IoT data is flowing from Microsoft IoT Central to an Azure Data Lake (i.e. a Blob Storage Gen 2).
The data export functionality is built-in in and easy to set up using a simple IoT Central configuration.
When configured, AVRO-files (I assume) are continuously sent to the Data Lake. Folder files are all created, named and appended automatically, in this case into a blob container named data.
All file paths are according to the pattern shown in the Screen Shot attached where the first string (after data / ) is auto-generated.
New folders are added automatically every full minute and hence every file holds a number of rows, all with timestamps within that specific minute.
I would like to aggregate data from these files on some longer time interval. It is hence not as clear-cut as pointing to one single .cvs file on a known fixed file path.
So, how do I use wild cards for the folder structure as above? How do I trigger/schedule a read and data aggregation "job" in Designer?
I have downloaded and installed Azure Data Lake File Import module and I am using that. Is that what is recommended?
Solved! Go to Solution.
@tobiaspegbg , you can use the Directory tool to point to the "data" folder and use the "include subdirectories" option to expose all the files regardless of the subfolders they live in (which to my understanding is the autogenerated portion of the file path). From here you can use a filter to include only the files that were created after a certain time frame and then use a Dynamic Input tool to input the appropriate files (assuming that the structure/format is the same for all files). see the attached workflow.
Regarding scheduling, I can confirm that you CANNOT schedule processes with Designer. You'll need to run the process manually (click the 'run' button) unless you have access to Server, which is required for scheduling workflow runs/processes.
Regarding Azure Data Lake File Import, I can confirm that this connector will NOT have flexible functionality to read in multiple files at once like the Directory tool at this time.
This community post seems to have a workaround in a batch macro that might be able to help you
The example in the attached workflow can work if you can somehow get the files from the Data Lake onto your machine for use with the Directory tool.
Let me know if there's anything above I can clarify for you further.
If this resolves or gives further guidance on how to solve your issue, please mark this post as the solution so that other's in the community can benefit from our collaboration.
Thanks.
Thanks Andrew, I much appreciate it.