Alteryx workflow to write Dataset Metadata to a DCAT RDF file
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Introduction:
I am an Enterprise Data Architect. We want to exchange our data catalog with other entities, so we want to store our metadata in a machine-readable format so that we can exchange it. The W3C DCAT 3.0 standard is particularly suitable for this.
I am looking for an Alteryx workflow example that can write the metadata of a dataset to a DCAT 3.0 RDF file.
Need:
The workflow must perform the following steps:
- Read the metadata of a dataset.
- Convert the metadata to DCAT RDF.
- Write the DCAT RDF to a file with the extension .rdf.
Context:
The DCAT 3.0 standard is described here:
https://www.w3.org/TR/vocab-dcat-3/
- Labels:
- Datasets
- Preparation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The Field Info tool is your best friend: https://help.alteryx.com/current/en/designer/tools/developer/field-info-tool.html
As to DCAT RDF, I am not sure what it looks like. Do you have a sample? If it's delimited in some way or if it can be read by Notepad++, then what you can do is use the Output tool and choose ".csv", but change the extension in the output string to .rdf or its appropriate extension.
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Below an example of a DCAT description of a Customer Dataset with additionally a defintion of the Customer Concept used in the dataset.
@prefix rdf: <https://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix dct: <http://purl.org/dc/terms/>
@prefix dcat: <http://www.w3.org/ns/dcat#>
@prefix skos: <http://www.w3.org/2004/02/skos#>
<https://example.com/customer-dataset>
a dcat:Dataset ;
dct:title "Customer Dataset" ;
dct:description "A dataset containing information about customers" ;
dcat:keyword "customer, sales, marketing" ;
dcat:publisher <https://example.com/organization> ;
dcat:distribution [
a dcat:Distribution ;
dcat:downloadURL <https://example.com/customer-dataset.csv> ;
dcat:mediaType "application/csv" ;
dcat:format [
a dcat:MediaType ;
dcat:name "CSV" ;
dcat:extension ".csv" ;
] ;
] ;
dcat:landingPage <https://example.com/customer-dataset> ;
dcat:license <https://creativecommons.org/licenses/by/4.0/> ;
dcat:theme skos:Theme [
a skos:Concept ;
skos:prefLabel "Customer Management" ;
] ;
dcat:subject skos:Concept [
a skos:Concept ;
skos:prefLabel "Customer" ;
skos:broader skos:Concept [
a skos:Concept ;
skos:prefLabel "Person" ;
] ;
].
Here is a breakdown of the DCAT file:
- The @prefix directives declare the prefixes for the RDF namespaces used in the file.
- The <> syntax is used to define IRIs (Internationalized Resource Identifiers).
- The a keyword is used to declare a type.
- The ; keyword is used to separate properties.
- The dct:title, dct:description, dcat:keyword, dcat:publisher, dcat:distribution, dcat:landingPage, dcat:license, dcat:theme, and dcat:subject properties are used to describe the dataset.
- The skos:prefLabel property is used to define the preferred label for a concept.
- The skos:broader property is used to define a broader concept for a concept.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Okay then my previous response may not be valid anymore. DCAT RDF's output file type looks very foreign to me, and I don't think my recommendation works.
Can this be done with Python or R? If yes, you can port over the script into Alteryx via the R or Python tools and output accordingly as well.
Not sure if relevant: https://community.alteryx.com/t5/Alteryx-Server-Discussions/Is-it-possible-to-execute-Alteryx-workfl...
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for your support, I really appreciate this. I'm a little bit surprise that there is no info on RDF formats available in Alteryx. DCAT and SKOS are W3C standards and widely used.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Perhaps someone from Alteryx can chip in here. This is beyond me as a community member to answer, but if others ave experienced it, then I think their opinions value more than mine in this case.
Alternatively, if Alteryx really doesn't have anything on this, you can suggest this as an idea. You can also go through your CSM as part of the Voice of the Customer initiative they have. Your CSM or even Alteryx assigned engineer is better able to answer you in that regard.
Hope this helps somewhat @pgrooten
Alteryx ACE
https://www.linkedin.com/in/calvintangkw/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks! Good advice! I will contact my CSM for this.
