Creating Alteryx database yxdb.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am looking to create a Alteryx database repository. I am aware that the size limits for a yxdb file to grow for a 64bit system is 2GB. This limit will hit soon as i am appending data on to the file. What option do i have in terms of growing the file after that? Does Alteryx or Alteryx server provide any capability of holding file which can grow more than 2 GB
Some thoughts i had were:-
- Creating seperate yxdb files each time limit is hit and while querying data i union the file together? Will that hit performance a lot?
- Move the repository to the SQL databases?
Any suggestion or feedback would be great..
Thank you..!!
Solved! Go to Solution.
- Labels:
- Output
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Here are some file splitting options:
- Logically partition the data
- Partition the data by quantity of records
If you know the size of a record, you can divide that size into 2 GB and set a maximum # per file. When you output the data, there is a record limit option. When reading the file(s) into an input tool, you can wildcard the input and read all files. There won't be a performance hit; other than reading more than 2GB of data. I do wonder if you'll really need to read all the data all of the time. That's why I suggested the logical partitions instead of a record count.
If you embed the filename with your partition logic, you can use a directory tool to read all possible filenames, filter to the sets of data that include your desired input and take only those files into a dynamic input tool. I like this option.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Will Logically partitioning the data in output create a separate file automatically once the data size limit of 2GB is reached?
I am trying to achieve a Alteryx database which would ideally append data with course of time from which we can always query only the data that is required.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Example:
I have a file of 18,000 records that looks like 700KB inside of Alteryx. When I output the file, it sits on 180KB of disk (great compression in the yxdb file format). Each record is roughly 10 Bytes in size. If I divide 2,000,000,000 (approximate #) by 10, I can get roughly 200 million records into that file. That is the configuration (Option #1: Record Limit) that I would use to make each yxdb no more than 200,000,000 records.
If the logical partitioning runs the risk of exceeding the limit (maybe more specific logic is needed), you may still need to include a record limit. When you are reading so many records into the input, I hope that you would consider options for faster read/write operations.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Before devising a partitioning strategy, you may want to further investigate the 2GB limit. As far as I'm aware for all practical purposes, a YXDB can be as large as your disk space will allow for. We commonly produce YXDBs well into the dozens of GBs and have never ran into an issue.
Best regards,
Ryan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I agree that I've never experienced an issue with large files. Perhaps the help file is in need of an update?
Alteryx Data Files
.yxdb (Alteryx Database)
The Alteryx database format (.yxdb) is a file type that can hold data fields, values, and spatial objects . The .yxdb format is the most efficient file type for reading and writing in Alteryx because it has no size limit, is compressed for maximum speed, and includes additional metadata that references the source of the data and how the data was created. While there is no limit to the number of records or file size, there is a record size limit of 2GB in a 64-bit environment.
See Alteryx Database File Format for more information.
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ahh @MarqueeCrew, this makes more sense - each individual record (IE: Row) cannot exceed 2GB in size while the YXDB itself has no size limit. It would take an incredibly wide table or very high resolution spatial object to hit the row size limit.
I agree there may be opportunity for a bit of extra wording in the help file to reduce any confusion.
Best regards,
Ryan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
as @rdoptis points out, don't worry about the total size of the file. I too read that wrong. I've never concerned myself over the size of a yxdb. But if I am reading and writing to the same yxdb and growing it over time, my ADHD will be kicking in and waiting for it to read (especially if over a network) would cause me to twitch. Also, if you don't have backups available you might wish you did at some point in the future.
Alteryx is wonderful for getting you the results that you want quickly. Sometimes, however, architecture for the future is a necessary component.
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you @rdoptis and @MarqueeCrewfor clearing this out for me, it would be a great help to plan things now. Even I was a little confused with the size limits.
