The Issue
Currently, there is not a clear explanation on what the real difference between String and V_String is. The documentation states, "It is more efficient to store strings as variable-length strings." So, why use the String data type?
You can open up the attached workflow, and follow along.
Test
A CSV file containing fixed-length, 4-digit account numbers was saved as four different YXDB files:
The files were then compared by file size to see the difference.
PowerShell script to get file sizes:
Get-ChildItem "*.yxdb" | ForEach-Object {
Write-Host ([string]::Format("{0}`t{1}", $_.Length, $_.Name))
}
Results
Here is the PowerShell output for the file sizes:
53815 String 255.yxdb
30527 String 4.yxdb
49549 V_String 255.yxdb
49545 V_String 4.yxdb
The String 4 data type was the smallest file at 31 KB. The V_String data type files were both 50 KB. And the largest of the four was the String 255 data type file.
So, when you have a field with a fixed number of Latin-1 characters, the String data type set to the number of characters (i.e. String 10) is the most efficient. Also, limiting the max length of the V_String only has a negligible effect on the file size, so it is best to leave the max value high to avoid truncating strings.
Try your test using strings of varying lengths.
The String data type will reserve a static amount of space for the each occurrence of the field (each "cell"), even if the data in the field doesn't not take up all of the space.
Using String(10) when a field holds a maximum length of 4 characters will "waste" the storage space of 6 characters.
V_String adjusts the storage space to match the actual length of each "cell". (or maybe it adjusts to match the maximum length of all values in the field, not cell-by-cell?)
Use V_WString if any of your "cells" will hold unicode data (2 bytes per character).
Chris
User | Count |
---|---|
18 | |
15 | |
13 | |
9 | |
8 |