The Issue
Currently, there is not a clear explanation on what the real difference between String and V_String is. The documentation states, "It is more efficient to store strings as variable-length strings." So, why use the String data type?
You can open up the attached workflow, and follow along.
Test
A CSV file containing fixed-length, 4-digit account numbers was saved as four different YXDB files:
- With the data type V_String 255
- With the data type V_String 4
- With the data type String 255
- With the data type String 4
The files were then compared by file size to see the difference.
PowerShell script to get file sizes:
Get-ChildItem "*.yxdb" | ForEach-Object {
Write-Host ([string]::Format("{0}`t{1}", $_.Length, $_.Name))
}Results
Here is the PowerShell output for the file sizes:
53815 String 255.yxdb
30527 String 4.yxdb
49549 V_String 255.yxdb
49545 V_String 4.yxdb
The String 4 data type was the smallest file at 31 KB. The V_String data type files were both 50 KB. And the largest of the four was the String 255 data type file.
So, when you have a field with a fixed number of Latin-1 characters, the String data type set to the number of characters (i.e. String 10) is the most efficient. Also, limiting the max length of the V_String only has a negligible effect on the file size, so it is best to leave the max value high to avoid truncating strings.