I am once again asking for your tech support.
I'm working through a huge dataset and trying to match codes from individual lines to another file that has the other portion of the data using a series of tools. The hangup with this dataset is a join tool that comes immediately after another.
The first join tool doesn't struggle too much, but when the output from that join tool has to join with the next set of information the amount of records it's joining becomes 3.2billion records. Which becomes even more in the next join. The junk gets trimmed out later in the workflow, but with the amount of data I'm trying to process here I keep using up over 500gb of disk space in temp storage and the workflow can no longer run.
I inherited this workflow when I joined this team, I've made alterations over time but I don't know how to fix this issue. And it would be illegal for me to share the data, which I know makes this a hard ask.