Are you 100% percent sure if those are the same data sets?
From my own experience when I have an issue with something I always try to limit the reasons why we got an issue.
A number of lines are the same? Check
A number of columns are the same? Check
Did I check if each of those positions is exactly the same by a joining tool?
The truth is the data is the data. If they are exactly the same the only reason why the data seize would differ is the character type. Some of the data can be saved with strings with 7 signs and the same column in the second data sets can be saved with string 30 signs.
From the top of my head, I don't recall any more reasons why the file sizes can be different except file format but I believe file format is the same.