This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Obviously on this sort of dataset it's pretty easy, but in the real world I have 20 million records (40GB) in the fact table, so after the join I have over 290 million records (600+ GB) and the next filter drops to 13 million records (28 GB).
The biggest problem is that I can't do the join "in DB" without some significant refactoring, so I'm hoping there's a solution that I haven't thought of yet.
Try Find and Replace. Find the Customer ID and Replace with the Date(s). The Date fields can be concatenated into one field and your filter can apply parsing logic to determine the dates. Or, preferably In-DB. You can also look to create surrogate keys in the DB to match the appropriate record (highly recommend).
Sorry Misunderstood parts of your issue, Find & Replace won't work in this scenario. In-DB would be your best option or alternatively you can try using a Calgary Join which should speed up the process greatly.