Hello Experts,
I have 100,000 +rows of excel data that needs to be parsed into rows and columns. To distinguish data in the rows - it is indicated by page numbers (ex: page 1 of 700).
Below is the input data and the expected output:
Page 25 of 700 | |
EffDt <= 4 day | RULE NAME |
Left Right | |
Criteria = 1 Criteria = 1 | |
( Column Op Value ) Logical ( Column Op Value ) Logical | |
Debit or = C AND Debit or = D AND | |
Credit Credit | |
Reference 4 = 856 Reference 4 = 859 | |
Left Match Group Right Match Group | |
Tolerance Tolerance | |
effectiveDate | |
Left Item Additional Op Right Item Additional Tolerance | |
Info (L) Info (R) | |
Effective DateYYYYMMDD <= Effective Date YYYYMMDD 4 Days | |
No | Exception Combo Rule |
status | |
Page 26 of 700 | |
Many DR Contract # | MATCH RULE NAME |
Left Right | |
Criteria = 1 Criteria >= 1 | |
( Column Op Value ) Logical ( Column Op Value ) Logical | |
Debit or = C AND Debit or = D AND | |
Credit Credit | |
Reference 4 = 086 AND Transaction = DW AND | |
Type | |
Reference 1 <> Reference 7 <> | |
Left Match Group Right Match Group | |
Tolerance Tolerance | |
Left Item Additional Op Right Item Additional Tolerance | |
Info (L) Info (R) | |
Reference 1 = Reference 7 | |
Effective DateYYYYMMDD = Effective Date YYYYMMDD 30 Days | |
No | Exception Combo Rule |
status | |
Page 27 of 700 |
Output:
RULE NAME | Left Criteria | Column | Op | Value | Logical | Right Criteria | Column | Op | Value | Logical | Left Match Group Tolerance | Right Match Group Tolerance | Left Item | Additional Info (L) | Op | Right Item | Additional Info (R) | Toleranace | Exception Status | Combo Rule |
EffDt <= 4 day | .=1 | Debit or Credit | = | C | AND | .=1 | Debit or Credit | = | D | AND | effectiveDate | Effective Date | YYYYMMDD | <= | Effective Date | YYYYMMDD | 4 Days | No | ||
Many DR Contract # | .=1 | Debit or Credit | = | C | AND | >=1 | Debit or Credit | = | D | AND | ||||||||||
Many DR Contract # | Reference 4 | = | *086 | AND | Transaction Type | = | DW | AND | Reference 1 | = | Reference 7 | |||||||||
Many DR Contract # | Reference 1 | <> | Reference 7 | <> | Effective Date | YYYYMMDD | = | Effective Date | YYYYMMDD | 30 Days | No |
Hi @geeklarokcmie. This is an interesting problem I began to solve but notices that page 26 information appears to appear in 3 seperate rows in the output where has page 25 only occupies one row. Can you please explain the logic to why this is the case?
Regards - Rhys
Appreciate your support in finding a resolution.
The source data is from a txt file that I converted to excel using Power Query.
The data in each page is not consistent - one page may have three rows of data under the column op value section or the additional info section
while other page may only have a single row of data under the columns op value section or the additional info section.
Let me know if this information was helpful.
@geeklarokcmieThank you for your reply: would be very tricky creating parsing logic if the data per each page is inconsistent. We might have to look into parsing the txt file directly itself. Could you share this, or a sample of it, please.
Can you put up the original txt file preview of the same data. The data is broken into multiple rows not by logic but by word, making it difficult to put it together.
@Rhys_Cooper @Gaurav_Dhama_ Please find the attached txt file.