So, I had to build a data set of Congressional marks from some really low quality OCR images.
See chart below.
My question involves the last two rows with the value Committee Recommended under WHAT -- and specifically the second to law row labeled "Need Calculated Version". The last row is generated from the OCR scan, however, it is full of errors. So I need to generate the second to last row and compare against the last row to see if there is a difference . If the value is >< 0 than there is an OCR error that needs to be corrected. What is the simplest way to do this.
This is just the walk-down for one line, the real data has hundreds of lines with similar walk downs.
| | ID | NAME | WHO | WHAT | TXT | AMOUNT |
| HARD CODED | N0181-07-02-4162-A | SLDTA | PBR | Budget Request | Request | 101595 |
| HARD CODED | N0181-07-02-4162-A | SLDTA | HAC | Adjustment | Increase X | -6385 |
| HARD CODED | N0181-07-02-4162-A | SLDTA | HAC | Adjustment | Decrease Y | -3499 |
| HARD CODED | N0181-07-02-4162-A | SLDTA | HAC | Sum Adjustments | Sum of all... | -9884 |
| NEED CALCULATED VERSION | N0181-07-02-4162-A | SLDTA | HAC | Committee Recommended | HAC Recommended | 91,711 |
| ALREADY HARD CODED | N0181-07-02-4162-A | SLDTA | HAC | Committee Recommended | HAC Recommended | 91,711 |