Need help. I extracted a 500 page price catalog using the Computer Vision tools. Now I need to get the trapped data.
Reference the image below.
- I want to extract CODE (column 2) and its associated price (column 3). The "$" is the key to finding the CODE and dollar amount pair.
- I want to extract the name of an item (if it is present). The name of an item will always appear in the row preceding the row that contains the "$" and the name of the item will always be surrounded by "-" as in "- 3000 HS -" in the image.
- Ideally, I also want to pick up the second value pair that appears in the row after the "$". This will usually be a +/- value pair, but it may be just a single +/-
- In a perfect world, I also want to capture all the text (the item description) at the start of the line with the "$" and all lines until the first line that starts with a REQS.
Clear as mud, right?
Sample workflow attached with real data.
