Hi there, I am trying to extract the invoice number from a bank statement line using the regex tool but cannot figure it out. The invoice number generally starts with 9, and one statement line can have multiple invoice numbers.
Example 1
Input: 343319249320230203001340619062958368/PT/FT/PY/9062958368/OA/USD326306,10/BI/816467004/BN/ABC Ltd. /Country/OB/3582088658001/OB1/Bank Name (Bank Country)B3/FLOOR 32, STANDARD CHARTERED BANKBU/OB4/ILDING 4-4A DES VOEUX ROAD/BO...
Output: 9062958368
Example 2:
Input : 343319020720230203001352969860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/PT/FT/PY/9860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/OA/USD12321333,23/BI/816467004/BN/BNK COUNTRY/ bANK NAME
Output: 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363
Solved! Go to Solution.
You can use regular expressions to extract the invoice numbers that start with 9 from the input strings. Here's a Python code snippet that demonstrates how you can do this:
import re
def extract_invoice_numbers(input_line):
# Use regex pattern to match all numbers starting with 9 that are 10 digits long
pattern = r'\b9\d{9}\b'
invoice_numbers = re.findall(pattern, input_line)
return ' '.join(invoice_numbers)
# Example usage
input_line1 = "343319249320230203001340619062958368/PT/FT/PY/9062958368/OA/USD326306,10/BI/816467004/BN/ABC Ltd. /Country/OB/3582088658001/OB1/Bank Name (Bank Country)B3/FLOOR 32, STANDARD CHARTERED BANKBU/OB4/ILDING 4-4A DES VOEUX ROAD/BO..."
output1 = extract_invoice_numbers(input_line1)
print(output1) # Output: 9062958368
input_line2 = "343319020720230203001352969860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/PT/FT/PY/9860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/OA/USD12321333,23/BI/816467004/BN/BNK COUNTRY/ bANK NAME"
output2 = extract_invoice_numbers(input_line2)
print(output2) # Output: 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363
Here, the regular expression pattern \b9\d{9}\b is used to match numbers that are 10 digits long and start with the number 9. You can modify the pattern if the invoice numbers have different lengths or other specific characteristics.
Hi @sparksun, Any way to get the multiple invoice numbers into 1 line separated by a comma?
Summarize tool is easy to make it.
@sparksun , Thanks but anyway to do it if I have to process over 100 statement lines?