Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Extract 10 digit value from a string seperated by "/"

Ahmediqbal
7 - Meteor

Hi there, I am trying to extract the invoice number from a bank statement line using the regex tool but cannot figure it out. The invoice number generally starts with 9, and one statement line can have multiple invoice numbers. 

Example 1
Input: 343319249320230203001340619062958368/PT/FT/PY/9062958368/OA/USD326306,10/BI/816467004/BN/ABC Ltd. /Country/OB/3582088658001/OB1/Bank Name (Bank Country)B3/FLOOR 32, STANDARD CHARTERED BANKBU/OB4/ILDING 4-4A DES VOEUX ROAD/BO...

Output: 9062958368

Example 2: 
Input : 343319020720230203001352969860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/PT/FT/PY/9860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/OA/USD12321333,23/BI/816467004/BN/BNK COUNTRY/ bANK NAME

Output: 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363

5 REPLIES 5
sparksun
11 - Bolide

regex.jpg

emma_Wilson
6 - Meteoroid

You can use regular expressions to extract the invoice numbers that start with 9 from the input strings. Here's a Python code snippet that demonstrates how you can do this:

import re

def extract_invoice_numbers(input_line):
    # Use regex pattern to match all numbers starting with 9 that are 10 digits long
    pattern = r'\b9\d{9}\b'
    invoice_numbers = re.findall(pattern, input_line)
    return ' '.join(invoice_numbers)

# Example usage
input_line1 = "343319249320230203001340619062958368/PT/FT/PY/9062958368/OA/USD326306,10/BI/816467004/BN/ABC Ltd. /Country/OB/3582088658001/OB1/Bank Name (Bank Country)B3/FLOOR 32, STANDARD CHARTERED BANKBU/OB4/ILDING 4-4A DES VOEUX ROAD/BO..."
output1 = extract_invoice_numbers(input_line1)
print(output1)  # Output: 9062958368

input_line2 = "343319020720230203001352969860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/PT/FT/PY/9860868418 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363/OA/USD12321333,23/BI/816467004/BN/BNK COUNTRY/ bANK NAME"
output2 = extract_invoice_numbers(input_line2)
print(output2)  # Output: 9062990141 9062990142 9860868760 9070853746 9070853747 9070853748 9860868677 9860868678 9860868679 9860866363

 

Here, the regular expression pattern \b9\d{9}\b is used to match numbers that are 10 digits long and start with the number 9. You can modify the pattern if the invoice numbers have different lengths or other specific characteristics.

Ahmediqbal
7 - Meteor

Hi @sparksun, Any way to get the multiple invoice numbers into 1 line separated by a comma?

sparksun
11 - Bolide

Summarize tool is easy to make it.

extract.jpg

Ahmediqbal
7 - Meteor

@sparksun , Thanks but anyway to do it if I have to process over 100 statement lines?

Labels