This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Speaker1: HELLO MY NAME IS MONA HOW MAY I HELP YOU Speaker2: YES I NEED TO CHECK THE STATUS ON AN ORDER I WAS TOLD THAT IT WAS SUPPOSED TO BE SHIPPED OUT MONDAY AND I WAS SUPPOSED TO RECIEVE IT YESTERDAY AND I DID NOT RECIEVE IT Speaker1: OKAY ONE MOMENT AND ILL BE GLAD TO CHECK THAT STATUS
I will work with the output afterwards to make it easier to read, but I really want to group the words by speakers for the string without breaking it into more columns. I need to be able to use this workflow to convert more than one transcript at a time. I am thinking this can be done with regex, but I have not figured out a solution yet.
I would do maybe something like this? It's not very clean but it works. Use RegEx to Tokenize and split into rows your big long field. use "\w+" to match words (this is where someone could be much more clever)
Then I did a multi-row formula to find which speaker was speaking, then filter out the useless lines.
A touch of RegEx to parse the data into rows (using the space as delimiter. You can easily do this with Text to Columns too, using \s as the delimiter), then a touch of Formulas to clean the data (remove the number and : with Regex_Replace, and create a speaker column). Assign a row number, effectively which sentence the data belongs to, with Multi Row Formula, and build the sentences back with a Summarize.