This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I have extracted a dataset from a PDF which leaves everything in the first cell of each line. The data can easily be delimited the text to columns function using a space a space as the delimiter. However, in the middle of the dataset we have the company name, and then further numbers. When I delimit using a space, it splits the company name up also so the numbers become unaligned. Is it possible to use the regex tool to parse a word from a number to prevent this from happening? Example below. Thanks!
Thanks for replying so quickly. Unfortunately this has created several new columns but they are all [Null]. Some of the numbers are in the thousands with , (e.g. 1,924) - could this affect the regex formula? Or does the field need to be in a certain data format to process perhaps?
Yes - having other characters in the numeric values could impact how this script works.
Here's an example that also looks at commas and decimals as being part of numbers (as long as they are not within a company name)
the only change between this and the prior script I had suggested is the addition of ",\." before and after the company name parse. the "\." says to look for a decimal character, as "." is a RegEx special character.
Thanks @Claje, with a couple of extra find/replaces and formulas to remove some punctuation it's pretty much working. Only thing left is to stop the word picking up the brackets of negative numbers that follow the company name, but hopefully I can fix this with something else!