I have a field that contains a list of comma separated state abbreviations and alternates (e.g. AZ, CO, DC, LA, MD, MI, NV, NV-WH, NY, NY-ON, RI, TN, WA, WA-MS, WA-SK). Not every record has the same set of abbreviations and they are not in any semblance of order. I want to be able to filter on certain states without flagging the alternates. E.G., I was trying to use a filter with Contains([State],"NY") but I want to make sure that I'm not getting records that contain only NY-ON without NY.
The desired result would include this logic:
State contains "NY" but not "NY-ON" = True
State contains "NY" and "NY-ON" = True
State contains "NY-ON" but not "NY" = False
I tried adding a comma and using Contains([State],"NY,") but that would rule out any record where NY was the only entry or at the end of the list.
I'd appreciate any guidance.
Solved! Go to Solution.
@EEnglish2 Use = operator instead of contains [State]= "NY"
Are you suggesting that I say [State]="NY"? That won't work because the [State] field is not necessarily an individual state abbreviation. It might look like my original example because it is a comma separated list of abbreviations. So, it is incredibly rare that the field would only contain "NY". I want to make sure I capture any record where NY might be in the list of states.
Perhaps if you provide us a sample dataset close to the original along with your desired output, we can better help brainstorm. At the moment from what you described and showed, I’d wager @binuacs ’ way is the one. But since you gave more context, it changes everything.
Based on this sample set of data, I would expect that the following records would be returned as they have NY in the States field:
Douglas Adams
Daniel Brown
Nora Roberts
Bradley Meltzer
Kevin Smith
Harrison Ford
William Murray
Isla Fisher
Elmer Fudd
Samuel Yosemite
Charles Norris
James Rhodes
Pepper Potts
Steven Rogers
Sherman Peabody
Chumley Walrus
Dudley Doright
I would expect the following records to be excluded, as they have only NY-ON in the States field, with no NY present:
Scott Card
Susan Grafton
Mark Hamill
Jessica Rabbit
Bart Conner
Anthony Stark
Nicholas Fury
IN.
'NY'- IN('NY','OH') - true
'OH' - IN('NY',OH') - true
'NY-ON' - IN('NY','OH') - false
this is all assuming you have a more substantive matching mechanism than just ny - otherwise you can search for ny at the start, ny at the end and ,ny, via regex_match([states],"^ny,.*|.*,ny,.*|.*ny$")
but assuming this is more complicated you can:
1) add a recordid
2) use text to columns - split states into rows using comma
create a second text input with your match terms.
join. match terms -> state
use select tool to drop fields from the Join tool except recordid.
join recordid-> recordid before text to columns tool.
Thank you for your response. I believe it's as simple and straightforward as it appears and doesn't require anything more substantive. It's just picking records that contain a specific value. I was hoping to just use the "Contains" function, which works in identifying any state that does not have a variant and the variants themselves. Your RegEx approach looks promising, since it includes the comma and looks at the string in 3 different ways. I'll give that a try.
@EEnglish2
There is a simple solution for it. With a Formula tool create a Flag field and just populate there the 3 conditions that you mentioned about. You will have True and False in that field for each of the rows. Connect it to a Filter tool and filter it based on the Flag Field, on True or False as per what you need. So you do not need to focus on the states but on the condition if it is true or false.
User | Count |
---|---|
107 | |
82 | |
70 | |
54 | |
40 |