Did more regex practice - I wanted to try to get everything done in a single regex expression, but I'm struggling with how to make wildcards work across multiple lines with HTML. I managed a multi-column parse in one case but I clearly have further to go.
Used tokenize approach to solve the question. But need to be very careful that some doctors only have name on their information. Hence, no <h4> or <h5> in their sections.