RegEx match open tags except XHTML self-contained tags

I need to match all of these opening tags:

<p>
<a href="https://stackoverflow.com/questions/1732348/foo">

But not these:

<br />
<hr class="https://stackoverflow.com/questions/1732348/foo" />

I came up with this and wanted to make sure I’ve got it right. I am only capturing the a-z.

<([a-z]+) *[^/]*?>

I believe it says:

  • Find a less-than, then
  • Find (and capture) a-z one or more times, then
  • Find zero or more spaces, then
  • Find any character zero or more times, greedy, except /, then
  • Find a greater-than

Do I have that right? And more importantly, what do you think?

3
36

Leave a Comment