Regular Expression to remove HTML tables

Just figured out this regular expression to remove all tables from a HTML document:

</?table[^>]*>|</?tr[^>]*>|</?td[^>]*>|</?thead[^>]*>|</?tbody[^>]*>

Extremely useful for cleaning up prehistoric mark-up with a text editor that supports regular expression find-and-replace searches.

And to go all the way, this one removes font tags too:

</?table[^>]*>|</?tr[^>]*>|</?td[^>]*>|</?thead[^>]*>|</?tbody[^>]*>|</?font[^>]*>

5 Responses to “Regular Expression to remove HTML tables”


  1. 1 E_Jim

    Thanks a lot, you have no idea how much time you’ve saved me!

  2. 2 kjdash

    Pretty good, but missing a few things:


    ]*>|]*>|]*>|]*>|]*>|]*>|]*>|]*>

    would be more complete.
    Thanks for the foundation

  3. 3 kjdash

    ugh, it stripped everything else out.

    I added th and tfoot

  4. 4 llll

    Great code

  5. 5 Mark

    you’re a legend, thanks a lot for this, save me a fair bit of time!

    cheers.

Leave a Reply