Removing Inline Styles
CAVEAT: ALWAYS MAKE A BACK UP BEFORE YOU DO THIS KIND OF STUFF.
Problem: I had a site with 1200 articles and every one of them had hard coded inline styles. Fonts, font sizes, colors, underlines, you name it. It was a 6 year old site with multiple contributors all using every button available in the text editor.
Of course I wanted the articles to use the styles in my css file, but inline styles override style sheets. What to do? I tweeted @NoNumber_nl Peter Van Westen from NoNumber and asked him if there was a way to use his excellent tool, dbReplacer to find and replace all of the styles and he kindly responded with the code below.
Hack:
With regular expressions, use:
\s*style="[^"]*" or \s*style=".*?"
(also grabs the extra whitespace before the style attrib)
If memory serves, I might have been left with style=" ", but I simply used the dbReplacer again to remove all of those. I was then left with all of the <em><strong><p><br> tags in place, but no fonts or sizes.
Since taxonomy is important to Search Engine Optimzation, I also used it to find all of the mis-used <H1> tags and replace them with a more appropriate tag.
p.s. I'm also using his Sourcerer plugin to add source code to my articles