Actually, the best way to do this probably to use the "word boundary" escape character.
Each word in your array that is noise, you should put a \b on each side of the word:
the => \bthe\b
Mind you, it's been a while since I've used PHP, but that should work. Read
RegExp Pattern Syntax for more information.