Adding Exclusions to Storm Topology Processing
The Shop now has a group that's continually hitting the different apps and APIs for application security and as a consequence, we're getting a lot of messages that have completely bogus information in them. Like a country field with the value " SLEEP(10) " - as if there's a way to hack the system.
This all makes sense, and I can certainly respect their job, but it does mean that someone needs to filter these out, or we are all going to be dealing with partially corrupted data, and getting wrong answers from it. That someone turned out to be me.
The test wasn't all that hard - we're just looking for a few characters in the field that would be strictly no-good, and then exclude the entire message based on that. The test is really a very simple predicate: a clojure set:
(def bad-chars "()%\"'")
and then it's used very simply:
(if (not-any? bad-chars (:field msg)) )
and we can also use some for the opposite logic, if needed.
There was an additional test - looking at the userAgent to see if it was one of the security group's tests - again, pretty simple, and not too hard to add.