Wednesday, February 13, 2013

"We're listing Starboard Cap'n:" a look at using lists to control data integrity.

introduction

We live in an age where any individual with the right tool can forge credentials through a facebook token and gain access to banking information from their target. Unlike days of old where one had to be familiar with a specific technology or programming language to implement, one can go out to the internet and use a simple Google search to find any number of tools that will enable the novice (aka Script Kiddie) to mount a precision attack against any number of sites. The worst part of this is

the tools are free and open source

. While most of this is of no surprise to any forward thinking InfoSec analyst, one trending we are seeing more of is User Generated content. People being able to add their two cents to any page on the web. Wikipedia, facebook, twitter, blogs, news pages with comments, youtube, instagram, pinterest; All fundamentally use input from users to guide, drive, and traffic information. However, one alarming note is the general lack of consistent standardization for field validation within sites themselves, let alone across multiple sites and corporations. If one pauses to consider the kinds of internal turmoil that many web properties go through, one can quickly understand HOW this could happen. But it isn't until you go farther down, to the individual developer, page, and field that it starts becoming crystal clear.

a quick note on standards

When a developer starts designing a page for a project, unless there are standards driving that development, it is bound to look like a patchwork quilt. One of the most dangerous areas to be that spotty is security. With that much inconsistency between developers within an organization, it is no wonder that when looking at web development and application development overall, one can see vast differences including differences in similar code. While the creativity of the developer is not the cause of this, it is important to point out that developing blindly without understanding the code base one is pushing their project into is dangerous. From there one has to understand what the most likely attacker profile will be. If one can understand that, they can then greatly reduce the number of vectors for which an attacker will likely gain access.

White, Black, and Red listing

White and black lists are nothing new. The idea of white-listing a server to keep untrusted traffic out is a good concept. However, there is a huge hole in that concept. What happens if the traffic is NOT malicious, but just not on the White list? Let's take the example of Minecraft. In Minecraft multi-player, there are servers setup by various organizations. They also have white-listing as an out of the box feature of the Minecraft multi-player server. They use the white-lists to deter and keep players that are not trusted or well known off. For those that don't use white-listing, there are other systems that monitor and detect player actions by player so that one does not have to read the server logs to find out if players have been doing inappropriate activity. When a player does something that is deemed "inappropriate use" of the world in Minecraft, they are warned, booted from the server, or banned. If white-listing is acceptable players and banning is a black-list, then booting and warning players would be a separate list altogether. Let's bring that back to the world of field validation that we discussed briefly with regards to User Generated content. If one implements a white-list for acceptable values on a field, one also must add a black-list to define what IS NOT appropriate. The black-list is arguably more important than the white list because it defines what is absolutely not allowed in the field.
Ultimately white-listing is no different from or better than black-listing because it is impossible for either humans or computer systems to distinguish good software from bad software.
-Simon Crosby, February, 2013, taken from his blog The difference posed here is knowing your systems, and the information most likely to be targeted. If one can understand that, they can make a better white and black list.
n order to validate the integrity of the input we need to ensure it matches the pattern we expect. Blacklists looking for patterns such as we injected earlier on are hard work both because the list of potentially malicious input is huge and because it changes as new exploit techniques are discovered. Validating all input against whitelists is both far more secure and much easier to implement. In the case above, we only expected a positive integer and anything outside that pattern should have been immediate cause for concern. Fortunately this is a simple pattern that can be easily validated against a regular expression.
-Troy Hunt, May 2010, taken from his website

the red-list

The next concept I would like to put out is the idea of using another list (what I call the "Red List"). Red-listing is for input that is not necessarily malicious, but is suspect because of the context, type, or time the information is presented. An example could be found back with the Minecraft players. Let's say that a player is white-listed and playing on the server, but he starts gathering specific materials needed to make and use TNT (sand, gun-powder) which is not allowed. Because the player is not doing anything inappropriate on the server, he will draw no attention from the server moderators. However, he would be Red-listed. He would be brought to a higher level of awareness and will be allowed to continue. The server moderators may then either pro-actively intervene (by asking the player what they are doing) or wait until the player has created the inappropriate item and warn/boot the player. Red-listing is nothing more than a clearing house for what could be malicious content, but is not overtly so. Just a thought.