Posts tagged ‘security’

Do We Need Longer Passwords?

I noticed an article today on CNN called  How to create a ‘super password’. For those of you who don’t feel like reading the article, it is basically about how researchers at Georgia Tech are able to brute force 8-character passwords in less than a couple of hours, using “clusters of graphics cards”. This is nothing new; graphics cards are being used for all sorts of applications other than graphics. What I think is more interesting is the conclusion the writer draws from the fact that cracking an 8-character password is a simple enough task, provided you have the hardware and the ability to program it. That conclusion being we should “Say goodbye to those wimpy, eight-letter passwords”.

Before I comment on the article I’d like to talk about what goes into cracking a password. The article refers to the brute force method, but fails to mention that in order to carry out such an attack you must already have one piece of information: the password hash. While it is technically possible to carry out a brute force attack without this piece of information, it would mean every time you generate a password you would have to test it using the target platform itself. For example, if you were trying to obtain a users password to their personal computer, your password generating program would have to attempt to login with every password it generated. Even if it weren’t the case that most applications are smart enough to lock a user out after a certain number of attempts, using this method would greatly diminish the number of passwords you could generate in a reasonable amount of time. If you consider attempting to obtain a user’s password to a website it becomes entirely impossible to use this method.

So what is a password hash and how do you obtain it? Most applications do not store passwords in their raw state. Instead they use the password to generate a hash and store that instead. Rather than explain what a hash is myself I’ll let Wikipedia do the work. Obtaining a hash might be as simple as knowing where to look when it comes to desktop applications. For web applications however, there is a bit more to it. When you register on a website the server generates a hash and stores it. It never even touches the users computer. This means that in order for an attacker to have the hash corresponding to your password they must have already compromised that site. This of course does happen; if it didn’t I wouldn’t be writing this.

Once an attacker has the hash they can begin their brute force attack. Now every time a password is generated it is hashed using the same method as the hash obtained from the server. As the article mentions, if your password is 8 characters long (even if you use non-alphanumeric  characters) it will take only a matter of hours with the right hardware to find a match. Simple right? Maybe not. There are a couple of assumptions here that may be the key to whether or not it is necessary to have longer passwords: Does the attacker know the method used to generate the hash and how quickly can a hash be generated using this method.

An attacker cannot necessarily determine the method used to generate a hash from the hash string alone. Seeing some example hashes might help understand why. This is a (very non-comprehensive) list of some hash functions and example outputs (all of them were given the input ‘tinsology’):

Algorithm length (hex digits) Output
crc32 8 ba86b1ef
tiger128 32 13b7a126b69ef08f71a7c3a8ff6cf55b
haval128 32 14e530346f17fb9b2ec292a6d6cb6461
md5 32 bb58ec46cc9049167ab394f131773dde
tiger160 40 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a22
haval160 40 deca5c4ff24586119f80082ea67b3594740cb563
sha1 40 ddb48f4802ac20502d644ad0ef59e3b984f61e05
tiger192 48 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a2224cf7782
haval192 48 d4ff8cd2f8bb7696594db27dd13583bdf9a15899a6955a5f
sha224 56 e1d14af535bbe9991ff87f29355a5c5690ea22a184042419f2420434
haval224 56 36b218f09b72fbaa3171e9e6084a540a4f2d4a5bb71859801071ae71
haval256 64 eb1a125d894b136fa7d125ff23ebbd504dacb333c188815d1e5bd215763aafa3
snefru256 64 ed79e0ccdb0a8064c9b38b80057f37221cbd9730f635649775bd31083d7656dd
sha256 64 3041a80756a26c887db6a5ec5083f0247ce9cce357b498238cab755f0b13e285
ripemd320 80 6d41b9e88392fef66e07874c7ead16052992651d0737654f4b8568758ce2cf775a04f01b4ae81815
sha384 96 7cbc44cc4a0c017ec481ec46f306672d36e290241dcdb81dd61a3a442296d7f1e032ba827bbd1f46b4e1da058f3243fb
sha512 128 83ffea274400557ad24d2a6b50a28fef5ce42b37730a574d4d96b3f8bb96db3f51add41b77ddd656c031bc470c1e8c3ce27582be1c04d7e785f15d42c9d284be
whirlpool 128 4243b927f617890f8fe2d376c28d87d7836f4d71567b786d869375ec1b5f355cb600eed1b920744cbb529325e37d92aa753b88b9c790b143db3061b40e33ffdc
salsa20 128 6c0826f7f717b8e7034302d3991034a109cbdcee47f5a4d6666320a42b2d0a034dcf11666078710a9cf9ddaab7d0ed6f0898b83c62a607858bb4fc8a55589415

It is important to note that not all of the above algorithms are meant for or should be used for password hashing. I’m looking at you md5.

As you can see from the list length alone is not a distinguishing factor. However, knowledge that a particular hash function is more commonly used than others, combined with the length of the hash, might be enough to determine which algorithm was used. For example, if I saw a 40 digit long hash, I would assume it was generated using sha1. This will not be a problem for a well secured application however; there are some simple tricks to making it impossible to determine how a hash was generated simply from looking at the output.

One such method, called salting, involves appending a predetermined string of characters, called a salt, to the hash and rehashing the resulting string. Here is an example using the sha1 has of the string ‘tinsology’ with the salt ‘a$f’:

Step Value
Input tinsology
SHA1 Hash ddb48f4802ac20502d644ad0ef59e3b984f61e05
Append Salt ddb48f4802ac20502d644ad0ef59e3b984f61e05a$f
Rehash aa68c096b60a2646041df5cba4664330cfcea597

An attacker would be correct to assume that the final hash was generated using sha1, but if he were unaware that a salt was used that information would be useless. If he attempted to find a string with a matching hash using brute force it would be equivalent to trying to crack a 43 character password. This would be true regardless of how long the password used as input was.

If an attacker is able to obtain the database containing the hash however, it may be reasonable to assume that they have also obtained the source code revealing how the hash was generated. Even if this is the case you still gain something by salting and rehashing. Increasing the number of operations needed to generate a hash means you reduce the number of passwords that can be tested per second. If an attacker can generate and test X passwords per second, doubling the number of operations needed to generate a hash would reduce that number to somewhere around one half of X.

This fact yields an alternative to using longer passwords. While hashing is a fairly common operation on a webserver, every time a user logs in a hash is generated, it does not compare to the trillions of hashes that need to be generated to crack a password using brute force. If you maximize the amount of time that can reasonably be spent generating hashes on a webserver, you minimize the effectiveness of brute force attacks.

What does all of this say about the question at hand? Do we need to use 12 or more characters passwords in order to be safe from having our passwords cracked? Perhaps, but if we do it isn’t because passwords cannot be secured against brute force attacks. The underlying point in the article is that as technology improves, these types of attacks become more and more feasible. I think that this is a two way street however. If malicious users can throw more hardware at the problem, then so can server administrators. The advantage in this case does not go to the attacker. An attacker must generate trillions of hashes per minute in order to find a match in a reasonable amount of time. Even the largest websites out there don’t even come close to that. Using a slower hashing algorithm is going to slow down a brute force attack a whole lot more than a server.

The catch is how much trust you should put into the websites you are giving your password to. I trust that if I give my password Google I’m not going to find out that someone came along, downloaded their database, and found that my password was being stored unencrypted. This definitely isn’t the case for every website asking for my password. The trick is to have multiple passwords and make sure that you don’t use the same one for your bank account as you do for the website your neighbor made dedicated to his cat.

A Supplement

I’ve just added a section about session management to Creating a Secure Login System the Right Way.

Check it out