Do We Need Longer Passwords?
I noticed an article today on CNN called How to create a ‘super password’. For those of you who don’t feel like reading the article, it is basically about how researchers at Georgia Tech are able to brute force 8-character passwords in less than a couple of hours, using “clusters of graphics cards”. This is nothing new; graphics cards are being used for all sorts of applications other than graphics. What I think is more interesting is the conclusion the writer draws from the fact that cracking an 8-character password is a simple enough task, provided you have the hardware and the ability to program it. That conclusion being we should “Say goodbye to those wimpy, eight-letter passwords”.
Before I comment on the article I’d like to talk about what goes into cracking a password. The article refers to the brute force method, but fails to mention that in order to carry out such an attack you must already have one piece of information: the password hash. While it is technically possible to carry out a brute force attack without this piece of information, it would mean every time you generate a password you would have to test it using the target platform itself. For example, if you were trying to obtain a users password to their personal computer, your password generating program would have to attempt to login with every password it generated. Even if it weren’t the case that most applications are smart enough to lock a user out after a certain number of attempts, using this method would greatly diminish the number of passwords you could generate in a reasonable amount of time. If you consider attempting to obtain a user’s password to a website it becomes entirely impossible to use this method.
So what is a password hash and how do you obtain it? Most applications do not store passwords in their raw state. Instead they use the password to generate a hash and store that instead. Rather than explain what a hash is myself I’ll let Wikipedia do the work. Obtaining a hash might be as simple as knowing where to look when it comes to desktop applications. For web applications however, there is a bit more to it. When you register on a website the server generates a hash and stores it. It never even touches the users computer. This means that in order for an attacker to have the hash corresponding to your password they must have already compromised that site. This of course does happen; if it didn’t I wouldn’t be writing this.
Once an attacker has the hash they can begin their brute force attack. Now every time a password is generated it is hashed using the same method as the hash obtained from the server. As the article mentions, if your password is 8 characters long (even if you use non-alphanumeric characters) it will take only a matter of hours with the right hardware to find a match. Simple right? Maybe not. There are a couple of assumptions here that may be the key to whether or not it is necessary to have longer passwords: Does the attacker know the method used to generate the hash and how quickly can a hash be generated using this method.
An attacker cannot necessarily determine the method used to generate a hash from the hash string alone. Seeing some example hashes might help understand why. This is a (very non-comprehensive) list of some hash functions and example outputs (all of them were given the input ‘tinsology’):
| Algorithm | length (hex digits) | Output |
|---|---|---|
| crc32 | 8 | ba86b1ef |
| tiger128 | 32 | 13b7a126b69ef08f71a7c3a8ff6cf55b |
| haval128 | 32 | 14e530346f17fb9b2ec292a6d6cb6461 |
| md5 | 32 | bb58ec46cc9049167ab394f131773dde |
| tiger160 | 40 | 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a22 |
| haval160 | 40 | deca5c4ff24586119f80082ea67b3594740cb563 |
| sha1 | 40 | ddb48f4802ac20502d644ad0ef59e3b984f61e05 |
| tiger192 | 48 | 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a2224cf7782 |
| haval192 | 48 | d4ff8cd2f8bb7696594db27dd13583bdf9a15899a6955a5f |
| sha224 | 56 | e1d14af535bbe9991ff87f29355a5c5690ea22a184042419f2420434 |
| haval224 | 56 | 36b218f09b72fbaa3171e9e6084a540a4f2d4a5bb71859801071ae71 |
| haval256 | 64 | eb1a125d894b136fa7d125ff23ebbd504dacb333c188815d1e5bd215763aafa3 |
| snefru256 | 64 | ed79e0ccdb0a8064c9b38b80057f37221cbd9730f635649775bd31083d7656dd |
| sha256 | 64 | 3041a80756a26c887db6a5ec5083f0247ce9cce357b498238cab755f0b13e285 |
| ripemd320 | 80 | 6d41b9e88392fef66e07874c7ead16052992651d0737654f4b8568758ce2cf775a04f01b4ae81815 |
| sha384 | 96 | 7cbc44cc4a0c017ec481ec46f306672d36e290241dcdb81dd61a3a442296d7f1e032ba827bbd1f46b4e1da058f3243fb |
| sha512 | 128 | 83ffea274400557ad24d2a6b50a28fef5ce42b37730a574d4d96b3f8bb96db3f51add41b77ddd656c031bc470c1e8c3ce27582be1c04d7e785f15d42c9d284be |
| whirlpool | 128 | 4243b927f617890f8fe2d376c28d87d7836f4d71567b786d869375ec1b5f355cb600eed1b920744cbb529325e37d92aa753b88b9c790b143db3061b40e33ffdc |
| salsa20 | 128 | 6c0826f7f717b8e7034302d3991034a109cbdcee47f5a4d6666320a42b2d0a034dcf11666078710a9cf9ddaab7d0ed6f0898b83c62a607858bb4fc8a55589415 |
It is important to note that not all of the above algorithms are meant for or should be used for password hashing. I’m looking at you md5.
As you can see from the list length alone is not a distinguishing factor. However, knowledge that a particular hash function is more commonly used than others, combined with the length of the hash, might be enough to determine which algorithm was used. For example, if I saw a 40 digit long hash, I would assume it was generated using sha1. This will not be a problem for a well secured application however; there are some simple tricks to making it impossible to determine how a hash was generated simply from looking at the output.
One such method, called salting, involves appending a predetermined string of characters, called a salt, to the hash and rehashing the resulting string. Here is an example using the sha1 has of the string ‘tinsology’ with the salt ‘a$f’:
| Step | Value |
|---|---|
| Input | tinsology |
| SHA1 Hash | ddb48f4802ac20502d644ad0ef59e3b984f61e05 |
| Append Salt | ddb48f4802ac20502d644ad0ef59e3b984f61e05a$f |
| Rehash | aa68c096b60a2646041df5cba4664330cfcea597 |
An attacker would be correct to assume that the final hash was generated using sha1, but if he were unaware that a salt was used that information would be useless. If he attempted to find a string with a matching hash using brute force it would be equivalent to trying to crack a 43 character password. This would be true regardless of how long the password used as input was.
If an attacker is able to obtain the database containing the hash however, it may be reasonable to assume that they have also obtained the source code revealing how the hash was generated. Even if this is the case you still gain something by salting and rehashing. Increasing the number of operations needed to generate a hash means you reduce the number of passwords that can be tested per second. If an attacker can generate and test X passwords per second, doubling the number of operations needed to generate a hash would reduce that number to somewhere around one half of X.
This fact yields an alternative to using longer passwords. While hashing is a fairly common operation on a webserver, every time a user logs in a hash is generated, it does not compare to the trillions of hashes that need to be generated to crack a password using brute force. If you maximize the amount of time that can reasonably be spent generating hashes on a webserver, you minimize the effectiveness of brute force attacks.
What does all of this say about the question at hand? Do we need to use 12 or more characters passwords in order to be safe from having our passwords cracked? Perhaps, but if we do it isn’t because passwords cannot be secured against brute force attacks. The underlying point in the article is that as technology improves, these types of attacks become more and more feasible. I think that this is a two way street however. If malicious users can throw more hardware at the problem, then so can server administrators. The advantage in this case does not go to the attacker. An attacker must generate trillions of hashes per minute in order to find a match in a reasonable amount of time. Even the largest websites out there don’t even come close to that. Using a slower hashing algorithm is going to slow down a brute force attack a whole lot more than a server.
The catch is how much trust you should put into the websites you are giving your password to. I trust that if I give my password Google I’m not going to find out that someone came along, downloaded their database, and found that my password was being stored unencrypted. This definitely isn’t the case for every website asking for my password. The trick is to have multiple passwords and make sure that you don’t use the same one for your bank account as you do for the website your neighbor made dedicated to his cat.
