I noticed an article today on CNN called How to create a ‘super password’. For those of you who don’t feel like reading the article, it is basically about how researchers at Georgia Tech are able to brute force 8-character passwords in less than a couple of hours, using “clusters of graphics cards”. This is nothing new; graphics cards are being used for all sorts of applications other than graphics. What I think is more interesting is the conclusion the writer draws from the fact that cracking an 8-character password is a simple enough task, provided you have the hardware and the ability to program it. That conclusion being we should “Say goodbye to those wimpy, eight-letter passwords”.
Before I comment on the article I’d like to talk about what goes into cracking a password. The article refers to the brute force method, but fails to mention that in order to carry out such an attack you must already have one piece of information: the password hash. While it is technically possible to carry out a brute force attack without this piece of information, it would mean every time you generate a password you would have to test it using the target platform itself. For example, if you were trying to obtain a users password to their personal computer, your password generating program would have to attempt to login with every password it generated. Even if it weren’t the case that most applications are smart enough to lock a user out after a certain number of attempts, using this method would greatly diminish the number of passwords you could generate in a reasonable amount of time. If you consider attempting to obtain a user’s password to a website it becomes entirely impossible to use this method.
So what is a password hash and how do you obtain it? Most applications do not store passwords in their raw state. Instead they use the password to generate a hash and store that instead. Rather than explain what a hash is myself I’ll let Wikipedia do the work. Obtaining a hash might be as simple as knowing where to look when it comes to desktop applications. For web applications however, there is a bit more to it. When you register on a website the server generates a hash and stores it. It never even touches the users computer. This means that in order for an attacker to have the hash corresponding to your password they must have already compromised that site. This of course does happen; if it didn’t I wouldn’t be writing this.
Once an attacker has the hash they can begin their brute force attack. Now every time a password is generated it is hashed using the same method as the hash obtained from the server. As the article mentions, if your password is 8 characters long (even if you use non-alphanumeric characters) it will take only a matter of hours with the right hardware to find a match. Simple right? Maybe not. There are a couple of assumptions here that may be the key to whether or not it is necessary to have longer passwords: Does the attacker know the method used to generate the hash and how quickly can a hash be generated using this method.
An attacker cannot necessarily determine the method used to generate a hash from the hash string alone. Seeing some example hashes might help understand why. This is a (very non-comprehensive) list of some hash functions and example outputs (all of them were given the input ‘tinsology’):
| Algorithm | length (hex digits) | Output |
|---|---|---|
| crc32 | 8 | ba86b1ef |
| tiger128 | 32 | 13b7a126b69ef08f71a7c3a8ff6cf55b |
| haval128 | 32 | 14e530346f17fb9b2ec292a6d6cb6461 |
| md5 | 32 | bb58ec46cc9049167ab394f131773dde |
| tiger160 | 40 | 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a22 |
| haval160 | 40 | deca5c4ff24586119f80082ea67b3594740cb563 |
| sha1 | 40 | ddb48f4802ac20502d644ad0ef59e3b984f61e05 |
| tiger192 | 48 | 13b7a126b69ef08f71a7c3a8ff6cf55bf7ae6a2224cf7782 |
| haval192 | 48 | d4ff8cd2f8bb7696594db27dd13583bdf9a15899a6955a5f |
| sha224 | 56 | e1d14af535bbe9991ff87f29355a5c5690ea22a184042419f2420434 |
| haval224 | 56 | 36b218f09b72fbaa3171e9e6084a540a4f2d4a5bb71859801071ae71 |
| haval256 | 64 | eb1a125d894b136fa7d125ff23ebbd504dacb333c188815d1e5bd215763aafa3 |
| snefru256 | 64 | ed79e0ccdb0a8064c9b38b80057f37221cbd9730f635649775bd31083d7656dd |
| sha256 | 64 | 3041a80756a26c887db6a5ec5083f0247ce9cce357b498238cab755f0b13e285 |
| ripemd320 | 80 | 6d41b9e88392fef66e07874c7ead16052992651d0737654f4b8568758ce2cf775a04f01b4ae81815 |
| sha384 | 96 | 7cbc44cc4a0c017ec481ec46f306672d36e290241dcdb81dd61a3a442296d7f1e032ba827bbd1f46b4e1da058f3243fb |
| sha512 | 128 | 83ffea274400557ad24d2a6b50a28fef5ce42b37730a574d4d96b3f8bb96db3f51add41b77ddd656c031bc470c1e8c3c… e27582be1c04d7e785f15d42c9d284be |
| whirlpool | 128 | 4243b927f617890f8fe2d376c28d87d7836f4d71567b786d869375ec1b5f355cb600eed1b920744cbb529325e37d92aa… 753b88b9c790b143db3061b40e33ffdc |
| salsa20 | 128 | 6c0826f7f717b8e7034302d3991034a109cbdcee47f5a4d6666320a42b2d0a034dcf11666078710a9cf9ddaab7d0ed6f… 0898b83c62a607858bb4fc8a55589415 |
It is important to note that not all of the above algorithms are meant for or should be used for password hashing. I’m looking at you md5.
As you can see from the list length alone is not a distinguishing factor. However, knowledge that a particular hash function is more commonly used than others, combined with the length of the hash, might be enough to determine which algorithm was used. For example, if I saw a 40 digit long hash, I would assume it was generated using sha1. This will not be a problem for a well secured application however; there are some simple tricks to making it impossible to determine how a hash was generated simply from looking at the output.
One such method, called salting, involves appending a predetermined string of characters, called a salt, to the hash and rehashing the resulting string. Here is an example using the sha1 has of the string ‘tinsology’ with the salt ‘a$f’:
| Step | Value |
|---|---|
| Input | tinsology |
| SHA1 Hash | ddb48f4802ac20502d644ad0ef59e3b984f61e05 |
| Append Salt | ddb48f4802ac20502d644ad0ef59e3b984f61e05a$f |
| Rehash | aa68c096b60a2646041df5cba4664330cfcea597 |
An attacker would be correct to assume that the final hash was generated using sha1, but if he were unaware that a salt was used that information would be useless. If he attempted to find a string with a matching hash using brute force it would be equivalent to trying to crack a 43 character password. This would be true regardless of how long the password used as input was.
If an attacker is able to obtain the database containing the hash however, it may be reasonable to assume that they have also obtained the source code revealing how the hash was generated. Even if this is the case you still gain something by salting and rehashing. Increasing the number of operations needed to generate a hash means you reduce the number of passwords that can be tested per second. If an attacker can generate and test X passwords per second, doubling the number of operations needed to generate a hash would reduce that number to somewhere around one half of X.
This fact yields an alternative to using longer passwords. While hashing is a fairly common operation on a webserver, every time a user logs in a hash is generated, it does not compare to the trillions of hashes that need to be generated to crack a password using brute force. If you maximize the amount of time that can reasonably be spent generating hashes on a webserver, you minimize the effectiveness of brute force attacks.
What does all of this say about the question at hand? Do we need to use 12 or more characters passwords in order to be safe from having our passwords cracked? Perhaps, but if we do it isn’t because passwords cannot be secured against brute force attacks. The underlying point in the article is that as technology improves, these types of attacks become more and more feasible. I think that this is a two way street however. If malicious users can throw more hardware at the problem, then so can server administrators. The advantage in this case does not go to the attacker. An attacker must generate trillions of hashes per minute in order to find a match in a reasonable amount of time. Even the largest websites out there don’t even come close to that. Using a slower hashing algorithm is going to slow down a brute force attack a whole lot more than a server.
The catch is how much trust you should put into the websites you are giving your password to. I trust that if I give my password Google I’m not going to find out that someone came along, downloaded their database, and found that my password was being stored unencrypted. This definitely isn’t the case for every website asking for my password. The trick is to have multiple passwords and make sure that you don’t use the same one for your bank account as you do for the website your neighbor made dedicated to his cat.
Thank you very much… I have been programming in PHP/MySQL for about 6 years and while I understood what “To Do” and “Not To Do”, I can honestly say I didn’t get the hashing until I read this article and a similar article you wrote. Very helpful.
We don’t need longer passwords, we just need more complex ones. The ones that they are brute forcing over there are most likely lists of common passwords that people use or common words created into passwords.
I use a 10 character hexadecimal for most of my websites, of which has never been able to get brute forced or guessed. And I use a 24 character hexadecimal for my wireless network.
‘Complex’ passwords are resistant to dictionary attacks. A brute force attack is comprehensive with regard to its character set. The article I refer to doesn’t explicitly refer to the set of characters used in the attack, but I wouldn’t be surprised if it were completely comprehensive. Because of this even complex passwords are vulnerable.
With regard to using hex values as passwords, you can represent just over a trillion distinct values with 10 hex digits. It sounds like a lot but it really isn’t. If you include upper and lower case letters in your hex string (22 instead of 16 possible values for each character) that number jumps to 26 trillion (2.65 * 10^13). For passwords that can contain any upper case letter, lower case letter, or digit that number (62 values) that number becomes about 8.39 * 10^17 (for 10 characters).
Pingback: Creating a Secure Login System the Right Way | Tinsology
ha ha
The longer passwords are just to confuse the guy over your shoulder, watching you type it in with one finger, right? I like what you said about not using the same password for everything, it is good to reinforce the idea.
Thank you.
All good points. I would be so happy if we could just stop using MD5 and move to something modern…..
Nice article; it always surprises me how many largish websites take little notice of password hashing.
As a rule of thumb I believe that if a website can provide you (when you reset your password) with your old password, odds are the password is not secured in there database and that website should be avoided.