Tons of websites or applications are requiring us to register our username/password into their system. This scares the hell out of me as a developer. If I register to 40 websites, and one of them gets breached, is my information secure? Does that one breach potentially affect the other 39?
How are you storing your users’ passwords? Read more to see if your password protection can hold up to today’s hackers…
The Methods
Plain Text – If you are storing passwords in plain text, please unplug your internet connection because you are doing more harm than good. People tend to do this for one of two reasons… 1) negligence, or 2) they need to be able to recover the password.
Encryption – If you do need to reverse the passwords, then you will want some sort of two-way encryption. If your entire system is compromised, there is a chance passwords may be too. If you take the route of encryption, I recommend AES, which is the current government-approved standard for data encryption. Be warned though, proper encryption requires a lot of preparation. Encrypting passwords is rarely necessary. Allowing users to reset their password via email address is typically preferred from having a way to actually reverse it. Personally, this also gives me the peace of mind that they are not storing my password.
Hash (One-way encryption) – A hash is a one-way transformation, creating a very small representation of the original data. It is impossible to reverse. You can authenticate users by storing a hash of their password, and when they return, a hash of the password they log in with should match the one stored. This is the preferred method of authenticating users (or, if you can, some sort of OpenID implementation with no password exchange).
Home Grown Solution – Unless you have been studying advanced cryptography for several years, please stop. Every home grown solution I’ve seen has been security-by-obscurity at best, and would not stop even an inexperienced malicious user.
Why You Shouldn’t Use md5 or sha1/sha2
MD5 is the most popular hashing algorithm, and has been around for a long time now. It is still being widely (ab)used. It creates a 128 bit (32 digit hex) hash of the input string. Generally, you will get a unique hash for every string. SHA1 and its successor, SHA2 are also very popular and valid hashing algorithms. While they are better than md5, they are also not meant for password storage. These are general hashing functions – not safe for dealing with passwords. Here is why…
Rainbow Tables – One of the main flaws of using a standard hash is hackers can compile a large table of strings, and their corresponding hashes. A simple test is to search on Google for a hash of your password (or any string) and see what you come up with. This easily gets past common passwords, but what about more advanced ones?
Adding Salt – Adding “salt” refers to appending or prepending random characters to the password, so it will not match other common passwords, or even the same password somewhere else. Each account should have a different salt! While this does solve the rainbow tables issue, it is still not foolproof.
Collisions – Unfortunately, md5/sha1/sha2 are not perfect hashing algorithms. Multiple strings can yield the same hash, so there is a chance that a user would be able to log in with a different password than your own. While this is not very common, it lessens the time required to brute force passwords.
Brute Force – At the end of the day, if a user has the hashed password, they can try to brute force it. They’d need a script to generate all character combinations (or download a large db), and compare hashes with the hash that they are trying to break. If time is the only weapon we have, the slower the hash, the better our chances are. A decent server can crack a 6 digit password in under a minute. If you actually have sensitive information, a few thousand dollars will allow you to try more than 700 million passwords per second. It will not take long. More and more hackers are renting high-powered computers so they can work faster.
Speed is the Enemy
The faster a hash is calculated, the quicker a hacker can brute force it. There is an enormous difference in computation power between your home PC, a decent server, and a super-computer that you can leverage for a few hundred dollars per hour. Some computers can calculate every 6 or 7 digit md5 hash in a few seconds.
bcrypt is a very slow hashing algorithm based on Blowfish (popular encryption algorithm). You can set the number of encryption rounds between 4 and 31. By comparison, depending on the number of rounds you set, it can take a tenth of a second up to a few seconds for a single password. Just to drill it in… md5: millionth of a second, bcrypt: 0.1s-5s (very rough numbers).
bcrypt uses a salt internally, so it naturally avoids rainbow attacks, and due to its complexity, brute force attacks and collisions. Even with some sort of super-computer, it would take several years to crack a bcrypt hash, which is significantly more effective than MD5 or SHA1/2.
Implementing bcrypt in PHP
bcrypt is a lot more difficult to call than md5(), and has a few dependencies as well. Oddly enough, it is also not very well documented. In order to use it, you call the crypt function (yes, it’s a hash and not an encryption function). In order to set the hash type, you need to pass it specifically formatted salt.
string crypt ( string $str [, string $salt ] )
$str is your input string to hash, and $salt is a combination of configuration and the hash type to use. Here is the documentation for bcrypt (blowfish):
CRYPT_BLOWFISH – Blowfish hashing with a salt as follows: “$2a$”, a two digit cost parameter, “$”, and 22 base64 digits from the alphabet “./0-9A-Za-z”. Using characters outside of this range in the salt will cause crypt() to return a zero-length string. The two digit cost parameter is the base-2 logarithm of the iteration count for the underlying Blowfish-based hashing algorithmeter and must be in range 04-31, values outside this range will cause crypt() to fail.
Here is an example:
crypt('string_to_hash', '$2a$10$abcdefghijklmnopqrstuv');
Broken down:
$password = 'TOP SECRET PASSWORD';
$rounds = 10; // typically ~0.2s
$salt = 'abcdefghijklmnopqrstuv'; // a-zA-Z0-9 / and .
$hash = crypt($password, '$2a$' . $rounds . '$' . $salt);
When re-creating the hash for comparison, you’ll need to pass the same salt, so you should store it in your user table similar to how the hash is stored. Creating a user and authenticating users may take another 0.2s or so, but that’s really a small (and configurable) price to pay for security.