- Published on
Securely storing your users' passwords
What we'll focus on
- Password hashing functions
- Salting
- Password cracking
Pre-requisites
- Programming concepts like functions, loops, e.t.c
- Basic knowledge of Go
- Knowledge about how the web works (HTTP, forms, HTTP verbs, passwords)
As we build user facing software systems, we're always faced with the same problem of user identity management - how do we verify that the users intending to access certain data are the rightful owners?
Over the years, many web based software systems, e.g Linkedin have suffered from breaches in user password data. How then have we not learnt how to properly store user passwords?
As gatherers of data, we owe our users proper protection of the data entrusted with us. Good intention is not enough. We have to put in place proper mechanisms to make proper data security attainable.
Through this article, we'll explore the different password storage mechanisms, the weaknesses of some popular hashing algorithms and ways to overcome the said weaknesses.
Understanding hashing
Hashing is the process through which we transform plain text into hashes. Instead of storing the plaintext words/strings e.g password
in our database, we store the hash representation of the said strings.
Some popular hashing functions include MD5 (Message Digest Method 5), SHA-2 (Secure Hash Algorithm 2), to mention but a few.
For example;
Suppose my password is thisIsMyPassword
, And I'm using the MD5 hash function to hash it,
This would be the resultant string 80d2f0dd3f1caa2e62bab686d6d1d140
Properties of hash functions
- They are one way functions
Once you use a function for hashing, you can not use the same function to reverse the hashing.
- Time consuming
It takes a considerable amount of time to find two inputs that hash to the same output. If you use a secure hash function, it would essentially take longer than the earth's age to find two functions that hash to the same value (referred to as a collision).
To put the above in context, it's estimated that it will take 36 trillion years to find a collision for the SHA-256 hash function. The universe is only 13.8 billion years old.
- Fixed size out put is produced
Long and short streams of text will result into hashes of the same length. Why then should we insist on using long passwords?
It's because the longer the password, the more difficult it is to crack it. I'll explain below.
Anatomy of a hash function
I choose not to go into the details of the inner details of the most popular hash function. However, here's a list of resources that delve into details:
Examples of how to encrypt a string using MD5, SHA-256, SHA-1
- Using SHA-256
func main() {
hasher := sha256.New()
io.WriteString(hasher, "password")
fmt.Printf("%x", hasher.Sum(nil))
}
// Output: 5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
- Using SHA-1
func main() {
hasher := sha1.New()
io.WriteString(hasher, "password")
fmt.Printf("%x", hasher.Sum(nil))
}
// Output: 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
- Using MD5
func main() {
hasher := md5.New()
io.WriteString(hasher, "password")
fmt.Printf("%x", hasher.Sum(nil))
}
// Output: 5f4dcc3b5aa765d61d8327deb882cf99
Bad approach to password handling
Remembering the fact that hash functions are one way "streets", the same input produces the same output.
For example;
Suppose your system has over 100,000 users and about 1% of them are using password
as their password.
Since you're using the same hashing function e.g MD5, 1000 user accounts will have the same password hash in your database.
In case of a database breach, the attackers may use a rainbow table
approach to compare the password hashes in your database with precomputed hashes thereby revealing the original passwords the users used.
Password hash cracking demo
I'll use a tool called hashcat
to decrypt a password hash into plain text. There are many other tools options that can perform the same job.
See instructions on how to set up hashcat
here
I'm using a Kali Linux VM and it comes with a password list stored in the /usr/share/wordlists
directory.
$ hashcat -m 0 hashes.txt /usr/share/wordlists/rock.you.txt --show
-m
means attack mode0
tells hashcat that we're using raw MD5 hasheshashes.txt
is the file containing the password hash. It contains5f4dcc3b5aa765d61d8327deb882cf99
as the only value/usr/share/wordlists/rock.you.txt
is the directory to our word list--show
is a flag to show the output in the formatpassword hash
:plain text equivalent
Here is the output:
We can see that hashing user data using one-way hash functions may not be enough.
A better approach
Very secure systems utilize hash algorithms that take into account the time and resources it would require to compute a given password digest. This allows us to create password digests that are computationally expensive to perform on a large scale. The greater the intensity of the calculation, the more difficult it will be for an attacker to pre-compute plain text.
In Go, it's recommended that you use the bcrypt
package.
package main
import (
"fmt"
"log"
"golang.org/x/crypto/bcrypt"
)
func main() {
password1 := "thisIsAPassword"
// Generate a hash from password
hash, err := bcrypt.GenerateFromPassword([]byte(password1), bcrypt.DefaultCost)
if err != nil {
log.Println("error: ", err)
}
fmt.Println("Hash to store:", string(hash))
// Store this "hash" somewhere
// Later, a user wants to log in. Check the password they entered against the one you have in the database
password2 := "given-password"
storedHash := hash
if err := bcrypt.CompareHashAndPassword(storedHash, []byte(password2)); err != nil {
if err != nil {
log.Println("error: ", err)
}
}
fmt.Println("Password is correct!")
}
Conclusion
I hope this article has made a contribution towards building secure software with user data security in mind.
Let me know what your thoughts are about this article hi[at]luigimorel.com. Cheers!