Uses and praxis of encryption & hashing in secure web applications.
Encryption is the process of turning data into other data, data which can only be made meaningful if you know how to decrypt it.
You should encrypt any data your organization has deemed "sensitive", both in transit and at rest.
Data is generally thought of as having three "states". Just like matter can be a liquid, solid or gas1, data can be "in transit" (or "in motion"), "in use", or "at rest".
Data "in use" is a reference to data being stored in volatile memory, i.e. RAM. We could extend this metaphor to data displayed and edited in the web browser, but either way it's generally thought of as being the end-user's problem, and there's not much we can do about encrypting it, so we're not going to bother with it much today.
For our purposes, we can think of data "in transit" as being data transmitted from a client (like a browser) to our web server, or the reverse.
Data "at rest" is data stored in non-volatile memory, like in our database.
You should think of it as your responsibility to encrypt data that is "in transit" or "at rest".
Great news! We learned how to do this last week, and it's pretty easy.
Strict-Transport-Security
HTTP header to make https mandatory.Ok, step 1: use an encryption library, not some "roll-your-own" solution.
Let's take a look at a (very, very simple, and therefore very, very bad) example of encryption:
If you're just reading along in the notes without the accompanying lecture, make sure you check out the comments in the JavaScript.
While in this example, we used a single formula (letter + 1), a proper encryption algorithm would have a different formula every time it encrypts data. To know the formula used to encrypt a particular block of data, you need the "key".
If you use the same key to encrypt and decrypt, this is known as "symmetric key encryption".
If different keys are used to encrypt and decrypt (usually a 'public' key and a 'private' key), this is known as "asymmetric encryption". In some asymmetric encryption algorithms, both the public key and private key can encrypt, but only the private key can decrypt. Asymmetric encryption is pretty computationally expensive, so it's usually only used to secure small amounts of data, like keys themselves.
"RSA" is an example of asymmetric encryption - brought to you by the same people who make those little authentication fobs.
The current gold-standard of symmetric encryption (encryption with a single key) is AES. AES is the only publicly-available encryption method deemed secure enough to secure top-secret information for the U.S. government, so it's pretty solid.
As previously mentioned, despite the availability of powerful encryption algorithms like AES, you should not be attempting to create your own encryption library.
Currently, there are two libraries that are very well-regarded and available for use in most contexts (Node, PHP, .NET, etc.): OpenSSL Opens in a new window and libsodium Opens in a new window.
If you're wondering what the difference is - OpenSSL is considered more interoperable and "future-proof", while libsodium is considered more "idiot-proof". It's possible to choose older, less secure encryption through OpenSSL, whereas with libsodium, if you've got it working, you're pretty much ok. That being said, libsodium is harder to keep working as security threats evolve over the years and your application grows more complex.
Let's look at a very simple example, where we use OpenSSL through the command line (one of the many places, including most server-side code, that you can use OpenSSL).
openssl enc -aes-256-cbc -e -salt -in my-diary.txt -out nz-ejbsz.txt
Broken down,
openssl
calls the command line application 'openssl',enc
calls the method for using encryption/decryption ciphers,-aes-256-cbc
specifies the algorithm with which we will encrypt our data - in this case, AES with a key 256 bits long, using the Cipher Block Chaining mode,-e
means we will be encrypting the data,-salt
means we will be adding a "salt" to our data - an extra layer of security, which we'll talk about shortly-in
means the next thing in the command will be our input source, the thing getting encryptedmy-diary.txt
is the input source, a file in the working directory (i.e. the same folder where this command is run),-out
means we're about to specify the output destination, the place where our output data goes, and finallynz-ejbsz.txt
is the file where we'll put our encrypted data.-in
means the next thing in the command will be our input source, the thing getting encryptedmy-diary.txt
is the input source, a file in the working directory (i.e. the same folder where this command is run),-out
means we're about to specify the output destination, the place where our output data goes, and finallynz-ejbsz.txt
is the file where we'll put our encrypted data.At this point, you're prompted to enter and confirm a password. After that, `nz-ejbsz.txt` gets created (if it didn't already exist).
This:
10/09/2001
Ugh, it's my turn with these stupid 'travelling pants'. I don't care what Lena, Tibby, Bridget, and Carmen say, they totally stretched them out. I hear some lady wrote a book about us, and it's coming out tomorrow. That will definitely be the worst thing that happens in the whole world. The only thing that could possibly be worse is if they wrote me out of the book.
…becomes this:
5361 6c74 6564 5f5f 8456 721a 878a 5dd8
6a46 a2cc d47d 9268 eb7d beac e1ea 4300
c4f9 49d5 138e 27f8 ddbf e4fd bfce 7abc
e75b 7b2c 0241 29f7 459b c47c 9e91 8ac3
e258 90c2 3693 14a1 4a1b 45bc 9883 b16f
8e37 a854 9699 18cb 7660 5033 1c7f 13ca
599f 3687 f2fc 7dda 5d0d 34c9 db33 16eb
d67f d6b6 bfff b142 31ae a451 1095 6213
68ee fa5a a1b1 5795 0870 8fde c081 2e52
5c10 fcd9 a098 580d e49d 8aa5 7eee f703
de39 8028 669f e62c 944c 3fdd 5eb3 5719
2f3a 420a 7ae1 87b5 1ec4 9d78 829d eb93
a3ec 1592 2761 49b0 e78c 8fe9 6b16 f9b6
e9e7 337a fec0 9a2e 504b 14eb e565 f83e
b5f5 b46e f1b9 5d49 6b41 d6d8 909a c478
86fe b1fe efad 5045 c67d 8496 286a ad0d
08e8 8dc3 eb65 0a44 9f6d e40a 2bc8 002f
b4b8 81c1 9b7e f9f7 37fb a037 58bd e5b8
d160 6239 e306 38e5 5e07 f2d8 b962 a968
3a20 bda3 1c09 6239 9c02 af4c 5909 27cb
9bfc b8ab 22fa 7790 20f4 4712 df29 841e
cdc0 d265 b5ec b7f0 dd56 bc73 ace2 eac9
54eb 5f4e 5514 1fc9 3ab0 b2fb ba24 b82b
50ea b7ed 85a7 80f1 339c 1f24 0dea 5e5a
a62f 3dfb 963b e6bc 6c3d e5f6 5a6b 6908
ad4f ca6e 0808 e25f 5adb 0428 f9d0 b41f
d8c4 87ce 5034 368b 4bc5 23b0 7ca1 b62d
fcb4 8e81 2224 60d2 0c24 3fa3 56d7 5154
cbcc e0c3 27af 6572 69e4 1a99 2d0e 9c6d
58c8 2b1a f040 06dc 5e79 64f8 81b4 bdf5
0735 8660 d286 6c8b e642 e225 8e5c e4d7
31c8 25bf dd49 9a5b 2f5b 716d 7669 9d79
071b 827f 728f 3a0b 4300 ae39 5aab c9f9
3296 e315 e895 ee63 d679 5326 16ac 542f
The command to decrypt is almost identical:
openssl enc -aes-256-cbc -d -in nz-ejbsz.txt -out got-your-diary.txt
The only differences are that -e
becomes -d
, the input is set to the encrypted file, and the output is set to a new file.
We're prompted for the password again, and this:
5361 6c74 6564 5f5f 8456 721a 878a 5dd8
6a46 a2cc d47d 9268 eb7d beac e1ea 4300
c4f9 49d5 138e 27f8 ddbf e4fd bfce 7abc
e75b 7b2c 0241 29f7 459b c47c 9e91 8ac3
e258 90c2 3693 14a1 4a1b 45bc 9883 b16f
8e37 a854 9699 18cb 7660 5033 1c7f 13ca
599f 3687 f2fc 7dda 5d0d 34c9 db33 16eb
d67f d6b6 bfff b142 31ae a451 1095 6213
68ee fa5a a1b1 5795 0870 8fde c081 2e52
5c10 fcd9 a098 580d e49d 8aa5 7eee f703
de39 8028 669f e62c 944c 3fdd 5eb3 5719
2f3a 420a 7ae1 87b5 1ec4 9d78 829d eb93
a3ec 1592 2761 49b0 e78c 8fe9 6b16 f9b6
e9e7 337a fec0 9a2e 504b 14eb e565 f83e
b5f5 b46e f1b9 5d49 6b41 d6d8 909a c478
86fe b1fe efad 5045 c67d 8496 286a ad0d
08e8 8dc3 eb65 0a44 9f6d e40a 2bc8 002f
b4b8 81c1 9b7e f9f7 37fb a037 58bd e5b8
d160 6239 e306 38e5 5e07 f2d8 b962 a968
3a20 bda3 1c09 6239 9c02 af4c 5909 27cb
9bfc b8ab 22fa 7790 20f4 4712 df29 841e
cdc0 d265 b5ec b7f0 dd56 bc73 ace2 eac9
54eb 5f4e 5514 1fc9 3ab0 b2fb ba24 b82b
50ea b7ed 85a7 80f1 339c 1f24 0dea 5e5a
a62f 3dfb 963b e6bc 6c3d e5f6 5a6b 6908
ad4f ca6e 0808 e25f 5adb 0428 f9d0 b41f
d8c4 87ce 5034 368b 4bc5 23b0 7ca1 b62d
fcb4 8e81 2224 60d2 0c24 3fa3 56d7 5154
cbcc e0c3 27af 6572 69e4 1a99 2d0e 9c6d
58c8 2b1a f040 06dc 5e79 64f8 81b4 bdf5
0735 8660 d286 6c8b e642 e225 8e5c e4d7
31c8 25bf dd49 9a5b 2f5b 716d 7669 9d79
071b 827f 728f 3a0b 4300 ae39 5aab c9f9
3296 e315 e895 ee63 d679 5326 16ac 542f
…becomes this:
10/09/2001
Ugh, it's my turn with these stupid 'travelling pants'. I don't care what Lena, Tibby, Bridget, and Carmen say, they totally stretched them out. I hear some lady wrote a book about us, and it's coming out tomorrow. That will definitely be the worst thing that happens in the whole world. The only thing that could possibly be worse is if they wrote me out of the book.
Now at this point, if we were able to devote the entire semester to encryption workflows, we might build an encrypted Node or PHP application that can store a user's password and credit card info and deepest, darkest secrets, but I have to draw a line at a certain point to make it really clear that what we're learning is not sufficient to build a secure application. To go much further, we'd be hitting a huge jump in the learning curve, where we talk about…
…and that's just to take the next step into a discussion of secure full-stack architecture. This is absolutely not something you'll be tasked with in your first job. Or third or fifth. Bigger places have security specialists, and smaller places use 3rd-party APIs to handle things like online payments and user authorization.
Today, towards the end of the lesson, we'll look at how service providers handle security, and I'll provide you with a reading list of great materials if you want to keep going ahead with learning about full-stack web application security, but, trust me, you've come a long way in a short while if you can read half of today's lesson without your eyes crossing.
Hashing, like encryption, is turning data into something indecipherable. Unlike encryption, however, with hashing, you can never turn the data back.
So is your data just lost forever? Well, kind of. But it's still useful. It's all about asking the right question.
Let's say I were to use one of the most secure hash functions, SHA-256 Opens in a new window. I hash 'password1234' and get the value b9c950640e1b3740e98acb93e669c65
766f6670dd1609ba91ff41052ba48c6f3
b9c950640e1b3740e98acb93e669c65
is impossible to reverse-engineer to get the value 'password1234'.
766f6670dd1609ba91ff41052ba48c6f3
'password1234' is effectively gone, and all we're left with is b9c950640e1b3740e98acb93e669c65
(which I'll start referring to as
766f6670dd1609ba91ff41052ba48c6f3theHash
for brevity's sake).
However, what you can do is check to see if values match theHash
. Hashing always returns the same output when provided the same input.
What hashing lets you do is avoid storing sensitive data. You can 100% never store a user's password, but still have the ability to check if a user's password is correct.
In other words, the wrong question is "What is the original value of the hash?"
The right question is "does thisHash
equal theHash
?"
Okay, but doesn't that still leave us vulnerable to credential stuffing, or dictionary attacks, like trying the top 10 million passwords?
I mean, hopefully you're limiting the rate/number of attempted logins, but if someone were able to get around that, then yes! Hackers have some pretty cool advanced Opens in a new window ways of cracking hashes.
Hashing is funny, because it's kinda like the decryption password for each bit of hashed data is the data itself.
We can defend against attacks like credential stuffing by adding an additional password to the user's password. This is known as a 'salt'.
A 'salt' is basically an extra password that you generate for a user, added to the end of their password.
This means that two users will never have the same password, and a hacker can't decode a list of hashed passwords by themselves.
Before you hash their password, you add the extra password (the salt) to the end of the password they've supplied. You don't save their password, but you do store the salt.
Salts are "cryptographically secure Opens in a new window" random strings generated by your encryption library.
This means that a hacker would need to get into your database, defeat your database encryption, get the users' salts, and then perform their dictionary attack against the passwords.
At this point, you've made it much easier for hackers to simply cold-call people and say, "uh, hey, it's… Bill, from the I.T. department - can you tell me your username and password?". In other words, your job is done.
Okay, at this point you've probably noticed that administrative passwords are still important for authenticating who is allowed to decrypt things, verify passwords, etc.
Obviously, you don't want to have to type in a password every time a user creates an account, so they need to be stored somewhere. But if a hacker gets into an encrypted database, and the encryption key is stored in that same database, it's like leaving the key to your house just sitting in the lock.
If you're in charge of your own hardware, there are extra-secure devices Opens in a new window called "hardware security modules" that you can use to store keys and perform encryption, instead of doing it on your regular server. If someone breaches your server, they still have to get into this thing:
Similar devices can also come in the form of small, portable keys Opens in a new window for the security-minded person on the go.
At very least, you should be storing your passwords on a separate server from the encrypted data. That way, a hacker would need to hack into both your web server and your database server.
As I mentioned before, smaller shops (and big ones, too!) will often choose not to handle secure data, like credit card transactions. The most popular option for this is Paypal. The second most popular is called Stripe. Today we're going to talk about how Stripe secures their data (because Stripe's documentation Opens in a new window is much nicer to read).
HTTPS and HSTS for secure connections
Hey, we know how to do that! HTTPS just means having an SSL certificate, and HSTS just means setting your Strict-Transport-Security
header.
Encryption of sensitive data and communication
All card numbers are encrypted at rest with AES-256. Decryption keys are stored on separate machines. None of Stripe’s internal servers and daemons can obtain plaintext card numbers but can request that cards are sent to a service provider on a static allowlist. Stripe’s infrastructure for storing, decrypting, and transmitting card numbers runs in a separate hosting environment, and doesn’t share any credentials with Stripe’s primary services (API, website, etc.).
Okay, we don't know how to do all this stuff, but with today's lesson we know about them.
Vulnerability disclosure and reward program
Stripe maintains a private, invite-only bug bounty program, with the assistance of HackerOne.
This is pretty cool, and a not uncommon technique. "Bug bounties" are rewards that companies offer for finding security problems. In other words, they pay hackers for telling them they're vulnerable.
Sometimes bug bounties are public, and sometimes they're run through an organization like HackerOne Opens in a new window, which has paid out >$100 million dollars in bounties.