Almacenando (y atacando) contraseñas con PHP (PHPBenelux Conference 2013)

Presentación

Vídeo

Transcripción

Extracto de la transcripción automática del vídeo realizada por YouTube.

so I'm going to talk to you today about password storage and attack thing yeah I was pointed out that is a spelling mistake I'm sorry spelling is definitely not my strong suit so so this project this talk I'm also basing it around a project that

I did specifically for this talk called password bad web app it's a bad web application meaning it has intentional vulnerabilities and it's designed for use for educational purposes so you can go out on github right now and check this out and follow

along it doesn't require anything except Apache it does have a few dependencies that you can install but we're going to look at it a little bit through here but I definitely encourage you after the talk to go play around with it patchy isn't necessary

some web server is and this is what the basic login page is it's think of it like a blog post like a blog engine and it has extremely primitive login functionality and it tells you you've logged in there's no session management there's no anything

all it's doing is a quick check on the post so it's really designed around to experiment with password hashing so moving along let's start from the very very very beginning what if we use plaintext to store passwords obviously it's storing

them in plaintext right in the database in the case of the bad web app it's using a flat file the data file but it's storing them in plaintext what's wrong with that can anyone see any problems here no no problems what happens if we have an SQL

injection vulnerability so as part of this bad web app I have a simulated SQL injection vulnerability that simulates select star from users so I can show you what that looks like here so we see I just put a little bit of flare injected here to let you know

it's different from the main page we have our two posts and then we have user 1 user 2 in their plain text password so this is extremely bad because every single user on your system is immediately exposed to any attacker pretty obvious I hope I don't

need to go too deep into that and if I do come talk to me afterwards so the problem here is that any attack vector into your system that actually gets into the system so everything beyond a simple cross-site scripting attack leaks all user credentials so we

can do better so if we use md5 that's the next logical step beyond plaintext let's just hash this password it uses the md5 cryptographic hashing function what's a hash though think of the hashes of fingerprint if I want to know who you are and

I take a picture I can then go around the room with that picture and look and find that hey that's you whereas a fingerprint I need to actually get access to you in order to see if it's the same and that's the same kind of thought process that

a cryptographic hash uses in that it's very very trivial if I have the original to get to this fingerprint it's very very very difficult if I have this fingerprint to get back to the original now there are some attack methods that we can use and we'll

talk about those in a minute but just from a mathematical perspective it's very very very inefficient to reserve to reverse them but there's also another key that we'll come to later they should also be very efficient to compute they should be

very easy to take your fingerprint or take the fingerprint of that password so what's the problem if we have md5 the SPL and vulnerability from before still gives us that hash but since the hash is one way how can we attack it the first method and the

easiest method is with a lookup table and you use a lookup table every single day Google Payson and md5 into Google and not all the time but a lot of times you will get back a result so this hash here md5 password simple look-up tables are incredibly CPU efficient

there are also incredibly storage system inefficient because you need to have one line with the hash and one line with the password for every single hash that you do so it's it's very much on the CPU side of a time memory trade-off so to do all passwords

less than or equal to 7 characters it requires about one and a half petabytes of storage I mean think about just try to think of that kind of scale and if you were to try to use these practically only incredibly simple passwords would fall so we can do better

enter the the famous rainbow table now the way a rainbow table solves this problem is by switching that time memory trade-off mooring the fate in the to the scope of less memory but we're going to use the CPU more so we take a seed so we start off let's

guess a random password we run that through our hash function and now we have a hash then we take a reduce function which is basically a one-way function that takes a hash as an input and produces something that looks like a password is the output and we run

that through hash again so we have the input reduce and that generates a new password which then we continue on this chain the rainbow table we chained a whole bunch of these together sometimes you can do a hundred chain lengths you can do 20,000 chain lengths

you could do a million hashes as their chain lengths so when we want to attack Hach we have a hash what we first do is look at the end of the table and do we have a match now okay run it for a reduce and check again no run it through reduce run it through

reduce again and check again and you keep iteratively checking up to the hash length so up to the chain length so if the chain is a thousand operations long it can take a thousand operations to check to see if a hashes in the rainbow table so it's very

CPU inefficient but it's very space efficient it uses much much much less storage so that one-and-a-half petabytes drops down to 64 gigabytes and this lookup takes roughly about a second and a half whereas the lookup in the lookup table takes microseconds

and it's also worth noting that it's probabilistic so because we're using reduced function we're not actually iterating every possible password we design it in such a way that with a reasonable size table we're going to theoretically hit

every password so that's why we say most passwords instead of all passwords so how do we defend against a rainbow table well the most common method is added dose of salt so salted md5 and this is just another branch and git uses the m25 cryptographic hash

[ ... ]

Nota: se han omitido las otras 3.212 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.