PyCon 2014

Las estructuras de datos de la librería estándar de Python

Brandon Rhodes  · 

Transcripción

Extracto de la transcripción automática del vídeo realizada por YouTube.

there needs to be more musical theater at Python at PyCon scary thought ladies and gentlemen just a quick reminder before we get started this morning if you have a device that makes annoying noises while the rest of the room is silent it'll probably save

you a fair bit of embarrassment if you made it silent now so please do that but you can keep talking amongst yourself until we actually get started don't be quite on my part it's on its way to closing right no no it's closing just people are pushing

people are pushing it open well this cameraman are we ready great hello everyone welcome to the morning session of the Friday of PyCon 2014 here in Montreal it's great to see so many of you in this room so our first presenter today is here at his seventh

PyCon he's been using python since 97 or 98 he's not quite sure and tries to share everything he learns with the community that has given him such a great programming language I think you've chosen a really good presentation and presented to come

to it to see first here at PyCon please welcome Brandon Rhodes well thank you welcome everybody to PyCon I'm glad I'm glad that you've all made it safely and I wish safe travels to any who haven't arrived yet I'm Brandon Rhoads this is

my talk all of your ducks in a row data structures in the standard library and beyond is sort of beginner to intermediate level talk because very often people complain that they don't hear until very much later about useful data structures hiding in the

standard library there's a lot of them that I didn't know about and so this tries to get them all into one slide deck and all into one place what are the purposes of this talk you know in case you're thinking of walking next door to one of the

other excellent talks it's good for me to put this at the beginning three main points how do data structures work how are they actually implemented in memory and why does that make some operations efficient and some slow what can they do efficiently for

us and then finally I will reveal which data structure is pythons most dangerous in fact I will reveal it right now in case you really want to be listening to one of the other excellent speakers it is the list to find out why you'll have to wait a few

minutes computer memory is an array of bytes little integers that can have eight binary digits and hold a value from 0 to 255 bytes in memory are named by sequential integers called addresses in your processor in your laptop or tablet asks for byte number

five and gets it back for memory to save space I'll be writing bytes and rows of eight but they are simply a continuous sequence with no break in the computer's hardware now Ram chips are the one massively parallel computing device in your computer

which is what provides you random access to a billion different locations without having to wait for a billion locations to be visited it has a grid of lines that he uses when you throw in a dress at your memory chip it makes like a billion simultaneous decisions

so that all 999 million locations that aren't being asked for their data at the moment kind of put their heads down and the one location whose address lines of just all lit up green puts its value on the outgoing data bus a massively massively parallel

computation that does a billion gives you a billion answers at once that's how you can get to memory locations with equal speed this talk will ignore a lot of the complexities that now exist between the chip your processor and RAM both the question of

compression the bus to ram is now so slow compared to your processor that many researchers are compressing their arrays in memory because the processor has ages to decompress a block that it gets once it finally comes in from Ram and I'm going to ignore

memory hierarchy but I strongly recommend you see Gustavo do arts what your computer does while you wait blog post and Dan Liu's blog post very recently how misaligning your data can increase performance twelve times along with a sample script where you

can see that work why do I ignore those issues and you know supposed to be a fairly technical talk because the standard library does lists and dictionaries don't pay attention to memory hierarchy very much at all and so I will follow suit all right I was

talking about addresses these numbers that are used to index memory locations since they are integers you can add and subtract from one address to visit another address that sits near it address arithmetic then is used to support the two fundamental data structures

of the machine level which I will call records and arrays a record is just when you put some field some information in an agreed order in memory every Python object in you your C Python memory store always starts its record with eight bytes of reference count

on the 64-bit machine an 8-byte address of the type you know the integer type that has all the integer methods or the floating point type that has all the floating-point methods so this is I checked this is how an int is actually laid out on my 64-bit machine

at home 24 bytes of which only eight actually contain the integer floats are laid out similarly but use a different format so they can store a floating-point number in their payload and some Python data structures are of variable length here is one way of

storing a string it has six different eight byte fields before it then actually includes the payload the actual string that I want to store given a record you retrieve a field by adding the records start address with the fields offset again very simple arithmetic

[ ... ]

Nota: se han omitido las otras 2.853 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.