DjangoCon 2015

Mejora el rendimiento de PostgreSQL en solo 15 minutos

Josh Berkus  · 

Transcripción

Extracto de la transcripción automática del vídeo realizada por YouTube.

I'm I'm going to talk a little bit about some basic things you can do to improve Postgres performance for to your django application we got our running elephant here few people know that elephants can run at 20 to 25 miles an hour they're actually

quite fast and Postgres can do thousands of requests per second so if you're not getting thousands of requests per second there's probably a reason why so what you do is you log into post res and you go into PostgreSQL conf and you set the hidden parameter

go faster to ten and then you save it and you're done okay so we're done here we have questions no seriously unfortunately it's not that easy the we actually spend in posters of posters development world we spend a lot of time talking about how

can we make things faster as a matter of equity spend more time talking about that than just about anything else except maybe the commit cue and whose turn it is to review things and as a result if there was something that we could do by default to make posters

faster we probably already did it so the things that you're going to do to make Postgres faster are going to require work you can't just change a few configuration settings in fact one of the things that I do to get paid as I do a lot of performance

tuning and performance engineering on different sites and this is more or less how I spend my time it varies a lot per site but more or less I spend my time and you'll notice that tuning the configuration is a pretty small minority of how I spend my time

and the affect the tuning the configuration has is an even smaller minority as a matter of fact sometimes and in some environments changing the post Korsakov settings will have no effect no measurable effect at all on database throughput so instead we're

going to talk about some of the other things you can do which have much larger effects on database throughput and responsiveness the first one is do less the fastest database request is the one that you don't make at all anytime that you're adding

a piece of code that's going to work with with data you know anytime you're referencing the or M or whatever you say first of all do I really need the database to answer that is this something that I could be answering in the individual Django session

without calling out to the database now one of the things that we're talking about there is caching of course I mean the obvious one is to just look in the results cache and actually make use of the results cache that seems obvious but for some reason

people don't do it as much as they could if the results cache isn't enough because you actually need to share things among several different backends then you can use things like Redis and memcached also don't forget about using CD ends for caching

large objects even if the large objects are being stored in the database there's some django systems out there people storing images in the database they're storing compressed data and that sort of thing give that data you know when you've retrieved

it once copy it out of Postgres loaded up into a CDN reference it in the CDN by file name instead of retrieving it from the database all the time because there are a lot it's a lot easier to scale a CDN than it is to scale a relational database so the

things that you don't need a relational database for scale them elsewhere so caching is sort of our first part of thing and do as much caching as you reasonably can with your concurrency and sort of data consistency model the other thing that we see a

lot in doing less is actually some anti-patterns better better exhibited through common mistakes that people make one of which is polling now this is a very simplified example but this is something I see a lot with celery based apps and others which is let's

have every back-end pull the database as fast as it can so that is polling for new jobs with no weight or with a little tiny wait like 10 milliseconds well this generates thousands to hundreds of thousands of database requests per second frequently the majority

of your traffic it's not not a good thing to do another anti-pattern is requesting data you already have believe it or not this is from a real-life example let's look up users by the user ID and then return the user ID more or less the SQL equivalent

of select ID from users where ID equals something see that a lot the database does not need to be in the loop here at all you can save yourself that request also data you don't need for example returning an entire table in order to get the first row looks

great when there's only one row on the table when there's 10 million rows in the table does not work so well so avoid some of this if you have very wide tables use some of the values list methods in order to return only the columns you need particularly

[ ... ]

Nota: se han omitido las otras 2.378 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.