La gestión de la memoria en las aplicaciones Ruby (Garden City Ruby 2014)

Presentación

Vídeo

Transcripción

Extracto de la transcripción automática del vídeo realizada por YouTube.

HARY KRISHNAN: So, thank you very much for being here on a Saturday evening, this late. My talk got pushed to the last, but I appreciate you being here, first. My name's Hari. I work at MavenHive. So this is a talk about Ruby memory model. So before I start, how many of you have heard about memory model and know what it is? Show of hands, please.

OK. Let's see where this talk goes. So why I did I come up with this talk topic. So I started my career with Java, and I spent a lot many years with Java, and Java has a very clearly documented memory model. And it kind of gets to you because with all that, you don't feel safe enough doing multi-threaded programming at all.

So with Ruby, we've always been talking about, you know, doing multi-process for multi-process parallelism, rather than multi-threaded parallelism, even though the language actually supports, you know, multi-threading semantics. Of course we know it's called single-threaded and all that, but I just got curious, like, what is the real memory model behind Ruby, and I just wanted to figure that out.

So this talk is all about my learnings as I went through, like, various literatures, and figured out, and I tried to combine, like, get a gist of the whole thing. And cram it into some twenty minutes so that I could, like, probably give you a very useful session, like, from which you can further do more digging on this, right.

So when I talked to my friends about memory model, the first thing that comes up to their mind is probably this - heap, heap, non-heap, stack, whatever. I'm not gonna talk about that. I'm not gonna talk about this either. It's not about, you know, optimizing your memory, or search memory leeks, or garbage collection.

This talk is not about that either. So what the hell am I gonna talk about? First, a quick exercise. So let's start with this and see where it goes. Simple code. Not much to process late in the day. There's a shared variable called 'n', and there are thousand threads over that, and each of those threads want to increment that shared variable hundred times, right.

And what is the expected output? I'm not gonna question you, I'm just gonna give it away. It's 100,000. It's fairly straightforward code. I'm sure all of you have done this, and it's no big deal. So what's the real output? MRI is very faithful, it gives you what you expected.

100,000, right. So what happens next? I'm running it on Rubinius. This is what you see. And it's always going to be a different number every time you run it. And that's JRuby. It gives you a lower number. Some of you may be guessing already, and you probably know it, why it gives you a lower number.

So why all this basic stupid code and some stupid counter over here, right? So I just wanted to get a really basic example to explain the concept of increment is not a single instruction, right. The reason why I'm talking about this is, I love Ruby because the syntax is so terse, and it's so simple, it's so readable, right.

But it does not mean every single instruction on the screen is going to be executed straight away, right. So at least, to my junior self, this is the first advice I would give, when I started, you know, multi-threaded programming. So at least three steps.

Lowered increments store, right. That's, even further, really simple piece of code like, you know, a plus equals to, right. So this is what we really want to happen. You have a count, you lowered it, you increment it, you stored it. Then the next thread comes along.

It lowers it, increments it, stores it. You have the next result which is what you expect, right. But we live in a world where threads don't want to be our friend. They do this. One guy comes along, reads it, increments it. The other guy also reads the older value, increments it.

And both of them go and save the same value, right. So this is a classic case of lost update. I'm sure most of you have seen it in the database world. But this pretty much happens a lot in the multi-threading world, right. But why did it not happen with MRI? And what did you see the right result?? [00:04:52]? That, I'm sure a lot of you know, but let's step, let's part that question and just move a little ahead.

So, as you observed earlier, a lot of reordoring happening in instructions, right. Like, the threads were context-switching, and they were reordering statements. So where does this reordering happen? Reordering can happen at multiple levels. So start from the top.

[ ... ]

Nota: se han omitido las otras 2.282 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.