DrupalCon Prague 2013

No maltrates a tus servidores

Marji Cermak  · 

Presentación

Vídeo

Transcripción

Extracto de la transcripción automática del vídeo realizada por YouTube.

okay please take your seats let's start the bell tolling hello guys welcome to my presentation have you been stalking your service two things I would like to say for the beginning first late this is a beginner level presentation so if you know about monitoring

he would probably get bored so I won't take any offense if you decide to leave the ring presentation because time is pressure the second one my business partner and friend was presenting a session at group consider is separate and he realized that there

was one guy sleeping in front of him during presentation and he got really nervous because he thought oh my god I'm being boring you know Pete Rouse sleeping once again I'm born in sick Republic beautiful I know that Czech beer is beautiful so if you

fall asleep I will take it as you had a really good night last night and I will take no offense you need your beauty sleep so just to introduce myself I have been a sysadmin for 13 years I'm from Czech Republic lived in proc many years moved to Sydney

10 years ago and i think when i started working for the biggest telco in australia which was about four years ago I became a devops engineer because we were in doing a lot of puppet and continuous integration and two years ago I set up a company called morphed

together with my colleague and we do triple services puppet deployment to the cloud beautiful stuff so monitoring is not my patient number one but by talking to our clients I realized that it's very needed and I don't like going to servers without

having any data to investigate so this is just the bare minimum you need to know if you want to start with monitoring so this is my first big presentation like that so when I was doing a little research how to start I realize that there is this one rule the

rule of three things everybody recommends apprently the three is perfect amna three imperfect to everything that comes in three is perfect so even if you are presenting we are supposed to give you three things so two approaches I will tail I will tell you

these three things I want you to remember when you are leaving this room or this is how it was done in the past this is how it's done now and this is what's the future so I would like you guys to know what monitoring is and why you want to start monitoring

if you haven't yet then we will look at what tools are here ready to be used for you even you haven't been exposed to that before and the third one I will show you how easy it is to start and I believe that you will be able to do that yourself this

afternoon so part one what's monitoring and why you want a monitor a broken glass I worked as a I work at the brokerage company here here in proc i think it was in 2001 and they wanted to get access to the New York Stock Exchange to be able to offer check

customers to trade on your stock exchange they went the buses went to the US they're trying to get an IP I didn't get any API so they got a web interface which was meant to be used by a person brought it back to Prague and say guys you have to parse

the HTML and makes it basically hook our own website our clients trade through to the system to pass its to the new york stock exchange and then back so we spent maybe two months developing this ssl sockets javascript parsing it was a lot of fun we started

trading it's like 2pm afternoon because the time difference new york stock exchange was opening to vm product i'm and five minutes later i have these three sales guys behind my back saying we are not trading something's broken we we are losing

money so we looked at the website and it completely changed apparently the provider changed their web interface without telling us why would they write because we were parsing their HTML so I realized that we completely underestimated the monitoring part what

we could do is for example having a little robot you know like trying to sell one little ticker every five minutes and then we just like do a little operations every five minutes to get known that fact that it's broken in the morning we would have much

more time to fix that so I have to really bad three hours of my life with these cells guys behind my back I've managed to fix it but I don't want to do that again so I started monitoring I don't want to scare you with this definition this is the

only definition i have but i really like it because ah it's actually not from the IT area it's from nature conservation area and it says exactly what monitoring is it's a series of observations in time carried out to show the extent of compliance

or degree of deviation from an expected norm i really like that one so i'm going to offer you a few reasons you want to monitor the first one is you want to know the bad news before your customers do or at least your boss does as my story you want to scale

up your servers in advance if you know that you are going to run out of disk space or if your CPU is getting hot you use swap a lot you want to know that you want to tune up your application may be there are extra modules enabled recently maybe you have more

customers now and the application became lo and you don't know that or you want to monitor that response time you want to prove your up time to your to your customers this is let me take just one slight detour this is called 59 it's actually a unusual

unit you can find on Wikipedia together with Sydney Harbour as a volume measurement or a bus as how many people used in London apparently so five nines would be five minutes per year downtown which is like six seconds oe these numbers are very often in SLA

of histamines and management wants to see them and when we have our services in the cloud we see these numbers but i think they very often guess they wish that we don't know what this really means I read somewhere this this five nines which is the six

seconds a week downtown is actually if you have a power grid supporting a city this is consider this is considered a an interrupted service but i think we remembered google being down two minutes three weeks ago or amazon having some some troubles like 30

minutes lock as well recently so let's go back to the reasons i can see you want to minimize your downtown it's expensive when you are not up and running also maybe you want to capture your customers behavior you know like maybe they trade or use your

application during lunch break or on Sunday afternoon you want to know that to be ready maybe you had a ad running and you want to analyze the success of the ad you want to have that up to diagnose when something happens you want to be able to go back and

see what actually happened so having some data that i will show you a few examples i find interesting so you can watch out for trends like here we have a tape usage and you can see it's growing slowly over the 23 years so I can kind of see the speed and

I know when I will need another tape so that's a trend I can see from here then i can watch out for spikes this is a low average monthly craft and i can see there is a spike every week it's probably sunday night when there is the weekly cron job running

maybe backing up data dumping mysql database g zipping lock files I want to know that that there are these CPU spikes because maybe my application is is hurt at that time and if there were some clients they would have below average performance you can watch

[ ... ]

Nota: se han omitido las otras 3.610 palabras de la transcripción completa para cumplir con las normas de «uso razonable» de YouTube.