Search This Blog

Showing posts with label blog. Show all posts
Showing posts with label blog. Show all posts

Thursday, March 12, 2015

The real problem behind highly transactional applications

An architecture trying to respond to at least 10000 concurrent connections per second, is trying to solve the C10K problem, even if this is so last decade is still breaking servers, architectures, and configurations, giving sysadmins real headaches and not always because of real connections, also for basic DDoS attacks (pretty much is the same concept: lots and lots of new connections to the same service).

Today, because of the need of connecting and sharing resources across infrastructures and also the need to implement high availability in solutions many companies have implemented SOA or multi-layer solutions when these solutions can become handy, it could also be a problem if are not implemented in the correct way: without the proper testing set, and sometimes people don't even know it if the architecture implemented is going to respond in the correct way or even the way that the developers team are planning. this problem does not only affect to wrong configured architectures but also solutions not properly planned to grow.

The problem usually is errors in coding and validation on every layer of the application solution; proprietary code, web server, application server, DBMS, and so on, if applications were coded properly security and bug-hunting guys would be unemployed by now.

So what are you going to see in a highly transactional server with a misconfiguration problem?

  • Lots of TIME_WAIT connections.
  • Lots of CLOSE_WAIT connections.
  • Possibly memory problems.
  • Possibly the system swapping.
  • Really Slow server.
  • Many timeouts in the application log.
  • The application became unreachable.
  • We can't create new connections to the server, even ssh ones.
  • ... Worst case scenario, dead servers.

But service restart, reboot and kill will not solve all the problems, nor the operating system or the kernel are there to solve all the problems, the kernel work is to handle the control plane and in a general and multipurpose way, if you take only the kernel tuning approach, the kernel is going to be part of the problem, and you are going to be far far away to solve the problem.

The kernel has a known way to work and knows O(n^2) complexity, with every new connection the kernel has to walk down all the current processes to figure out which thread should handle the packet or if we talk of connection polls the process is the same, each packet had to walk a list of sockets.





Hight level Kernel diagram: layers and intercommunication (1).



Even if you take the complete tuning approach, maybe the application is going to work, but not always, you only are going to get stability, but not the real solution, the correct way to handle and solve the C10K problem, even more, C10M is letting the kernel solve de control plane and applications handle the data plane and/or write software to bypass the stack, such as DPDK (2), this is pretty much like if we're talking about an exokernel (3), using an end-to-end principle.





Common Kernel V/S ExoKernel (3).



To build usable and scalable applications to support 10 million concurrent connections per second (and more), we need to solve other kinds of problems first.

  • Packet scalability.
  • Multi-core scalability.
  • Memory scalability.

So the real problem is.... knowledge, lots of developers know how to code client/server applications, but less than 50% of them know how the TCP/IP or TTCP/IP works, and how to use MP libraries, I understand this is not an easy task to accomplish, but we really need to start working on that, with every performance problem we also need to start looking in the code and software architecture searching for scalability errors, not always will have site reliability engineers to help our application to be super reliable, super fast, all the time, even if we have these guys to help us, the solution can be found many iterations behind before the system starts losing points of our precious 99.99…99

And what if, we can correct coding errors fast enough or we can’t (in the case of proprietary software): tuning, will always be the answer, but like I said, tune all the layers, not only the kernel:

  • Tune for aggressive network throughput.
  • Tune timeouts.
  • Tune the socket parameters.
  • Tune shared filesystems.
  • Tune the schedulers.
  • Tune the complete architecture.
  • ….

There are many layers before can reach the kernel, and even if you want to tune the kernel you need to understand how the application works, communicate and use internal and external applications, libraries, and utilities.





Common multi-layer software architecture (4).



In common transactional architectures, tuning will work like a tourniquet in a bullet wounded, probably saving a life but In highly transactional applications, tuning is just to help the system, not to solve problems and your application will die slowly and painfully.

References:

  1. https://en.wikipedia.org/wiki/Monolithic_kernel
  2. http://dpdk.org/
  3. https://en.wikipedia.org/wiki/Exokernel
  4. http://www.guidanceshare.com

Wednesday, October 15, 2014

Why companies should embrace OSS and the DevOps movement

It’s not a secret that the best and most competitive technologies today exist in the world are based on some Open Source component, maybe the Linux kernel, GNU/Linux operating system, a version of BSD, modules, drivers, or the programming language is completely free or have a free compiler or interpreter.

On the other hand, we have a complex and extensive range of solutions that are born almost with every blink, we need options to integrate these solutions into existing technologies, we have to interconnect new software with hardware and almost all possible combinations can generate with these, so basically, no matter what kind of hardware or software want or have to work, if we want to survive in the era of cloud solutions, build an interface to interconnect them will always be the fastest solution, we will always have to be interconnected and this is a main principle that in cloud architectures is required to satisfy, where the hardware is defined by software and everything is “as a Service” (XaaS), everything has to be able to be interconnected with something else, in short, this is the Application Programming Interfaces (API) age.

Nowadays technologies are needed with API (RESTSOAP), communities (RedditIRC, …) accessible information (Blogs, Wikis), otherwise we have to be able to build there, the faster way possible, we need tools, and languages, plugins, everything that we can use to build these interconnections and better solutions, the only platform we can use to accomplish this with the speed needed is the Open Source, it’s not a mystery that Open Source technologies based move much faster than any other kind of proprietary technologies, so if we don’t want to be a technological dinosaur from one day to another we have to know about agile development languages (pythonrubygroovy, …), collaborative work applications (gitlabgithubtracbugzilla, …), source code management and revision control (gitsvn, …), tools that move and help us with the speed required to build new products, today knowing about Open Source, licenses, programming languages, communities is no longer an option.

Speed is not the only thing that Open Source gives, for any professional, having software freedom without limits, whether that solves the problem 100% or having a piece of software that delivers a solid foundation in order to modify and make what is required, POCs without asking for a copy of the software to a company is priceless, which also has an impact on the number of users downloading the same software, which can modify, test, add new characteristics.

I don’t want to expose a vision where nothing else exists besides the Open Source Software but to compete technologically we have to know the ecosystem or even better to innovate must know the tools and work with the right people for the job, people who can integrate all kinds of solutions, but who are this guys? These guys are like super-sysadmins + developers + Open Source gurus, all this and more, equals DevOps engineers (like me), better check this post by the puppetlabs people maybe in the future I’ll write my own.

But you don’t need to believe me, I can challenge you to find a job offer in a company that wants to innovate (any real IT company), regardless of the language or the country you are not looking for DevOps guys or Open Source knowledge.

Let’s cut to the chase, any company that wants to innovate technologically needs DevOps in its payroll, and any DevOps who wants to have a decent job requires Open Source knowledge.

Hope you enjoyed the reading, see you soon!!!
$ commit

Monday, March 15, 2010

Migration post

Recently Headup lost its database. But I'll try to be back as fast as I can.... and now a Drupal upgrade broke my admin/theme/life

Sorry. :(