03 February 2011

Virtualization for databases (bad idea)

Originally in response to this (excerpt of a) discussion on LinkedIn:
I think this is a LINUX issue! Because in linux the I/O is buffered or delegated to a proccess. When you install Postgres or any DB, Postgres tell to the OS that it can't wait to do the I/O, it must be done inmediattly. But what happens in a virtualized environment?
There's no such thing as telling the OS to do an I/O immediately, as opposed to waiting. It's the other way around: non-buffered I/O requires waiting for it to actually complete. This is important for such features as data integrity (knowing it was written to the platter, or, perhaps, in the case of SSDs, that the silicon was erased and written to).

The real problem is that virtualization is fundamentally flawed. What is an operating system for, in the first place? It's the interface between the hardware and the applications. Virtualization breaks this, without, IMO, adequate benefit.

Put another way, virtualization abstracts away hardware, to a lowest common denominator. It is therefore an unsurprising result that the subsequent performance is consistent with the lowest common denominator as well. "Commodity hardware" is a myth[1].

One of my greatest tools as a sysadmin is my knowledge of hardware, how it fits together, and how it interacts with the OS. Take that away from me by insisting on virtualization or ordering off a hosting provider's menu of servers, and I, too, suffer from the lowest common denominator syndrome.


[1] Really, it's that non-commodity "big iron" is extinct in my world, especially with the demise of Sun.

No comments:

Post a Comment