Thin provisioning FAQ

Tony Lock,answers common questions on thin provisioning. His answers are also available below as an MP3 download.

To hear this thin provisioning podcast, click here.

The questions he addresses are:

What is thin provisioning?

What are the benefits of thin provisioning?

In what scenarios does thin provisioning make sense?

What are the potential pitfalls of thin provisioning?

Q. What is thin provisioning?

A. Explaining thin provisioning as simply as possible has some merit, but it’s also quite difficult to do because the terminology is now used to describe a number of similar but slightly different solutions. Essentially, thin provisioning is all about allocating space as and when the systems actually need to consume the space. In traditional systems when the platform’s set up at the beginning, essentially all of the space is allocated then. So, you might start off with a platform that’s going to consume perhaps a few GB or few hundred GB of storage, but because of the uncertainties of the future and difficulties in the past of changing storage systems once you’d set them up, you might have to physically allocate a lot more storage to that system. Perhaps two, three even five times as much as you think at the beginning is going to be consumed.

Essentially, thin provisioning is just-in-time provisioning, a way of making sure there’s storage available when you need to consume it instead of saying here’s a big chunk of it now and for evermore.

Q. What are the benefits of thin provisioning?

A. The perceived advantages of thin provisioning essentially centre on cost. The idea that you can buy storage when you need to use it is very appealing, particularly in the current economic climate. In the past, the traditional way of implementing storage systems was to buy the entire amount you thought a platform was ever going to need on day one and make it available on day one, with the result that for quite a lot of the time the storage sits empty. Potentially, it might sit empty for evermore because once it’s allocated to a particular application nothing else can get in and use that otherwise free space. So the idea of thin provisioning is to buy the storage as and when you need it. So buy less storage capacity on day one and if your system grows, then in month five, month 10, year three, buy more storage as you get close to using everything that’s available.

The other big thing around thin provisioning is the idea that you can couple multiple applications or servers and have them use a common pool of storage so they can share the disk space you have available — with use of that system only made when applications write data to it, rather than having everything eaten on day one.

It comes down to money, but there are some peripheral benefits to this, particularly green benefits. Clearly, if you have disk space that’s allocated but not being used, you’ve got physical disks in there that are spinning and consuming energy, which costs you money, potentially generating excess heat that needs to be cooled.

Beyond that there’s the whole issue of managing storage and that’s the real killer in terms of cost. The administration of traditional file systems, particularly when you want to add capacity to them after they’ve been installed, wasn’t easy in the past. It has got simpler in recent years but, again, one of the plus factors of thin provisioning is that it can make daily administration much simpler and hence more cost effective to carry out. A side benefit there is that having simpler administration limits any potential human error that might get into a system as changes are made. As we all know, changing things is what causes most service interruptions.

Q. In what scenarios does thin provisioning make sense?

A. This is a really interesting question. Thin provisioning certainly doesn’t fit everywhere. But the scenarios where it does make sense are the ones where you have many different applications or servers trying to share a physical pool where they can grow from a single free pool, rather than each having to have its space allocated to it and only it from day one.

Others are when you have applications where it is relatively straightforward to predict their growth, so you know that every n months you’ll need to buy more storage so you can avoid that capital cost of having to buy everything for them for the next year to five years in advance. Beyond that, I know some people who take the opposing view that where you have applications where it’s difficult to predict usage that that’s a scenario where thin provisioning makes sense, again because of that ability to grow quite quickly. The problem there is that you very much have to keep an eye on the systems to make sure you do have capacity available as and when the systems need it. Clearly, where it’s difficult to make forecasts in advance that could lead to problems, unless you’ve got enough spare capacity available to meet any sort of contingency. Also, where cash might be tight on day one is where this could have some applicability, but only if the application profile fits with thin provisioning.

Q. What are the potential pitfalls of thin provisioning?

A. There are some areas where thin provisioning really doesn’t fit. If we start at the top, the most obvious technological area where there might be some things that need to be considered before using thin provisioning are systems where either the applications or the file systems on which those applications sit are basically not good in a thin-provisioning scenario. There are some file systems and applications where once you allocate space to that system it goes into that system and puts its meta data everywhere it thinks it’s going to have storage. So, if you say to it ’We’re going to have 5 GB of storage,’ it wants to go out and physically claim all 5 GB immediately, rather than writing data to it as and when required — and you end up with a lot of dead space that you can’t use for other systems. That perhaps, technically, is the area that’s least deserving of use for thin provisioning.

Beyond that it really comes down to a couple of areas. One which not many organisations think of is the way the storage budget is allocated. In most organisations today budgets are allocated on day one and it’s very difficult to get allowances made in budgets for ongoing costs. They’re assumed to be apart from the overall overhead of IT itself, so if you have a project today and you’re saying on day one we think it’s only going to consume 3 GB or TB, then we’re only going to buy that much. That’s fine, but if you’re using a thin-provisioning system and you know the system is going to grow, then you need to make sure there are budgetary allocations in place to cater for the additional expenditure when you need to buy new storage to add to the thin-provisioning system. That’s something that most organisations today struggle with, frankly, because most IT capital budgets are focused around project initiation, and putting in budget capacity for growth over time is something most organisations have get to get to grips with.

There are some other issues that need to be addressed. One is where you have a sudden demand for storage coming in at once from multiple applications. If you physically cannot add storage to the platform in time, you’re going to run out of space; and in scenarios where you’re using a thin-provisioning system with multiple applications sharing that pool of storage, potentially you could have all those applications becoming unavailable though lack of disk storage. That puts a large onus on the monitoring and management of these platforms. Really close monitoring has to take place, which in turn means the supplier of the disk systems that sit inside the thin-provisioned storage needs to be able to react quickly enough for you to be able to get hold of storage as and when you need it. Clearly, you need the budget processes in place to buy the stuff when you need it.

There are some other areas where thin provisioning may not be suitable, and here I’m thinking of very high-performance systems where millisecond response times are really vital.

Beyond that, it comes down to monitoring. Do you want to have all your eggs in one basket? Because, like a big server or mainframe, when you’ve got lots happening and lots of applications sharing this single pool of storage and it runs out of space, it’s not one application that you lose, but potentially all of them.

Originally published on SearchStorage.co.UK.