When we in IT talk about the data problem, we’re often talking about the vast amounts of processed and stored data moving among devices, systems, and data centers. But the data problem today isn’t exactly the problem of too much data, though that’s certainly connected. The data problem is really a problem of I/O capacity, which isn’t getting any faster, even as hard drive capacity grows.
Slower IOPS means that application performance suffers, and data-related tasks like backups and replications can’t get faster, no matter how much #storage is available or how densely packed the drive is. It’s in the realm of performance, not overloaded storage cabinets, where IT really runs into problems.
Deduplication in a traditional IT stack doesn’t work to solve this problem like it should because the same block of data gets deduplicated multiple times by various appliances in the stack. A backup appliance will deduplicate the data from backups. The same block of data will get deduped by the WAN optimization appliance before it goes across the wire. Other data protection apps may back up the data separately and deduplicate it as well. Today’s IT needs a way to simplify deduplication, among other things, to solve the data problem.
We’re now entering the territory of the zettabyte, the unit of measurement beyond petabyte and exabyte. The amount of data worldwide is expected to hit 44 zettabytes (the equivalent of 40 trillion gigabytes) by 2020, according to IDC Research. That’s an enormous amount of data, all of it in data centers around the globe, and much of it causing slow performance. Organizations are indeed scrambling to add more storage and optimize data the best they can, but they’re also fighting on the application performance front.