Dropbox started shifting workloads away from AWS to its own data centers several years ago because it needed more control over how files were stored and accessed. It developed a storage architecture called Magic Pocket to help, but over time it recognized that most people moved files to Dropbox for backup purposes, then rarely accessed them again.
Engineers realized it made little sense to have everything stored in the same way when many files weren’t being accessed much after the first day of putting them on the service. The company decided to create two levels of storage, warm storage (previously Magic Pocket) and a new level of longer term storage called Cold Storage, which lets Dropbox store these files less expensively, yet still deliver them in a timely manner should a customer need to see one.
Dropbox customers obviously don’t care about the engineering challenges the company faces with such an approach. They only know that when they click a file, they expect it to open without a significant amount of latency, regardless of how old it is. But Dropbox saw an opportunity to store these files in a separate layer.
“When one is talking about cold storage, we are thinking of files that are accessed less often. And for those files, we can make some trade-offs between storage, performance and network bandwidth,” Preslav Le, a software engineer in charge of the cold storage project told TechCrunch.
So it was up to the engineers to design a system with an acceptable level of latency to retrieve files stored in the cold layer without so much delay that customers would notice. It involved walking a tight design tightrope and considering all of the trade-offs that would be required with such an approach.
“Our cold tier runs on the same hardware and network but saves costs through innovatively reducing disk usage by 25 percent, without compromising durability or availability. The end experience for users is almost indistinguishable between the two tiers,” Dropbox wrote in a blog post announcing the new feature.
The company needed to ensure durability and reliability while creating a new storage layer to reduce their overall costs, and while the project wasn’t easy, they expect the dual tier system to save them 10-15 percent in costs over time.