Hacker News new | past | comments | ask | show | jobs | submit login
IBM and Sony cram up to 330TB into tiny tape cartridge (arstechnica.co.uk)
104 points by 076ae80a-3c97-4 on Aug 2, 2017 | hide | past | favorite | 47 comments



Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. --Tanenbaum


When we were doing a colo migration for a Yahoo! Property I worked for 12 years ago it was cheaper and faster for us to backup to tape, FedEx the tapes, restore from tape, and then use the Internet to sync up the changes from in between rather than using the Internet for the whole transfer.


Let's do some XKCD what-if math because I love doing stuff like this.

An average truck can house anywhere between 20,000L to 40,000L [1]. Let's take an average 30,000L.

An average truck can also travel at anywhere in between 56 kph to 137 kph [2]. Let's take an average 100 kph.

The tape catridge in the image looks like it's about 4mm x 10cm x 6 cm. And it can store 330 TB.

Let's say we're transfering from New York to San Francisco. That's 4,678 km according to Google.

Total data transferred in one truck = 330 TB * ((30000 L / (1000 L / m^3)) / ((4 mm * 0.001 m/mm) * (10 cm * 0.01 cm/m) * (6 cm * 0.01 cm/m)) = 412500000 TB

Total time taken = (4678 km * (1000 m/km)) / (100 kph * (0.27 (m/s)/kph)) = 173259 s

Bandwidth = 412500000 TB / 173259 s = 2380 TB/sec = 19046 Tbps

For reference, 255 Tbps was the fastest single fibre [3]. This is a good 75 times faster.

Of course, all this comes with a network latency of 2 days.

[1] https://en.wikipedia.org/wiki/Tank_truck#Size_and_volume [2] https://en.wikipedia.org/wiki/Speed_limits_in_the_United_Sta... [3] https://www.extremetech.com/extreme/192929-255tbps-worlds-fa...


You need to factor in the time needed to transfer the data onto/off the drives.


It depends on what you're doing with the data. If you're moving it to some long term archive then you only need to count the write time of the disks at the source, packing time, and unpacking time.

If you're planning to actually use the data you need to add in the time need to read the tapes and put them on a usable medium.

Once you add in all of the overhead, the network comes out looking much better in comparison.


>Once you add in all of the overhead, the network comes out looking much better in comparison.

That's a broad assumption. There's a reason why AWS is finding demand for Snowball and Snowmobile. Internet connections just aren't that fast.


In total bandwidth the truck is still likely to win, but it won't be a 75x advantage. Closer to 3-4x.

Also note that the Snowball/Snowmobile fit into the first use case. They are loaded up then just plugged in on arrival, and even then the total number of enterprises that have used the services is not very large. The multi-day latency on the first packet means you need to be transferring enough data that it would take more than those few days to transfer on a STM-4 or similar. It works out to a tremendous amount of data.

Of course if you are an arctic researcher who has collected hundreds of petabytes of sensor measurements and only have a satellite backhaul, then this makes tons of sense. But for anybody with a solid backbone connection the calculus is much less favorable.


Also the risk of the truck having an accident and having to "replay" everything.


It is significantly less than the time required to lay the appropriate fiber coast to coast.


Wouldn't the bandwidth be close to infinte? As soon as the first byte gets there, the last byte also gets there. You'd have to account for the length of the truck and separation between consecutive trucks, right? The latency is simpler to calculate for sure.


Lovely :'D


There was a story out there a while back of a backup company migrating data centers who did just that. Packed up a vehicle full of hard drives and took a road trip.


Amazon has an entire service devoted to it: https://aws.amazon.com/snowmobile/


I would love a low cost tape backup solution for home use. I would put things on it that I normally would delete (but may theoretically need at a later stage). This could range anything from written material (small size) to raw 4k video material that normally would require fairly expensive storage. What tape solutions like these are available right now at a decent cost? Thinking sizes up to 50TB, so conventional disks are just too expensive and error prone.


Define "low cost".

You won't get away with spending less than $1,500 or so on a recent generation tape drive. Don't forget you need a SCSI card. This is the point at which most enthusiasts just turn around, walk home, and order a big stack of hard disks. Sure, you get to save a bunch of money on media. An LTO-6 cartridge costs $25, and that's 2.5TB. But tape is a pain in the ass, and since it has separate capex for throughput and capacity, you need to do the calculations for both. Then you only have one drive, so if it breaks you are sitting without it for X weeks until repairs go through. You could save money by buying older generations like LTO-5 (now two generations old), maybe $500 for a used drive on eBay and now you have to buy almost twice as many tapes which is a pain.

By comparison, with HDD you can pay $100 for a 4TB drive, grab 13 of those and you have your 50TB, all for the $1300 which is less than you'd have paid just for a bare tape drive. When hard drives break you swap them out and keep operating at reduced capacity. (I mean, I've done that with tape drives, but back then I had lots of tape drives.)

"Conventional disks are just too expensive." For small amounts of data, like 50TB, tape could quite easily be more expensive due to the fixed costs of the drive. Add the fact that you have to babysit the tape drive and swap in a new tape every so often unless you get a robot to do it for you.

"Conventional disks are too error prone." Not sure what that's about, when you have 50TB of data you're going to need to add error correction no matter whether you write it to tape or write it to HDD. In reality, you're going to have to deal with a loss of a full tape or full drive either way, plus some read errors on the available media.

I don't have exact TCO figures for you, but despite the low cost of media, tape TCO isn't that much lower than HDD, and for small amounts of data, HDD is cheaper.

Maybe, maybe you could make this work. With old refurbished drives and media, and an endless appetite for swapping out tapes, and the right software to index and manage it all, maybe you could achieve your dream while spending less money. But a couple grand will get you the HDD you need.


You had me at the 2nd paragraph. I'm also one of those who'll turn around and order a stack of HDD's. Capex is just too high. I find it quite funny that you think 50TB is considered "small amounts of data", but I suppose all is relative.

For a business however (if I had huge amounts of data that needed archiving i.e. >100TB) then I would definitely go the tape route. Or Glacier if I decided to be sensible.


About twenty years ago, I bought a new Qik 40 tape drive from the bargain bin at a big box office supply store. I felt I had an analogous use case...backing up all my personal and business and hobby data in case I needed it one day. The best most productive use I ever got out of it was moving a bunch of software and data from the office to my home computer. In those days the practical easy alternative was primarily 3.5" floppy disks, so a tape that could hold 60mb was pretty damn attractive. The full picture is that I also copied all my personal data onto tape twice, just in case. A year or two later, I bought a Zip drive so I could exchange data with clients. I copied "just in case" personal data onto Zip disks because it was more convenient than the tape. A couple of years later, I got a laptop with a CDRW and it filled a similar role. Then a desktop with a DVDRW.

About a year ago, I was going through some boxes at my parents house. There was a box of 5.25 DS disks holding backups of my important data. The only copy. Like all the Qic 40 tapes and Zip disks and most of the CD's and DVD's, they wound up in the trash.

For me, backing up personal stuff onto some technology has been mostly an exercise in spending time and money deleting it. I guess the practical heuristic is that if I can live with it being offline, I will live if I delete it. And the moral of the story is that deleting it sooner rather than later is cheaper.

Tape is a great technology when it makes money. It is a lousy consumer technology. Ebay shows me one "qic 40 drive." It's only twenty years later.


One of the main reasons tape is still used in scientific applications is its durability and robustness when transported, the price of the cartridges is less of a factor.

In exploration geophysics extremely large amounts of data are recorded onto tapes when acquiring in the field, and then either transferred over sea/land/air in a packing case to a processing centre. I wonder if this new tech will prove just as durable as current tape tech?


Putting a few of their numbers together seems to imply a read bandwidth of 24 Tbps (330 terabytes = 330*8 terabits/tape, tape length = 1098 meters => 2.4 terabits/meter, tape speed = 10 m/s => 24 Tb/s). Which to me is way more interesting than the total storage size. Of course I have no idea what you could actually feed that into, it far exceeds CPU memory bandwidth, let alone SCSI or PCIE. Do these things not actually run at full speed for more than a fraction of a second?


We're no where near 330TB tapes. This was an announcement that they were able to, in a lab, make a few inches of what would be a 330TB tape.

The idea is that when we're at a place where you can reliably manufacture 330TB tapes, fibre channel (or whatever we have then for SAN interconnects) will have caught up.


I realize this is a long way off from production. I guess what I was saying is I think the notion that we may have a 24 Tbps tape drive in 10 years more interesting than the notion that we may have a 330 TB tape drive in 10 years. And I'm wondering if my logic for the bandwidth is faulty, or if that might actually be doable with this technology (eventually).


They should copy Soundcloud's 900TB as a demo


Note that commercial tape cartridges max out at 15TB - so, less than the theoretical amount enabled by the 2010 breakthrough.

Will be a long time and quite expensive when/if these finally hit the market.

I would love to store backups on tape at home. Unfortunately harddrives are still the cheapest option in practice, and they are a bit too expensive and a bit impractical to comfortably use as offline storage (and transfer) for home use (my opinion / use case).


well the drives/etc we have for home use are magnitudes cheaper than commercial gear which only really has equivalence with regards to how much it stores.

recently pricing out 337gb SSD drives for our production and backup servers ends up with a price about nine times that of which it would cost per terabyte of a new iMac. When you get into have 60 to 120TB it adds up but reliability is so much more important.

Now I have never looked at tape backup for home use, what is the throughput of most of these? Is this merely a limit of the interface?


People who are really serious about storage like Google and Backblaze tend to buy those cheap commercial hard drives instead of the expensive enterprise drives.

Tape drives for home use don't make much sense sadly. The tapes are too expensive compared to hard drives. It's easier to buy one of those SATA docks if you want to have cold backups.

https://www.newegg.com/Product/Product.aspx?Item=N82E1681715...


> The tapes are too expensive compared to hard drives.

That's not at all my experience, at least when it comes to LTO. Sure, TS and T10k are expensive, but they're the 'enterprise solutions'. Modern LTO tapes can be bought for ~$100 if you shop around. Less if you buy in bulk.


Recent generation LTO tapes (LTO-6) can be had for like $25 a pop easily, low volume, just check Amazon or wherever. The LTO-7 are more expensive but that's what you get for latest and greatest.


"Areal" density (bits per unit area), not "Aerial" density (bits in the atmosphere?)


Honest question: are tape drives fast enough for use in any asynchronous/random-access environment, or are they generally limited to backups, financial and scientific data? For example, could something like Youtube rely on tapes as the tail end of a LRU cache or similar?


Probably not for serving user requests for content.

But, if Youtube archived the original upload on tape before transcoding to a different format, then it would be useful. The tapes would really only be "read," in a batch manner, when they want to re-transcode to a newer codec.

Likewise, tape would be a great way to store the digital equivalent of the "negatives" once a movie, TV show, or music album is complete. No one needs instant, on-demand, random access to every shot, every take, and every camera angle. It can be pulled off the tape in 10 years when producing the Nth remaster for the new super-duper bluray version.


Tape provides linear access. The limits of AWS Glacier are probably a good place to start back of napkin latency calculations. https://aws.amazon.com/glacier/


According to https://en.wikipedia.org/wiki/Linear_Tape-Open#Positioning_t... , in the region of 50 seconds, which isn't really good enough.


I'm sure there's long-tail data on YouTube for which terabytes of files that could be backed up to tape are accessed far more slowly than every 50 seconds. I agree that this would not be a great user experience, but there's certainly a place for it, especially if the cost/terabyte is compelling.


The other problem is that Hard Drives are cheap as hell and tapes, especially ones loaded into automatic robots, are really not that cheap.

It's the perennial disappointment with tape drives that the media has "enterprise" pricing, making it barely better per TB than cheap consumer spinning hard drives.


The one thing that has puzzled me for years is why tape never caught on with consumers.

Too much risk that people confuse it with music cassettes?

Or perhaps too fiddly software to deal with? for me a more likely reason why USB drives overtook optical media than the media itself, btw.


What kind of read/write speeds are these tapes capable of?


Commercial drives you can buy right now write at more than 150 MB/s, newer generations are always faster.


Backing up a few PB of data is costly and bulky, since you really must use hard disks. I've been waiting for this tech for years.


In the PB range, current tape technology is very competitive with HDD depending on access patterns and other factors.


Can anyone speculate about the price for one unit?


Price per TB would be the more important measure for this kind of storage equipment.

For reference and to put things in context one has to look at LTO generations. From what I see their invention would fall beyond the LTO-10 standard (unless the market forces and manufacturing change dramatically by then).

https://en.wikipedia.org/wiki/Linear_Tape-Open#Generations


Hard to say. That type of capacity isn't even on the LTO road map yet, and IBM is one of the controlling companies there.

LTO-7 currently holds about 6TB uncompressed, and they're about $100 per cartridge. LTO-8 will double the storage, and will probably be available by the end of next year. I'd guess the cartridges will be more expensive initially, but it's hard to say what the premium will be.


> will probably be available by the end of next year.

I'd be surprised if it took that long.


I agree, but I'm going based on the road map, which plans for that version to become available this year or next year.


They can go cram it, for all I care.


It can now store 330 million books. We better get started writing!


Kim Jong Un should make sure one of these things filled with music and movies doesn't get into N Korea. It would destroy their propaganda war.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: