I want to hear from people who are: 1) Coders, not designers 2) Use machine only...

wildmusings · on July 11, 2017

When I worked at Microsoft, you needed a large drive to have multiple copies of the Windows source code and prebuilt object files (which made building modified components much faster). Especially when you needed to work out of multiple branches, for example to fix a bug in a shipped version of Windows for a patch as well as in the current in-development version. SSDs made building much faster, but having only 256GB would have been limiting, depending on how many subdivisions and branches of the Windows source code you needed to work on.

I also had another machine solely dedicated to running VMs. I had a scheduled script every morning that would create a VM from the daily build of my team's branch, setup various tools on it, and make snapshots along the way. I liked to keep old VMs around so I could do a manual bisect and find out what build introduced a bug. Having a 1TB hard drive was very nice because I could go longer without having to purge old VMs.

Analemma_ · on July 11, 2017

Another former (although more recent) Microsoftie here...

It's a lot better than it used to be, because the switch from Source Depot to Git means that you can use branches instead of having multiple copies of the repo if you want to work on multiple things at once (which to me was always the one huge PITA about Source Depot, which was otherwise pretty good as far as non-distributed VCS's go). Combined with the build caching, you could probably get away with building Windows on a 256 GB drive now, even if most devs have more.

natch · on July 11, 2017

I'm a coder, not a designer. I build corpora for natural language processing use. In the last 24 hours I downloaded 62GB of web pages for textual analysis. That represents about 10% of what I plan to download for this one project, but I just hit the pause button on the downloading so I could avoid hitting my home internet provider's monthly data cap. I have probably spent about a solid week in coding time on this, or a few weeks of tinkering time. A point I mention to highlight the fact that this is just one project... I do other things on top of this. This one is English only. There are maybe 50 or so common other languages that I've done at various scales. So yes, disk space is useful. Obviously I don't need all data for all projects loaded locally at once. But working with this scale of data on a cloud drive isn't really a great option, so some off-machine local storage is better, and on-machine local storage for any data that is being processed right now.

You say nothing "fun" but this is fun for me... I think work is allowed to be fun. It's not a day job at the moment, more of a hobby, but it relates to some day jobs I've had and would like to have, and gives me data I can play with to keep my skills up. It's valuable (production quality) data when processed, so it's not mere entertainment.

kabdib · on July 11, 2017

I have about 3TB of SSD on my workstation. It's about 3/4 full of builds. They're often staging branches of one kind or another, where the features I'm working on can be worked on in isolation.

When I worked at Microsoft, I had several TB of spinning disk. Management was too bloody cheap to buy SSDs ("If we buy you one, we have to buy everyone an SSD..."). I calculated that my lost time was costing enought for Microsoft to buy me a complete new insanely high-end workstation every three or four months, complete with monitors. Insanity.

I wound up buying my own hardware in that group, on several occasions. That was terrible.

oso2k · on July 11, 2017

Like the MS people, I'm a Red Hatter and I just picked up 15TB last Spring to keep up with my storage needs. I have dozens and dozens of VMs. I consult/work on OpenShift and simulating an HA Cloud environment takes a few hundred GB. And when you have customers and projects on multiple versions, you need more. I tend to run N, N-1, and N-2 versions of each OpenShift in HA in VMs (3 Masters, 3 Infra, 3 App, 1 Bastion; 10 total), plus a storage VM, and some of our other relevant products on a 32 core/128GB HP z820 Workstation. Cheapest way I found to run a persistent lab environment. If I ever start doing OpenStack, I might need a small fleet of HP workstations.

Moral of the story is, you want you lab/dev box/env to match what you're trying to test/simulate. I need to simulate "clouds". If you're working on a single web app/site, you could get away with less.

CJefferson · on July 11, 2017

I've got 512gb and it keeps being close to full.

A windows xp, 7 and 10, a few Linux vms, source checkouts for GCC and clang and a few build trees, some traces that end up being a few GB, it all adds up.

pmoriarty · on July 11, 2017

I've filled well over 500 gigs with VM images.

Now, whether I actually need 500+ gigs of VM images is another question. I could definitely clean up and pare them down to just what I need, which would no doubt be much less than 500 gigs. But time is money, and it's just cheaper to buy more disk space than spend my time being an overpaid garbage collector.

codemac · on July 11, 2017

Only because consumer filesystems suck.

I'd love to have a completely versioned filesystem, at all times, regardless of how big it got.

xj9 · on July 11, 2017

zfs can almost get you there. snapshots are p fast. interesting concept.

jl6 · on July 11, 2017

I am working on such a thing. Would you be interested in testing it?

codemac · on July 18, 2017

I didn't see this before! Please feel free to ping it my way, and I'll let you know what my experiences are.

fnj · on July 11, 2017

HammerFS in Dragon Fly BSD automatically maintains versions, though it is still not mature, and it is starting to look like it will never reach maturity.

acdha · on July 11, 2017

Only when testing things: VMs, iOS simulator images, etc. can add up if you have a broad support matrix.

This is likely a stronger argument for setting up a server to run those in parallel.

nickpsecurity · on July 11, 2017

One thing someone like you could use is append-only storage of anything critical. The idea is you never delete a file until you run out of room for it. There's a whole series of versions of a file with your system pointing to the recent one. So, if something bad happens, you can rewind an individual file or the system as a whole back to some point.

It might take a lot of extra storage to do that. I could see never losing work being worth throwing terabyte drives at a system only using 120GB. Add one or two for RAID, one for local backup, and some remote storage like Backblaze or Glacier for best results. The backups would just keep the current versions or periodic snapshots to reduce cost.

Note: I don't know if this feature is supported in the Windows ecosystem right now. Versioned filesystem was a feature of its predecessor, OpenVMS. It did much of what I described plus clustered apps. Only difference was storage was too limited for the versioning to really go as far as I'm advocating. Mainly just posting to give you an idea of how extra storage might help you should your OS support the use case.

xyzxyz998 · on July 12, 2017

Windows does have file versions, I can go in properties and do this but I think you have to set it up. I like your idea and I've been using git for the same (since I have very little binary data I don't feel bad using the free websites with a git commit -a; git push)

But this was another of my motivations behind this question- (which I didn't mention), why buy 1 4TB which will crash anyway but not 4 1TB. As people are getting more pro- I think Microsoft should lead with home RAID solutions. I know there was Windows Server Home but that's no more now. The OS should do integrated backup to various remote storages.

nickpsecurity · on July 12, 2017

I thank you both for your replies. Far as RAID, it would be nice for Microsoft to do more in that area for desktops. I do remember when I used Windows there were RAID appliances that you could set right next to your desktop. You just plug in some HD's. Plug the appliance into the desktop. It does the rest.

I'm sure they're still around for anyone that wants better RAID than what Microsoft has. Probably got a lot cheaper, too.

xyzxyz998 · on July 12, 2017

Looks like windows does RAID (win 2000) and pooled storage (whs, win8). And you can setup particular folders for periodic backup and retrieve history.

But I'd like to be able to retrieve every single version for desired partitions/folders and the ability to store to cloud.

StillBored · on July 11, 2017

My "work" laptop has ~600GB used out of a 1TB ssd. Big chunks are eaten up by VM's and cloned source tree's with build artifacts. That said, I also have another 1-2TB on a build machine I connect to.

GordonS · on July 11, 2017

I have around 20 VMs sitting on disk, several of which have over 20GB disk allocated. I run these in VMware workstation, for testing on different Windows versions, and Linux and BSD distros. They eat a lot of space.

I also have virtual network devices which rub in GNS3, which also take up a fair bit of space.

Manozco · on July 11, 2017

One debug build of the software I'm working on professionaly is ~90GB (source + objects + third parties) so yes I need more than 256GB

I'm also using postgres databases that are around 60GB so a couple of DB and I max out your disk

sqeaky · on July 11, 2017

I routinely have VMs and images of VMs taking about many hundreds of GBs. I could get by with 256GB, but it wouldn't be comfortable. I would clean up old images and software on my schedule, then be dictated when because someone cheaped out on a drive.

If something like that delays me just a day, it could easily cost more than a single high speed 1TB SSD. That isn't so unlikely because disk full errors always manifest as something else, particularly when the VM is filling a disk and its internal disk still has has more space.

koyote · on July 11, 2017

Pictures take up the majority here. I do not keep them on my primary (SSD) harddrive though.

Then there's also video, which is getting more and more popular (gopro, drones) with filesizes not really going down much due to things like FullHd and now 4k.

Edit: I did not register your 'nothing fun' point properly. I guess you mean a pure working machine? Then no, I do not see a reason to have a hard drive above say 512gb. Do keep in mind that code-bases grow quickly though, I have about 25gb of source code on my main work machine.

dsacco · on July 11, 2017

Sure. I have a couple hundred TB in hard drive capacity and 1TB (4 x 256GB) SSD capacity.

I go for HDDs instead of SSDs because it would be cost prohibitive for me to achieve the same I/O bandwidth with SSDs that I can get on RAID 0 or RAID 10 HDDs and multiple 10GbE connections.

As it stands I'm already well into five figures between computing, storage and networking hardware. Replacing 24+ 8-10TB 7200RPM hard drives with equivalent SSD capacity would cost me nearly $100k in storage alone.

eslaught · on July 11, 2017

I generate traces of distributed applications I run (essentially logs of functions called, time taken, and other metadata). While the applications run on remote server(s), it's often nice to keep local copies of these logs for analysis. These can be on the order of tens of MB per node per run and I can easily generate many hundreds of GB if I'm not careful while debugging or tuning an application.

jaclaz · on July 11, 2017

As a side note, has anyone noticed that UEFI (and GPT) were forced down our throats just around the time everyone would have a 128, 256, or 512 Gb SSD boot/system disk?

All the hassle for accessing more than 2.2Tb on a boot disk when noone would have such device in use (GPT on non-bootable disks is of course perfectly accessible on most OS's without any need for an UEFI firmware).

garaetjjte · on July 11, 2017

Why connect UEFI with that? I use GPT with MBR boot just fine.

bartvanH · on July 11, 2017

I'm guessing you mean for a work machine as an entertainment machine(like gaming) would easily use more than 256G. My dev machines have 120G ssds and that's enough. The real number crunching happens on a machine that has 2 960G ssds in raid-0. I know, that's not exactly safe, but all the data on there is ephemeral anyway.

greggman · on July 11, 2017

My VM folder is 408gig. 13 VMs (Windows, OSX, Linux), multiple versions, some with 1 or 2 snapshots (like just before my app is installed so I can test the installer)

So yea, I need > 512gig and no, I don't want to run things from an external drive.

digi_owl · on July 11, 2017

The impression i have is that all space will be filled one way or another. Either from work files, or simply as a dumping ground of media files. Shit just build up over time...

kurthr · on July 11, 2017

Agreed... most space is used on media (including 3D designs, layouts, etc.). Code is just really compact, even when you have a huge local version tree.

The only counter example I can think of are large databases/VMs... which most wouldn't keep local anyway.

Really, with the speed (latency/bandwidth) of modern networks vs rotating media there seems little reason for local storage that isn't flash drive fast... other than security or minimizing bandwidth.

tertius · on July 11, 2017

Coder full time, photographer (commercial) part time.

Coding (OS + projects, including docker images) less than 150GB.

Photo/video - Active 400GB, Archived 2TB. (Use a 5TB spinning when not in use and a 256GB SSD when working.) Various backups.

- If I were to give a recommendation and you're not stuck to a laptop, get the fastest SSD you can find. * Intel 750. * Samsung 960 EVO M.2.

Versioning may eventually get us to 500GB and game development may need some more space.

eru · on July 11, 2017

If you do any kind of machine learning training on your machine, you can consume an arbitrary amount of space.

holtalanm · on July 11, 2017

Large databases for backend development on old projects that have been around for decades. I currently have a 1TB SSD installed in my work laptop, and it is about 75% full.

mrep · on July 12, 2017

I develop on an i3.xl (950 gb)

My team has hundreds of terabytes of compressed data stored in s3 so I sometimes jump up to the limits when I am doing data pulls for analysis.

E6300 · on July 11, 2017

I do a lot of work in local VMs (testing in clean environments, debugging drivers, etc.), and those can take up quite a bit of room.

treve · on July 11, 2017

Same as some others, in jobs where I needed VM's yes, otherwise no.

barrkel · on July 11, 2017

All the time: virtual machines.