There's some neat tools here, but it also feels like a bit of NIH. It would be interesting to hear why these are better than, say, Gulp, Yeoman, Hubot, Casper.js, etc.
Absolutely. It's a fine line. We haven't always made the right call for sure.
I think Metalsmith and Khaos are pretty good examples of why we feel the need to build our own.
The crux of it is that the alternatives (Yeoman and Jekyll) aren't designed to be augmented without opinion. They are designed as "solves all your needs tools and gives you the preferred workflow".
Metalsmith doesn't claim to solve all your needs. It actually doesn't solve any of your needs out of the box, unless you need a really complex way to copy files from a directory to another without doing anything else to them. It defines an extremely simple core—an API that extendability can be based around—and then hands over all of the logic to plugins. Which means it's pretty much never going to get in your way.
Similarly with Khaos—which we haven't really perfected yet—we're looking for an extremely simple core API design that people can augment to solve their scaffolding needs. We want it ourselves to scaffold integrations for Segment, to scaffold new Node and Go projects, we're even thinking about scaffolding Segment-branded PDFs.
Yeoman doesn't think that way, evidenced by this paragraph from their landing page:
> Through our official Generators, we promote the "Yeoman workflow". This workflow is a robust and opinionated client-side stack, comprising tools and frameworks that can help developers quickly build beautiful web applications. We take care of providing everything needed to get started without any of the normal headaches associated with a manual setup.
Khaos doesn't prescribe any workflows or client-side stacks, it just lets you scaffold things. What you want to scaffold is up to you, as it should be. You can use it for eBooks for all we care :)
Khaos also happens to be built on Metalsmith. It's just a series of plugins that run before the files are written to an output directory. That wouldn't have been possible with Jekyll, we'd have had to roll our own again inside Khaos or do some incredibly hacky shit to get it to work.
---
All that said, some of our projects suffer from NIH. And when they do it's usually clear over time since we get sick of dealing with them and we use something that has more widespread support behind it. But for others, we're super glad we've rolled our own because we get tired of other people's opinionated software :)
There are definitely lots of existing pieces of software that we like to use though. I've been thinking recently it would be cool for companies to have a Chrome Extension that the engineering team could use that specifically marked GitHub repositories as "liked/disliked" in terms of adhering to the companies design ethos. That way we could keep track a lot easier of the stuff we like to default to, so that we don't have to roll our own all the time.
Do you guys face issues with the increased mental complexities this approach brings? I am a bit shocked to know there are over 1000 repos in the account. I can assume developers typically need to keep at most tens in their working set, but 1k does sound a lot to manage.
Big monolithic repos actually increase the mental complexity more in our opinion. Mental complexity really comes with how much of the system you're holding in your head at once. And with smaller repos you might be dealing with 10-20 repos on any given day, but that's only 1-2% of our codebase/system. The other 98% is ignorable. So we end up holding less in our heads, assuming we've abstracted things correctly.
It did get annoying to deal with the mechanics of lots of repos. So we built tooling to make that easier. For example, CLI commands like "goto analytics.js" will clone and take us to the local copy of the repo. And "publish patch" handles all the mechanics of updating History.md from the git log, incrementing the version appropriately in package.json and component.json, tagging the commit and releasing to github and npm. Khaos, also mentioned in Sperandio's article, helps us template out new repos quickly. With a few pieces of tooling like that you can move pretty fast across lots of repos.
There are definitely a few drawbacks. For one GitHub is actually designed around a monolithic repository mindset, may be because it's all built on Rails :) which means that it gets harder to find things. So we needed to build directories for finding stuff, since no good option exists.
There's also an interesting balance for things that technically should be their own repository, but shouldn't for workflow reasons since it actually means they will be more quality if they are grouped and updated at the same time. Where splitting things out can actually discourage updating if the cost is too high.
There's also a big open question around what should be a separate repository, and what is separation just for separation's sake? A similar kind of question that has to be answered for individual functions in a codebase. We've made the wrong call sometimes and split things out that should have really been together.
I tend to think that the drawbacks are worth it, since it means things are actually decoupled and generally more parts of the codebase are able to get to a "finished" state. Lots of the modules we've built haven't been touched in months or years now since they do exactly what they need to do without any bugs now.
In general I think that in programming we talk a lot about building things that are decoupled and simple, but we haven't invested anywhere near enough in tooling to quell the drawbacks to things being more spread out. The pros are extremely valuable, and the cons are solvable, we just haven't done anything about it.
Ian and Peter really nailed it with their comments. For me, the mental complexity of trying to hold a huge API in my head all at once greatly outweighs the cost associated with taking some time to find a well maintained, tested repo that does the exact thing I need really well :) Another great example came to me today: syntax highlighting. The readme on our solution [here](https://github.com/segmentio/highlight) hints at why the decision to break things apart made sense for us. As for both mental complexity and ease of use, I think this is a lot simpler than something more traditional like, for example, highlightjs.
Rather than go to highlightjs's website to download a custom build of the script that doesn't have 22 languages (20 of which you probably don't need), adding it to your asset pipeline, then repeating whenever you need to add a new one, with highlight's plugin approach + a simple package manager and build process like duo, you can easily add and remove each language from your build ad hoc.