Hacker News new | past | comments | ask | show | jobs | submit login

This was something that surprised me after I did my PhD as well. I thought that employers would focus on my specialized skills and "someone else" would somehow pick up the pieces and make something out of what I did. Turns out this is completely wrong, and I now see how frustrating it is to work with people that have this kind of attitude.

Most of most jobs is a bunch of mundane stuff. I've seen it in software development, and I've seen it in management consulting. The best people, typically, are those that will happily do both, understanding that the fun stuff comes with a lot of baggage.

The "someone else is better at the stuff I don't want to do than me" argument rarely holds up either. The friction that comes from dividing the work along lines like modeling and production and trying to hand off is rarely worth it when one person can do both.

Anyway, I've been where the author is, but personally I think it's wishful thinking, unless maybe you want to start your own shop and structure it around yourself that way.




I dunno... I am a software/data engineer who partners with data scientists. I think that comparative advantage here is a real and important. Don't get me wrong, I'm happy when my data scientist partners write good code or show interest in getting better, but I'm more than happy to take their janky code and make it production ready. It often needs to be optimized for scale or refactored for reusability, and a lot of that often falls very solidly in the engineering discipline.

I'll often tell my DS partners, "Don't worry about the code; you get the math right then give it to me. You can go about doing more math, I'll do the engineering." In my experience, this is often a really pragmatic division of labor.

This is also important because often, the data scientists are never on-call, so if something breaks in production, engineering needs to know what is going on.


> You can go about doing more math, I'll do the engineering

I'd say most of the teams I supported as a backend/DevOps/infrastructure engineer followed that pattern.

Generally the evolution I saw was start with a handful of sciency folks working in R or python, get more grant money as work becomes more important, have some SE's thrown in the mix to re-write in python and/or C/Cpp.

A lot of those SE's would float between teams with one of us infra folks to performance tune as they scaled.

> the data scientists are never on-call,

Yeah, but I'll keep my on-call rotation over their constant 11th hour shenanigans trying to get a paper out the door.


I mean, this is probably how things should be done.

Having the specialists of each field appropriately resourced, focused and aligned streamlines everyone’s life. The problem comes in that most places either get the scientists to pull double duty and we end up in the scenario scientists are pushing messy, subpar code out the door that nobody else wants to touch because they’re unsupported and operating out of their depth, and devs don’t want to touch it because they’re not keen to fix someone else’s hacky code and because they probably have other priorities and the science stuff likely isn’t well integrated into the rest of the other code-delivery and monitoring systems.


The big thing is prestige, it takes some pretty big brains to write good, clean production infrastructure that doesn't fall over every five minutes.

Those big brains are absolutely capable of learning the math and doing the creative work of modeling.

They want to do the creative work.

Which means that if you create a prestige gradient and don't let your engineers do interesting things they themselves might get a paper out of, you lose good engineers and get bad engineers instead.

This is also a constant problem. In DevOps and SRE, the very best devops and sre people are absolutely incredibly mercurial and mercenary - because they've been fed total lies about building systems again and again when the net job description is "hey ops person, this application I wrote is misbehaving can you do advanced troubleshooting on this server for me do I can go do leetcode".


This. There is no clean application of comparative advantage here. The great individual that will rewrite the code and knows all the details to get the model scalable, robust and production ready, can also do the data science "creative" work.

If you are not willing to grow to become that person you are only damaging yourself long term. And if you don't attribute that individual's work: a) this is unethical b) they will, and should leave.

P.S. Seen this firsthand and second hand.


Good software dev should be able to refactor the subpar code


Sure but it's less fun to be a refactor assistant full time


> often needs to be optimized for scale or refactored for reusability

It's so easy to make subtle assumptions when you redo their code that completely invalidates the work they've done. ML completely collapses on extremely small errors. Handing something off to someone else to refactor is a dangerous step in the process that risks everyone wasting their time


> It's so easy to make subtle assumptions when you redo their code that completely invalidates the work they've done.

It's equally easy to test whether those assumptions actually break anything. Especially when minor errors can be catastrophic.

I'm part of a team that was tasked with producing a web app from something that was originally a piece of Matlab code.

We considered just running the Matlab code in a container but ultimately IEEE 754 is IEEE 754 regardless of platform/language, so creating a 1:1 implementation in C++ proved possible.


The catastrophic errors aren't the problem, the problem is when things are degraded by 20% which results in a small loss when it could have been a big win. You just assume it didn't work when in reality it wasn't implemented correctly


Personally, I'd like to do both the data science and the engineering. I'd also like to be in a team where each member does both. I at least "feel" like these two skills are close enough spiritually to one another that you can find a team composed entirely of people that do both.


> I think that comparative advantage here is a real and important

It is.

But so are transaction costs, and all handoffs have them.


Interestingly, I'm a physical scientist in a business that makes hardware and software. I'm up front with people, that I'm not an engineer, and nobody expects me to be one. I write code to support prototyping and testing, but don't expect it to go into production. Nobody wants my janky prototype. I produce a theory of operation that covers a proposed design, as well as an outline of the basic manufacturing and service processes needed to make it work. The prototype sometimes helps confirm that the technical requirements for the product can be satisfied at an early stage of a project.

When I do share code and things that I've designed, it's usually for tooling not product, e.g., a script that helps calculate non-obvious design parameters and tolerances. Often, my code is used to test hardware components before the official software is ready.

In my case, I'm on call, though not 24/7, because the business as a whole isn't. For instance I'm available to diagnose supplier and production problems, and deal with weird issues that emerge in the field.


Why do you want them to write code at all, then? Why not just task them with writing user stories around the parameterized functions they need, and just let you figure out how to implement it all?


You usually have to evaluate the model as you develop it, it's not something that a story driven model will fit


The idea that creating a model can be a “user story written in math” seems to me a variation on a common misunderstanding about what model creation, particularly the role of coding in it is. Data scientists, statisticians, modellers, don’t go in knowing what the model is and just coding or specifying it. They use code as a exploratory tool to test several hypotheses until they find one that seems to hold - the code, the data, the algorithm, the statistical test, the exploration and the reasoning are all tools in the process of discovering the right model. It simply can’t be specified in advance, it has to be run and tested.

Having som spaghetti code is a natural consequence of this exploratory, iterative process. Applying good SW engineering practices to this exploratory endeavour is just a natural consequence of the process when you don’t know if your code will be of any use before you finish running and checking the test results. Why would you bother modularising and doing test coverage on something that is very likely to be thrown away after one or 2 runs?

I say this from 28 years of experience both as a data engineer and data scientist. I am a good python developer, and I can write production grade code. But I won’t refactor my code into that until I know that is the code that generates the right model. And I certainly can’t specify this particular code before writing some dirty version of it, testing it and confirming that the model it trains passes some statistical tests, at least.

Basically, the code is not the product - the product is the result of applying some transformations on data and running that through some ML or statistical algorithm to generate a model. Transformations and algorithm being unknown to be useful until tested, hence specification being unknown until coded, run and tested.


Assuming you're writing actually detailed and complete user stories and those user stories are for functionality that basically amounts to "just math" (i.e. no UI or API standards or whatever to worry about alongside the business logic) I imagine it's not much more difficult or time consuming to just write the code rather than describing it in a user story.

Having one person write user stories and a second person, who understands the domain less than the first person write the actual code would be twice as much work for a worse result.


What's the best way to write those user stories? Certainly not plain English; you'd probably want to use some kind of pseudocode. Ah, but it would be good if you could run that pseudocode and see what the results are in the happy path, without worrying too much about what happens when something goes wrong. We have that, it's called Python, so that tends to be how these user stories are expressed.


I expect that it makes sense to have the modelers do at least enough coding to test the common use cases/expected inputs for the model to validate it before handing it off to be "properly" implemented.


Right but they still need to know some code and talk programming language.

The "here is some shitty code that does the job, make it good and production-ready" requires far less communication and domain knowledge than them telling you how exactly it should work and you translating that into code


That’s an unusual style of operation. It may be a really good one, but we don’t really see a lot of teams where people have complimentary skills although really we should.


That’s what makes technical interviews even more frustrating. Most jobs day to day are like 5-10% solving deep technical puzzles and 90-95% fiddling with tooling and automating things that are incredibly frustrating or time consuming.

So ya I’m not sure how to solve your silly brain teaser, but I have written custom test frameworks to automate the tedium away to save the team hundreds of man hours. It took a deep understanding of VERY specific tools (shout out to ASIC EDA tools). But that doesn’t matter on a technical interview. So you hire somebody who can Leetcode, but can’t figure out how to fit all the pieces of the actual job together.


The way I think of it is that in a business you often need to get a certain fixed number of things right before you could even start to make money.

Indeed, 95% of those things are mundane and tractable (maybe you have to be fast and careful), but the remaining 5% are challenges that can only be solved by specialized knowledge or some innovation.

If you hire someone good at solving mundane problems, they can contribute to the 95%. But if 95% of the company is hired like that, the pool of people you can draw on to do the remaining 5% is <5% (maybe more, if the company is lucky). There's a good chance that business is blocked by the remaining 5% because of the high uncertainty in delivery, fundamental to the nature of the problem.

Another approach is to hire, say 50% to do mundane work and 50% to do both mundane work and more special work. This costs more, but if done right it can make the company move faster, and have better chance to survive than any case of the former.

First of all, it's often (but not always) the case that people who can do special work can be trained to do the mundane things better. Second, having many people who can solve one-of-a-kind problems is especially important if for each project the 5% can depend on a different set of things; it's often also the case that solutions to the innovative work is closely related to experience dealing the more mundane parts, so this structure basically recognizes that innovation can come from rank and file rather than a specialized "ideas guy" that just flies by to solve problems. So what happens in certain small but profitable companies is that they try to find people who are happy to do dirty work, have a business mindset, but at the same time show signs of being innovative.

This is not to say I agree leetcoding is a good test. I think Google and Facebook don't test enough whether someone is capable of or interested in identifying technical priorities with a business mind. Although Google has some data to connect leetcode performance to job performance, I'm skeptical of their performance evaluation methods.


Yep, I always tell people that the "plumbing" is by far the most important part of most software projects.

The actual business problems basically solve themselves if you can write all the glue between them in a way that doesn't add extra cognitive load.


Great way to paraphrase it. I’ve found all jobs to be plumbing. The actual task of writing the code to do xyz was always straightforward or 5 minutes of googling. What no Googling is going to solve for you though is how to fit A B and C of your company’s process together into a manageable solution. Manageable being another keyword here. It’s easy to write a lot of code. It’s hard to keep all the pieces of it organized and flexible to future use cases/needs.


> So you hire somebody who can Leetcode, but can’t figure out how to fit all the pieces of the actual job together.

Yep - 2021 was a bad time for me, mostly because I was stuck working with "smart" people on 2 different projects who had no idea how to add value. All 3 of us have PhDs, but while they had coasted from there in positions where being "the genius" was enough, I had taken a couple of career tangents. They were constantly mesmerized by my Bash skills and infrastructure knowledge, and in the end my parts of the project got done and theirs were written off due to lack of progress.


> It took a deep understanding of VERY specific tools (shout out to ASIC EDA tools). But that doesn’t matter on a technical interview. So you hire somebody who can Leetcode, but can’t figure out how to fit all the pieces of the actual job together.

What would you suggest testing? If you ask about the specifics of a given framework, you'll get someone who's memorized that framework but can't actually think. If you get someone who can code and has a decent level of general intelligence, they'll generally be able to learn a new API.


I totally agree. I work on the private sector, coming from a research position too. I was also focused on the "interesting" side of the problem: the modeling, integrating domain knowledge into the analysis, drawing all sorts of plots... But there were other unavoidable and "uninteresting" needs for the research project, like building a data gathering system with its API and everything. This required my best software engineering abilities. Needless to say my best weren't precisely THE best, so as the project got bigger, the not-so temporary fixes increased, as well as poor design choices (if any). This finally led to a complete reestructure and almost fresh start.

I feel some of it could be avoided, so I learned the hard way that the whole modelling + software engineering process is a subtle craft. It is important to take care on the implications of your code and, specially, on how its done, since it may fall back onto you eventually. This reconciled me with the more technical stuff (my tools) to eventually put up a good work in a more satisfying way.


I think there is value in both and it sort of depends on the organization.

With the commoditization of models, we are seeing the rise of MLEs over data scientists. Engineers that understand enough DS to make things work are wildly proficient in this space.

However, not all models have been commoditized, and there is still a need for new math in many places and that’s where the division of labor makes sense. You can’t be an all star engineer and an all star data scientist it’s just too much for one human


I have a name for the sort of mundane-yet-employable programming tasks. "Plumbing work". You're not doing the clever problem solving that once sucked you into programming, you're welding pipes together that other people made.


>The "someone else is better at the stuff I don't want to do than me" argument rarely holds up either. The friction that comes from dividing the work along lines like modeling and production and trying to hand off is rarely worth it when one person can do both.

This was always my attitude. Every time you split something you add coordination overhead. This overhead gets worse the more times you split.

Of course there is specialization that can make someone else sufficiently more effective that you shouldn't do everything.

But every gain in specialization has to be weighed against increased communication costs.

Add to that a lot of problems don't need a lot of specialization but touch on many different disciplines.

This is why I think, contrary to the trend of specialization, we need generalists that can cover most bases at once and decrease the communication overhead considerably.

It's often still good to have specialists, but you should mostly employ generalists and only a few specialists in key technologies that set your company apart from others.


Well, in purely software shops, there's often people dreaming up the 'what to do' and a different group of people actually writing the code. Same with systems design, we have 'architects.'

This should be no different for statistical modelers or other disciplines. Employers are just cheap.


I've never worked on more painful codebases than when the "architects" don't have to bother writing actual code and so are ignorant of all the edge cases and special-case business rules that turn their pretty pictures into a horrific ball of mud.


The "architects" I've known also had a habit of giving you their grand perfect plan in a meeting that lasts no more than an hour or so, disappearing for the next 6 months without communicating with the people building the thing at all, and then being surprised when the final codebase looks nothing like the perfect system they'd designed in their head.

Architects have to be involved in building the thing they're architecting.

(It feels like the blueprint analogy fits here - actual building architects write blueprints, and then builders go off and build the house based on those blueprints. The code we write is effectively the blueprint, not the house. So what the hell are our architects making?

We need to treat software "architects" more like building site foremen than actual architects)


Having some times taken architect like roles, I think the big problem is that architects is a very loaded term, like everything in IT.

You have proper architects, those that design nice diagrams, have proper technical knowledge, and also code, even if small portions when compared with the rest of the team. These I would call proper architects, and tend to explicit mention Technical Architects to make the point.

Then you have the "architects" that do diagrams, spend the time in meetings with customers, plan features per sprint, delegate activities, and so forth. This ones I call managers and always double check if the company isn't using architec as synomim for managers.

At least some shops are more honest, by making use of business analyst or solution architect as synomim for high level management work.


> Most of most jobs is a bunch of mundane stuff.

That’s why we call it “work.”


It is not that surprising because software engineers are usually more expensive than scientists. That leads business people to ask serious questions about which staff they really need, how much of that work can be done by lower-paid specialists, and whether they really need the professional code quality when the stuck-together python that scientists tend to write usually also works.


To all the specialists out there, if you value job security, become a specialist that can produce. When the company I'm in did layoffs, the area that had the biggest layoffs were in groups that did not actively produce the product we were selling.


I love boring stuff because often it allows me to do something really well. Hard problems are fun but hard to solve well by their nature. Programming is fun because I get to do both of these things.


Having done both sides of the work, I totally understand the author’s perspective. Modelling and coding production are certainly two different skillsets, and many people will prefer one or the other, but not both. Now, there is one thing I guess the author doesn’t get (or he gets but doesn’t like) and a thing that most companies don’t get.

The former is that companies don’t care enough about what the individuals they hire prefer - especially not if it doesn’t address their need. They don’t need a beautifully crafted model that can’t be run - they want actionable results that hit the bottom line, and in the energy forecasting model that means generating new forecasts at every cycle (months, days, hours, minutes - whatever). A great model that can’t be put in production and run efficiently has about the same value as no model, and an average model in production will have much more impact.

The latter is that companies don’t usually understand that ML is not software development. Putting ML in production is, but finding the right model is research work, and code is mostly a discardable tool for research. Its goal is not to go live, it is to validate a hypothesis (in this case, that algorithm X, when presented with data Y, generates a model with enough predicting power to be useful to the business). This validation requires code to gather Y, clean/join/analyze/reshape/featurize it into a more informative and clean representation (clean from the algorithm’s perspective, not necessarily a human’s), run X, run inference with the generated model and run some test of the results against additional data.

If this test is negative, some or all the code written above is useless, and we go back to the drawing board. Given the very coupled nature of this process (a new data source has to be joined to the rest - coupling; a new data transformation changing a feature changes the data schema downstream - coupling; a new algorithm needs a different data input - coupling, and so it goes). If you have an experience like mine, you may actually be able to write in a way where you can reuse some of it, but I have 28 years of experience with data, there are simply not enough people in the market with that level of background or the interest in learning all this. Companies must accept that they will not always get this perfect candidate with all the skills they want, and start thinking of pairing the right people in teams.

Some who have been around for a while may remember the Venn diagram of the perfect Data Scientists - it usually was an intersection of business, math and programming skills (also often communication skills and a few others). My thoughts since I first saw this diagram were: “Even if there are people out there with all these skills, why would they want to work for others?”

This, more than anything else, is my guess at the core reason why so few companies are successful in putting ML in production.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: