Hacker News new | past | comments | ask | show | jobs | submit login

I was hired for this exact reason at my last company and tasked with rewriting it while maintaining bit-for-bit identical output.

The company itself was in biotech (cancer diagnostics) and was relatively new, spun off from a rather well known research lab. They quickly realized that their system was incapable of scaling (or being maintained properly...) to the needs of a business.

The code itself was written primarily by one man. He was quite bright, but not really a software engineer by trade. Usual stuff; no source control, mish-mash of technologies, spaghetti code, and home grown algorithms and hardware to solve well understood problems with available solutions. New management wanted me to rewrite everything in C# (against my protests. Not because I dislike C#, only because the code dealt primarily with image analysis and hardware control/robotics.)

I began by doing exactly as you proposed; I reverse engineered every bit of code. I took extensive notes (it was hard to follow) and walked through each step from sample prep to image acquisition to analysis to result. I started writing each sub-system only after I understood how everything pieced together.

The real bitch of it was that the original developer relied on automating ImageJ for nearly all of the image analysis. As my original requirement was to not alter the results in any way, I literally rewrote large swaths of ImageJ in C#. Bugs and all.

Well, turns out ImageJ (Java) is compiled with /strictfp. C#/.NET does not support this, so my floating point results were oh so close, but not identical. This was initially a problem for management... until the CEO was replaced, along with my boss, and the new team thought the entire project was a dumb waste of money and had me build a new system from the ground up.

That system was released (successfully) early this year. I began work on it nearly five years ago now, with many detours along the way. I now work elsewhere.




>maintaining bit-for-bit identical output

The worst projects are those when the company doesn't really want to upgrade, so the only requirement is "make it exactly like the old system"

In my first job out of college, I upgraded an approval workflow engine from VB3 to C#. It was written by someone who had never heard of state machines, so it had a weird ad-hoc design that would e.g. get confused if two documents were in the same state at the same time.

I demonstrated the bugs and suggested an alternative approach that would be simple and robust, but management wouldn't have it. They reminded me that the job was to make it exactly like the old system.


Yeah, it stinks, and it's rarely the right approach. In this case they wanted to avoid a complex re-validation, but it was short sighted. The assay was not market ready yet anyway, we had no difficult to obtain clearances, and the original design was lacking in many ways. It was essentially image analysis techniques take straught out of the 70's. The new system is much better.


If those calculations are business rules, then you absolutely do need to make it output the same as the old system in those ways because its the foundation upon which a bunch of assumptions are made.


They were primarily intermediate values used in the overall process of finding a certain type of cell. The algorithm needed improving anyway in order to be production ready, so in this case identical output was not necessary.


I think research world has a lot of this kind of code. These projects are written by 1 or 2 and it is extremely hard to understand what really is under the hood.


It does, can confirm. I've rewritten more than a few MATLAB algorithms which, while novel, were of... questionable... quality.


>>> That system was released (successfully) early this year. I began work on it nearly five years ago now, with many detours along the way. I now work elsewhere.

At what part of the project did you leave? I can't imagine anyone staying in the same company doing the same rewrite for 5 years.


I stayed until I finished it. It wasn't a rwrite anymore, only the application stayed the same, and even then we added a ton of functionality (e.g. morphological analysis as well as identification.) It's something I wanted to complete as I literally built huge parts of it by myself and wanted it to succeed. I took what I learned at the previous company and had a chance to build something similar from the ground up under my own (technical) direction. It was a bit of a pride thing.

Honestly, having worked in medical devices for more than a decade, a five year dev cycle isn't crazy. All said and done it was about three years of dev due to the aforementioned detours (shiny object chasing by management.)

In the beginning there were two other devs on the team, but they were web guys and primarily concerned with CRUD stuff. They were laid off and ~1 year later we hired three more, but again, primarily concerned with the web side and third party integrations. I was all hardware, image processing, image viewing, and image analysis, only helping them when needed. These images are multiple gigapixel (~20GB uncompressed), so just managing and viewing them is a lot of work.


Can you recommend any good biotech companies in image/optics? I just got my degree in BioEng. in bio-optics and have been pretty unsuccessful in getting any traction on the job market.


I was involved in the digital pathology sector. Leica (formerly Aperio) has an office in Vista and I hear it's still a good place to work after the acquisition. You may also try Epic Sciences, indica labs, or BioImagine. There are far more, but you could also try going to local conferences. I used to attend pathology Visions each year. Lots of industry representation and new tech.

You probably know this already, bit look for assay dev labs with existing software groups. Best of both worlds. Otherwise you run the risk of developing software in a culture which inderstands nothing of software dev. Not fun.


> therwise you run the risk of developing software in a culture which inderstands nothing of software dev. Not fun.

Preaching to the choir. My thesis was a lot of that, haha.

Thanks a TON for the recommendations!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: