Hacker News new | past | comments | ask | show | jobs | submit login
The Janitor Programmer (janitorprogrammer.com)
28 points by sutro on Jan 9, 2009 | hide | past | favorite | 8 comments



Here's a trick I discovered for working with bad code that used to help me a lot: treat it as an archaeological dig. Complicated bad programs usually evolve out of simpler bad programs, often in a semi-consistent way - inconsistent enough to be a maintenance nightmare, but consistent enough for a human reader to grok the patterns. If you can read the code backwards in this way and find the remains of the original simpler system, you then have a model for understanding the whole system: it's the crappy simple original one plus the crappy ways it was extended.

For example, complex bad systems are often extended by means of duplication (C-c-C-v-driven development). It's often possible, when you have a number of duplicated sections, to figure out which was the master and which the copies - e.g. if there were any comments in the original code, people will just leave them in the duplicated code. (The sort of programmer who reads and updates comments probably doesn't program by C-c-C-v in the first place.)

This brings up another point about bad code. Naive bad code is a lot easier to understand than sophisticated bad code. It's bloated and buggy, but there's always a sort of internal order to it. Bad programmers are able to maintain it by remembering the 10 places that X is done and the 19 places that Y is done. Sophisticated bad code, on the other hand - the kind full of impenetrable pseudo-abstraction - is a different can of worms. I don't think the above techniques apply to it. However, naive bad code is more common.


(C-c-C-v-driven development)

I like to refer to this kind of programming as the successor to spaghetti code: copypasta.


I want to cry. I found his example of code he needs to janitor to be positively blissful compared to what I often wade through. "When will the hurting stop?"


What problems do people see in this code? The hard-coding of the sql statements that differ by only by table name? The code repetition of the while loops? Something else? Just interested in seeing if my understanding and analysis of it agrees with what others think.

Edit: One more - the "select *" when only the "name" is used?


I see...

no variables defined before use (optional). Pet hate: VB fakehungarian names where objects are called oSomething.

In the oRS.open line, the "2, 2" are magic numbers - I have no idea what they do without looking them up. (Turns out they are CursorType adOpenDynamic and LockType adLockPessimistic). Since they aren't the defaults, presumably they are necessary, but from the snippet given the defaults might be OK.

The presence of database tables Members and Members2 with (in this snippet) no difference between them. If you don't need two, merge them. If you need two, name them more helpfully. If the data will only be in one or the other, could you use a union query? Why might the data be in one or the other and why not use that information directly to decide which to query instead of querying both? What if there are no names in either table?

As you say, selecting * but only using one field, and the repetition within the if branches.

Unhelpfully named array "myarr"

Redimensioning of the array every time through the loop.

But most importantly, all the body of the code does is take the names from the recordset and put them in an array. I expect (hope) there is a shorter, more efficient, more idiomatic way to copy a recordset into an array.

See other HN topic about the fun, easy and addictive nature of criticism. I shouldn't really - at least they have a product with members and money to employ people to code for them and enough success that their code stayed around long enough to become old and maintainable... and there's plenty of people who'll argue that make it work, ship it, then if necessary rewrite it is much prefereable to make it beautiful, perfect it, tweak it, file for bankruptcy.


Sorry the example isn't ugly enough for you guys, but I had to pick something reasonably generic. Sounds like we've all seen a lot worse, but most of the really bad stuff I've seen could be easily identified as proprietary. That particular type of snippet was taken from a system that had a sick amount of duplication and some really "creative" use of arrays for sorting that took forever, when the sort could have simply been done with SQL in a fraction of the time.



Bookmarked.

When you're not flying high on a startup, this kind of stuff pays the bills.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: