Something that's important to remember about Code Complete (which is great, I'm absolutely not knocking it at all) is that it's written with languages like C, C++, and Java in mind. In C and C++ particularly, 'shorter function' might be a euphemism for 'clever function', hence the higher defect rate and increased difficulty to comprehend.
More expressive languages (by definition) shouldn't have the same length; if you're writing a 200 line Python method, you are most definitely Doing It Wrong.
Here are some of my rules of thumb for breaking things up (absolutely not independently arrived at, by any means):
- Loop bodies are good candidates for standalone functions.
- Everything in a function body stays at one tier of abstraction. If I find myself doing something at a lower level than the rest of the function, it's time to break things out.
- If I find myself using the word 'and' to describe what a function does, and it isn't pretty high up the abstraction ladder and primarily just driving calls out to other functions, it probably needs to get broken up.
- I'm not formal about it, but my mental metric for my code is to minimize 'total loc + total indentation * some constant', meaning I'm willing to trade loc to reduce nesting, to a point.
- I like functions to be no longer than a screenful. That's a max; most of mine are shorter than that.
I've seen many routines (functions, methods, procedures .. whatever) that are needlessly long due to overzealous use of the copy/paste technique. But once you eliminate that kind of stuff, I do agree that you can get to a point where you've abstracted too much -- and the actual thing you are trying to accomplish with your code becomes obscured.
I have not read these studies, but to me it seems that the defect rate could rise because it's no longer apparent what's interacting with what, and in what order. If I have to hop around to numerous different 5 line methods (often in numerous different source code files) it becomes very hard to intuit whether the code is "correct" or even have any idea what it's doing.
It's hard to say - I learnt FORTH at one stage. The thing with FORTH is that it's almost impossible to write long functions. You can, but it's brutal. So you write a lot of very short methods. I actually hated working like this, but I must admit the code was better.
The reason I didn't like doing this is that it disturbed my flow. I wasn't even sure how the function was going to look, let alone put together 10 functions just to flesh it out. You needed a strong design and intent before you started.
Today, of course, people rely a lot on refactoring for the same effect. I tend to like this better, but I'm not sure it's for any solid basis on my part, or I'm just lazy and want to hack instead of pontificate on the problem.
Now I try and have a rule of thumb - I like functions that have a single verb and noun, with a single optional subject, adverb and adjective. (The verbs and nouns can be system concepts, naturally). So ideally anything can be described in sub ~5 words... If I'm really serious I put together the standard nomenclature first.
Once it becomes hard to describe in those terms, or you need n+ nouns, you've probably split the atom and working at too high a level of granularity.
These findings might really be about module-size, not function-size.
For Java/C#, objects and packages are module units that were not available in the languages in the studies, that would naturally share the role of modularity performed by functions.
My opinion: what belongs in a module, and its size, is determined by the problem and its domain. The crucial criteria is to make things clearer, and uncluttered by irrelevant things. A module is a way to combine some things, and separate other things. There can be several legitimate ways to divide up a task into modules, though it will make more sense if you choose just one organizing principle and stick to it. This makes your choice of modularity predictable.
You made me realize that there are so many different levels of expertise, that what is right for someone's situation might not be right for mine. It might not even make sense to me (if I don't happen to know their specific situation), even though it is right for them.
Generally, cut-and-paste isn't the issue, copy-and-paste is. Granted, you can cut and paste paste paste ..., but cut-and-paste usually means moving things to a more sensible place.
The biggest problem is when a code blob is duplicated all over the place and (worse still!) the copies slowly drift out of sync, making them hard to reunite. I've been working on a program off and on that tries to detect this in a language-agnostic manner, but I've hit a couple dead ends. (I half expect somebody with a more orthodox CS education to take one look at it and go, "Oh, that's NP-complete," but I haven't let it stop me - it's been a great way to learn about graph algorithms.)
As you have discovered the problem is complex because of the explosion of compares if you try to do it in any general way. It would be useful to query "is there any other code like the section that I selected?" The copied code usually has changes in things like constants and argument. If you abstracted away constants it might be easier to compare. Just a suggestion.
Right. Allowing preprocessing with a syntax-specific extension or (ideally) scanning on the AST would be more interesting - code can have be surprisingly similar except for variable names, types, and other semantic annotations. Rather than calling them "patterns" and prizing them, sometimes abstracting them away would be a better approach.
My main focus is on curing the heartbreak of copy-and-paste-programming, but I'm sure it would have more general applications (if I ever get it out of quadratic performance, etc.). Unsurprisingly, what is "similar enough" can be a very slippery concept.
Doing it on the AST should help performance. Even if you're still using an N^4 algorithm, the AST should be significantly smaller than the string that represents the code.
And, as it sounds like you realize, this will catch cases where variable names have been changed, but the structure is the same.
Right, and there would also be more semantic boundaries, (possibly greatly) reducing the size of the overall problem space. If (by sheer coincidence) there was a token overlap between the last two lines of one function and the first two of another, that's probably not relevant in the same way a copied-and-pasted version of the same function in two places would be.
The AST also makes it easier to use heuristics based on semantic information to shorten the number of subtrees you're concerned with (previously substrings).
For example, if in a C-like language, you could decide to only look at subtrees that are at least complete statements, or go to a higher granularity and only look at compound statements. You could also eliminate common idioms, such as
for (i = 0; i < N; i++)
Which will pop up all over the place and give false positives.
If you're literally just looking for code that appears in two places, then there is an obvious polynomial time algorithm, no? (Check every substring against every other substring...)
Yeah, it's polynomial, but it's still really slow and not practical.
In a string of length N, there are N substrings of length 1, N-1 substrings of length 2, N-2 substring of length 3, ..., 1 substring of length N. This is the sum of the first N integers, which is N(N+1)/2. If we're doing big oh analysis, we can just say that's O(N^2).
Comparing every something to every other something is an N^2 operation. But, in this case, our something is already O(N^2), so the final algorithm is actually O(N^4).
Doing a O(N^4) algorithm on nontrivial sizes of N (and nontrivial sizes of N are where you need it the most) will take a long time.
Disclaimer: it's been a long time since I've done any algorithm analysis, so please check my work.
My algorithms teacher does research in something similar. IIRC he has a program exactly for showing copy/pasted code. You may want to have a look at his papers on Clone Detection Tools, "A Novel Approach to Optimize Clone Refactoring Activity", etc.
Comparing every something to every other something is an N^2 operation. But, in this case, our something is already O(N^2), so the final algorithm is actually O(N^4).
If you are just checking for equality, comparing each of N items to another group of N items is O(N). Use a hash table.
It is not so much about the length of the method. But every method should do just one thing and it should be clear what it is. That makes code more understandable and reusable. Probably as a consequence methods are shorter.
I have two objections: First, what is "one thing"? Render a Doom 3 frame? Or increment the "i" variable? The "Clean Code" author argued that a log statement or a try/catch statement deserved its own method.
Second: says who? Where's the evidence that doing "one thing" is better than doing more things?
In general I notice that there are many rules and best practices in programming, without a lot of evidence, or even discussion of the drawbacks.
The first of your two objections is perhaps a confusion of the word best practice. It doesn't mean ultimatum as much as it means something seriously worth considering.
What constitutes "one thing" is a judgment call, and it varies tremendously as should be evident from all the other replies here trying to define it concretely. I imagine the definition changes a lot depending on how one's catalog of discrete things change. The point is not to define this "one thing" for you, it is to get you thinking about it.
A lot of the best practices that have been put forth are the empirical outcome of repeated judgment calls by others. They notice a trend in their own practice, and put it forth as being more of an emergent principle than a coincidence. They might evangelize some, but it is a sign they are excited about the idea. It doesn't mean they don't know there are drawbacks; they might want to float the idea first and see if their concerns are legitimate. If I floated all of the possible drawbacks of ideas I've had, I'd probably scare people away, even though I overplayed a lot of them.
Oh, and for the second objection, there is data. I don't have it in front of me, but I believe that Steve McConnell cites data within Code Complete about the length and scope of methods relative to error rate. Granted, one citation does not constitute proof, but it's a sign Uncle Bob isn't alone on his soapbox. :-)
But then, the point of all of this evangelizing, really, is to get you off auto-pilot, and thinking about your work, not to give you the definitive answers to everything. That you asked this question shows you're thinking about it. Work hard to keep doing this, it's more important than any rule or best practice.
You're the author, dude! What you choose as the "one thing" of any given method gives me, the reader, information. You can emphasize an activity by breaking it out into a method, or de-emphasize it by leaving it inline.
There's no evidence that Poe was a great short story writer either. You have communicative power with your choices; use it consistently and wisely.
- Shower
- Get wet
- Clean
- Rinse
- Get Dressed.
- Have breakfast
- Prepare food
- Eat
So different tasks have a different level of detail. The advantage is that you end up with more reusable methods. Like "shower" or "Prepare food" that can be used elseware.
In another time there used to be more emphasis on top-down design, but I feel that has been lost and I do not know if it is still taught.
> The advantage is that you end up with more reusable methods. Like "shower" or "Prepare food" that can be used elseware.
But that is exactly the author of this post's point. When you have something that is more reusable, like "Prepare food", you end up using it in multiple situations, increasing the complexity of relationships between components.
So for example, when you want to make sure you eat bacon with breakfast, you can't just add "Prepare bacon" to "Prepare food" b/c now the "Prepare food" task may be executed in "Have lunch", "Have dinner", and "Have midnight snack". And you don't want bacon with your midnight snack! Instead, you have to add a parameter to the task, some conditional, add parameter to each call site, etc... i.e. more complex.
Is "prepare bacon" necessarily part of "prepare food" itself, though? Rather than hardcoding preparing bacon as a stage of food prep, it's probably more maintainable to have the kinds of food passed in (as a list, a Menu object, whatever) - what you're cooking strongly influences the entire process.
washHands()
for f in val(foods) do
coroutine.init(f)
...
end
waitUntilDone()
if (not aLazyGit) then doDishes() end
Really, cooking usually involves interwoven stages of things like setting water boiling, whisking a roux, kneading dough, chopping peppers, etc. Tacking on "...and make bacon" at the end of all cooking batches doesn't make sense. Instead, break prepareFood() down further into operations such as boilWater(vol), preheatOven(temp), etc.; Those should be reusable, barring unusual circumstances (e.g. boiling water at high altitudes). Without a lot of specialization via arguments, prepareFood() is too coarse an abstraction, about as vague as doTaxes().
It probably is okay to duplicate a calculation 4 times, all other things being equal. At the same time, I'd say it's probably a sign of other problems. If you need to reuse a calculation, that calculation is part of your application's domain (using the term loosely) and should be represented there, rather than being embedded inside scenario-appropriate parts of the system.
To answer your first question, the right size of the method is that which a typcial maintainer of the code has no problems holding within his head.
The "one thing" is a one-statement decription of the method such that the maintaner does not need to look inside the second go around undersand the place where it is being invoked.
As to your second question, I have no answer. It is a bunch of voodoo right now and I have to wonder which body could, even in theory, produce a scientificaly valid proof of efficiency (or lack thereof)?
I find a great way to understand the complexity you are coding into a method / function is to try and write a unit test for it. Short simple methods are easy to write tests for, while long, complex ones become exponentially harder to capture all the combinatorial state possibilities (eg: does it work when foo is null, but bar is not and baz is out of range? what about when bar is null but foo is not and baz is in range .... and so on.)
Even if you are not intending to write a unit test for a particular function, it's useful to imagine how hard it would be to do so as a thought experiment.
As I posted elsewhere, I am seeing trivial functions spread across multiple files because of the overuse of interfaces, DAO's, DTO's, Dependency Injection, and the like.
You're right that it is good to have every method "do one thing".
The art of programming is dividing up a huge messy process into a series of actions that each are understandable as "one thing".
When you look at code that successfully does this, it seems easy and obvious. Those who are good at it sometimes think of it as easy and obvious in the way that mathematicians call some hard problems "trivial".
But in general, making everything do just one thing is hard and an art-form. But still good to do.
I wonder if the same stats apply to stateful versus stateless languages.
I'm working a lot with PHP these days (not my favorite language) and for historical reasons we even avoid the use of objects. I've noticed that even experienced practitioners in the PHP world write super-long functions. Forget 200 lines, more like 1000 or even 2000. When I try to refactor these, I usually find that I start causing worse problems, passing values around all over the place, using globals and so on.
It seems to me that if your language is relatively underpowered, a long method can help contain the messiness.
Sometimes there's a local minimum, where you really need to rethink the whole chunk you're working with to come up with a better design. If you're in this situation, which you can get to gradually, there's might be no easy way to refactor a piece at a time. But this isn't (ever in my experience, at least) because a super-length function is really the best solution, but because the design wasn't great in the first place.
And, as I pointed out above, we don't even use classes. This might seem crazy, but this project was started before PHP had actually decent objects, and now all the special workarounds have become part of the house style. So I'm living in a world of pure procedural programming right now. It's the 70s again.
I dunno about 10 years ago, but for a long time now, you could just use a class to avoid globals and still keep some handy variables around. It's slightly painful (I came from Python, and I still end up typing $objectname.methodname() and then going back and putting in '->'), but not nearly so much as I used to think before I had to work with PHP every day. :)
In my experience, there's a "sweet spot" for how much functionality a method "should" have.
If a method is "too short", it:
* may result in methodA calling methodB calling etc., increasing complexity;
* may not be readily apparent how it fits in to the problem that's being solved.
If a method is "too long", its:
* complexity increases;
* ease of testing decreases;
* frequency of reuse decreases.
Obviously, some methods have little functionality, and thus require very few lines. For the majority of the other methods, though, I tend to keep them below 25 lines. If a method grows past 15 lines, I examine it to see if there's any functionality that feels like it should be broken out into a separate method.
I think focusing on method length is focusing on the wrong thing. Some of the worst long methods I've seen weren't bad because they were long, they were bad because they had about eight or nine levels of nesting. When a method ends with
}
}
}
}
}
}
}
}
you know you've got a problem. Breaking such a method up into smaller methods would make each shorter, but the big win would come from eliminating the deep nesting.
I have two perspectives. Back in the 80's, I and two others developed a word processor. We aggressively refactored the code do that the average function length was less than 10 lines. The code was smaller, faster, and supported more users than any of out competitors. The key was there were only 3 of us, who knew the code thoroughly.
Fast forward 20 years and I an flying solo on a vast JBoss/Spring?Struts/Hibernate project. I just added a feature with 6 lines of functional code, along with another similar piece in a DAO to get the data out of a database. Essentially, I was mapping a key to an inventory item to a key to a document. The feature was about 80 lines because of Java baggage and Spring injections. In addition I had to modify another 10 files of interfaces, implementations, struts actions, DAO's, a jsp, struts xml, and spring xml.
Another place where there is a functionless proliferation of small routines is caused by the java method overloading model. A common case is to have a method with several parameters, most of which have common defaults. You often end up with 4 signatures, but can easily end up with more if you support, say, the name of an object as well as the object itself.
I haven't read the studies referenced here, but I'd be concerned about mistaking correlation for causation here. Long methods may or may not be bad, but the temptation to write them usually arises when the programming task is already complex. The error rate may reflect the complexity of the task, rather than the approach (long methods vs short ones).
That's a good thought, but it only explains why routines longer than 200 lines would be more error-prone. It doesn't explain why shorter routines were also showed a higher defect rate.
If I am making a change to long method I can understand what this method is doing. I am more likely to take care in making changes because I can see the potential problems and I have a visual indication of the complexity of the code. Also I might be less likely to make changes unless necessary because of the complexity.
When working on a shorter method, changes are easier to make and the context and complexity are hidden. The side effects of the changes are harder to see, so I am more likely to make a change which can break things and not know that it will cause problems.
Unit testing should provide a measure of protection against this kind of problem (side effects of changes). The actual experience of writing, updating, and running the tests adds its own complexity, though, so I'd view them as a big help, but not an easy to apply solution.
I was also struck by seeing actual studies cited, in the usually fashionable, religious and cliquey schoolyard of software development.
I wonder why there aren't any studies more recent than 1991. That's 18 years ago! Before Java, before C#, before Ruby, before JavaScript, before PHP (C++, Perl and Python had already started).
Perhaps because everyone involved has a barrow to push (including academia), no one has an interest in undertaking impartial and objective research - after all, it might give the wrong answers.
One idea I have about why 200 was chosen is that back then the languages that were commonly used weren't as expressive.
You can often say the same thing in Python or Ruby in fewer lines than you can in Cobol, Fortran or C.
Or put another way, of those 200 lines of Cobol maybe 1/2 to 3/4 are setup code that the language makes you write inorder to be able to do what you want vs say a Python where a smaller percentage of the code is overhead vs doing what you set out to do.
I'm not positive but I think that there have been studies that show that a programmer's output in LOC per day is fairly constant no matter what the language is. I could imagine this being similar -- maybe about 150-200 "operations" is the right number, no matter what level of abstraction those operations are at.
That's an interesting point about the setup code though. When you have to do explicit memory management, there may be a greater cost to dividing a long method into several smaller ones.
In general, splitting up a function into smaller ones might just have a greater cost in those languages. It's also possible the development tools are a factor -- it's a lot easier to jump between various methods in Eclipse than it is in vi (for me, anyways).
> I'm not positive but I think that there have been studies that show that a programmer's output in LOC per day is fairly constant no matter what the language is.
That may be true, I have no idea. But the point would be that you can do so much more with some languages than you can with others.
To make an extreme example compare writing a factorial function with Haskell and assembler.
I try hard to stick to these "rules" for determining method/function size.
1) Keep methods/functions shorter than one "screenfull".
2) If I code the same or substantially the same logic twice, it gets extracted to it's own method/function. [I also do the opposite, if method/function is used once I remove it and inline logic.]
3) Logic extraneous to the problem at hand gets extracted into method/function/language feature such as macro's or decorators if it takes more than a couple lines. (logging, safely opening a file, wrapping in a try/catch, etc)
Less often:
1) Logic that needs it's own unit test gets extracted.
2) If a comment precedes a block of code saying what it does. extract the code, name method/function after the comment, delete the comment.
# load the exception records from exception file
bla bla bla for 10 lines...
exceptions = load_exception_records_from_file(fh)
If the methods have a clear purpose with well-defined inputs and outputs and no side effects, then I think the trade-off is generally in favor of smaller methods. Once you introduce side effects, longer methods then have the advantage of showing you all the side effects in one place.
Great point. I've seen that developers that really understand OO tend towards smaller, more encapsulated classes/methods w/o side-effects whereas developers with a procedural background tend towards longer 'void' methods that lump side-effects into one place.
I think these results may be misleading -- most the methods I know of that are >200 lines are implementing some kind of large case statement. Often, these are auto-generated files, not hand-coded ones. So it may not be a fair comparision.
Agreed; selection bias - presumably the code base analyzed was made by experienced programmers with instincts tuned to creating smaller functions.
When we do observe larger functions, there must have been a specific reason to do it, like a big case statement or text generation. The programmer is making an exception, and there's probably a structural reason for it, the logic is probably simpler, and extra attention will be paid.
Breaking a large function apart into several smaller pieces makes testing them easier, and (even more importantly) lets one name them. A well-named function is generally better than a block of code in a function with a one-line comment before it, IMHO. (Of course, there's nothing stopping you from giving all of functions names like doThatStuff() besides the eternal scorn of your peers.)
It can be annoying in languages that require a lot of type annotations just to declare a function and have poor support for nested functions, though.
I wonder why shorter methods would be inversely correlate with errors. Maybe it's, to some extent, similar to why so many people get confused by pointers: they lose track of where they are in the program.
One aspect this piece doesn't mention are unit tests. I've found that writing short to-the-point methods helps with testability(?). I just wrote some unit tests the other day for a 100+ LOC java method. 300+ lines of test code to setup the massive amount of dependencies and go through all of the if- and for- statements. Would've been a lot easier if it was broken apart into smaller discreet units.
The correct size for functions is as short as possible without adding unneeded complexity, and no shorter. A piece of code deserves to be a function if you can coherently explain what it does using only terminology from the problem domain, or if having it significantly simplifies other code. Otherwise, leave it inline.
Example: A program to withdraw money from a Bank Account.
getBalance() // Does One Thing
if(getBalance > withdrawAmount) {
subMoney() // Does One Thing
} else
{
Print ("Insufficient Funds");
}
Other thing, I would like to point is that if you are making the code of your function visible, then it should be properly documented and readable. The compilers do a really great job of Optimizing the code, so there is no need to make the code look smart and have no one understand it. The only place I would put real smart code is when I am coding in assembly. In that situation, there should be really good documentation explaining the reason of your smartness.
Example:
A = 4
B = A X 2
[Smart Code: Right Shift A, but it's not always obvious that you are trying to multiply by 2 here, unless you have an explicit comment]
In more dynamic programming languages it is easy to create a method that deals with abstract objects, eg iterative ones, and does not have to hold too much knowledge.
To say nothing about easiness to test a short method that does only one thing.
I find long methods hard to read, especially if they have a lot of conditional logic. Recently, I encountered some long methods that had very generic names, such as "processData." One of the advantages of smaller functions/methods is that you can give them better, more accurate names since their scope is more limited. If one is unfamiliar with the code, this is a tremendous help. There are few things as daunting as 100 lines of code with 10-20 different variables all in the same scope. In such cases, I have to load it all into my head and break it into smaller pieces to understand it anyways.
Short answer? No.... each method should have a clearly defined concern separated from the concerns of other methods with an obvious, documented and tested contract
Seriously, had this in a formal coding standard, you never want to work in those places - so make sure you go straight from university to a {lisp|ruby|python|linux|YC}startup.
Forcing yourself to write short methods forces you to design your solution cleanly. I wasn't a fan of short methods, until I observed a master programmer's (who was also a co-worker) work for a while. After I adopted this style, I've found my own code improving. Ymmv.
I prefer small-function programming, but I think there are needed caveats.
Code reuse and easy testability are obvious upsides of small-function programming but, most importantly, interactive development is easier and more fun. You can test and play with the code while you write.
One downside on the human-readable end of things is that small-function code can easily devolve into spaghetti code if it's not done right. You actually have to think about where your functions are in the physical file, and have them logically organized, in order to have legible code.
Appropriate line length of a function varies wildly by programming language. In Lisp or Haskell, a 25-line function is usually considered a "monster" and should, if possible, be broken up, since a lot is happening in those 25 lines. Whereas a 25-line Java function is nothing unusual.
More expressive languages (by definition) shouldn't have the same length; if you're writing a 200 line Python method, you are most definitely Doing It Wrong.
Here are some of my rules of thumb for breaking things up (absolutely not independently arrived at, by any means):
- Loop bodies are good candidates for standalone functions.
- Everything in a function body stays at one tier of abstraction. If I find myself doing something at a lower level than the rest of the function, it's time to break things out.
- If I find myself using the word 'and' to describe what a function does, and it isn't pretty high up the abstraction ladder and primarily just driving calls out to other functions, it probably needs to get broken up.
- I'm not formal about it, but my mental metric for my code is to minimize 'total loc + total indentation * some constant', meaning I'm willing to trade loc to reduce nesting, to a point.
- I like functions to be no longer than a screenful. That's a max; most of mine are shorter than that.