Code Reuse On Ice – The Reefer Model of Software
I mentioned in a previous post that I was irritated with unnecessary software complexity and bloat. The general problem of bloat is endemic in most human endeavors whether it be software development, company growth, or my kid's closets….still it's something to fight against. For software, there are two major categories of bloat. 1) Feature bloat and 2) Code reuse bloat. You might be surprised to see code reuse on the suspect list for bloat, after all, code reuse was supposed to save time, labor, space by eliminating redundancy and improving quality.
Landon's Reefer Model of Software Bloat
Think of your refrigerator. Now imagine you're feasting at the Cheesecake Factory where the single portions are enough to feed a whole family. You bring the voluminous leftovers home and they go into the fridge. The next day, you're invited over to a friend's house, but you can't take your Cheesecake Factory leftovers to the potluck, that would be tacky, so you whip up some of your world famous Tuna Casserole with all the fixins'. Turns out it's famous for reasons other than what you thought, so you bring it home with one scoop taken out of it and it goes into your reefer – you comfort yourself thinking "that's ok, all the more for me." Your current Reefer graph looks like this:
For the rest of the week, you repeat the drama at every meal, and by the end of the week, your frig is busting at the seams with leftovers and it's starting to smell rank. Your tummy vs fridge consumption graph now looks like this:
Software developers do the same thing. We see a cool piece of functionality in a library somewhere. To incorporate it, we add a JAR or DLL (gasp) file to the package, write some code that uses it, and then go on to the next feature (our next meal.) As every developer knows, there are dependencies to satisfy so most of the time you're not just adding one JAR for the bits you want – it comes with some camp followers. In each case we use some of the functionality but don't use every bit of functionality in the library we just added. In some cases, we may use very little of it, but the little we used we needed badly. Perhaps, the little we used saved us several weeks of effort, so that's nice. Then we sit back, pop a brewskeee and soothe ourselves with words like "ahhh, that's what code reuse is all about…saving labor…let someone else maintain it…let them fix bugs….that's the ticket." The thing about the 'O reefer is that eventually the food rots, you can't stand the stink, and you clean it out (at least you do unless you're a frat boy… in that case, the maid-fairy comes to clean it out for you.) With software, it also eventually rots, but takes a lot longer, so the bloat is a lot worse than a week of leftovers because it's harder to smell…and no maid-fairy is around to cleanup our software leftovers.
There's a cross-over point somewhere in which the labor it takes to haul all this rotten software around overtakes the labor savings you gained by adding it in the first place. All of the sudden, code reuse takes on a Machiavellian aura in which the "careless reading" concludes the end (saving labor, gaining features) justifies the means (code reuse, bloat.) You have to live with the rot or remove it just like you would with your reefer. The only question left is when will you have to clean it out before you can't stand it anymore? So, while we love and respect our illustrious CTO, Rod Cope, I have to take exception to his remark, "Yes, both of these solutions can lead to dramatically increased disk space usage, but hey, when's the last time your new computer came with a 30 megabyte hard drive? Laptops now ship with fast 160GB+ drives; it's time to move on." I don't care how much disk space you have, it won't solve this problem. In fact, the more disk and memory you have, the worse the bloat problem gets. As software engineers, we need to still create minimalistic, functional software and fight the bloat at every turn regardless of how capable our machines get. Some things like our ability to deal with software complexity don't scale with Moore's Law.



Minimizing bloat is certainly a good way to reduce dependency hell, but I’m sticking to my orthogonal concept that the way to package software is to include all the dependencies (however many they are) along with it. If software developers can avoid throwing in the kitchen sink when they really just need a drain stopper, so much the better.
I think Landon is saying that you should evaluate the need for a dependency (not just evaluate including the jars in a distribution).
It’s an interesting question. What is the threshold for including a new dependency? Does it save you 10 lines of code? Is that enough to include a new dependency in your project?
How do you evaluate the weight (bloat) of a dependency? The size of the library? The impact of transitive dependencies? On the flip side if the dependency has a community you’ll benefit from continued improvement and additional features.