A Fine Line Between Abstraction and Obfuscation

By on September 24, 2007 9:09 pm

Introduction to Part 1

By the time you have written your abstraction layer, you have essentially written your own framework. Chances are, you are not a good framework writer, and it’s going to suck, and you are going to realize that one or two versions down the road and re-write it. © RedMonk in Java’s Fear of Commitment.

While there is a lot written on how difficult it is to write a good abstraction layer, there is very little advice on how to avoid the worst evil of the abstraction layer: obfuscation. As I scoured the internet looking for any discussion on this topic, my search results were a lot more sparse than I was expecting. There is a very succinct blog entry by Rhett Maxwell that turned up in my results that summarizes some of what I’d like to say in a single sentence: Most of the books out there that teach OO design talk about Abstraction, but they do not warn about Obfuscation at all. Its a shame.

And to those of you wondering why this type of obfuscation is a problem, let me clear it up. Obfuscation at its most harmless simply confuses people. Obfuscation at its worst makes people stupid. When the most brilliant programmer can no longer figure out how to get from point A to B, even though they are right next to each other, all their genius is useless.

JavaScript is incredibly susceptible to becoming accidentally obfuscated. In the next month or so, I want to go over various common methods of abstraction that I have seen widely used in JavaScript and discuss how and why they can lead to obfuscation.

I will be dealing with these issues in several sections:

  • Decorating Objects
  • More on Naming
  • Dealing with Multiple Function Signatures
  • Comments Don’t Fix Code
  • Wrapping Parameters
  • Calling Function References
  • The Ideas of JavaScript

Decorating Objects

The Oxford Dictionary says that abstraction is “the quality of dealing with ideas rather than events.” For programming, we might say that abstraction is “the quality of dealing with ideas rather than their implementation”. Write this down, it’s precisely what we need to remember when we’re manipulating objects in JavaScript.

One of the subjects that causes a lot of confusion about JavaScript among novice programmers is how to use it for object oriented programming. Though I personally find the mutability of JavaScript, along with its prototypal inheritance, to be relatively simple, flexible, powerful, and beautiful, many find it baffling.

The “solution” that many have come up with is to abstract these concepts using a decorator function. There’s nothing wrong with doing this, it can significantly clean up your code; the problem is the direction of the abstraction. Many of the tools available to abstract OO use naming conventions and concepts more akin to rigid languages, such as Java or more powerful languages, such as Python.

Using our definition of abstraction, these tools are using the ideas that represent Java’s implementations to try to represent the implementations of JavaScript. Before long, you end up adding the idea of multiple “superclasses”, an idea that simply doesn’t have an implementation in JavaScript. You’re then left with a programmer wondering why instanceof is evaluating to false even tough their superclass is in the list.

It also means that we lose out on some of the ideas of JavaScript that are totally unique to the language. The call and apply methods immediately spring to mind as incredibly useful tools for doing OO in JavaScript.

The solution isn’t getting rid of the abstraction, it’s conveying the idea properly. We need to do this by unlearning a lot of the terminology that we’re used to. If the terminology of other languages was created to describe a specific idea of another language’s implementation, why do we think we can just apply it to the idea we’re trying to convey, and do it without any extra baggage?

More on Naming

A lot of these utility functions end up becoming grossly weird in the name of saving bits. Only with the worst possible logic could $ be an acceptably meaningful function name. In most JavaScript libraries that use the $ function, it is merely an abstraction for the document.getElementById function (which is admittedly clunky). They’re taking a descriptive name and replacing it with a non-descriptive one. Abstracting in this manner is obfuscation, by anyone’s definition.

If you are really worried about reducing the size of your files, do it any other way. Use acronyms or use a code shrinking tool. Naming things properly is one of the most important things you can do in the long term. And be wary of those that claim that a ridiculous name is okay to use so long as you force it into the programmer’s lexicon, it is a sure sign of a lack of priorities.

Comments

  • ttrenka

    Another reason given for the use of $ as a variable or function name is to “save typing”; the idea is that instead of having to type “document.getElementById” over and over again, you can just type the single character, which (in theory) saves a lot of headache and time, particularly when you’re prone to mistyping.

    I find this to be a terrible excuse. It’s one thing when you get programmers that believe that every method hasToBeReallyDescriptivelyNamedJustSoYouCanHaveAClearExplanationOfWhatItDoes, it’s entirely something different when you encourage lazy practices. At the least, you can use a macro or a regular expression search and replace to swap that method name back when deploying; it’s not so hard to do this:

    :%s/\s?\$/\sdocument\.getElementById/g

    and then use a tool to shrink your code.

    It doesn’t matter that libraries are making this kind of a standard; everyone’s implementation is or can be slightly different, and those difference kill any kind of utility the “standard” might have.

  • Well, there is an obfuscation, and there is a verbosity. Imagine something like that:

    doSmth($(“a”), $(“b”), $(“c”), $(“d”), $(“e”), $(“f”), $(“g”));

    Now replace $ with document.getElementById(), and ask somebody to tell (without much deciphering) what does it do. It is hard. That’s why math tends to be on a concise, yet cryptic side — shortcuts make it easier to understand (I still remember the Reduce language and its verbosity :-( ).

    Verbosity can kill clarity. Math people understood it long time ago and employ a lot of clatter-reducing tricks — unary and infix operators, vertical/spacial positioning, single-character symbols (delving into Greek, and even Hebrew, if they have to). I think it is an oversimplification to say that short (== non-descriptive) symbols are necessary bad by definition — it is not true in some circumstances. I think it should be judged on per-case basis.

    Of course, if you use some notation, the downside is obvious — people have to study it first. It helps to study if this notation is compact and regular. Now we have to decided what do you want to minimize — the preliminary study of notation, the ease of deciphering your intent in the code (using said notation), or both. If you have a project, which is going to be supported for a long time by a small stable group of developers — I say invent the notation first to minimize the support cost. If you have a great fluidity of developers, minimize the notation, and increase the verbosity of the code — you cannot afford your developers spending time on “unproductive” notation studies, you want to minimize “the both”. The latter is pretty much common for many open source projects.

    Coming back to $ — I don’t think there is something fundamentally wrong with it, but there are some problems:

    1) $ does not use namespaces — it keeps it short, but makes it vulnerable to redefinition. You have to make sure nobody else defines it. You can do it by committing to just one library and enforcing “no $ redefinition” rule.

    2) $ looks a lot like S, which can lead to readability problems.

    My personal take: libraries should not define $ or any other no-namespace global symbols. But if you want to, define it in your own code, e.g., for Dojo: $ = dojo.byId;

    BTW, the syntax-level compression cannot help you with long public names, which is a common case for libraries. A smart linker can optimize the web *application* by replacing long public names with namespaces with short private ones, but it is hard to do it in general for JavaScript code.

  • Eugene, I completely agree that verbosity is the evil cousin of a shortcut. What I am saying is not that shortcuts are bad, it’s that shortcuts should have some meaning. I look at dojo.byId as a good example of a shortcut that significantly cuts down on size and still retains its meaning.

  • Agreed. I just wanted to point out that there are two sides of everything. Good descriptive names can be too verbose reducing the clarity as much as cryptic shortcuts. And I stressed clarity over reducing the size, which can be interdependent in some cases.

  • ttrenka

    Eugene–
    I also completely concur about verbosity; it’s one of the main reasons I tried to be clear about it in the Dojo Style Guide (i.e. getHandler as opposed to getEventHandler, etc.). I’d rather do symbols than something overly verbose any day of the week.

    And I also completely agree with you about library use vs. end-point (i.e. library-consuming) code; many times have I defined $ as an alias for doc.getElementById–but I only do it if I know that the code I’m writing isn’t going to be used by others. This, I think, is where that philosophy of very short variables fails.

    I also think that short names for variables in a local scope is perfectly fine in any code; for instance, I have no issue using “s” for a string, “b” for a bool, “n” for a number, and “a” for an array defined in a small scope and discarded almost immediately. These are only temporary variables and while it can be unclear what they mean, if they are used consistently I think that’s just fine.

  • Pingback: SitePen Blog » Blog Archive » Dealing with the Flexibility of JavaScript()

  • Well Eugene, you could write this function :

    doSmth($(”a”), $(”b”), $(”c”), $(”d”), $(”e”), $(”f”), $(”g”));

    like this :

    doSmth( document.getElementById(”a”),
    document.getElementById(”b”),
    document.getElementById$(”c”)

    );

    Which is not very hard to understand.

    But I think the main idea is larger than this discussion on redefining document.getElementById() or methods as such. Neil has a good point in this topic and there is still a lot of water to run under the bridge until we will get to see the things all of us only dream about for Javascript. My hope is that all these discussions will set the grounds for a future,really near future that is, of a reborn improved javascript language from grounds up.

  • Filip, I am totally opposed to the idea that JavaScript is broken. I think developers are broken, and they view JavaScript as broken because they refuse to unlearn all the stuff they’ve been taught that has no place in the language and refuse to embrace the power gained through JavaScript’s flexibility. The fact that you can alias document.getElementById however you want, with a minimum amount of code only reaffirms this point.