I hate using badly designed APIs. I hate it even more when someone beats me over the head with words they were handed in some rhetoric class masquerading as a computer science course. Words like “abstract”, “pattern”, and “object oriented” are used like a shield to protect the implementer from critical words like “crap”, “complicated”, “obtuse”, and “annoying”. It’s even worse when the implementer realizes that if he implements the most complicated piece of shit possible then he can go rogue consultant and make tons of mad cash helping poor unsuspecting companies implement his steaming pile of bullshit. Harsh words? You bet. But I’m fed up with people imposing their faulty definitions and ideas on me without any way for me to easily fight back with a reasonable explanation as to why their crap is steaming. I’ve decided to start fighting back by coming up with a set of essays about programming that highlight common design misconceptions. This essay is about my top pet peeve: an abstract interface and an indirect interface are entirely different things.
The point I’ll be trying to make throughout the essay is simple: Abstraction and indirection are very different yet cooperating concepts with two completely different purposes in software development. Abstraction is used to reduce complexity. Indirection is used to reduce coupling or dependence. The problem is that programmers frequently mix these up, using one for the other purpose and just generally screwing things up. By not knowing the difference between the two, and not knowing when to use one vs. the other or both, we end up with insanely convoluted systems with no real advantages.
NOTE: When I use the term “interface” here I mean what it meant originally before Java co-opted it: A means of interacting with something (not necessarily an object). Whether that’s a Graphical User Interface or an Application Programmer Interface. I also use the term “user” to refer to the programmer who has to work with an interface. Really it should be “consumer” as in the person buying your crap, but consumer always gets confused with queue operations. To keep things simple I’ll just say “user”.
I should point out some definitions to start things off. Here’s what google has to say about indirection, indirect, abstraction, and abstract. These definitions will demonstrate my primary argument (although indirectly).
First, notice the last definition in the list for indirection: “deceitful action that is not straightforward”. It’s interesting that indirection has a definition which is negative and associated with sneaky behavior. I definitely feel deceived when I use an API that is not straightforward. “Having to create some 10 different classes and sometimes just as much XML crap simply so I can create some basic EJB persistent beans really fits this definition of “indirection. The results for “indirect” are better and have definitions like:
- “having intervening factors or persons or influences”
- “not as a direct effect or consequence”
- “descended from a common ancestor but through different lines”
What’s funny is this definition is actually more descriptive of how the abstract keyword is used in Java.
Abstraction and abstract definitions tend to look like this:
- “a concept or idea not associated with any specific instance”
- “the process of formulating general concepts by abstracting common properties of instance”
- “the act of abstracting, the separating from the concrete. To derive. To epitomize; to abridge, summarize, concentrate. A brief statement of the chief points of a larger work”
- a condensed record or representation.”
All of these definitions seem to discuss simplifying complexity and generalizing the concrete. They mention words I commonly associate with the reduction of some complex topic or idea into something more manageable. They also continually refer to generalizing something and combining repeated commonality which is also a means of reducing complexity.
There’s one “definition” of abstraction that lays out the entire basis for my argument very clearly: “(v) The process of separating the interface to some functionality from the underlying implementation in such a way that the implementation can be changed without changing the way that piece of code is used. (n) The API (interface) for some piece of functionality that has been separated in this way.” Notice that this definition is entirely different from the others. It doesn’t mention words like “abridged, summarize, concentrate, concrete, general, or common”? Also notice that it is the only definition that is like this and it’s defined byyou guessed itthe programmers working on the Darwin project! Proof positive that programmers get this wrong right there from google.
Even more proof comes from the fact that Java uses the keyword “abstract” to create objects which actually support indirection. Think about it, the “abstract” keyword doesn’t reduce, summarize, or generalize a more concrete implementation, but rather creates an indirect path to the real implementation of that function. If this “abstract” class were to follow the previous definitions it would simply reduce the number of functions needed to use the actual concrete implementation. But, when you implement an “abstract” Java class you must implement the same number of functions including the abstract functions just to get it work. Ladies and gentlemen, this is indirection at its finest and abstraction left the building years ago.
This wrong-headed definition of “abstraction” is actually describing indirection. Go ahead and read the other definitions carefully. None of the other definitions mention being able to “change the implementation”. The others continually mention reducing complexity, hiding details, and the idea of deriving the general from the concrete or specific. None of them mention being able to remove the concrete implementation. This definition is actually describing the result of creating an indirect route to some implementation so that you can “re-route” the target implementation. Indirection doesn’t concern itself with the complexity or generality of this path, only that the path is not straightforward. Actually most of the definitions mention some form of additional complexity or annoyance from having to deal with indirection which implies that you pay for the flexibility with increased complexity.
Someday someone will pull a Dave Winer Asshole Plug Maneuver and flood google with convenient definitions for “indirection”. Until then, I’m going with the established definitions. Abstraction focuses on reducing complexity, generalizing the concrete, and abridge or summarize. Indirection is a necessary evil used to create pluggable systems. I would also add that an indirect interface should be “glued” or simplified with an abstract interface to wrap it.
Indirection Killed the EJB Star
Let’s pick on the EJB specification for a second here. Take a look at this classic example at the official Java website. This example describes a very simple data model with a Customer, Subscription, and Address in an even simpler set of relationships. To implement this incredibly simple data model you must implement a Bean, Local, and Local Home interface for each one, which makes 9 full and fairly complicated classes and Java interfaces. In addition to these 9 different files you must also keep track of JNDI configurations, XML files of various formats, and SQL database schemas. It’s a whopping huge mess for nothing more than three connected classes.
Indirection causes this complexity. More specifically, the EJB specification doesn’t hide the complexity required to configure and access the necessary components. Instead of creating a single entry point to access the key components you must traverse an morass of complicated functions. Instead of hiding how objects classes are related and found you must know things like a home and bean and interface how they are different. You’re forced to deal with this complexity even though Java has reflection and an EJB container could easily figure most of this out from some configuration files. Don’t believe me? Take a look at Hibernate or Spring which basically does just that.
Look at the code for CustomerBean.java and specifically the addSubscription (String subscriptionKey) method as a prime example. Just to add a subscription to a customer you must do the following:
- Grab an initial context.
- Use the initial context to look-up the LocationSubscriptionHome object.
- Use the LocationSubscriptionHome object to find the subscription.
- Then call CustomerBean.addSubscription (LocalSubscription subscription) to add the subscription. Oh, but we’re not done yet.
- Then use the CustomerBean.getSubscriptions () to get the Collection of subscriptions. Where does this method come from? It’s defined as abstract just like the class is. How does this method get created? If you read through the document you’ll find that this is actually a Container Managed Relationship (CMR), which meanstada! it’s probably defined in an XML file someplace.
- Finally, after all of that indirection we are able to call the Collection.add () method on the underlying collection which is also container managed and mostly defined externally which makes it one more level of indirection.
To summarize, just to get to where we could add an object to the collection, we had to follow this chain of calls: new InitialContext () > InitialContext.lookup> CustomerBean.addSubscription () -> CustomerBean.getSubscriptions () -> Collection.add (). An unbelievable 6 levels of indirection just to add a fucking object to a fucking collection. This is the kind of bullshit that chaps my ass purple and makes me want to eat babies.
I classify this type of interface as “indirection infested” and stare in amazement when I’m told this is an abstract interface because “you can change out the implementation of Subscription without changing your code.”
This is where we have the disconnect my friends, because in all of my research, the only discipline that thinks abstract==replaceable is computer science. All other disciplines think abstract requires a generalization, reduction, or simplification of the complex. What we have above (besides more of a mess than a bloody pillow case full of smashed puppies) is an indirect interface, which allows us to change out the implementation.
In order for this set of object interactions to be abstractaccording to the definitions shown previouslythey would need to reduce the information necessary to follow all of these indirect links, effectively hiding them from my perceptions. Hibernate has the right idea with finding new objects by using a query language. A query language lets you easily find any object you’re looking for, including it’s collection based fields. It hides all the nasty details from your sight putting the complexity in external configuration files. EJB desperately needed a facility like this, but the designers just didn’t get that indirection is evil and should be hidden from the developer.
I’ve found that developing a short list of desirable qualities helps discriminate between two aesthetics. Abstraction is difficult to pin down exactly, but it is possible to determine if a given interface has the qualities of an abstract interface. By creating a simple set of “rules” to evaluate an interface you can start to improve any existing or future code. These rules are always open to suggestion:
* An abstract interface should not require any configuration to use. Configuration should be done behind the scenes using an external resource like a property file, and should be done internally before anyone uses the interface.
* A user of an abstract interface should not have to go through more than one function call in a chain to “find” any component they need. Remember that indirection is evil and only ends up adding to the cognitive complexity necessary to use something. You can find examples of this in normal GUIs when you are forced to go through a huge list of complicated steps across multiple interface to complete one task. A wizard is an example of an abstraction over these complicated steps, and it basically provides one access point to your task. The same idea applies to abstract interfaces: If the user has to call more than one function just to access a component or complete a task then you haven’t helped them at all.
* If an abstract interface wraps another interface then it should reduce the number of entry points when compared to the interface being wrapped. This is probably the closest to a “measurement” of abstraction as you can get. I sometimes prove the point about abstraction by taking a wrapped interface (the target) and a wrapping supposedly abstract interface (the wrapper). I’ll count the number of functions in the target and the number in the wrapper. If both interfaces have the same number of functions (or worse, the same names) then you haven’t gain anything more than complexity and reduced performance. In this case the interface is not abstract, it is indirect and it should be configurable which implementation gets used.
* An abstract interface should not have the goal of being replaceable. If the interface needs to be replaceable then it it should wrap a more complex and configurable interface that is replaceable. This is necessary to avoid confounding the complexity of indirection with the simplicity of an abstraction. Another way to think about this is that abstraction’s simplicity goal competes with how indirection complicates things. It is too difficult balance these competing influences in the same interface so don’t.
* If an abstract interface stands alone then it should have a minimum number of entry points necessary to provide its services to others. I shouldn’t have to point this out, but keeping the number of entry points to a minimum makes things simpler. The problem with this is that what is “simpler” and “minimum” depends on the situation and the needs. Usually you’ll start off with a complex
interface that is flexible, and then later develop a simpler abstraction around it that encapsulates frequent usage patterns.
In comparison, an indirect interface generally is just the “normal” way things are done in object oriented programming. The mechanisms already in place for writing OOP style code (and actually any code) are already designed for doing indirect work. The only additional requirement I would place on an indirect interface is that if it is designed to be plug-and-play then it should be easy to switch which component is used. Maybe not on the fly, but a user shouldn’t need to change code to use a different implementation.
I think it would be good to find examples of good and bad use of the word “abstraction” in literature. I actually find many incorrect uses for it, but never thought about writing them all down. My next essay on this topic will be a comprehensive survey of where I’ve found this term used incorrectly. It may be anal, but it would make the point well that computer scientists and programmers just don’t get it.
Feel free to contact me with comments on this essay. It is still being worked on and is open for review. I’m pretty sure it makes no sense right now, but the major points should be there. I’m especially interested if you have any examples which you find relevant to the discussion, either for or against what I’m saying.
@zedshaw has a great essay about the difference between abstraction and indirection and how software developer often get them mixed up.
This is where we have the disconnect my friends, because in all of my research, the only discipline that thinks abstract==replaceable is computer science. All other disciplines think abstract requires a generalization, reduction, or simplification of the complex.