6.2 Good Class Interfaces
- Create good abstraction for classes;
- If you think of the class’s public routines as an air lock that keeps water from getting into a submarine, inconsistent public routines are leaky panels in the class. The leaky panels might not let water in as quickly as an open air lock, but if you give them enough time, they’ll still sink the boat. In practice, this is what happens when you mix levels of abstraction. As the program is modified, the mixed levels of abstraction make the program harder and harder to understand, and it gradually degrades until it becomes unmaintainable;
- Each time you add a routine to a class interface, ask, “Is this routine consistent with the abstraction provided by the existing interface?” If not, find a different way to make the modification, and preserve the integrity of the abstraction;
- The ideas of abstraction and cohesion are closely related—a class interface that presents a good abstraction usually has strong cohesion. Classes with strong cohesion tend to present good abstractions, although that relationship is not as strong. Focusing on the abstraction presented by the class interface tends to provide more insight into class design than focusing on class cohesion. If you see that a class has weak cohesion and aren’t sure how to correct it, ask yourself whether the class presents a consistent abstraction instead;
Good Encapsulation
- Minimizing accessibility is one of several rules that are designed to encourage encapsulation. If you’re wondering whether a specific routine should be public, private, or protected, one school of thought is that you should favor the strictest level of privacy that’s workable. I think that’s a fine guideline, but I think the more important guideline is, “What best preserves the integrity of the interface abstraction?” If exposing the routine is consistent with the abstraction, it’s probably fine to expose it. If you’re not sure, hiding more is generally better than hiding less;
- Don’t expose member data in public;
- Don’t make assumptions about the class’s users. A class should be designed and implemented to adhere to the contract implied by the class interface. It shouldn’t make any assumptions about how that interface will or won’t be used, other than what’s documented in the interface. Comments like this are an indication that a class is more aware of its users than it should be:
- initialize x, y, and z to 1.0 because DerivedClass blows up if they’re initialized to 0.0.
- Don’t put a routine into the public interface just because it uses only public routines. The fact that a routine uses only public routines is not a very significant consideration. Instead, ask whether exposing the routine would be consistent with the abstraction presented by the interface;
- Favor read-time convenience to write-time convenience. Code is read far more times than it’s written, even during initial development. Favoring a technique that speeds write-time convenience at the expense of read time convenience is a false economy. This is especially applicable to creation of class interfaces. Even if a routine doesn’t quite fit the interface’s abstraction, sometimes it’s tempting to add a routine to an interface that would be convenient for the particular client of a class that you’re working on at the time. But adding that routine is the first step down a slippery slope, and it’s better not to take even the first step;
- Be very, very wary of semantic violations of encapsulation. The difficulty of semantic encapsulation compared to syntactic encapsulation is similar. Syntactically, it’s relatively easy to avoid poking your nose into the internal workings of another class just by declaring the class’s internal routines and data private. Achieving semantic encapsulation is another matter entirely. Here are some examples of the ways that a user of a class can break encapsulation semantically:
- Not calling Class A’s Initialize() routine because you know that Class A’s PerformFirstOperation() routine calls it automatically;
- Not calling the database.Connect() routine before you call employee.Retrieve( database ) because you know that the employee.Retrieve() function will connect to the database if there isn’t already a connection;
- Not calling Class A’s Terminate() routine because you know that Class A’s PerformFinalOperation() routine has already called it;
- Using a pointer or reference to ObjectB created by ObjectA even after ObjectA has gone out of scope, because you know that ObjectA keeps ObjectB in static storage, and ObjectB will still be valid;
- Using ClassB’s MAXIMUM_ELEMENTS constant instead of using ClassA.MAXIMUM_ELEMENTS, because you know that they’re both equal to the same value.
- The problem with each of these examples is that they make the client code dependent not on the class’s public interface, but on its private implementation. Anytime you find yourself looking at a class’s implementation to figure out how to use the class, you’re not programming to the interface; you’re programming through the interface to the implementation. If you’re programming through the interface, encapsulation is broken, and once encapsulation starts to break down, abstraction won’t be far behind;
- If you can’t figure out how to use a class based solely on its interface documentation, the right response is not to pull up the source code and look at the implementation. That’s good initiative but bad judgment. The right response is to contact the author of the class and say, “I can’t figure out how to use this class”. The right response on the class-author’s part is not to answer your question face to face. The right response for the class author is to check out the class-interface file, modify the class-interface documentation, check the file back in, and then say, “See if you can understand how it works now”. You want this dialog to occur in the interface code itself so that it will be preserved for future programmers. You don’t want the dialog to occur solely in your own mind, which will bake subtle semantic dependencies into the client code that uses the class. And you don’t want the dialog to occur interpersonally so that it benefits only your code but no one else’s;
- Watch for coupling that’s too tight. Coupling refers to how tight the connection is between two classes. In general, the looser the connection, the better. Several general guidelines flow from this concept:
- Minimize accessibility of classes and members;
- Avoid friend classes, because they’re tightly coupled;
- Avoid making data protected in a base class because it allows derived classes to be more tightly coupled to the base class;
- Avoid exposing member data in a class’s public interface;
- Be wary of semantic violations of encapsulation;
- Observe the Law of Demeter;
- Coupling goes hand in glove with abstraction and encapsulation. Tight coupling occurs when an abstraction is leaky, or when encapsulation is broken. If a class offers an incomplete set of services, other routines might find they need to read or write its internal data directly. That opens up the class, making it a glass box instead of a black box, and virtually eliminates the class’s encapsulation.
6.3 Design and Implementation Issues
Containment (“has a” relationships)
- Containment is the simple idea that a class contains a primitive data element or object;
- A lot more is written about inheritance than about containment, but that’s because inheritance is more tricky and error prone, not because it’s better;
- Implement “has a” through containment:
- One way of thinking of containment is as a “has a” relationship. For example, an employee “has a” name, “has a” phone number, “has a” tax ID, and so on. You can usually accomplish this by making the name, phone number, or tax ID member data of the Employee class.
- Implement “has a” through private inheritance as a last resort:
- In some instances you might find that you can’t achieve containment through making one object a member of another. In that case, some experts suggest privately inheriting from the contained object. The main reason you would do that is to set up the containing class to access protected member functions or data of the class that’s contained. In practice, this approach creates an overly cozy relationship with the ancestor class and violates encapsulation. It tends to point to design errors that should be resolved some way other than through private inheritance.
- Be critical of classes that contain more than about seven members:
- The number 7+/-2 has been found to be a number of discrete items a person can remember while performing other tasks. If a class contains more than about seven data members, consider whether the class should be decomposed into multiple smaller classes. You might err more toward the high end of 7+/-2 if the data members are primitive data types like integers and strings; more toward the lower end of 7+/-2 if the data members are complex objects.
Inheritance (“is a” relationships)
- Inheritance is the complex idea that one class is a specialization of another class;
- The purpose of inheritance is to create simpler code by defining a base class that specifies common elements of two or more derived classes;
- A great many of the problems in modern programming arise from overly enthusiastic use of inheritance;
- When you decide to use inheritance, you have to make several decisions:
- For each member routine, will the routine be visible to derived classes? Will it have a default implementation? Will the default implementation be overridable?
- For each data member (including variables, named constants, enumerations, and so on), will the data member be visible to derived classes?
Implement “is a” through public inheritance
- When a programmer decides to create a new class by inheriting from an existing class, that programmer is saying that the new class “is a” more specialized version of the older class. The base class sets expectations about how the derived class will operate;
- If the derived class isn’t going to adhere completely to the same interface contract defined by the base class, inheritance is not the right implementation technique. Consider containment or making a change further up the inheritance hierarchy.
Design and document for inheritance or prohibit it
- Inheritance adds complexity to a program, and, as such, it is a dangerous technique;
- If a class isn’t designed to be inherited from, make it sealed so that you can’t inherit from it.
Adhere to the Liskov Substitution Principle
- You shouldn’t inherit from a base class unless the derived class truly “is a” more specific version of the base class;
- Subclasses must be usable through the base class interface without the need for the user to know the difference. In other words, all the routines defined in the base class should mean the same thing when they’re used in each of the derived classes;
- If you have a base class of Account, and derived classes of CheckingAccount, SavingsAccount, and AutoLoanAccount, a programmer should be able to invoke any of the routines derived from Account on any of Account’s subtypes without caring about which subtype a specific account object is;
- If a program has been written so that the Liskov Substitution Principle is true, inheritance is a powerful tool for reducing complexity because a programmer can focus on the generic attributes of an object without worrying about the details;
- If a programmer must be constantly thinking about semantic differences in subclass implementations, then inheritance is increasing complexity rather than reducing it. Suppose a programmer has to think, “If I call the InterestRate() routine on CheckingAccount or SavingsAccount, it returns the interest the bank pays, but if I call InterestRate() on AutoLoanAccount I have to change the sign because it returns the interest the consumer pays to the bank.” According to Liskov, the InterestRate() routine should not be inherited because its semantics aren’t the same for all derived classes.
Be sure to inherit only what you want to inherit
- A derived class can inherit member routine interfaces, implementations, or both;
- Inherited routines come in three basic flavors:
- An abstract overridable routine means that the derived class inherits the routine’s interface but not its implementation;
- An overridable routine means that the derived class inherits the routine’s interface and a default implementation, and it is allowed to override the default implementation;
- A non-overridable routine means that the derived class inherits the routine’s interface and its default implementation, and it is not allowed to override the routine’s implementation;
- When you choose to implement a new class through inheritance, think through the kind of inheritance you want for each member routine. Beware of inheriting implementation just because you’re inheriting an interface, and beware of inheriting an interface just because you want to inherit an implementation.
Move common interfaces, data, and behavior as high as possible in the inheritance tree
- The higher you move interfaces, data, and behavior, the more easily derived classes can use them;
- How high is too high? Let abstraction be your guide. If you find that moving a routine higher would break the higher object’s abstraction, don’t do it.
Be suspicious of classes of which there is only one instance
- A single instance might indicate that the design confuses objects with classes. Consider whether you could just create an object instead of a new class. Can the variation of the derived class be represented in data rather than as a distinct class?
Be suspicious of base classes of which there is only one derived class
- When I see a base class that has only one derived class, I suspect that some programmer has been “designing ahead”—trying to anticipate future needs, usually without fully understanding what those future needs are;
- The best way to prepare for future work is not to design extra layers of base classes that “might be needed someday,” it’s to make current work as clear, straightforward, and simple as possible;
- That means not creating any more inheritance structure than is absolutely necessary.
Be suspicious of classes that override a routine and do nothing inside the derived routine
- This typically indicates an error in the design of the base class. For instance, suppose you have a class Cat and a routine Scratch() and suppose that you eventually find out that some cats are declawed and can’t scratch. You might be tempted to create a class derived from Cat named ScratchlessCat and override the Scratch() routine to do nothing. There are several problems with this approach:
- It violates the abstraction (interface contract) presented in the Cat class by changing the semantics of its interface;
- This approach quickly gets out of control when you extend it to other derived classes. What happens when you find a cat without a tail? Or a cat that doesn’t catch mice? Or a cat that doesn’t drink milk? Eventually you’ll end up with derived classes like ScratchlessTaillessMicelessMilklessCat;
- Over time, this approach gives rise to code that’s confusing to maintain because the interfaces and behavior of the ancestor classes imply little or nothing about the behavior of their descendents.
- The place to fix this problem is not in the base class, but in the original Cat class. Create a Claws class and contain that within the Cats class, or build a constructor for the class that includes whether the cat scratches;
- The root problem was the assumption that all cats scratch, so fix that problem at the source, rather than just bandaging it at the destination.
Avoid deep inheritance trees
- Object oriented programming provides a large number of techniques for managing complexity;
- But every powerful tool has its hazards, and some object oriented techniques have a tendency to increase complexity rather than reduce it;
- Most people have trouble juggling more than two or three levels of inheritance in their brains at once;
- Deep inheritance trees have been found to be significantly associated with increased fault rates;
- Deep inheritance trees increase complexity, which is exactly the opposite of what inheritance should be used to accomplish. Make sure you’re using inheritance to minimize complexity.
Prefer inheritance to extensive type checking
- Frequently repeated case statements sometimes suggest that inheritance might be a better design choice, although this is not always true;
//code example 1: switch ( shape.type ) { case Shape_Circle: shape.DrawCircle(); break; case Shape_Square: shape.DrawSquare(); break; }
- In this example, the calls to shape.DrawCircle() and shape.DrawSquare() should be replaced by a single routine named shape.Draw(), which can be called regardless of whether the shape is a circle or a square;
//code example 2 switch ( ui.Command() ) { case Command_OpenFile: OpenFile(); break; case Command_Print: Print(); break; case Command_Save: Save(); break; case Command_Exit: ShutDown(); break; }
- In this case, it would be possible to create a base class with derived classes and a polymorphic DoCommand() routine for each command. But the meaning of DoCommand() would be so diluted as to be meaningless, and the case statement is the more understandable solution.
Avoid using a base class’s protected data in a derived class (or make that data private instead of protected in the first place)
- Inheritance breaks encapsulation;
- When you inherit from an object, you obtain privileged access to that object’s protected routines and data;
- If the derived class really needs access to the base class’s attributes, provide protected accessor functions instead.
Why Are There So Many Rules for Inheritance?
- Inheritance tends to work against the primary technical imperative you have as a programmer, which is to manage complexity;
- For the sake of controlling complexity you should maintain a heavy bias against inheritance;
- Here’s a summary of when to use inheritance and when to use containment:
- If multiple classes share common data but not behavior, then create a common object that those classes can contain;
- If multiple classes share common behavior but not data, then derive them from a common base class that defines the common routines;
- If multiple classes share common data and behavior, then inherit from a common base class that defines the common data and routines;
- Inherit when you want the base class to control your interface; contain when you want to control your interface.
Member Functions and Data
Keep the number of routines in a class as small as possible
- Statistically, programs found that higher numbers of routines per class were associated with higher fault rates;
- Other competing factors were found to be more significant, including deep inheritance trees, large number of routines called by a routine, and strong coupling between classes;
- Evaluate the tradeoff between minimizing the number of routines and these other factors.
Minimize direct routine calls to other classes
- One study found that the number of faults in a class was statistically correlated with the total number of routines that were called from within a class;
- The same study found that the more classes a class used, the higher its fault rate tended to be.
Minimize indirect routine calls to other classes
- Direct connections are hazardous enough. Indirect connections—such as account.ContactPerson().DaytimeContactInfo().PhoneNumber()—tend to be even more hazardous
- Law of Demeter
- Object A can call any of its own routines;
- If Object A instantiates an Object B, it can call any of Object B’s routines. But it should avoid calling routines on objects provided by Object B;
- In the account example above, that means account.ContactPerson() is OK, but account.ContactPerson().DaytimeContactInfo() is not;
- Depending on how classes are arranged, it might be acceptable to see an expression like account.ContactPerson().DaytimeContactInfo().
In general, minimize the extent to which a class collaborates with other classes
- Try to minimize all of the following:
- Number of kinds of objects instantiated;
- Number of different direct routine calls on instantiated objects;
- Number of routine calls on objects returned by other instantiated objects.
Constructors
Initialize all member data in all constructors, if possible
- Initializing all data members in all constructors is an inexpensive defensive programming practice.
Initialize data members in the order in which they’re declared
- Using the same order in both places also provides consistency that makes the code easier to read.
Prefer deep copies to shallow copies until proven otherwise
- One of the major decisions you’ll make about complex objects is whether to implement deep copies or shallow copies of the object;
- A deep copy of an object is a member-wise copy of the object’s member data
-
//code example A ob1 = new A(); ob1.a = 10; A ob2 = new A(); ob2.a = ob1.a; ob1.a = 5; // If you see value of ob2.a after this line, it will be 10.
-
- A shallow copy typically just points to or refers to a single reference copy
-
//code example A ob1 = new A(); ob1.a = 10; A ob2 = new A(); ob2 = ob1; ob1.a = 5; // If you see value of ob2.a after this line, it will be 5.
-
- Deep copies are simpler to code and maintain than shallow copies;
- In addition to the code either kind of object would contain, shallow copies add code to count references, ensure safe object copies, safe comparisons, safe deletes, and so on;
- This code tends to be error prone, and it should be avoided unless there’s a compelling reason to create it;
- The motivation for creating shallow copies is typically to improve performance;
- Although creating multiple copies of large objects might be aesthetically offensive, it rarely causes any measurable performance impact;
- A small number of objects might cause performance issues, but programmers are notoriously poor at guessing which code really causes problems;
- Because it’s a poor tradeoff to add complexity for dubious performance gains, a good approach to deep vs. shallow copies is to prefer deep copies until proven otherwise.
6.4 Reasons to Create a Class
- Model real-world objects;
- Model abstract objects;
- Reduce complexity;
- Isolate complexity;
- Hide implementation details;
- Limit effects of changes;
- Hide global data;
- Streamline parameter passing:
- If you’re passing a parameter among several routines, that might indicate a need to factor those routines into a class that share the parameter as class data;
- Streamlining parameter passing isn’t a goal, per se, but passing lots of data around suggests that a different class organization might work better.
- Make central points of control:
- Using one class to read from and write to a database is a form of centralized control. If the database needs to be converted to a flat file or to in-memory data, the changes will affect only the one class.
- Facilitate reusable code;
- Plan for a family of programs:
- If you expect a program to be modified, it’s a good idea to isolate the parts that you expect to change by putting them into their own classes. You can then modify the classes without affecting the rest of the program, or you can put in completely new classes instead;
- Several years ago I managed a team that wrote a series of programs used by our clients to sell insurance. We had to tailor each program to the specific client’s insurance rates, quote-report format, and so on. But many parts of the programs were similar: the classes that input information about potential customers, that stored information in a customer database, that looked up rates, that computed total rates for a group, and so on. The team factored the program so that each part that varied from client to client was in its own class. The initial programming might have taken three months or so, but when we got a new client, we merely wrote a handful of new classes for the new client and dropped them into the rest of the code. A few days’ work, and voila! Custom software!
- Package related operations;
- Accomplish a specific refactoring.
Classes to avoid:
- God classes;
- Eliminate irrelevant classes:
- If a class consists only of data but no behavior, ask yourself whether it’s really a class and consider demoting it to become an attribute of another class.
- Avoid classes named after verbs:
- A class that has only behavior but no data is generally not really a class. Consider turning a class like DatabaseInitialization() or StringBuilder() into a routine on some other class.
Read checklist and Key Points at the end of the chapter