The Big Three in Java, Part Four: equals(Object) in Hierarchies

Introduction

We’re onto the fourth and final blog post in a four-part series about what I call the “big three” methods in Java. These are methods that are all inherited from the Object class, and they should generally be overridden in any class that we implement. The three methods are Object.toString()Object.equals(Object), and Object.hashCode().

In the first blog post in this series, we covered the Object.toString() method. The second post, closely related to this one, introduced the Object.equals(Object) method. The third post discussed the Object.hashCode() method.

Today, we’ll end the series by returning to the Object.equals(Object) method. There are some subtleties we have to watch out for when we implement this method in class hierarchies. When a class that overrides equals(Object) has subclasses, we have to be careful to maintain certain required behaviours in this method, namely symmetry and transitivity.

API Requirements

All implementations of equals(Object) in Java need to adhere to five requirements. From the Java 17 API, implementations must have the following properties, for non-null objects x, y, and z:

  • Reflexive: x.equals(x) is always true
  • Symmetric: x.equals(y) returns the same thing as y.equals(x)
  • Transitive: If x.equals(y) and y.equals(z) are both true, then x.equals(z) returns true
  • Consistent (also called idempotent): x.equals(y) always returns the same result, no matter how many times it is run, provided nothing about x nor y has changed
  • Not null: x.equals(null) is always false

As we’ll see, it’s the symmetry and transitivity requirements that are the most difficult to maintain when we introduce class hierarchies — for example, DairyCow extends Cow, or HappyDog extends Dog.

A Subclass with Additional Functionality

Let’s dive right into an example where we’d like one object to potentially be equal to a different object of its subclass. Here, we define a Dog class, where Dogs have a name and age:

public class Dog
{
    protected String name;
    protected int age;

    public Dog(String name, int age)
    {
        this.name = name;
        this.age = age;
    }
}

We’ll also define a subclass of Dog, HappyDog, that has some additional functionality. Namely, a happy dog can wag its tail:

public class HappyDog extends Dog
{
    public HappyDog(String name, int age)
    {
        super(name, age);
    }

    public void wagTail()
    {
        System.out.println("Wag");
    }
}

Importantly, there are no new instance variables in the HappyDog class. That is, while HappyDog has additional functionality, there’s no new state a HappyDog has that a Dog doesn’t.

We would like for any two dogs to be equal if they have the same name and age, regardless of whether they’re Dog objects, HappyDog objects, or one of each.

Back in our earlier blog post about the Object.equals(Object) method, we briefly touched on two ways to safeguard the downcast in an implementation of equals(Object): an instanceof check, or a class-equality check. When we want it to be possible for an object to be equal to another object of a subclass, we use an instanceof check. Here, that means asking whether the parameter object is an instance of Dog, regardless of whether it’s a Dog or one of its subclasses. Based on the code from our earlier post:

public class Dog
{
    [...]

    // Will be improved later
    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof Dog)) {
            return false;
        }

        Dog other = (Dog)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }
}

Because we used an instanceof check to guard the downcast in the Dog class, we see that a Dog and a HappyDog can compare as equal if they have the same name and age:

public static void main(String[] args)
{
    Dog d = new Dog("Fido", 3);
    HappyDog hd1 = new HappyDog("Fido", 3);
    HappyDog hd2 = new HappyDog("Rover", 4);

    System.out.println(d.equals(hd1));  // true
    System.out.println(d.equals(hd2));  // false
}

We can also see that the relationship is symmetric — that is, x.equals(y) produces the same result as y.equals(x):

System.out.println(d.equals(hd1));  // true
System.out.println(hd1.equals(d));  // true

System.out.println(d.equals(hd2));  // false
System.out.println(hd2.equals(d));  // false

For students working along through this code, note that HappyDog inherits the Dog.equals(Object) method. So, a HappyDog object also checks that the parameter to its equals(Object) method is an instanceof a Dog.

We’ll return to this Dog.equals(Object) method later in the blog post with one improvement. But, for now, it implements the required behaviour: a Dog and a HappyDog can be equal if they have the same name and age.

A Subclass with Additional State: Broken Symmetry

Now, let’s consider two new classes — a class and a subclass — where the subclass has an additional instance variable. For this example, we have a Cow class and a DairyCow class. A Cow has a name and age:

public class Cow
{
    protected String name;
    protected int age;

    public Cow(String name, int age)
    {
        this.name = name;
        this.age = age;
    }
}

A DairyCow, beyond having a name and age, also has an instance variable that represents how much milk the dairy cow produces:

public class DairyCow extends Cow
{
    private double milkProduced;

    public DairyCow(String name, int age,
        double milkProduced)
    {
        super(name, age);
        this.milkProduced = milkProduced;
    }
}

Let’s say we want two DairyCows to compare as equal only if they have the same name and age, and produce the same amount of milk. One (incorrect) approach we might think to take is to first implement an equals(Object) method in the Cow class with an instanceof check guarding the downcast. Then, we might think to override that implementation of equals(Object) in the DairyCow class, so it also takes into account the amount of milk produced. This approach is shown below; but, as we’ll see, it’s wrong:

public class Cow
{
    [...]

    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        // WRONG
        if (!(obj instanceof Cow)) {
            return false;
        }

        Cow other = (Cow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }
}

public class DairyCow extends Cow
{
    [...]

    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof DairyCow)) {
            return false;
        }

        DairyCow other = (DairyCow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age &&
            this.milkProduced == other.milkProduced);
    }
}

The reason that this implementation is incorrect is that when you call Cow.equals(Object) with a DairyCow as a parameter, only the name and the age of the two cows are compared. A Cow and a DairyCow with the same name and age will compare as equal. However, when you call DairyCow.equals(Object) with a Cow as a parameter, the instanceof guard on the downcast to DairyCow fails, and the method returns false. For example:

public static void main(String[] args)
{
    Cow c = new Cow("Molly", 3);
    DairyCow dc = new DairyCow("Molly", 3, 10.0);

    System.out.println(c.equals(dc));  // true
    System.out.println(dc.equals(c));  // false
}

Because c.equals(dc) and dc.equals(c) return different values, the symmetry property of the equals(Object) method has been broken. In order to meet the symmetry requirement, we need to take a different approach.

A Subclass with Additional State: Considering Transitivity

At this point, we might think the issue of broken symmetry stems just from the way we’ve written the guard on the downcast. Is it possible to rewrite our code so that a Cow and a Cow, or a Cow and a DairyCow, compare as equal if they have the same name and age; but, two DairyCows only compare as equal if they have the same name, age, and amount of milk produced?

Code that behaves like this would be a disaster rife with anti-patterns, but we could write it. However, I won’t write it in this blog post, nor would I ever want to leave it as an exercise for students.

That said, there’s a more fundamental reason not to design the Cow and DairyCow classes like this, beyond the code being a mess. Code that behaves like how we’ve described would violate the transitivity requirement of the equals(Object) method.

Let’s start by assuming that:

  1. We want two Cow objects to compare as equal if they have the same name and age; and,
  2. We want two DairyCow objects to be equal if they have the same name, age, and amount of milk produced.

As we’ll see, to meet just these two design requirements alone, a DairyCow must never compare as equal to a Cow, because of the transitivity requirement of the equals(Object) method.

Let’s consider the following three objects, to see what happens if it’s possible for a Cow and DairyCow to compare as equal:

Cow c = new Cow("Molly", 3);
DairyCow dc1 = new DairyCow("Molly", 3, 10.0);
DairyCow dc2 = new DairyCow("Molly", 3, 25.0);

Let’s assume, for the sake of argument, that the DairyCow dc1 and the Cow c compare as equal, because they have the same name and age:

boolean i = dc1.equals(c);  // Assume this is true

Let’s also assume that c and dc2 compare as equal, for the same reason — they have the same name and age:

boolean j = c.equals(dc2);  // Assume this is true

Then, by the transitivity requirement of the equals(Object) method, it would have to be the case that dc1 and dc2 compare as equal:

boolean k = dc1.equals(dc2);  // Then this must be true!?

But the DairyCows dc1 and dc2 produce different amounts of milk, so we don’t want them to compare as equal.

A Subclass with Additional State: Changing the Downcast Guard

We encountered this situation, where two DairyCows that produce different amounts of milk would have to compare as equal, because of two factors put together:

  1. An object of the superclass (Cow) can sometimes compare as equal to an object of the subclass (DairyCow); but,
  2. Objects of a subclass (DairyCow) use additional state that the superclass doesn’t have (milkProduced) in their equality comparison.

In order to avoid breaking the transitivity requirement of the equals(Object) method, we need to eliminate one of these two contributing factors. To do that, there’s a rule we have to follow. If objects of a subclass compute equality using additional state beyond what the superclass has, then objects of the superclass and objects of the subclass must never compare as equal to each other. That is, if contributing factor 2 is present, we have to eliminate contributing factor 1.

Because the DairyCow class uses additional state in its equality comparison, we have to change the downcast guard in the Cow class to ensure that a Cow never compares as equal to any of its subclasses. We do so by changing the instanceof guard on the downcast (which allows the Object parameter to be a Cow or any subclass of Cow, such as a DairyCow) to a class equality guard that uses the Object.getClass() method (which allows the Object parameter only to be a Cow):

public class Cow
{
    [...]

    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof Cow)) {
            return false;
        }

        if (obj == null ||
            this.getClass() != obj.getClass()) {
            return false;
        }

        Cow other = (Cow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }
}

Note that, while the instanceof operator implicitly guards against the Object parameter being null, the class-equality check doesn’t. We have to ensure that obj isn’t null before we attempt to call obj.getClass().

The Downcast Guard in the Subclass with Additional State

Because we want two DairyCows to compare as equal if they have the same name, age, and amount of milk produced — behaviour different from the equals(Object) method in the Cow superclass — we need to override the equals(Object) method in the DairyCow class.

Recall that, in order to maintain the transitive property of equals(Object), we had to change the instanceof guard in the Cow superclass to a class-equality guard. But, what about the downcast guard in the subclass, in the DairyCow.equals(Object) method — should it be an instanceof guard or a class-equality guard? That answer depends on whether any subclass of DairyCow uses additional state beyond what DairyCow uses to compute equality.

If DairyCow has no subclasses, or at least no subclasses that use additional state beyond name, age, and amount of milk produced to compute equality, then we can use either an instanceof guard or a class-equality guard (depending on whether you want a DairyCow to be able to compute as equal to any of its subclasses). The fact that DairyCow has a superclass (Cow) that uses less information than DairyCow to compute equality is irrelevant to the downcast guard used in DairyCow.

So either of these implementations of DairyCow.equals(Object) would be correct:

public class DairyCow
{
    [...]

    // Can still be improved, below
    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof DairyCow)) {
            return false;
        }

        // Alternatively:
        // if (obj == null ||
        //     this.getClass() != obj.getClass()) {
        //     return false;
        // }

        DairyCow other = (DairyCow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age &&
            this.milkProduced == other.milkProduced);
    }
}

Protecting Transitivity When Using instanceof

Central to our discussion of implementing, for example, Dog.equals(Object), is knowing whether Dog has a subclass that uses more state in its equality computation than Dog does. But, what if we don’t know what subclasses of Dog another programmer might eventually implement? In fact, often we won’t know.

To address this situation, let’s discuss one last improvement we should make to the equals(Object) method when we use an instanceof guard, like we did in the Dog class. We can ensure that no subclass ever uses additional state to compute equality, beyond what the superclass with an instanceof guard does, by explicitly preventing any subclass from overriding the superclass’ equals(Object) method.

Using the example of Dog and HappyDog, in which Dog used an instanceof guard in its equality computation, we can use the final keyword to provided this guarantee. If the Dog.equals(Object) method is declared as final, then neither HappyDog nor any other subclass of Dog can ever override Dog.equals(Object):

public class Dog
{
    [...]

    @Override
    public final boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof Dog)) {
            return false;
        }

        Dog other = (Dog)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }
}

Recall, from the blog post about the Object.hashCode() method, that any two objects that compare as equal must have the same hash code. So, if we prevent the Dog.equals(Object) method from ever being overridden, we should also add the same protection to the Dog.hashCode() method:

public class Dog
{
    [...]

    @Override
    public final int hashCode()
    {
        int ret = Objects.hashCode(this.name);
        ret = 31 * ret + this.age;
        return ret;
    }
}

A Subclass with Additional Functionality: Finished Version

Putting all the pieces together, we can write a finished version of the Dog and HappyDog classes. Because HappyDog adds new functionality to its superclass, Dog, but no new state used in the equality comparison, we can use an instanceof guard in the Dog.equals(Object) method. That allows a Dog and a HappyDog to compare as equal. We also add the final keyword to both Dog.equals(Object) and Dog.hashCode(), so that any potential subclasses of Dog can’t accidentally override the Dog.equals(Object) or Dog.hashCode() methods to use new state.

public class Dog
{
    protected String name;
    protected int age;

    public Dog(String name, int age)
    {
        this.name = name;
        this.age = age;
    }

    @Override
    public final boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof Dog)) {
            return false;
        }

        Dog other = (Dog)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }

    @Override
    public final int hashCode()
    {
        int ret = Objects.hashCode(this.name);
        ret = 31 * ret + this.age;
        return ret;
    }
}

The HappyDog subclass adds only additional functionality to the Dog class, so it doesn’t need to override Dog.equals(Object) nor Dog.hashCode():

public class HappyDog extends Dog
{
    public HappyDog(String name, int age)
    {
        super(name, age);
    }

    public void wagTail()
    {
        System.out.println("Wag");
    }
}

A Subclass with Additional State: Finished Version

Let’s also write a finished version of the Cow and DairyCow classes. Because DairyCow adds new state beyond its superclass, Cow, that’ll be used in equality comparison, we have to use a class-equality guard in the Cow.equals(Object) method. That means that a Cow and a DairyCow can never be equal to each other.

We override the equals(Object) method in the DairyCow subclass, making two dairy cows compute as equal to each other only when they produce the same amount of milk. As discussed in the blog post about Object.hashCode(), we have to override hashCode() whenever we override equals(Object). So, we also override hashCode() in the DairyCow class.

First, the Cow class with the class-equality guard:

public class Cow
{
    protected String name;
    protected int age;

    public Cow(String name, int age)
    {
        this.name = name;
        this.age = age;
    }

    @Override
    public boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (obj == null ||
            this.getClass() != obj.getClass()) {
            return false;
        }

        Cow other = (Cow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age);
    }

    @Override
    public int hashCode()
    {
        int ret = Objects.hashCode(this.name);
        ret = 31 * ret + this.age;
        return ret;
    }
}

There are two possible versions of the DairyCow class, depending on whether subclasses of DairyCow can add additional state used in equality comparisons. If no new state used in equality tests can be added to potential subclasses of DairyCow, we can use an instanceof guard in the DairyCow class, and make both DairyCow.equals(Object) and DairyCow.hashCode() final:

// If subclasses of DairyCow can't add additional state
public class DairyCow extends Cow
{
    private double milkProduced;

    public DairyCow(String name, int age,
        double milkProduced)
    {
        super(name, age);
        this.milkProduced = milkProduced;
    }

    @Override
    public final boolean equals(Object obj)
    {
        if (this == obj) {
            return true;
        }

        if (!(obj instanceof DairyCow)) {
            return false;
        }

        DairyCow other = (DairyCow)obj;

        // Assumes this.name is non-null
        return (this.name.equals(other.name) &&
            this.age == other.age &&
            this.milkProduced == other.milkProduced);
    }

    @Override
    public final int hashCode()
    {
        int ret = Objects.hashCode(this.name);
        ret = 31 * ret + this.age;
        ret = 31 * ret + Objects.hashCode(this.milkProduced);
        return ret;
    }
}

Alternately, if potential subclasses of DairyCow could add additional state used in equality comparisons, we would have to use a class-equality comparison in the DairyCow class (like the one used in the Cow class).

I’ve left it as an exercise for students to implement this alternate version of DairyCow. As a hint, remember that neither DairyCow.equals(Object) nor DairyCow.hashCode() should be final in this alternate version, because subclasses of DairyCow would need toIn the final entry in a series of four blog posts about the “big three” methods in Java, we return to the Object.equals(Object) method. Specifically, we discuss how to maintain the symmetry and transitivity properties of the method in class hierarchies. We answer the question: when do we use instanceof, and when do we check for class equality instead? override them.

Conclusion

When we create our own classes in Java, we generally have to override the “big three” methods: Object.toString()Object.equals(Object), and Object.hashCode(). In today’s blog post, we revisited the Object.equals(Object) method — specifically how to implement it in class hierarchies.

As briefly mentioned in the previous blog post about equals(Object), there are two types of guard that we can place before the downcast. The first type of guard checks if the parameter object is an instanceof the class we’re implementing. The second type of guard, on the other hand, checks if the parameter object is exactly the same class as the class we’re implementing, using the Object.getClass() method. Broadly speaking, the distinction is whether objects of the class we’re implementing can be equal to objects of one of its subclasses.

That said, it can still be difficult to know when to use which type of guard on the downcast. One key rule to remember, though, is that if subclasses use additional state (not available in the superclass) to compute equality, the guard in the superclass has to be a class-equality check instead of an instanceof check. That’s the only way to maintain the symmetry and transitivity requirements of the Object.equals(Object) method. On the other hand, if subclasses introduce only new functionality, but no additional state used in equality comparison, we could choose to use an instanceof check and make the equals(Object) method final. In either case, don’t forget to maintain consistency between the hashCode() and equals(Object) methods.

For more tips, and to arrange for personalized tutoring for yourself or your study group, check out Vancouver Computer Science Tutoring.