Introduction
This blog post is the second in a series of four articles about what I call the “big three” methods in Java. All of them are methods inherited from the Object
class, and we should typically override all three of them when we design a new class. They are the Object.toString()
method, the Object.equals(Object)
method, and the Object.hashCode()
method.
The Object.toString()
method is discussed in the first blog post in this series. And, the Object.hashCode()
method is covered in the third blog post.
Today, we’ll dive into the Object.equals(Object)
method. Because this method is quite complex, it’ll be divided into two blog posts. This post will introduce the equals(Object)
method, what it does, and how to use it. The fourth and final blog post in this series, yet to be published, will cover the subtleties of the equals(Object)
method in class hierarchies.
Types of Equality in Java
In order to discuss the Object.equals(Object)
method, we have to start by discussing the two types of equality in Java. One type of equality asks: “Do two variables reference the exact same object?” The other type of equality asks: “Do two variables reference objects with the same state, regardless of whether they’re the same object or not?”
The first type of equality, which tests whether two variables reference the exact same object, is typically called “reference equality” (though you may also see it called “identity equality”). The second type of equality, which tests whether two objects have the same internal state, is typically called “value equality” (you may also see it called “state equality” or “logical equality”).
Reference Equality
The first type of equality, which asks if two variables reference the same object, is computed with the ==
operator. Let’s return to our Cow
class from the previous blog post, so that we can see the ==
operator in action.
A Cow
has two instance variables: its name and its age.
public class Cow
{
private String name;
private int age;
public Cow(String name, int age)
{
this.name = name;
this.age = age;
}
}
Let’s write a program that creates two Cow
objects with the same name and age. We’ll use three different variables to reference them, and check the three variables for reference equality. When two of the variables compute as equal with the reference equality check, ==
, we’ll output a message:
public static void main(String[] args)
{
Cow mollyOne = new Cow("Molly", 3);
Cow mollyTwo = new Cow("Molly", 3);
Cow alsoMollyOne = mollyOne;
// Won't print anything
if (mollyOne == mollyTwo) {
System.out.println("mollyOne == mollyTwo");
}
// Will print a message
if (mollyOne == alsoMollyOne) {
System.out.println("mollyOne == alsoMollyOne");
}
}
The mollyOne
and alsoMollyOne
variables compare as equal using the ==
operator, because they reference the exact same object. The alsoMollyOne
variable was set, in the third line of the main(String[])
method, to reference the same object as the mollyOne
variable.
However, the mollyOne
and mollyTwo
objects don’t compare as equal because they’re two independently constructed objects. So, when we use a reference-equality check, Java responds that they’re not the same.
Why Would There Be Two Different Objects?
At this point, you might be asking: why, practically speaking in a real program, would we ever have constructed two distinct Cow
objects that talk about the same cow (our three-year-old Molly)?
This could happen, for example, in a program that asks a user to input information about the same cow multiple times. Consider a method that asks a user to input a cow’s name and age, then returns a new Cow
object:
private static Scanner scan = new Scanner(System.in);
public static Cow inputCow()
{
System.out.print("Cow name: ");
String name = scan.nextLine();
System.out.print("Cow age: ");
int age = scan.nextInt();
scan.nextLine();
return new Cow(name, age);
}
Each time the user is asked to input a cow’s information, this method will construct a new Cow
object to hold that data.
We might want the different Cow
objects to compare as equal, provided the name and age input are the same. But, if we use a reference-equality check, no two independently constructed Cow
objects will compare as equal:
public static void main(String[] args)
{
Cow firstCow = inputCow();
Cow secondCow = inputCow();
// Will never compare as equal
if (firstCow == secondCow) {
System.out.println("They're the same cow");
}
}
This program will never produce any output. In order for those two Cow
objects to compare as equal, we need to run a value-equality check.
Value Equality Default Implementation
Let’s assume that we want the two Cow
objects, mollyOne
and mollyTwo
to compare as equal if they have the same name
and age
. If we want to ask whether two Cow
objects have the same internal state (i.e., name
and age
), we need to use their class’ equals(Object)
method.
So, instead of writing,
if (mollyOne == mollyTwo) {
...
}
we need to write,
if (mollyOne.equals(mollyTwo)) {
...
}
Let’s return to our example program, with the test for reference equality replaced by a test for value equality. The output of this program, though, might be surprising:
public static void main(String[] args)
{
Cow mollyOne = new Cow("Molly", 3);
Cow mollyTwo = new Cow("Molly", 3);
Cow alsoMollyOne = mollyOne;
// Still won't print anything (yet)
if (mollyOne.equals(mollyTwo)) {
System.out.println("mollyOne.equals(mollyTwo)");
}
// Will print a message
if (mollyOne.equals(alsoMollyOne)) {
System.out.println("mollyOne.equals(alsoMollyOne)");
}
}
Despite us now using the equals(Object)
method, the program only considers mollyOne
and alsoMollyOne
to be equal. It still doesn’t consider mollyOne
and mollyTwo
to be equal, despite them having the same name
and age
.
The reason is that the Cow
class, by default, inherits the implementation of equals(Object)
from the Object
class. The implementation of that method in the Object
class is just a wrapper around a reference-equality check. That is, the method in the Object
class looks something like the following:
public class Object
{
[...]
public boolean equals(Object obj)
{
return this == obj;
}
}
So, by default, two Cow
objects will compare as equal with the Cow.equals(Object)
method if and only if the two objects compare as equal with the ==
operator.
If we want the value-equality check to behave as intended, we need to override the equals(Object)
method in the Cow
class with a more meaningful test.
Casting the Parameter from Object
The first thing to take note of, when we override the equals(Object)
method in the Cow
class, is that the parameter to the method is an Object
, not a Cow
. That means, we can’t write code like the following in the Cow
class:
public boolean equals(Object obj)
{
// Doesn't work
[...] this.age == obj.age [...]
}
That’s because an object of type Object
doesn’t have an instance variable named age
. In order treat the parameter obj
as a Cow
, we first have to downcast it. That is, we need to write something like the following:
public boolean equals(Object obj)
{
Cow other = (Cow)obj;
[...] this.age == other.age [...]
}
Having written that cast, we can perform the comparison between the age
instance variables of the two Cow
objects (this
and other
).
But, what if the object passed to the Cow.equals(Object)
method isn’t a Cow
? What if, for example, we had the following code:
public static void main(String[] args)
{
Cow molly = new Cow("Molly", 3);
// Assume we also have a Dog class
Dog fido = new Dog("Fido", 4);
if (molly.equals(fido)) {
[...]
}
}
Now, in the Cow.equals(Object)
method, the attempted downcast of the parameter (with actual type Dog
) to a Cow
will fail. It’ll throw a ClassCastException
, because a Dog
object isn’t a Cow
.
We need to have, near the top of the Cow.equals(Object)
method — prior to any cast — a check that the parameter is, in fact, a Cow
. To do this, we can use an instanceof
check. The instanceof
operators returns true
if and only if the object it’s provided is of type Cow
(or one of its subclasses). If we find that the Object
parameter isn’t an instance of a Cow
, we can immediately conclude that the parameter isn’t equal to this
Cow
and return false
:
public boolean equals(Object obj)
{
if (!(obj instanceof Cow)) {
return false;
}
Cow other = (Cow)obj;
[...] this.age == other.age [...]
}
The instanceof
check also verifies that the argument obj
isn’t null
. So, with the instanceof
check in place, the cast is guaranteed to succeed and the other
variable (created with the downcast) is guaranteed to be non-null
.
Common Error: Reference-Equality Checks Inside equals(Object)
With the downcast complete, we might think to write the following Cow.equals(Object)
method, which will return true
if and only if the member variables of this
Cow
and the parameter Cow
are equal. That is, we might want to write the following:
public boolean equals(Object obj)
{
if (!(obj instanceof Cow)) {
return false;
}
Cow other = (Cow)obj;
// INCORRECT
return (this.name == other.name &&
this.age == other.age);
}
However, the implementation of Cow.equals(Object)
above contains a very common error that I’ve seen students make many times.
Notice that the two Cow
‘s name
instance variables are compared using reference equality (that is, using ==
). So, the above Cow.equals(Object)
method can only ever return true
if the two Cow
s’ name
variables reference the same String
object.
This mistake is particularly insidious, because this Cow.equals(Object)
method will often appear to work correctly. Let’s go back to our example program with the two Cow
objects. Using the Cow.equals(Object)
method we wrote above, our sample program will produce two messages, like it should when a value-equality check is used:
public static void main(String[] args)
{
Cow mollyOne = new Cow("Molly", 3);
Cow mollyTwo = new Cow("Molly", 3);
Cow alsoMollyOne = mollyOne;
// Will print a message, by accident
if (mollyOne.equals(mollyTwo)) {
System.out.println("mollyOne.equals(mollyTwo)");
}
// Will print a message
if (mollyOne.equals(alsoMollyOne)) {
System.out.println("mollyOne.equals(alsoMollyOne)");
}
}
However, the only reason that mollyOne
and mollyTwo
compare as equal using the flawed Cow.equals(Object)
method is that their name
instance variables reference the exact same String
object. They reference the same String
object because of something called “string interning”. Java sees the same constant “Molly
” in two locations, and optimizes the code to allocate only a single String
object for that constant.
What would happen, instead, if a user had input the name Molly twice, when prompted two different times to input a cow’s name? Consider the Cow.inputCow()
method we saw earlier, which would have created a new String
object each time it processed the user’s input.
We can simulate that behaviour by making explicit calls to the String
constructor when we create our two Cow
s:
public static void main(String[] args)
{
Cow mollyOne = new Cow(new String("Molly"), 3);
Cow mollyTwo = new Cow(new String("Molly"), 3);
Cow alsoMollyOne = mollyOne;
// No longer prints a message
if (mollyOne.equals(mollyTwo)) {
System.out.println("mollyOne.equals(mollyTwo)");
}
// Will print a message
if (mollyOne.equals(alsoMollyOne)) {
System.out.println("mollyOne.equals(alsoMollyOne)");
}
}
To fix the bug in our Cow.equals(Object)
method, we have to use a value equality check between the two name
instance variables, instead of a reference-equality check:
public boolean equals(Object obj)
{
if (!(obj instanceof Cow)) {
return false;
}
Cow other = (Cow)obj;
// Assumes this.name is non-null
return (this.name.equals(other.name) &&
this.age == other.age);
}
With the Cow.equals(Object)
method corrected, our sample program that uses the new String("Molly")
construction now produces both intended outputs.
In general, in an equals(Object)
method that’s performing a value-equality comparison, we probably want to compare instance variables using value-equality checks. The times you would want to perform a reference-equality check inside an equals(Object)
method are limited and beyond the scope of this blog post, such as handling circular references or optimizing the comparison of two singleton instance variables.
instanceof vs. Class Equality
In the example Cow.equals(Object)
method, we used an instanceof
check to guarantee that we can downcast the Object
parameter to a Cow
:
public boolean equals(Object obj)
{
if (!(obj instanceof Cow) {
return false;
}
Cow other = (Cow)obj;
[...]
}
You may see a class-equality check used instead, prior to the downcast:
public boolean equals(Object obj)
{
if (obj == null ||
this.getClass() != obj.getClass()) {
return false;
}
Cow other = (Cow)obj;
[...]
}
Both of these safeguards prior to the downcast are correct. However, their behaviour with respect to class hierarchies is different. This difference will be the subject of the next blog post in this series about the Object.equals(Object)
method.
For now, your course notes probably use one or the other of these checks. If you’re uncertain which one you should use, ask your instructor, or be consistent with your course notes.
Why Override Instead of Overload?
You might wonder, at this point, why we can’t just overload the equals
method in the Cow
class, instead of overriding the Cow.equals(Object)
method. That is, why couldn’t we just write a Cow.equals(Cow)
method to deal with Cow
objects as arguments, and let Cow.equals(Object)
deal with all non-Cow
arguments:
public class Cow
{
[...]
// INCORRECT
public boolean equals(Cow other)
{
// Assumes this.name is non-null
return this.name.equals(other.name) &&
this.age == other.age;
}
// Do not override equals(Object obj)
}
The problem with this approach is that the equals(Object)
method can still wind up being called, instead of the equals(Cow)
method, even when the actual type of the argument is Cow
. Consider the following example:
public static void main(String[] args)
{
Cow mollyOne = new Cow("Molly", 3);
Object mollyTwo = new Cow("Molly", 3);
// No longer prints a message
if (mollyOne.equals(mollyTwo)) {
System.out.println("mollyOne.equals(mollyTwo)");
}
}
In this case, Java decides which equals
method gets called at compile-time, using the declared type of the argument. Because the declared type of mollyTwo
is Object
, mollyOne
‘s equals(Object)
instance method gets called. The decision about which method to use isn’t made at runtime by the dispatcher, so the decision isn’t based on the actual type of the argument (which is Cow
).
Because we didn’t override Object.equals(Object)
in the Cow
class, the default Object.equals(Object)
implementation gets called. The default implementation is just a wrapper around a reference-equality check — so, it’ll return false
, because the mollyOne
and mollyTwo
variables don’t reference the same object.
To avoid this issue, instead of overloading equals
with a Cow.equals(Cow)
method, we should override Object.equals(Object)
with a Cow.equals(Object)
method.
Improving the Cow.equals(Object) Method
There are two minor improvements we can make to the Cow.equals(Object)
method, though neither affects its correctness.
The first change is a small optimization. Namely, we can check if this
object references the exact same object as the parameter object by running a fast reference-equality check at the top of the method. There’s nothing incorrect about leaving this reference-equality check out. However, because reference-equality checks can be executed so quickly, we can fast-track the Cow.equals(Object)
method when a Cow
object receives itself as an argument. That can happen frequently, for example, with hash lookups or search algorithms.
Additionally, similar to how we discussed in the previous blog post about toString()
, we should add an @Override
annotation overtop the Cow.equals(Object)
method to catch accidental typos in our method name.
Making these two improvements, the final version of our Cow.equals(Object)
method looks like the following:
public class Cow
{
[...]
@Override
public boolean equals(Object obj)
{
if (this == obj) {
return true;
}
if (!(obj instanceof Cow)) {
return false;
}
Cow other = (Cow)obj;
// Assumes this.name is non-null
return (this.name.equals(other.name) &&
this.age == other.age);
}
}
Exercise
The final version of the Cow.equals(Object)
method we wrote above assumes this.name
is non-null
. I encourage students to think about how they could modify this method, if our program might allow this.name
to be null
. Keep in mind that two Cow
objects could be equal if both of their names are null
, but never if only one of them is.
Overriding hashCode()
There’s one final point to mention about overriding the Object.equals(Object)
method. Whenever we override the Object.equals(Object)
method, we almost certainly want to override the Object.hashCode()
method as well. Overriding Object.hashCode()
is the subject of the next blog post in this series.
Conclusion
When we write a class in Java, we should typically override the “big three” methods: Object.toString()
, Object.equals(Object)
, and Object.hashCode()
. In this second blog post about the “big three”, we explored the Object.equals(Object)
method.
By overriding Object.equals(Object)
in any class we write, we can test two objects of that type for value equality. That is, it becomes possible for two variables of a class to compare as equal, even when they don’t reference the exact same object. A value-equality check, unlike a reference-equality check, is based on the internal state of the objects.
One common mistake students make is to use reference-equality checks inside equals(Object)
methods they write, when comparing the instance variables of the two objects. Typically, unless there’s a specific reason not to do so, we should use value-equality checks between instance variables inside equals(Object)
methods.
For more tips, and to arrange for personalized tutoring for yourself or your study group, check out Vancouver Computer Science Tutoring.