Per the Java documentation, the hash code for a String
object is computed as:
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int
arithmetic, where s[i]
is the
ith character of the string, n
is the length of
the string, and ^
indicates exponentiation.
Why is 31 used as a multiplier?
I understand that the multiplier should be a relatively large prime number. So why not 29, or 37, or even 97?
According to Joshua Bloch’s Effective Java (a book that can’t be recommended enough, and which I bought thanks to continual mentions on stackoverflow):
The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i
. Modern VMs do this sort of optimization automatically.
(from Chapter 3, Item 9: Always override hashcode when you override equals, page 48)