Java HashMap取模为什么是一个合数
来源:14-2 哈希函数的设计
慕瓜5585045
2018-08-02
在看JDK8 HashMap的源码中,数组长度为2^n,在初始的情况下数组默认长度为16是一个合数,然后对散列值的取模操作是这样的(length - 1) & hash 这其实等价于 hash% length ,老师不是说过M如果是合数的话,分布会不均匀吗? 但是为什么Java中HashMap取模的是一个合数呢?
写回答
1回答
-
是的。HashMap的实现中使用长度为2的整数次幂的数组长度。这样做的好处是:计算一个元素的位置更快速,使用位运算就可以了。相比取模,尤其是对素数取模的运算,要快很多。但是缺点是:容易哈希冲突。
为了解决这个问题,HashMap在哈希函数上做文章。HashMap不是直接使用用户传来的数据的hashCode值直接 % length,而是对用户传来的数据的hashCode再进行一次哈希运算,以求得到的哈希值分布尽量平均。为此,HashMap中有一个hash(key)的方法。可以仔细阅读一下其中的注释,描述了这个设计:
/** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
p.s.:Java中的HashTable的设计,初始的容量为11:)
412018-08-03
相似问题