Java HashMap取模为什么是一个合数

来源：14-2 哈希函数的设计

慕瓜5585045

2018-08-02

在看JDK8 HashMap的源码中，数组长度为2^n，在初始的情况下数组默认长度为16是一个合数，然后对散列值的取模操作是这样的(length - 1) & hash 这其实等价于 hash% length ，老师不是说过M如果是合数的话，分布会不均匀吗？但是为什么Java中HashMap取模的是一个合数呢？

写回答

1回答

liuyubobobo

2018-08-03

已采纳

是的。HashMap的实现中使用长度为2的整数次幂的数组长度。这样做的好处是：计算一个元素的位置更快速，使用位运算就可以了。相比取模，尤其是对素数取模的运算，要快很多。但是缺点是：容易哈希冲突。

为了解决这个问题，HashMap在哈希函数上做文章。HashMap不是直接使用用户传来的数据的hashCode值直接 % length，而是对用户传来的数据的hashCode再进行一次哈希运算，以求得到的哈希值分布尽量平均。为此，HashMap中有一个hash(key)的方法。可以仔细阅读一下其中的注释，描述了这个设计：

/**
     * Computes key.hashCode() and spreads (XORs) higher bits of hash
     * to lower.  Because the table uses power-of-two masking, sets of
     * hashes that vary only in bits above the current mask will
     * always collide. (Among known examples are sets of Float keys
     * holding consecutive whole numbers in small tables.)  So we
     * apply a transform that spreads the impact of higher bits
     * downward. There is a tradeoff between speed, utility, and
     * quality of bit-spreading. Because many common sets of hashes
     * are already reasonably distributed (so don't benefit from
     * spreading), and because we use trees to handle large sets of
     * collisions in bins, we just XOR some shifted bits in the
     * cheapest possible way to reduce systematic lossage, as well as
     * to incorporate impact of the highest bits that would otherwise
     * never be used in index calculations because of table bounds.
     */
    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

p.s.：Java中的HashTable的设计，初始的容量为11：）

慕瓜5585045

非常感谢！

2018-08-03

共1条回复

玩转数据结构

动态数组/栈/队列/链表/BST/堆/线段树/Trie/并查集/AVL/红黑树…

6221 学习 · 1704 问题

查看课程

相似问题

4-2

回答 1