波波老师,在词频统计中 为什么TrieMap 比 BSTMap 效率要低呢?

来源:10-6 Trie字典树和字符串映射

爱西瓜同志

2019-05-06

package middle;

import java.util.TreeMap;

public class TrieMap<String extends Comparable<String>, V> implements Map<String , V> {

    private class Node{

        V value ;
        boolean isWord ;
        TreeMap<Character , Node> next ;

        public Node(V value , boolean isWord){
            this.isWord = isWord ;
            this.value = value ;
            next = new TreeMap<>() ;
        }
        public Node(boolean isWord){
            this(null , isWord) ;
        }
        public Node(){
            this(null , false) ;
        }
    }

    private Node root ;
    private int size ;

    public TrieMap(){
        root = new Node() ;
        size = 0 ;
    }

    public int getSize(){
        return size ;
    }

    public boolean isEmpty(){
        return size == 0 ;
    }

    public void add(String word , V value){

        Node cur = root ;
        for (int i = 0 ; i < word.toString().length() ; i ++){
            char c = word.toString().charAt(i) ;
            if (cur.next.get(c) == null)
                cur.next.put(c , new Node()) ;
            cur = cur.next.get(c) ;
        }
        if (!cur.isWord){
            size ++ ;
            cur.value = value ;
            cur.isWord = true ;
        }
    }

    public boolean contains(String word){

        Node cur = root ;
        for (int i = 0 ; i < word.toString().length() ; i ++){
            char c = word.toString().charAt(i) ;
            if (cur.next.get(c) == null)
                return false ;
            cur = cur.next.get(c) ;
        }
        return cur.isWord ;
    }

    public V get(String word){

        Node cur = root ;
        for (int i = 0 ; i < word.toString().length() ; i ++){
            char c = word.toString().charAt(i) ;
            cur = cur.next.get(c) ;
        }
        return cur.value ;
    }

    public void set(String word , V value){

        Node cur = root ;
        for (int i = 0 ; i < word.toString().length() ; i ++){
            char c = word.toString().charAt(i) ;
            if (cur.next.get(c) == null)
                cur.next.put(c , new Node()) ;
            cur = cur.next.get(c) ;
        }
        cur.value = value ;
    }

    public V remove(String word){

        Node cur = root ;
        for (int i = 0 ; i <word.toString().length() ; i ++){
            char c = word.toString().charAt(i) ;
            if (cur.next.get(c) == null)
                return null ;
            cur = cur.next.get(c) ;
        }
        V res = null ;
        if (cur.isWord){
            res = cur.value ;
            cur.isWord = false ;
            cur.value = null ;
        }
        return res ;
    }

}

图片描述

写回答

1回答

liuyubobobo

2019-05-06

因为Trie每个字母一个节点。无论是开辟这些节点的内存过程,还是演着这些节点查找的过程,都是耗时的。


Trie的性能效果,只有在大规模情况下(单词个数远远高于每个单词的平均长度),才能看出来:)


继续加油!:)

0
1
爱西瓜同志
非常感谢!
2019-05-06
共1条回复

玩转数据结构

动态数组/栈/队列/链表/BST/堆/线段树/Trie/并查集/AVL/红黑树…

6221 学习 · 1704 问题

查看课程