org.foray.common
Class TernaryTreeMap

java.lang.Object
  extended by org.foray.common.TernaryTreeMap
All Implemented Interfaces:
Serializable, Cloneable

public class TernaryTreeMap
extends Object
implements Cloneable, Serializable

A map whose key is always a CharSequence (usually a String), and whose value is a char, usually used to store an index into some other data structure. This class is intended to serve as a base class or helper class for natural-language dictionaries and similar data structures.

The values Character.MIN_VALUE (0x0000) and Character.MAX_VALUE (0xFFFF) have special significance to the internal structures in this class. Mapping strings that include these values will have unpredictable and probably horrible results.

Ternary means "grouped in threes". Each node in the tree has up to three child nodes: a "high" branch for values greater than that in the node, a "low" branch for values less than that in the node, and an "equal" branch for the value that is equal to that in the node. This provides efficiency in both speed and memory for processing textual data. The memory benefits come from the fact that similar Strings are actually stored in the same location. A properly balanced tree is fast because it does not need to traverse a large number of nodes to find a value.

A ternary search tree is a hybrid between a binary tree and a digital search tree (trie). Ternary trees have some nice properties, including the following:

In this implementation the value is a char (think "unsigned short"), and is stored in the leaf nodes of the tree. This limits the number of nodes to 65,536. Branches that contain only one key are compressed to one node by storing a pointer to the trailer substring of the key. A compressed branch needs only one node per key plus the size of the string key. Compressed branches are decompressed as needed when another key with same prefix is inserted. This saves a lot of space, especially for long keys.

This class could be extended to or embedded in a general map that would allow arbitrary objects to be values in the map. One scheme for doing so would be to create an array of the value Objects, and use this class to store the indexes to that array.

See Also:
Serialized Form

Nested Class Summary
 class TernaryTreeMap.Iterator
          Inner class Iterator implementation.
 
Constructor Summary
TernaryTreeMap()
          Constructor.
 
Method Summary
 TernaryTreeMap clone()
           
 int get(char[] key)
          Returns the mapped "value" for a given char array.
 int get(char[] key, int start)
          Returns the mapped "value" for a given char array.
 int get(char[] key, int start, int end)
          Returns the mapped "value" for a given char array.
 int get(CharSequence key)
          Returns the mapped "value" for a given String or other CharSequence key.
 int get(CharSequence key, int start)
          Returns the mapped "value" for a given String or other CharSequence key.
 int get(CharSequence key, int start, int end)
          Returns the mapped "value" for a given String or other CharSequence key.
 int getNodeCount()
          Returns the number of nodes in the tree.
 void optimize()
          Allows the map to internally optimize itself for faster performance and possible memory reduction.
 void put(char[] key, char value)
          Put part or all of one char[] in the map.
 void put(char[] key, int start, char value)
          Put part or all of one char[] in the map.
 void put(char[] key, int start, int end, char value)
          Put part or all of one char[] in the map.
 void put(CharSequence key, char value)
          Put one String or other CharSequence in the map.
 void put(CharSequence key, int start, char value)
          Put one String or other CharSequence in the map.
 void put(CharSequence key, int start, int end, char value)
          Put one String or other CharSequence in the map.
 int size()
          Returns the number of key-value pairs in this map.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TernaryTreeMap

public TernaryTreeMap()
Constructor.

Method Detail

put

public void put(char[] key,
                char value)
Put part or all of one char[] in the map.

Parameters:
key - The key to be added.
value - The value.

put

public void put(char[] key,
                int start,
                char value)
Put part or all of one char[] in the map.

Parameters:
key - The key to be added.
start - The index to the first char in key to be added.
value - The value.

put

public void put(char[] key,
                int start,
                int end,
                char value)
Put part or all of one char[] in the map.

Parameters:
key - The key to be added.
start - The index to the first char in key to be added.
end - The index to the first char in key, after start that should NOT be added.
value - The value.

put

public void put(CharSequence key,
                char value)
Put one String or other CharSequence in the map.

Parameters:
key - The key to be added.
value - The value.

put

public void put(CharSequence key,
                int start,
                char value)
Put one String or other CharSequence in the map.

Parameters:
key - The key to be added.
start - The index to the first char in key to be added.
value - The value.

put

public void put(CharSequence key,
                int start,
                int end,
                char value)
Put one String or other CharSequence in the map.

Parameters:
key - The key to be added.
start - The index to the first char in key to be added.
end - The index to the first char in key, after start that should NOT be added.
value - The value.

get

public int get(char[] key)
Returns the mapped "value" for a given char array.

Parameters:
key - The key whose value is needed.
Returns:
The mapped value.

get

public int get(char[] key,
               int start)
Returns the mapped "value" for a given char array.

Parameters:
key - The key whose value is needed.
start - The index to the first element in key which should be considered part of the key.
Returns:
The mapped value.

get

public int get(char[] key,
               int start,
               int end)
Returns the mapped "value" for a given char array.

Parameters:
key - The key whose value is needed.
start - The index to the first element in key which should be considered part of the key.
end - The index to the first element in key, after start, which should NOT be considered part of the key.
Returns:
The mapped value, if key is in the map. Otherwise, returns -1;

get

public int get(CharSequence key)
Returns the mapped "value" for a given String or other CharSequence key.

Parameters:
key - The key whose value is needed.
Returns:
The mapped value.

get

public int get(CharSequence key,
               int start)
Returns the mapped "value" for a given String or other CharSequence key.

Parameters:
key - The key whose value is needed.
start - The index to the first element in key which should be considered part of the key.
Returns:
The mapped value.

get

public int get(CharSequence key,
               int start,
               int end)
Returns the mapped "value" for a given String or other CharSequence key.

Parameters:
key - The key whose value is needed.
start - The index to the first element in key which should be considered part of the key.
end - The index to the first element in key, after start, which should NOT be considered part of the key.
Returns:
The mapped value, if key is in the map. Otherwise, returns -1;

size

public int size()
Returns the number of key-value pairs in this map. This can be different from the number of nodes in the tree because of the compressed branches.

Returns:
The number of key-value pairs in this map.

getNodeCount

public int getNodeCount()
Returns the number of nodes in the tree.

Returns:
The number of nodes in the tree.

clone

public TernaryTreeMap clone()
Overrides:
clone in class Object

optimize

public void optimize()
Allows the map to internally optimize itself for faster performance and possible memory reduction. This method should be run only once, immediately after the tree is completely built.



Copyright © 2017. All rights reserved.