|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.foray.hyphen.PatternTree
public class PatternTree
An implementation of the Knuth/Liang hyphenation scheme that is part of TeX. This scheme has three data components: classes, exceptions, and patterns. More information about these components can be found in Appendix H of "The "TeXbook", which is also volume A of "Computers & Typesetting".
The strings patterns are usually small (from 2 to 5 characters), but each char in the tree is stored in a node. Therefore keeping a low memory footprint is important. Also, hyphenation is a frequently-used task, making speed an important design consideration as well.
This implementation uses a TernaryTreeMap to map the pattern strings
to an index into the inter-letter pattern values.
The TernaryTreeMap ternary tree provides a nice combination of low
memory footprint and speed that is suitable for storing hyphenation
information.
The current ternary tree implementation is limited to 65,536 nodes, but this
has not been a practical limitation yet.
A natural language typically requires from 5000 to 15000 hyphenation
patterns.
The original author of this class wrote: "In my tests the english patterns
took 7694 nodes and the german patterns 10055 nodes, so we are well within
the 65,000 node limitation."
| Nested Class Summary | |
|---|---|
static class |
PatternTree.Source
Enumeration of the possible sources of this tree. |
| Constructor Summary | |
|---|---|
PatternTree()
Constructor. |
|
| Method Summary | |
|---|---|
void |
addClass(String chargroup)
Add a Liang-style character class. |
void |
addException(String hyphenatedWord,
int qtyMorphExceptions)
Add a Liang-style hyphenation exception. |
void |
addMorphException(String exceptionWord,
String pre,
String post,
String no)
Add a morphing hyphenation break to an exception word. |
void |
addPattern(String rawPattern)
Add a Liang-style hyphenation pattern. |
static String |
getInterletterValues(String pattern)
Extract the inter-letter values from a given pattern, returning them as a String. |
static String |
getPatternChars(String pattern)
Extract the character sequence from a Liang-style hyphenation pattern. |
PatternTree.Source |
getSource()
Returns the source of this tree. |
void |
setHyphenChar(char hyphenChar)
Sets the character that should be interpreted as the hyphenation character in exceptions. |
void |
setMinAfter(byte minAfter)
Sets the minimum number of characters that should be left on a line before a hyphenation break. |
void |
setMinBefore(byte minBefore)
Sets the minimum number of characters that should be at the beginning of a line after a hyphenation break. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public PatternTree()
| Method Detail |
|---|
public void addClass(String chargroup)
PatternConsumer
addClass in interface PatternConsumerchargroup - The character class to add.
public void addException(String hyphenatedWord,
int qtyMorphExceptions)
PatternConsumer
addException in interface PatternConsumerhyphenatedWord - The raw word for which the exception is being
created.
For example, the English pattern dictionary distributed with TeX includes
the exception "oblig-a-tory", which is the text expected here.qtyMorphExceptions - The number of morph exceptions that will be added to
this exception word.PatternConsumer.addMorphException(String, String, String, String)
public void addMorphException(String exceptionWord,
String pre,
String post,
String no)
throws org.axsl.hyphen.HyphenationException
PatternConsumer
addMorphException in interface PatternConsumerexceptionWord - The raw word for which the exception is being
created.
This must be the same word that was used in
PatternConsumer.addException(String, int).pre - The "pre" portion of the special exception.post - The "post" portion of the special exception.no - The "no" portion of the special exception.
org.axsl.hyphen.HyphenationException - If exceptionWord is not found
in the exception words.public void addPattern(String rawPattern)
PatternConsumer
addPattern in interface PatternConsumerrawPattern - The raw Liang-style pattern to be added, for example
".ab4i".public static String getPatternChars(String pattern)
pattern - The raw pattern to be parsed.
pattern is the Liang pattern ".ab4i", the
return value should be ".abi".public static String getInterletterValues(String pattern)
pattern - The pattern whose inter-letter values should be extracted.
getPatternChars(String) for the same pattern.
This is because values are included for the slot immediately before the
first character in pattern, and for the slot immediately
after the last character in pattern.
For example, for a pattern of ".ab4i", which could be translated into
a fully-expanded pattern of "0.0a0b4i0", the return value should be "00040".public void setHyphenChar(char hyphenChar)
PatternConsumer
setHyphenChar in interface PatternConsumerhyphenChar - The hyphenation character to set.PatternConsumer.addException(String, int)public void setMinAfter(byte minAfter)
PatternConsumer
setMinAfter in interface PatternConsumerminAfter - The minimum number of characters that should be left on a
line before a hyphenation break.public void setMinBefore(byte minBefore)
PatternConsumer
setMinBefore in interface PatternConsumerminBefore - The minimum number of characters that should be at the
beginning of a line after a hyphenation break.public PatternTree.Source getSource()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||