|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.foray.hyphen.PatternTree
public class PatternTree
An implementation of the Knuth/Liang hyphenation scheme that is part of TeX. This scheme has three data components: classes, exceptions, and patterns. More information about these components can be found in Appendix H of "The "TeXbook", which is also volume A of "Computers & Typesetting".
The strings patterns are usually small (from 2 to 5 characters), but each char in the tree is stored in a node. Therefore keeping a low memory footprint is important. Also, hyphenation is a frequently-used task, making speed an important design consideration as well.
This implementation uses a TernaryTreeMap
to map the pattern strings
to an index into the inter-letter pattern values.
The TernaryTreeMap
ternary tree provides a nice combination of low
memory footprint and speed that is suitable for storing hyphenation
information.
The current ternary tree implementation is limited to 65,536 nodes, but this
has not been a practical limitation yet.
A natural language typically requires from 5000 to 15000 hyphenation
patterns.
The original author of this class wrote: "In my tests the english patterns
took 7694 nodes and the german patterns 10055 nodes, so we are well within
the 65,000 node limitation."
Nested Class Summary | |
---|---|
static class |
PatternTree.Source
Enumeration of the possible sources of this tree. |
Constructor Summary | |
---|---|
PatternTree()
Constructor. |
Method Summary | |
---|---|
void |
addClass(String chargroup)
Add a Liang-style character class. |
void |
addException(String hyphenatedWord,
int qtyMorphExceptions)
Add a Liang-style hyphenation exception. |
void |
addMorphException(String exceptionWord,
String pre,
String post,
String no)
Add a morphing hyphenation break to an exception word. |
void |
addPattern(String rawPattern)
Add a Liang-style hyphenation pattern. |
static String |
getInterletterValues(String pattern)
Extract the inter-letter values from a given pattern, returning them as a String. |
static String |
getPatternChars(String pattern)
Extract the character sequence from a Liang-style hyphenation pattern. |
PatternTree.Source |
getSource()
Returns the source of this tree. |
void |
setHyphenChar(char hyphenChar)
Sets the character that should be interpreted as the hyphenation character in exceptions. |
void |
setMinAfter(byte minAfter)
Sets the minimum number of characters that should be left on a line before a hyphenation break. |
void |
setMinBefore(byte minBefore)
Sets the minimum number of characters that should be at the beginning of a line after a hyphenation break. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public PatternTree()
Method Detail |
---|
public void addClass(String chargroup)
PatternConsumer
addClass
in interface PatternConsumer
chargroup
- The character class to add.public void addException(String hyphenatedWord, int qtyMorphExceptions)
PatternConsumer
addException
in interface PatternConsumer
hyphenatedWord
- The raw word for which the exception is being
created.
For example, the English pattern dictionary distributed with TeX includes
the exception "oblig-a-tory", which is the text expected here.qtyMorphExceptions
- The number of morph exceptions that will be added to
this exception word.PatternConsumer.addMorphException(String, String, String, String)
public void addMorphException(String exceptionWord, String pre, String post, String no) throws org.axsl.hyphen.HyphenationException
PatternConsumer
addMorphException
in interface PatternConsumer
exceptionWord
- The raw word for which the exception is being
created.
This must be the same word that was used in
PatternConsumer.addException(String, int)
.pre
- The "pre" portion of the special exception.post
- The "post" portion of the special exception.no
- The "no" portion of the special exception.
org.axsl.hyphen.HyphenationException
- If exceptionWord
is not found
in the exception words.public void addPattern(String rawPattern)
PatternConsumer
addPattern
in interface PatternConsumer
rawPattern
- The raw Liang-style pattern to be added, for example
".ab4i".public static String getPatternChars(String pattern)
pattern
- The raw pattern to be parsed.
pattern
is the Liang pattern ".ab4i", the
return value should be ".abi".public static String getInterletterValues(String pattern)
pattern
- The pattern whose inter-letter values should be extracted.
getPatternChars(String)
for the same pattern.
This is because values are included for the slot immediately before the
first character in pattern
, and for the slot immediately
after the last character in pattern
.
For example, for a pattern
of ".ab4i", which could be translated into
a fully-expanded pattern of "0.0a0b4i0", the return value should be "00040".public void setHyphenChar(char hyphenChar)
PatternConsumer
setHyphenChar
in interface PatternConsumer
hyphenChar
- The hyphenation character to set.PatternConsumer.addException(String, int)
public void setMinAfter(byte minAfter)
PatternConsumer
setMinAfter
in interface PatternConsumer
minAfter
- The minimum number of characters that should be left on a
line before a hyphenation break.public void setMinBefore(byte minBefore)
PatternConsumer
setMinBefore
in interface PatternConsumer
minBefore
- The minimum number of characters that should be at the
beginning of a line after a hyphenation break.public PatternTree.Source getSource()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |