JAVA Hyphenator using Coldfusion

138 views Asked by At

I was hoping someone can help with the implementation of TeXHyphenator-J using CFML.

I am using the JavaLoader.cfc to create a ColdFusion object of TeXHyphenator-J (as in the code below). When running the code I don't get any errors and a string is returned. However it isn't hyphenated?

<!--- Load Javaloader --->
<cfset paths    = arrayNew(1)>
<cfset paths[1] = expandPath("assets/Hyphenator/texhyphj.jar")>
<cfset loader   = createObject("component", "assets.javaloader.JavaLoader").init(paths)>

<!--- Create buffered stream to TeX file --->
<cfset FileInputStream = createobject("java", "java.io.FileInputStream").init(expandPath("assets/Hyphenator/hyphen.tex"))>
<cfset BufferedInputStream = createobject("java","java.io.BufferedInputStream").init(FileInputStream)>

<!--- Initiate Hyphenator --->
<cfset h = loader.create('net.davidashen.text.Hyphenator').init()>
<!--- load the TeX table into Hyphenator --->
<cfset h.loadTable(BufferedInputStream)>
<!--- Get hyphenated string, Hyphenator should return as-so-ci-ate --->
<cfset retStr = h.hyphenate('associate')>

<cfdump var="#retStr#">
1

There are 1 answers

0
Leigh On

however it isn't hyphenated

While not immediately obvious, the returned value actually is hyphenated. The chosen hyphen character, ie \u00ad just is not visible. Dumping each character in the string shows the soft hyphens have an ascii value of 173:

97  : a
115 : s
173 : ­    <== soft hyphen
115 : s
111 : o
173 : ­    <== soft hyphen
99  : c
105 : i
97  : a
116 : t
101 : e

So one simple solution would be to replace that character with a standard hyphen:

newString = replace(retStr, chr(173), "-", "all")

Side note, running the same example directly in java also yielded "as-so-ciate", not "as-so-ci-ate".