I am writing code that looks through an XML file and gets a target word. It then looks for a successor word and calculates the probability of those 2 words showing up together in all of the documents. When i try to normalize-space(), the results in the output for $successor still shows a space after the word. Below is my code and the output file I get.
Code:
<html>
<body>
<table border='1'>
<tr><td>Target</td><td>Successor</td><td>Probability</td></tr>
{
let $targetword := "has"
let $t_word_occ := collection("./?select=*xml")//s//w[lower-case(normalize-space()) = $targetword] (::)
let $totalwords := collection("./?select=*xml")//s//w[lower-case(normalize-space())]
for $successor in distinct-values($t_word_occ/following-sibling::w[1])
let $freq := count($t_word_occ/following-sibling::w[1][. = $successor])
let $dwtw := count($totalwords[. = $successor])
let $prob := $freq div $dwtw
order by ($prob) descending
return <tr><td>{$targetword}</td><td>{$successor}</td><td>{$prob}</td>
</tr>
}
</table>
</body>
</html>
Sample output:
<tr>
<td>Target</td>
<td>Successor</td>
<td>Probability</td>
</tr>
<tr>
<td>has</td>
<td>intentions </td>
<td>1</td>
</tr>
<tr>
<td>has</td>
<td>drifted </td>
<td>1</td>
</tr>
<tr>
<td>has</td>
<td>eluded </td>
<td>1</td>
</tr>
<tr>
<td>has</td>
<td>won</td>
<td>1</td>
</tr>
In the output you can see for some words it says for example, "drifted ", "eluded " with the space after. And one which is normal e.g. "won" (without the space)
How would I go about fixing this?
I am also using xQuery 1.0
You can try the following technique:
Or even as follows
A full repro