Hyphenation for Ancient Greek in R

Question

Hyphenation for Ancient Greek in R

60 views Asked by LocusClassicus At 07 January 2023 at 19:03

Is there a way to divide an Ancient Greek text (UTF-8) into syllables in R? I need to count the number of unique syllables in a corpus.

I cannot find an algorithm to do so, and the rules are quite complicated to write it from scratch.

Original Q&A

There are 1 answers

**LocusClassicus** · Answer 1 · 2023-01-07T19:23:39+00:00

Basing on https://cran.r-project.org/web/packages/sylly/vignettes/sylly_vignette.html#fn2, here is a solution

library(sylly.en)
sample.text <- "Μουσάων Ἑλικωνιάδων ἀρχώμεθ' ἀείδειν"


url.grc.pattern <- url("http://tug.ctan.org/tex-archive/language/hyph-utf8/tex/generic/hyph-utf8/patterns/txt/hyph-grc.pat.txt")
hyph.grc <- read.hyph.pat(url.grc.pattern, lang="grc")
close(url.grc.pattern)

hyph.txt.grc <- hyphen(sample.text, hyph.pattern=hyph.grc) # or
hyph.txt.grc <- hyphen_df(sample.text, hyph.pattern=hyph.grc)
class(hyph.txt.grc$word) # character vector

Some words are not hyphenated correctly, though.

TechQA.

Hyphenation for Ancient Greek in R

There are 1 answers

Related Questions in R

Related Questions in HYPHENATION

Popular Questions

Trending Questions