Hopefully I find some answers here. I'm trying to write html with python 3. I have tried yattag and dominate modules but with both I have the same problem: when I try to write the content of my code to an HTML file, the generated document doesn't display the letters with accent mark, instead a question mark in a little black figure is displayed. (see image at the bottom)
my code looks like this.
Using dominate:
import dominate
import dominate.tags as tg
#an example doc
_html = tg.html(lang='es')
_head = _html.add(tg.head())
_body = _html.add(tg.body())
with _head:
tg.meta(charset="UTF-8") #this line seems to be the problem
with _body:
tg.p("Benjamín")
print(_html)
#when I print to console, the accent mark in the letter 'í' is there but...
#when I write the file, the weird character is displayed
with open("document.html", 'w') as file:
file.write(_html.render())
Same thing using yattag
from yattag import Doc
#another example doc
doc, tag, text = Doc().tagtext()
with tag("html", "lang='es'"):
with tag("head"):
doc.stag("meta", charset="UTF-8") #this line seems to be the problem
with tag("body"):
text("Benjamín")
#when I print to console, the accent mark in the letter 'í' is there but...
#when I write the file, the weird character is displayed
with open("document2.html", 'w') as file:
file.write(doc.getvalue())
So when I change or remove the charset in both cases, the problem seems to go away. I use the last two lines to write simple documents, as everyone does I guess, and no problem with accent marks. The problem seems to be how the imported modules manage the charset to display the content of the page. Well I don't know. do you know any way to get around this? Hope you're doing fine. Thank you.
You can use the
encoding
parameter when youopen
the file:Pro tip: You can define the behavior in case of errors using the
errors
parameter: