Lex file
import ply.lex as lex
# List of token names.
tokens = (
"SYMBOL",
"COUNT"
)
t_SYMBOL = (r"Cl|Ca|Co|Os|C|H|O")
def t_COUNT(t):
r"\d+"
t.value = int(t.value)
return t
def t_error(t):
raise TypeError("Unknown text '%s'" % (t.value,))
atomLexer = lex.lex()
data1 = "CH3Cl"
data = "OClOsOH3C"
def testItOut():
# Give the lexer some input
atomLexer.input(data1)
# Tokenize
tok = atomLexer.token()
while tok:
print (tok)
tok = atomLexer.token()
Parse file
import ply.yacc as yacc
# Get the token map from the lexer.
from atomLex import tokens
def p_expression_symbol(p):
'molecule : SYMBOL'
p[0] = p[1]
def p_error(p):
raise TypeError("unknown text at %r" % (p.value,))
atomParser = yacc.yacc()
def testItOut():
# Give the parser some input
s = input('Type a chemical name > ')
# Parse it
result = atomParser.parse(s)
print ('The atom is: ' + result)
while(True):
testItOut()
Currently I would like to be able to enter in CH3Cl, although within my parse file I am not entirely sure how to create these grammar definitions that I have been given,
chemical : chemical molecule
chemical : molecule
molecule : SYMBOL COUNT
molecule : SYMBOL
What would the grammar definitions for these be within the parse file? Thank you.
There is a nice set of documentation for PLY with examples, which can be used to answer this question: http://www.dabeaz.com/ply/ply.html
Section 6.2 is particularly helpful. I suggest you change this code:
To include the new rules. The name
p_expression_symbol
is also inappropriate. I guess you copied that from one of the examples. We now have:There are also other useful examples in the documentation that can be applied to your exercise.