How can I create the grammar definition to correctly parse a input

300 views Asked by At

Lex file

import ply.lex as lex

# List of token names.   
tokens = (
    "SYMBOL",
    "COUNT"
)

t_SYMBOL = (r"Cl|Ca|Co|Os|C|H|O")

def t_COUNT(t):
    r"\d+"
    t.value = int(t.value)
    return t

def t_error(t):
    raise TypeError("Unknown text '%s'" % (t.value,))

atomLexer = lex.lex()

data1 = "CH3Cl"
data = "OClOsOH3C"

def testItOut():
    # Give the lexer some input
    atomLexer.input(data1)
    # Tokenize
    tok = atomLexer.token()
    while tok:
        print (tok)
        tok = atomLexer.token()

Parse file

import ply.yacc as yacc

# Get the token map from the lexer.  
from atomLex import tokens

def p_expression_symbol(p):
    'molecule : SYMBOL'
    p[0] = p[1]


def p_error(p):
    raise TypeError("unknown text at %r" % (p.value,))

atomParser = yacc.yacc()


def testItOut():
    # Give the parser some input
    s = input('Type a chemical name > ')
    # Parse it
    result = atomParser.parse(s)
    print ('The atom is: ' + result)

while(True):
    testItOut()

Currently I would like to be able to enter in CH3Cl, although within my parse file I am not entirely sure how to create these grammar definitions that I have been given,

chemical : chemical molecule
chemical : molecule
molecule : SYMBOL COUNT
molecule : SYMBOL

What would the grammar definitions for these be within the parse file? Thank you.

1

There are 1 answers

0
Brian Tompsett - 汤莱恩 On BEST ANSWER

There is a nice set of documentation for PLY with examples, which can be used to answer this question: http://www.dabeaz.com/ply/ply.html

Section 6.2 is particularly helpful. I suggest you change this code:

def p_expression_symbol(p):
    'molecule : SYMBOL'
    p[0] = p[1]

To include the new rules. The name p_expression_symbol is also inappropriate. I guess you copied that from one of the examples. We now have:

def p_chemical_forumal(p):
    '''molecule : SYMBOL
       chemical : chemical molecule
       chemical : molecule
       molecule : SYMBOL COUNT
       molecule : SYMBOL'''
       p[0] = p[1]

There are also other useful examples in the documentation that can be applied to your exercise.