Change an apache.commons.configuration file into a pythonable configparser file

193 views Asked by At

The aim is to change an org.apache.commons.configuration file into a pythonable configparser file

I have a Java Apache config file as such (the full file is http://pastebin.com/Wz2T2KV9):

##############################
# BABELNET-RELATED PROPERTIES
##############################

include = babelnet.var.properties

babelnet.fullFile = ${babelnet.dir}/babel-synsets-full.txt
babelnet.dictFile = ${babelnet.dir}/babel-synsets-lexicon.txt
babelnet.glossFile = ${babelnet.dir}/babel-synsets-gloss.txt
babelnet.relFile = ${babelnet.dir}/babel-synsets-relations.txt
babelnet.mapFile = ${babelnet.dir}/babel-synsets-mapping.txt


#################
# DB BABELCO
#################

babelco.windowRadius=20

babelco.db.user=root
babelco.db.password=

I would like to convert it into a file that python ConfigParser (https://docs.python.org/2/library/configparser.html) can parse, i.e.

[BABELNET-RELATED PROPERTIES]

include = babelnet.var.properties

babelnet.fullFile = ${babelnet.dir}/babel-synsets-full.txt
babelnet.dictFile = ${babelnet.dir}/babel-synsets-lexicon.txt
babelnet.glossFile = ${babelnet.dir}/babel-synsets-gloss.txt
babelnet.relFile = ${babelnet.dir}/babel-synsets-relations.txt
babelnet.mapFile = ${babelnet.dir}/babel-synsets-mapping.txt


[DB BABELCO]
babelco.windowRadius=20

babelco.db.user=root
babelco.db.password=

I tried this but it's giving me the wrong outputs:

fin = open('configfile', 'r')

sections = {}
headers = []
start_header = False
prev = ""
this_header = ""
this_section = []


for line in fin:
    line = line.strip()
    if line.startswith('#') and line.endswith('#') and start_header == False:
        start_header = True
        sections[this_header] = this_section
        headers.append(this_header)
        this_header = ""
        this_section = []
        prev = ""
        continue
    if line.startswith('#') and not line.endswith('#') and start_header == True:
        this_header = line[2:].strip()
        continue
    if line.startswith('#') and line.endswith('#') and this_header:
        start_header = False
        continue
    this_section.append(line.strip())

for h in headers:
    print '[' + h + ']'
    for line in sections[h]:
        print line

Is there a simpler way to convert Java Apache commons config file formats into python config file format?

1

There are 1 answers

2
rchang On BEST ANSWER

Try this (I used Python 2.7) - this assumes a pretty well-formed input file (edited to handle empty comment lines). Additional error handling will have to be added if you want to be tolerant of malformed configurations.

import re

fin = open('configfile', 'r')

for line in fin:
    line = line.strip()
    if re.search(r'^\s*##+\s*$', line):
        match = re.search(r'^\s*#\s*(.*?)\s*$', fin.next().strip())
        print "[%s]" % match.group(1)
        # Absorb the next line, which should just be #s
        fin.next()
        continue
    else:
        print line