This is my first time posting here so I'm sorry if the formatting is a little wrong.
Basically my work has asked me to read through an XML (with invalid tags so using a library may be out of the question - I have no control over the XML files), take specific strings from the tags, and output them to a CSV.
So far I've been able to parse a portion of the program but I run into problems when the desired tags occur more than once in a line.
This is the general format of the XML:
<my:LineItem>
<my:LineNumber>1</my:LineNumber>
<my:PartNumber></my:PartNumber>
<my:Quantity>1</my:Quantity>
<my:UOM>EA</my:UOM>
<my:UnitCost>1</my:UnitCost>
<my:ExtendedCost>1</my:ExtendedCost>
<my:CostCentre>801090 - CG Collab - Feretti -Core 1</my:CostCentre>
<my:ExpenseCode>86130 - Lab Equipment Rental</my:ExpenseCode>
<my:CostExpenseMerge>801090.86130</my:CostExpenseMerge>
<my:Description>123</my:Description>
<my:Comments>1</my:Comments>
</my:LineItem><my:LineItem><my:LineNumber>2</my:LineNumber><my:PartNumber></my:PartNumber><my:Quantity>2</my:Quantity><my:UOM>BX</my:UOM><my:UnitCost>2</my:UnitCost><my:ExtendedCost xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">4</my:ExtendedCost><my:CostCentre>800186 University of Maastricht - Bou</my:CostCentre><my:ExpenseCode>86110 - Glass/Plastic Washing S</my:ExpenseCode><my:CostExpenseMerge>800186.86110</my:CostExpenseMerge><my:Description></my:Description><my:Comments>2</my:Comments></my:LineItem><my:LineItem><my:LineNumber>3</my:LineNumber><my:PartNumber></my:PartNumber><my:Quantity>3</my:Quantity><my:UOM>CA</my:UOM><my:UnitCost>3</my:UnitCost><my:ExtendedCost xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">9</my:ExtendedCost><my:CostCentre>800180 J Bartlett PCC – PRONTO Team Grant</my:CostCentre><my:ExpenseCode>81920 - Mechanical Supplies</my:ExpenseCode><my:CostExpenseMerge>800180.81920</my:CostExpenseMerge><my:Description></my:Description><my:Comments>3</my:Comments></my:LineItem>
And I only need to save specific values such as Quantity, CostExpenseMerge, and Description.
Now; so far I am able to read the first two occurrences of because they occur on separate lines. My problem now is: how do I save multiple occurrences of my desired tags in one line?
The XML seems to randomly force more than one entry into a line (see items 2 and 3 in my input file).
This is what I have for reading:
char buffer[1024];
const char * startTag = "<my:Quantity>";
const char * endTag = "</my:Quantity>";
char * start, * end;
char * tempString, * target=NULL;
while(fgets(buffer, sizeof(buffer), entry_file)){
if((start=strstr(buffer,startTag))){
start+=strlen(startTag);
if((end=strstr(start,endTag))){
target = (char*)malloc(end-start+1);
memcpy(target, start, end-start);
target[end-start]='\0';
if(target)printf("%s\n", target);
}
}
}
And my output is:
1
2
Which means that the third occurrence of didn't get read (it's supposed to be "3").
Help please!
Rather than
if()
, use awhile()
. Here's a test program that works on your data: