The problem with xml parsing using MSXML2: one first node is repeated when ten different nodes should be listed

81 views Asked by At

The problem mentioned in the title occurs in using selectSingleNodes. I believe it has to do with xPath of the method. I denoted the problematic lines in the code below with "<===== HERE"

In the line HERE A, I tried xnode.SelectSingleNode("corp_name"). This gave no nodes at all. With "/corp_name", no nodes at all, either. But with the current code where "//corp_name" is used, 10 nodes are found.

With the A line being "//corp_name", if in the line HERE B "corp_name" and "/corp_name" are used, the run time error occurred. So I was forced to use "//corp_name" in the line B, too. But, then the problem of the title occurred. Among 10 nodes, only first node is repeated. This seems logical since the method itself returns the first node. But then, how can I extract all the different nodes? I was hoping that the loop would take care of each node. But apperently it does not, or else... This is something I do not understand in the code which I copied from Internet.

My test option seems depleted to nothing. In fact, this is my first try to use API, XML, AND MSXML2, meaning that there should be something I am using without completely understanding. So, it would be a great help if any of you could point out what seems to be misused.

Please help!

===================

Sub extract_corpname_receptno()

Dim xml_obj As MSXML2.XMLHTTP60 Set xml_obj = New MSXML2.XMLHTTP60

base_url = "https://opendart.fss.or.kr/api/list.xml?"
param_api = "&crtfc_key="
param_api_value = CStr(Worksheets("sheet2").Range("k1").Value)
    
api_url = base_url + _
          param_api + param_api_value

xml_obj.Open bstrMethod:="GET", bstrURL:=api_url
xml_obj.send
Debug.Print "The Request was " + xml_obj.statusText

Debug.Print xml_obj.responseText
    
Dim xDoc As MSXML2.DOMDocument60
Dim xNodes As MSXML2.IXMLDOMNodeList
Dim xnode As MSXML2.IXMLDOMNode

Set xDoc = New MSXML2.DOMDocument60

    xDoc.LoadXML (xml_obj.responseText)
    If Not xDoc.LoadXML(xml_obj.responseText) Then
           MsgBox "load error"
    End If
    
Set xNodes = xDoc.SelectNodes("//corp_name") '<===== HERE A

Debug.Print xNodes.Length

Dim xChlNode As IXMLDOMNodeList
Set xChlNode = xDoc.ChildNodes.Item(1).ChildNodes

For Each xChl In xChlNode
    Debug.Print xChl.BaseName
    Debug.Print xChl.NodeType 
Next

For Each xnode In xNodes
    Debug.Print "----------------------------------"
    'Debug.Print xnode.Text
    Debug.Print xnode.SelectSingleNode("//corp_name").Text '<===== HERE B
    Debug.Print xnode.SelectSingleNode("//rcept_no").Text '<===== HERE C
Next
    
Dim wrksht As Worksheet
Set wrksht = ThisWorkbook.Worksheets("Sheet2")
    
Count = 1
For Each xnode In xNodes

    wrksht.Cells(Count, 1).Value = xnode.SelectSingleNode("//corp_name").Text '<=== HERE B
    wrksht.Cells(Count, 2).Value = xnode.SelectSingleNode("//rcept_no").Text '<=== HERE C

    Count = Count + 1

Next
 

End Sub

==================

The XML I am trying to parse has the following structure:

 <list>
     <corp_name>AAA</corp_name>
     <rcept_no>111</rcept_no>
 </list>

 <list>
     <corp_name>BBB</corp_name>
     <rcept_no>222</rcept_no>
 </list>

 .....

 <list>
     <corp_name>JJJ</corp_name>
     <rcept_no>ten</rcept_no>
 </list>

===============================

The code above gives only

AAA 111

AAA 111

AAA 111 .... ....

AAA 111

when should give

AAA 111

BBB 222

CCC 333 .... ....

JJJ ten

===================

0

There are 0 answers