Discussion:
MSXML6 Memory Leaks - Recursive implementation for XML parsing
(too old to reply)
shishir
2011-06-24 07:16:36 UTC
Permalink
MSXML6 Memory Leaks - Recursive implementation for XML parsing

Hi All,

I am developing a C++ application that uses MSXML6 to parse an XML.
The XML contains a collection of <Record> nodes .e.g.

<Root>
<Record id=1>...</Record>
<Record id=2>...</Record>
<Record id=3>...</Record>
<Record id=4>...</Record>
<Record id=5>...</Record>
..
...
</Root>

My application is a dialog box that has a ListView, a "Next" and
"Previous" button. The application -
1. Reads a set(fixed number) of these <Record> nodes and displays
them in a ListView
2. When the end user clicks the "Next" button then the application
clears the ListView, reads the next set of <Record> nodes and
populates the ListView with the new set of records.
3. This way my application always displays a fixed set of <Record>
nodes only.

The schema of the <Record> node however is unknown so to parse this
XML the C++ application will have to -

1. Retrieve a collection of <Record> nodes
2. Iterate the collection of <Record> nodes
3. Retrieve a collection of the child nodes in each <Record> node
4. Iterate each child node
5. Check if the child node has any childs
6.
a. If the child does not have any node display the "name" and
"value" of the node.
b. If the child has nodes then repeat steps 3,4, 5 and 6.
(Recursive function implementation)


Code
=============

void CMyDlg::PopulateLogFileData(std::vector<struct stctLogDetails >
&f_vctAllDetailValue,

MSXML2::IXMLDOMNodePtr f_pChild,
int
f_ExtractData,
int
f_nReadCount)
{

int i_Cnt = 0;


for (;NULL != f_pChild; f_pChild = f_pChild->nextSibling)
{

MSXML2::IXMLDOMNodePtr pChild = f_pChild->firstChild;

CComBSTR l_bstr;
f_pChild->get_nodeName(&l_bstr);

MSXML2::IXMLDOMNodePtr l_pNodeBody = NULL;
l_pNodeBody = f_pChild->GetparentNode();
CComBSTR bs;
l_pNodeBody->get_nodeName(&bs);
m_pXMLDOMList = l_pNodeBody->GetchildNodes();

...

...

...


PopulateLogFileData(f_vctAllDetailValue,pChild,f_ExtractData,f_nReadCount);
}

}

My code uses the smart pointers for MSXML (available from #import of
the "MSXML6.dll") and BSTRs.


Memory Leak Problem
======================

I observe that with every click of "Next" button my memory consumption
keeps increasing and it never comes down. I have used smart pointers
for "MSXML" and BSTR so I am not leaking any interfaces or memory so I
am unable to understand the reason behind the memory consumption. Even
if we assume that the memory consumption may go up due to the
Recursive function call implementation still the memory should be
released when the call completes.

Does anybody have any idea on this behavior of MSXML.


Regards,
Shishir Srivastav
Joseph M. Newcomer
2011-06-24 15:03:42 UTC
Permalink
See below...
Post by shishir
MSXML6 Memory Leaks - Recursive implementation for XML parsing
Hi All,
I am developing a C++ application that uses MSXML6 to parse an XML.
The XML contains a collection of <Record> nodes .e.g.
<Root>
<Record id=1>...</Record>
<Record id=2>...</Record>
<Record id=3>...</Record>
<Record id=4>...</Record>
<Record id=5>...</Record>
..
...
</Root>
My application is a dialog box that has a ListView, a "Next" and
"Previous" button. The application -
1. Reads a set(fixed number) of these <Record> nodes and displays
them in a ListView
2. When the end user clicks the "Next" button then the application
clears the ListView, reads the next set of <Record> nodes and
populates the ListView with the new set of records.
3. This way my application always displays a fixed set of <Record>
nodes only.
The schema of the <Record> node however is unknown so to parse this
XML the C++ application will have to -
1. Retrieve a collection of <Record> nodes
2. Iterate the collection of <Record> nodes
3. Retrieve a collection of the child nodes in each <Record> node
4. Iterate each child node
5. Check if the child node has any childs
6.
a. If the child does not have any node display the "name" and
"value" of the node.
b. If the child has nodes then repeat steps 3,4, 5 and 6.
(Recursive function implementation)
Code
=============
void CMyDlg::PopulateLogFileData(std::vector<struct stctLogDetails >
&f_vctAllDetailValue,
MSXML2::IXMLDOMNodePtr f_pChild,
int
f_ExtractData,
int
f_nReadCount)
{
int i_Cnt = 0;
for (;NULL != f_pChild; f_pChild = f_pChild->nextSibling)
{
MSXML2::IXMLDOMNodePtr pChild = f_pChild->firstChild;
CComBSTR l_bstr;
f_pChild->get_nodeName(&l_bstr);
MSXML2::IXMLDOMNodePtr l_pNodeBody = NULL;
l_pNodeBody = f_pChild->GetparentNode();
CComBSTR bs;
l_pNodeBody->get_nodeName(&bs);
m_pXMLDOMList = l_pNodeBody->GetchildNodes();
...
...
...
PopulateLogFileData(f_vctAllDetailValue,pChild,f_ExtractData,f_nReadCount);
}
}
My code uses the smart pointers for MSXML (available from #import of
the "MSXML6.dll") and BSTRs.
Memory Leak Problem
======================
I observe that with every click of "Next" button my memory consumption
keeps increasing and it never comes down. I have used smart pointers
for "MSXML" and BSTR so I am not leaking any interfaces or memory so I
am unable to understand the reason behind the memory consumption. Even
if we assume that the memory consumption may go up due to the
Recursive function call implementation still the memory should be
released when the call completes.
****
I think it is important that there is no POSSIBLE way to "notice" that the memory usage is
changing. There IS, however, a way to determine "memory usage", and you have to exercise
this way and report WHAT you were doing that caused you to think that memory usage was
increasing. For example, "I was looking at the number of pages reported by Task Manager",
Then, I would point out that Task Manager is just about the most entirely USELESS tool for
determining memory usage that exists, because it says nothing about how the memory is
USED, only that the memory is PRESENT. In general, there are no good tools to tell you
how memory is USED, although it is worth looking at the numerous options of the
Application Verifier which might give a somewhat more realistic picture of how the memory
is actually USED.

Then, you state "I am not leaking any...memory" and you have no evidence of this. You
only believe that the smart pointers are handling deallocation and that the absence of any
memory leak notifications means no memory is leaking. I might trust the smart pointers,
but the absence of memory leak messages means (a) you are not leaking memory in the C/MFC
heap and (b) the memory you ARE leaking is not being managed by a heap that tracks memory
leakage/ So unless you use a tool which takes over the memory for the entire process
(such as Application Verifier) then you really have no idea if you are leaking memory.

Read my essays about how storage allocators work
http://www.flounder.com/memory_allocation.htm
http://www.flounder.com/inside_storage_allocation.htm

Note that one of the side effects of reality is that the memory footprint as reported by
Task Manager could very well grow while the actual memory utilization remains constant.
There is a fictitious belief (based on I-have-no-idea-what) that Task Manager is reporting
the memory USED, when it is only reporting the address space footprint!

Now it is true that MSXML might well be leaking memory; this would require suitable
investigation. But using Task Manager or any other similar technique to attempt to
determine memory usage is not a valid technique. Application Verifier should be your
first choice.
joe
****
Post by shishir
Does anybody have any idea on this behavior of MSXML.
Regards,
Shishir Srivastav
Joseph M. Newcomer [MVP]
email: ***@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
David Webber
2011-06-24 15:43:02 UTC
Permalink
MSXML6 Memory Leaks - Recursive implementation for XML parsing...
With the risk of being the Irishman who, when asked for directions, said

"If I were going there, I wouldn't start from here."

let me say that I spent some time, a few years ago, trying to use MSXML4.
Eventually I found out that what I was trying to do (verify against a DTD
using a local copy) was impossible, and I switched to using Xerces library,
which I found quite generally easier to use. Since then I haven't looked
back. IF MSXML is being a pain, then I'd recommend switching.

Dave

-- David Webber
Mozart Music Software
http://www.mozart.co.uk
For discussion and support see
http://www.mozart.co.uk/mozartists/mailinglist.htm
Joseph M. Newcomer
2011-06-24 23:30:47 UTC
Permalink
Actually,l I have never used MSXML; it is just too complex (when possible, I try to avoid
anything with a GUID needed to access it...). I've used ExPat, and for one client I had
to write an XML parser (wow--it took me nearly a whole afternoon!) because they "didn't
want none o' that-there open-source software" (I wrote it on my own time, and someday I
will publish it...but it isn't covered by a GNU license, which is what they REALLY cared
about, but couldn't say coherently)

There are a lot of free XML readers out there, and if you don't find one you like, write
your own. Starting cold, it can't take more than a couple days...less time than you will
spend fighting MSXML.
joe
Post by David Webber
MSXML6 Memory Leaks - Recursive implementation for XML parsing...
With the risk of being the Irishman who, when asked for directions, said
"If I were going there, I wouldn't start from here."
let me say that I spent some time, a few years ago, trying to use MSXML4.
Eventually I found out that what I was trying to do (verify against a DTD
using a local copy) was impossible, and I switched to using Xerces library,
which I found quite generally easier to use. Since then I haven't looked
back. IF MSXML is being a pain, then I'd recommend switching.
Dave
-- David Webber
Mozart Music Software
http://www.mozart.co.uk
For discussion and support see
http://www.mozart.co.uk/mozartists/mailinglist.htm
Joseph M. Newcomer [MVP]
email: ***@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Loading...