Main BLOGGER
Google
WWW THIS BLOG
Monday, February 13, 2006
 
On erasing items from std::vector and std::map
If we try to conditionally erase items from std::vector or std::map during a loop,

since we use iterator to control the loop and as the same time the iterator may be changed by erase(),

It is very subtle to write the correct code.

Based on my testing, for std::vector, we should always use the
β€œit = my_vector.erase(it)” in the loop

For std::map, we should always use the trick
β€œ tmp = it++; my_map.erase(tmp); β€œ

The approach used in std::map can not be applied to std::vector since the result is not defined.

The approach used in std::vector can not work for std::map in Linux, it is only good in windows.


----------------Here is the right way to do the stuff ----------------------



int testEraseMap()
{
std::map hostList;
for (int i=0; i<5; i++)
hostList[i]="host";
std::map::iterator it;
it=hostList.begin();
while (it != hostList.end())
{
std::cout << "testing hostname='" << it->second <<"' i=" << it->first << std::endl;
if ( it->second == string("host"))
{
std::map::iterator tmpIt = it++;
hostList.erase(tmpIt);
}else
{
it++;
}
}
return 0;
}

int testEraseVector()
{
std::vector hostList;
for (int i=0; i<5; i++)
hostList.push_back("host");
std::vector::iterator it;
it=hostList.begin();
int j=1;
while (it != hostList.end())
{
std::cout << "testing hostname='" << (*it) <<"' j=" << j << std::endl;
if ( (*it) == string("host"))
{
//std::vector::iterator tmpIt = it;
//it++; cause core dump in Linux eventually
it = hostList.erase(it);
}else
{
it++;
}
j++;
}
return 0;
}


Hao Zhou figured out why the following seemingly okay code does not work for std::vector. Very insightful!

std::vector::iterator tmpIt = it++; (1)
hostList.erase(tmpIt); (2)

The reason is because std::vector internally uses ARRAY to store elements, when we need to erase one element, the following elements will be copied one by one forward. We can image the situation is like

+0 +1 +2 +3 +4 +0 +1 +2 +3
a1, a2, a3, a4, a5 ---> erase a3 --> a1, a2, a4, a5

in this case, after (1) tmpIt = +2, it = +3
after (2), a4, a5 will be copied forward to fill out the empty place left by a3, such that right now iterator "it" is pointing to a5 now!

It is not the end of the whole story. There is another P1 bug 64178 is related to this subtle implementation of vector::erase().

since every time after vector::erase() is called, the end() iterator is changed as well, we can not assign a "static" value to catch the end() of vector. We have to update end() at each iteration.

If in the loop, we need to remove the item from the vector, we have to update the end of vector in the loop condition as well.
For example

{
std::vector::iterator itB, itE;
itB=hostList.begin();
itE=hostList.end();
int j=1;
while (itB != itE) <--- we use the "static" itE, it will cause problem
{
std::cout << "testing hostname='" << (*itB).c_str() <<"' j=" << j << std::endl;
if ( (*itB) == ACE_TString("host"))
{
//std::vector::iterator tmpIt = it++;
//hostList.erase(tmpIt);
itB = hostList.erase(itB);
}else
{
++itB;
}
j++;
}
return 0;
}


The testing result
testing hostname='host' j=1
testing hostname='host' j=2
testing hostname='host' j=3
testing hostname='host' j=4
testing hostname='host' j=5
testing hostname='' j=6
testing hostname='' j=7
testing hostname='' j=8
testing hostname='' j=9
testing hostname='' j=10
testing hostname='Segmentation fault (core dumped)


it should be

{
std::vector::iterator itB, itE;
itB=hostList.begin();
int j=1;
while (itB != hostList.end()) <--- we need to use updated hostList.end() in each iteration.
{
std::cout << "testing hostname='" << (*itB).c_str() <<"' j=" << j << std::endl;
if ( (*itB) == ACE_TString("host"))
{
//std::vector::iterator tmpIt = it++;
//hostList.erase(tmpIt);
itB = hostList.erase(itB);
}else
{
++itB;
}
j++;
}
return 0;
}

The testing result
testing hostname='host' j=1
testing hostname='host' j=2
testing hostname='host' j=3
testing hostname='host' j=4
testing hostname='host' j=5



<< Home

Powered by Blogger

Google
WWW THIS BLOG