Main BLOGGER
Google
WWW THIS BLOG
Monday, February 13, 2006
 
On erasing items from std::vector and std::map
If we try to conditionally erase items from std::vector or std::map during a loop,

since we use iterator to control the loop and as the same time the iterator may be changed by erase(),

It is very subtle to write the correct code.

Based on my testing, for std::vector, we should always use the
β€œit = my_vector.erase(it)” in the loop

For std::map, we should always use the trick
β€œ tmp = it++; my_map.erase(tmp); β€œ

The approach used in std::map can not be applied to std::vector since the result is not defined.

The approach used in std::vector can not work for std::map in Linux, it is only good in windows.


----------------Here is the right way to do the stuff ----------------------



int testEraseMap()
{
std::map hostList;
for (int i=0; i<5; i++)
hostList[i]="host";
std::map::iterator it;
it=hostList.begin();
while (it != hostList.end())
{
std::cout << "testing hostname='" << it->second <<"' i=" << it->first << std::endl;
if ( it->second == string("host"))
{
std::map::iterator tmpIt = it++;
hostList.erase(tmpIt);
}else
{
it++;
}
}
return 0;
}

int testEraseVector()
{
std::vector hostList;
for (int i=0; i<5; i++)
hostList.push_back("host");
std::vector::iterator it;
it=hostList.begin();
int j=1;
while (it != hostList.end())
{
std::cout << "testing hostname='" << (*it) <<"' j=" << j << std::endl;
if ( (*it) == string("host"))
{
//std::vector::iterator tmpIt = it;
//it++; cause core dump in Linux eventually
it = hostList.erase(it);
}else
{
it++;
}
j++;
}
return 0;
}


Hao Zhou figured out why the following seemingly okay code does not work for std::vector. Very insightful!

std::vector::iterator tmpIt = it++; (1)
hostList.erase(tmpIt); (2)

The reason is because std::vector internally uses ARRAY to store elements, when we need to erase one element, the following elements will be copied one by one forward. We can image the situation is like

+0 +1 +2 +3 +4 +0 +1 +2 +3
a1, a2, a3, a4, a5 ---> erase a3 --> a1, a2, a4, a5

in this case, after (1) tmpIt = +2, it = +3
after (2), a4, a5 will be copied forward to fill out the empty place left by a3, such that right now iterator "it" is pointing to a5 now!

It is not the end of the whole story. There is another P1 bug 64178 is related to this subtle implementation of vector::erase().

since every time after vector::erase() is called, the end() iterator is changed as well, we can not assign a "static" value to catch the end() of vector. We have to update end() at each iteration.

If in the loop, we need to remove the item from the vector, we have to update the end of vector in the loop condition as well.
For example

{
std::vector::iterator itB, itE;
itB=hostList.begin();
itE=hostList.end();
int j=1;
while (itB != itE) <--- we use the "static" itE, it will cause problem
{
std::cout << "testing hostname='" << (*itB).c_str() <<"' j=" << j << std::endl;
if ( (*itB) == ACE_TString("host"))
{
//std::vector::iterator tmpIt = it++;
//hostList.erase(tmpIt);
itB = hostList.erase(itB);
}else
{
++itB;
}
j++;
}
return 0;
}


The testing result
testing hostname='host' j=1
testing hostname='host' j=2
testing hostname='host' j=3
testing hostname='host' j=4
testing hostname='host' j=5
testing hostname='' j=6
testing hostname='' j=7
testing hostname='' j=8
testing hostname='' j=9
testing hostname='' j=10
testing hostname='Segmentation fault (core dumped)


it should be

{
std::vector::iterator itB, itE;
itB=hostList.begin();
int j=1;
while (itB != hostList.end()) <--- we need to use updated hostList.end() in each iteration.
{
std::cout << "testing hostname='" << (*itB).c_str() <<"' j=" << j << std::endl;
if ( (*itB) == ACE_TString("host"))
{
//std::vector::iterator tmpIt = it++;
//hostList.erase(tmpIt);
itB = hostList.erase(itB);
}else
{
++itB;
}
j++;
}
return 0;
}

The testing result
testing hostname='host' j=1
testing hostname='host' j=2
testing hostname='host' j=3
testing hostname='host' j=4
testing hostname='host' j=5

Friday, February 10, 2006
 
Understand Copy Constructor
compile in windows: cl /EHsc main.cpp

Output:

C:\c++Test>cat HowMany.out
after construction of h: object count = '1'
x argument inside f(): object count = '1'
~HowMany(): object count = '0'
after call to f(h): object count = '0'
~HowMany(): object count = '-1'
~HowMany(): object count = '-2'

Source Code

#include
#include
using namespace std;

ofstream out("HowMany.out");

class HowMany {
static int count;
public:
HowMany()
{
count++ ;
}
static void print(const string &msg="")
{
if (msg.size() != 0)
{
out << msg << ": " ;
}
out << "object count = '" << count << "'" << endl;
}
~HowMany()
{
count--;
print("~HowMany()");
}
};

int HowMany::count = 0;

HowMany f(HowMany x) {
x.print("x argument inside f()");
return x;
}

void main()
{
HowMany h;
HowMany::print("after construction of h");
HowMany h2 = f(h);
HowMany::print("after call to f(h)");
}

Friday, February 03, 2006
 
Native STL container performace testing
Be careful of the STL containers, they are NOT standard in terms of performance.

Some testing result (Thanks to Ajith)

Testing Environment
Hardware: IBM blades Dual CPU Xeon 2.8G, 2.5G memory
1. Windows Server 2003
2. Linux 2.6 glibc 2.3

Conclusions. Eventhough Windows STL is slower than Linux, we can still push 1 million elements into a list in 1/2 a second. This doesn't seem like a significant issue.

Linux list.size() is really slow. 1 million size calculations on a 1 million element list would take 2 hours. On Windows it would be 3ms.

Linux Release mode is two times faster than debug mode.
Benchmarks: 1 Million element list
Windows list is 4-5 times slower than Linux list in release mode, 2-3 times slower in debug mode.

Linux size() is completely broken.
Release Mode Windows
Push Back : 455,708us
Size : 2,895us
Pop Back : 393,674us
Push Front : 456,519us
Pop Front : 384,902us
Release Mode UNIX
Push Back : 100,464us
Size : 8,000,000,000us (est. over 2 hours)
Pop Back : 54,065us
Push Front : 61,269us
Pop Front : 53,361us
Debug Mode UNIX
Push Back : 172847us
Size : 11,200,000,000us (est. over 2 hours)
Pop Back : 80,081us
Push Front : 118,254us
Pop Front : 82,967us

Here are the results for MAP - 1 million items. Map is a bit slower on Windows.
Windows Release Mode
Insert : 2,314,187us
Size : 2,609us
Erase : 573,573us
Linux Debug Mode
Insert : 1,202,102us
Size : 9,206us
Erase : 182,477us




Powered by Blogger

Google
WWW THIS BLOG