c++ standard library list usage?

206 views Asked by At

Porting a program from linux to solaris, building it with solarisstudio 12.3.

It has these definitions:

typedef std::list<ISocketMultiplexerJob *> CSocketJobs;
typedef CSocketJobs::iterator CJobCursor;

CSocketJobs   m_socketJobs;

and this code:

CSocketMultiplexer::CJobCursor
CSocketMultiplexer::nextCursor(CJobCursor cursor)
{
    CLock lock(m_mutex);

        CJobCursor j = m_socketJobs.end();
        CJobCursor i = cursor;
        while (++i != m_socketJobs.end()) {
            if (*i != m_cursorMark) {   // CRASHES HERE!!
                j = i;

                // move our cursor just past the job
                m_socketJobs.splice(++i, m_socketJobs, cursor);
                break;                                                  
             }
        }
        return j;
  }

It crashes at the line indicated above because:

(dbx) print i
i = {
    _C_node = (nil)
}

It looks like the "++i" iteration traipsed off the list but the test against m_socketJobs.end() failed to see it and let it go through. Need help debugging, eg how can I analyze *i in dbx in a more c++-aware fashion?

The caller of nextCursor() is in this snippet, <>:

                // collect poll entries
                if (m_update) {
                       m_update = false;
                       pfds.clear();
                       pfds.reserve(m_socketJobMap.size());

                       CJobCursor cursor    = newCursor();
                       CJobCursor jobCursor = nextCursor(cursor);
                       while (jobCursor != m_socketJobs.end()) {
                               ISocketMultiplexerJob* job = *jobCursor;
                               if (job != NULL) {
                                    pfd.m_socket = job->getSocket();
                                    pfd.m_events = 0;
                                    if (job->isReadable()) {
                                            pfd.m_events |= IArchNetwork::kPOLLIN;
                                    }
                                    if (job->isWritable()) {
                                            pfd.m_events |= IArchNetwork::kPOLLOUT;
                                    }
                                    pfds.push_back(pfd);
                               }                               
                               jobCursor = nextCursor(cursor);  //FATAL CALL
                       }

And here's the function newCursor():

CSocketMultiplexer::CJobCursor
   CSocketMultiplexer::newCursor() {  
     CLock lock(m_mutex);
     return m_socketJobs.insert(m_socketJobs.begin(), m_cursorMark);
}

I did some munging around and found that newCursor()/nextCursor() sorta works and doesn't work.... maybe another thread is hurting the context. In the example below (embedded in my program) the first init of "CJobCursor c = newCursor();" is robust, I can insert the line "c= nextCursor(c);" anywhere in my program and it doesn't crash. But the next one with the comment "BAD" is flawed and crashes on the second nexCursor() call. I find this interesting but no explanation yet. I think I need to continue testing inside the whole program because the context is killing things. What do you think?

void
CSocketMultiplexer::serviceThread(void*)
{
    std::vector<IArchNetwork::CPollEntry> pfds;
    IArchNetwork::CPollEntry pfd;

                    CJobCursor c    = newCursor();
                    CJobCursor j = nextCursor(c);
                    c = nextCursor(c);
                    c = nextCursor(c);

    // service the connections
    for (;;) {
            CThread::testCancel();

            // wait until there are jobs to handle
            {
                    CLock lock(m_mutex);
                    while (!(bool)*m_jobsReady) {
                            m_jobsReady->wait();
                    }
            }

            // lock the job list
            lockJobListLock();
            lockJobList();

            // collect poll entries
            if (m_update) {
                    m_update = false;
                    pfds.clear();
                    pfds.reserve(m_socketJobMap.size());

                    CJobCursor cursor    = newCursor();  //BAD, Ill-fated object
                    CJobCursor jobCursor = nextCursor(cursor);
                    c = nextCursor(c);
                    cursor = nextCursor(cursor);        // SEGV's here
                    while (jobCursor != m_socketJobs.end()) {
                            ISocketMultiplexerJob* job = *jobCursor;
                            if (job != NULL) {
                                    pfd.m_socket = job->getSocket();
                                    pfd.m_events = 0;
                                    if (job->isReadable()) {
                                            pfd.m_events |= IArchNetwork::kPOLLIN;
                                    }
                                    if (job->isWritable()) {
                                            pfd.m_events |= IArchNetwork::kPOLLOUT;
                                    }
                                    pfds.push_back(pfd);
                            }                               
                            c = nextCursor(c);
                            jobCursor = nextCursor(cursor);
                    }
                    c = nextCursor(c);
                    deleteCursor(cursor);
            }
1

There are 1 answers

12
Jonathan Wakely On

It looks like the "++i" iteration traipsed off the list

No, I think it looks like an invalid iterator (maybe default constructed), not a past-the-end iterator.

A std::list<T>::iterator needs to be able to decrement backwards through the list again, so it can't point to null, otherwise once you reached the end and it became null you couldn't go backwards again. Typically a past-the-end std::list iterator has to point to a sentinel node.

So your i does not point to any element of m_socketJobs which means that m_socketJobs.end() is not reachable from i no matter how many times you increment it, so i != m_socketJobs.end() is always going to be true.

Try adding assert( i != CJobCursor() ) to the top of the function and I bet it aborts there, because I think you've called the function with an invalid iterator.

You should not assume that just because (++i != m_socketJobs.end()) is true the iterator is dereferenceable.