upper_bound() gives WRONG output

681 views Asked by At

My program below reads a text file (data.txt), uses upper_bound() to compare a value but it doesn't give the correct value. I couldn't figure out why the output is always the last value.

#include <iostream>
#include <vector>
#include <algorithm>

int main(int argc, char* argv[])
{   
    FILE* textFile;
    std::string filename;

    std::vector <unsigned> dataVector;
    unsigned d;

    // Open file
    filename = "data.txt";
    if( (textFile = fopen(filename.c_str(), "r")) == NULL ) {
        std::cout << "Cannot open file "<< filename.c_str() << std::endl; 
        exit(1);
    }

    // Read data
    while ( fscanf(textFile, "%d", &d) > 0 ){
        dataVector.push_back(d);
    }
    fseek(textFile, 1, SEEK_CUR);
    std::cout<<"dataVector size: "<< dataVector.size() << std::endl;

    // ********************* iterator ******************************** //
    unsigned val;
    unsigned x;
    std::vector<unsigned>::iterator it;
    std::vector<unsigned>::iterator begin;
    std::vector<unsigned>::iterator end;

    val = 41;
    begin = dataVector.begin();
    end = dataVector.end();
    it = upper_bound (begin, end, val);   // Problem here: it = end 
    x = int(it - begin);                  // output: 424 is wrong, supposed to be 64!!

    return 0; 
}

data.txt (This doesn't work)

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 8 14 22 32 44 58 75 93 113 134 157 183 211 240 271 304 339 376 415 455 497 539 581 623 665 708 752 797 843 890 938 988 1040 1093 1148 1204 1262 1321 1381 1442 1504 1568 1634 1701 1770 1840 1911 1983 2056 2132 2209 2287 2365 2433 2498 2558 2611 2658 2705 2753 2801 2849 2898 2947 2996 3046 3095 3145 3195 3246 3296 3348 3399 3451 3504 3556 3607 3658 3708 3758 3808 3859 3911 3960 4011 4060 4106 4150 4194 4237 4277 4319 4362 4403 4441 4480 4520 4557 4595 4629 4657 4687 4715 4742 4769 4795 4819 4844 4867 4888 4906 4921 4936 4950 4958 4963 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

data_2.txt (This works)

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 14 31 51 73 102 141 182 225 279 335 393 453 516 582 651 724 801 880 967 1054 1140 1228 1318 1409 1501 1597 1693 1791 1890 2001 2117 2251 2393 2535 2680 2827 2973 3121 3272 3425 3581 3740 3896 4054 4212 4371 4528 4688 4850 5011 5176 5341 5504 5662 5813 5961 6111 6264 6416 6569 6723 6881 7041 7200 7356 7509 7654 7796 7937 8073 8207 8340 8470 8598 8724 8850 8975 9098 9219 9339 9458 9574 9688 9802 9915 10027 10139 10251 10363 10474 10583 10693 10804 10914 11022 11129 11233 11336 11439 11541 11643 11748 11855 11965 12076 12188 12313 12440 12572 12706 12860 13024 13190 13359 13530 13702 13876 14052 14229 14429 14631 14839 15051 15268 15487 15707 15929 16155 16385 16617 16852 17091 17336 17584 17835 18090 18350 18615 18882 19154 19426 19702 19982 20264 20555 20847 21140 21436 21733 22031 22332 22634 22938 23243 23550 23858 24173 24489 24807 25128 25450 25773 26096 26419 26742 27065 27388 27711 28034 28357 28679 29001 29323 29644 29965 30286 30607 30925 31236 31544 31851 32149 32442 32725 33000 33259 33508 33732 33956 34182 34407 34632 34857 35081 35301 35507 35706 35894 36076 36254 36429 36602 36772 36940 37100 37257 37400 37532 37665 37797 37928 38059 38189 38318 38444 38571 38699 38827 38955 39083 39212 39342 39472 39602 39732 39862 39993 40124 40255 40386 40516 40646 40775 40904 41033 41163 41293 41423 41553 41682 41812 41942 42072 42202 42333 42464 42595 42726 42856 42988 43123 43259 43395 43531 43667 43803 43939 44075 44211 44346 44483 44621 44760 44900 45042 45184 45328 45472 45616 45759 45902 46045 46188 46332 46476 46621 46766 46911 47058 47205 47353 47502 47652 47803 47954 48106 48259 48413 48567 48721 48875 49029 49183 49337 49489 49641 49792 49942 50092 50243 50397 50553 50711 50869 51027 51186 51345 51504 51664 51824 51984 52144 52305 52465 52626 52787 52948 53109 53271 53433 53595 53757 53921 54085 54249 54413 54578 54742 54907 55072 55237 55402 55567 55732 55898 56065 56232 56400 56569 56738 56907 57074 57242 57410 57579 57749 57919 58088 58257 58426 58593 58759 58925 59090 59255 59420 59585 59749 59913 60076 60240 60403 60566 60728 60890 61051 61212 61373 61534 61695 0 0 0 0
2

There are 2 answers

5
T.C. On BEST ANSWER

The search can be done in one line.

auto val = 41u;
auto it = std::find_if(dataVector.begin(), dataVector.end(),
                       [val](unsigned i) { return i > val; });
std::cout << (it - dataVector.begin()) << std::endl;

If you repeatedly search the same range enough that the reduced complexity of a binary search actually matters, and the range is sorted until the trailing zeroes, then we can go searching for where the sorted sub-range ends (and the trailing zeroes start), and use upper_bound on the initial sorted sub-range.

Also, the fscanf call causes undefined behavior because you use %d with an unsigned. You need %u. Or, get rid of the C-style mess and use a stream iterator to read into the vector:

std::ifstream ifs(filename);

if( !ifs ) {
    std::cout << "Cannot open file "<< filename << std::endl; 
    return 1;
}

std::vector<unsigned> dataVector(std::istream_iterator<unsigned>(ifs), {});

Demo.

3
Spanky On

Got your problem!!! you want the distance between the upper bound and starting element. Your way is wrong. See upper_bound() do not work without sorting. if you sort, the vector changes and pushes your value to the end. this gives a result of 312 as the element has been sorted to 312th position. so need to preserve the original vector and on it apply distance algorithm. steps:

  1. copy the original vector to another vector.
  2. sort the original vector 3.. apply upper bound
  3. take the upper bound value and do find() on the copied and unsorted vector
  4. use the find returned iterator and apply distance algorithm.

AND viola you have your correct answer 64!!! code pasted below:

FILE* textFile;
std::string filename;

std::vector <unsigned> dataVector;
unsigned d;

// Open file
filename = "data.txt";
if ((fopen_s(&textFile,filename.c_str(), "r")) != 0) {
    std::cout << "Cannot open file " << filename.c_str() << std::endl;
    exit(1);
}

// Read data
while (fscanf_s(textFile, "%d", &d) > 0){
    dataVector.push_back(d);
}
fseek(textFile, 1, SEEK_CUR);
std::cout << "dataVector size: " << dataVector.size() << std::endl;

// ********************* iterator ******************************** //
unsigned val;
unsigned x;
std::vector<unsigned>::iterator it;
std::vector<unsigned>::iterator begin;
std::vector<unsigned>::iterator end;

int size = dataVector.size();
std::vector<unsigned> vec(size);
//copy the original vector
std::copy(dataVector.begin(), dataVector.end(),vec.begin());
//sort it
std::sort(dataVector.begin(), dataVector.end());
val = 41;
begin = dataVector.begin();
end = dataVector.end();
it = upper_bound(begin, end, val);   // Problem here: it = end 
//cout << *it << endl;
//x = it - begin;                  // output: 424 is wrong, supposed to be 64!!
int newval = *it;
std::vector<unsigned>::iterator itr = find(vec.begin(), vec.end(),newval);
x = std::distance(vec.begin(),itr );
cout << x << endl;