I'm having issues finding binary substring in my BMP file

81 views Asked by At

so i have a small issue. I am working on a school assignment and i just can't figure out how to get the solution. So the assignment needs me to read a file in binary form and then find certain bit sequences in that read file. I am then to print out the index of the first bit in the found sequence for all instances of that sequence.

So if i were to write a binary sequence 10011001001011101 and search for the 1001 sequence i would get the indexes as follows: 0, 4. Now all was well when i had written code to do just that for a string of 1s and 0s, but when i started implementing the file reading and searching i ran into an issue. The file is a .bmp and reading it in binary supposedly gives me a string of the pixel values (not quite sure how to check if the string is actually what i think it is). Then when i go to search for a binary substring, it doesn't find any instances of it when there are plenty. My program returns instead std::string::npos and doesn't go to loop through the string.

Here is the code i have currently:


#include <iostream>
#include <vector>
#include <fstream>
#include <sstream>
#include <filesystem>
using namespace std;

void f(vector<size_t>& poz, string pattern, string filename) {

    std::ifstream file(filename, std::ios::binary);
    if (file)
    {

        file.seekg(0, std::ios::end);
        std::streampos length = file.tellg();
        file.seekg(0, std::ios::beg);

        std::vector<char> buffer(length);
        file.read(&buffer[0], length);

        std::stringstream localStream;
        localStream.rdbuf()->pubsetbuf(&buffer[0], length);

        string solution = localStream.str();

        //after i get the full string i should be able to find the instances of my pattern
        //this file contains only white pixels and red (000000000000000011111111) pixels

        cout << "Pattern: " << pattern <<endl;
        size_t index = solution.find(pattern);
        cout << "First found pixel: " << index << endl; // this value is the exact same as npos
        cout << "string::npos: " << string::npos;
        while (index != string::npos) {
            poz.push_back(index);
            index = solution.find(pattern, index + pattern.size());
        }
    }
    
}

int main() {
    vector<size_t> poz;
    string file = "test.bmp";
    string pattern = "000000000000000011111111"; //looking for red pixels only

    f(poz, pattern, file);

    for (auto i = poz.begin(); i != poz.end(); i++) {
        if (i == (poz.end() - 1))
            cout << *i << endl;
        else
            cout << *i << ", ";
        
    }

}

I don't know if i'm searching for the sequences wrong or if my file isn't actually converted into a binary string and that's why the search isn't turning anything up. Any help is appreciated.

1

There are 1 answers

0
simsam On BEST ANSWER

As per @user4581301 's observation the file i had read prior to the solution was passed on as bytes and thus i couldn't find the matches as i wished to. Changing how i read the file solved my issue. Instead of reading the file in binary mode and directly storing it into a string, i decided to read the characters and then convert them to a bitset array. I then concatenate the the bitset arrays into a single string and use the .find() function to find my occurences. This process is slow yes but technically works as per my question. My code is changed to reflect this.

void f(std::vector<size_t>& poz, std::string pattern, std::string filename) {
    std::ifstream file(filename, std::ios::in | std::ios::out | std::ios::binary);

    if (file) {

        std::string fileString((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>()); // reading chars instead of bytes

        auto size = fileString.size();
        file.close();

        // Save the characters into 8 bit bitsets
        std::bitset<8>* bitsetFile = new std::bitset<8>[size];
        std::string convertedFile;
        for (int i = 0; i < fileString.size(); i++) {
            bitsetFile[i] = std::bitset<8>((int)fileString[i]);
            convertedFile = convertedFile + bitsetFile[i].to_string();
        }
        
        // use string function .find() to search for red pixels
        size_t index = convertedFile.find(pattern);
        while (index != std::string::npos) {
            poz.push_back(index);
            index = convertedFile.find(pattern, index + pattern.size());
        }
    }
}