Memory copy speed comparison CPU<->GPU

2.1k views Asked by At

I am now learning boost::compute openCL wrapper library. I am experiencing very slow copy procedure.

If we scale CPU to CPU copy speed as 1, how fast is GPU to CPU, GPU to GPU, CPU to GPU copy?

I don't require precise numbers. Just a general idea would be a great help. In example CPU-CPU is at least 10 times faster than GPU-GPU.

1

There are 1 answers

3
Zeta On BEST ANSWER

No one is answering my question. So I made a program to check the copy speed.

#include<vector>
#include<chrono>
#include<algorithm>
#include<iostream>
#include<boost/compute.hpp>
namespace compute = boost::compute;
using namespace std::chrono;
using namespace std;

int main()
{
    int sz = 10000000;
    std::vector<float> v1(sz, 2.3f), v2(sz);
    compute::vector<float> v3(sz), v4(sz);

    auto s = system_clock::now();
    std::copy(v1.begin(), v1.end(), v2.begin());
    auto e = system_clock::now();
    cout << "cpu2cpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v1.begin(), v1.end(), v3.begin());
    e = system_clock::now();
    cout << "cpu2gpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v3.begin(), v3.end(), v4.begin());
    e = system_clock::now();
    cout << "gpu2gpu cp " << (e - s).count() << endl;

    s = system_clock::now();
    compute::copy(v3.begin(), v3.end(), v1.begin());
    e = system_clock::now();
    cout << "gpu2cpu cp " << (e - s).count() << endl;
    return 0;
}

I expected that gpu2gpu copy would be fast. But on the contrary, cpu2cpu was fastest and gpu2gpu was so slow in my case. (My system is Intel I3 and Intel(R) HD Graphics Skylake ULT GT2.) Maybe parallel processing is one thing and copy speed is another.

cpu2cpu cp 7549776
cpu2gpu cp 18707268
gpu2gpu cp 65841100
gpu2cpu cp 65803119

I hope anyone can benefit with this test program.