I am trying to run a hello-world DPC++ sample of oneAPI which adds two 1-D Arrays on both CPU and GPU, and verifies the results. Code is shown below:
/*
DataParallel Addition of two Vectors
*/
#include <CL/sycl.hpp>
#include <array>
#include <iostream>
using namespace sycl;
constexpr size_t array_size = 100000;
typedef std::array<int, array_size> IntArray;
// Initialize array with the same value as its index
void InitializeArray(IntArray& a) { for (size_t i = 0; i < a.size(); i++) a[i] = i; }
/*
Create an asynchronous Exception Handler for sycl
*/
static auto exception_handler = [](cl::sycl::exception_list eList) {
for (std::exception_ptr const& e : eList) {
try {
std::rethrow_exception(e);
}
catch (std::exception const& e) {
std::cout << "Failure" << std::endl;
std::terminate();
}
}
};
void VectorAddParallel(queue &q, const IntArray& x, const IntArray& y, IntArray& parallel_sum) {
range<1> num_items{ x.size() };
buffer x_buf(x);
buffer y_buf(y);
buffer sum_buf(parallel_sum.data(), num_items);
/*
Submit a command group to the queue by a lambda
which contains data access permissions and device computation
*/
q.submit([&](handler& h) {
auto xa = x_buf.get_access<access::mode::read>(h);
auto ya = y_buf.get_access<access::mode::read>(h);
auto sa = sum_buf.get_access<access::mode::write>(h);
std::cout << "Adding on GPU (Parallel)\n";
h.parallel_for(num_items, [=](id<1> i) { sa[i] = xa[i] + ya[i]; });
std::cout << "Done on GPU (Parallel)\n";
});
/*
queue runs the kernel asynchronously. Once beyond the scope,
buffers' data is copied back to the host.
*/
}
int main() {
default_selector d_selector;
IntArray a, b, sequential, parallel;
InitializeArray(a);
InitializeArray(b);
try {
// Queue needs: Device and Exception handler
queue q(d_selector, exception_handler);
std::cout << "Accelerator: "
<< q.get_device().get_info<info::device::name>() << "\n";
std::cout << "Vector size: " << a.size() << "\n";
VectorAddParallel(q, a, b, parallel);
}
catch (std::exception const& e) {
std::cout << "Exception while creating Queue. Terminating...\n";
std::terminate();
}
/*
Do the sequential, which is supposed to be slow
*/
std::cout << "Adding on CPU (Scalar)\n";
for (size_t i = 0; i < sequential.size(); i++) {
sequential[i] = a[i] + b[i];
}
std::cout << "Done on CPU (Scalar)\n";
/*
Verify results, the old-school way
*/
for (size_t i = 0; i < parallel.size(); i++) {
if (parallel[i] != sequential[i]) {
std::cout << "Fail: " << parallel[i] << " != " << sequential[i] << std::endl;
std::cout << "Failed. Results do not match.\n";
return -1;
}
}
std::cout << "Success!\n";
return 0;
}
With a relatively small array_size
, (I tested 100-50k elements) the computation works out to be fine.
Sample output:
Accelerator: Intel(R) Gen9
Vector size: 50000
Adding on GPU (Parallel)
Done on GPU (Parallel)
Adding on CPU (Scalar)
Done on CPU (Scalar)
Success!
It can be noted that it takes barely a second to finish the computation on both CPU and GPU.
But when I increase the array_size
, to say, 100000
, I get this seemingly clueless error:
C:\Users\myuser\source\repos\dpcpp-iotas\x64\Debug\dpcpp-iotas.exe (process 24472) exited with code -1073741571.
Although I am not sure at what precise value the error starts occurring, but I seem to be sure it happens after around 70000
. I seem to have no idea why this is happening, any insights on what can be wrong?
Turns out, this is due to Stack size reinforcement by VS. Contiguous array with too many elements resulted in a stack overflow.
As mentioned by @user4581301, the error code
-107374171
in hex, givesC00000FD
, which is a signed representation of 'stack exhaustion/overflow' in Visual Studio.Ways to fix this:
/STACK
reserve to something higher than 1MB (this is the default) in the Project Properties > Linker > System > Stack Reserve/Commit values./STACK:reserve
.std::vector
instead, which allows dynamic allocation (suggested by @Retired Ninja).I couldn't find an option to change
/STACK
in oneAPI, the normal way in Linker properties, shown here.I decided to go with dynamic allocation.
Related: https://stackoverflow.com/a/26311584/9230398