Clenqueuereadbuffer (Queue, C - Buffer,, 0, N, C, 0, ,)
Clenqueuereadbuffer (Queue, C - Buffer,, 0, N, C, 0, ,)
Problem statement
Methodology
By implementing the vector addition code given on the lecture slide the following results are found.
To calculate the time it takes to complete memory copy from host to device I measured the time of the
clEnqueueReadBuffer (queue, c_buffer, CL_TRUE, 0, N * sizeof (cl_float), c, 0, NULL,
NULL);
The following result is obtained by using the average of running the code multiple times.
Appendix A
// Assignment4.cpp : This file contains the 'main' function. Program execution begins and
ends there.
#include<CL\cl.h>
#include<stdio.h>
#include <stdlib.h>
#include <tchar.h>
#include <memory.h>
#include <windows.h>
#include "CL\cl_ext.h"
#include "utils.h"
#include <assert.h>
#include<iostream>
#include<chrono>
#include<ctime>
using namespace std::chrono;
using namespace std;
//====
//=====
void main() {
chrono::time_point<std::chrono::system_clock> start, end;
int i;
for (i = 0; i < N; i++) {
a[i] = i;
b[i] = N - i;
}
cl_mem a_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,
N * sizeof(cl_float), a, NULL);//buffer object read only for kernel copy data from memory
referenced
cl_mem b_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,
N * sizeof(cl_float), b, NULL);
cl_mem c_buffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY |
CL_MEM_COPY_HOST_PTR, N * sizeof(cl_float), NULL, NULL);
size_t global_work_size = N;
cl_ulong time_start;
cl_ulong time_end;
clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_START, sizeof(time_start),
&time_start, NULL);
clGetEventProfilingInfo(event, CL_PROFILING_COMMAND_END, sizeof(time_end),
&time_end, NULL);
cout << "hello";
cout << time_start;
cout << time_end;
double nanoSeconds = time_end - time_start;
cout<< nanoSeconds / 1000000.0;
free(a);
free(b);
free(c);
clReleaseMemObject(a_buffer);
clReleaseMemObject(b_buffer);
clReleaseMemObject(c_buffer);
clReleaseKernel(kernel);
clReleaseProgram(program);
clReleaseContext(context);
clReleaseCommandQueue(queue);
end = chrono::system_clock::now();
time_t end_time = std::chrono::system_clock::to_time_t(end);
chrono::duration<double> elapsed_seconds = end - start;
cout << "elapsed time: " << elapsed_seconds.count() << " sec\n";
system("pause");
}