I've read in the CUDA Programming Guide that the global memory in a CUDA device is accessed by transaction on 32, 64 or 128 bit. Knowing that, is there any advantage of, say, having an set of float4 (128 bit) close together in memory? As I understand it, whether the float4 are distributed in memory or in a sequence, the number of transaction will be the same. Or will all access be coalesced in one gigantic transaction?
If each piece of data take 128 bit or more, is there any advantage of grouping them in memory?
128 views Asked by subb At
1
There are 1 answers
Related Questions in MEMORY
- project version in .exe-filename
- Modifying AndroidManifest.xml with build tag causes infinite rebuilding in Eclipse
- Versioned containers
- Adding some automatic versioning to CSS files. How come I'm not able to pull assembly info?
- How to commit subversion revision with every commit in order to refer between two repositories
- Forcing the browser to reload css/js only if they have changed
- Best way to specify version in REST service calls
- How to auto increase versioning so that browser can automatically reload the JS/CSS files without having to clear the cache manually
- PHP Get version from url
- In a Golang application, how to embed a version in a other package than main?
Related Questions in CUDA
- project version in .exe-filename
- Modifying AndroidManifest.xml with build tag causes infinite rebuilding in Eclipse
- Versioned containers
- Adding some automatic versioning to CSS files. How come I'm not able to pull assembly info?
- How to commit subversion revision with every commit in order to refer between two repositories
- Forcing the browser to reload css/js only if they have changed
- Best way to specify version in REST service calls
- How to auto increase versioning so that browser can automatically reload the JS/CSS files without having to clear the cache manually
- PHP Get version from url
- In a Golang application, how to embed a version in a other package than main?
Related Questions in COALESCING
- project version in .exe-filename
- Modifying AndroidManifest.xml with build tag causes infinite rebuilding in Eclipse
- Versioned containers
- Adding some automatic versioning to CSS files. How come I'm not able to pull assembly info?
- How to commit subversion revision with every commit in order to refer between two repositories
- Forcing the browser to reload css/js only if they have changed
- Best way to specify version in REST service calls
- How to auto increase versioning so that browser can automatically reload the JS/CSS files without having to clear the cache manually
- PHP Get version from url
- In a Golang application, how to embed a version in a other package than main?
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Coalescing refers to combining memory requests from individual threads in a warp into a single memory transaction.
A single memory transaction is typically a 128 byte cache line, therefore it would consist of eight 128 bit (e.g.
float4
) quantities.So, yes, there is a benefit to having multiple threads requesting adjacent 128 bit quantities, because these can still be coalesced into a single (128 byte) cache line request to memory.