According to the SPIR-V specification the opcode OpVectorShuffle
seems to be versatile enough to take two input vectors (and a list of constants).
When trying to combine two vectors / extract unaligned subsequence from two vectors i.e.
#version 450
layout (local_size_x = 2, local_size_y = 8, local_size_z = 1) in;
layout(set = 0, binding = 0) buffer InputBuffer { vec4 input_buffer[3]; };
layout(set = 0, binding = 1) buffer OutputBuffer { vec4 result_buffer[1]; };
void main()
{
vec4 a = input_buffer[0];
vec4 b = input_buffer[1];
vec4 w = input_buffer[2];
result_buffer[0] = vec4(a.w, b.x, b.y, b.z) * w.xxxx;
}
I'm getting
%4 = OpFunction %void None %3
%5 = OpLabel
%19 = OpAccessChain %_ptr_StorageBuffer_v4float %15 %int_0 %int_0
%20 = OpLoad %v4float %19
%23 = OpAccessChain %_ptr_StorageBuffer_v4float %15 %int_0 %int_1
%24 = OpLoad %v4float %23
%27 = OpAccessChain %_ptr_StorageBuffer_v4float %15 %int_0 %int_2
%28 = OpLoad %v4float %27
%36 = OpCompositeExtract %float %20 3
%39 = OpCompositeExtract %float %24 0
%41 = OpCompositeExtract %float %24 1
%44 = OpCompositeExtract %float %24 2
%45 = OpCompositeConstruct %v4float %36 %39 %41 %44
%47 = OpVectorShuffle %v4float %28 %28 0 0 0 0
%48 = OpFMul %v4float %45 %47
%49 = OpAccessChain %_ptr_StorageBuffer_v4float %33 %int_0 %int_0
OpStore %49 %48
OpReturn
OpFunctionEnd
IMO this seems the perfect candidate to use the
%45 = OpVectorShuffle %v4float %20 %24 3 4 5 6
The command to compile is
>/usr/local/bin/glslc -S -O -fshader-stage=compute ./conv.glsl --target-env=vulkan1.1 -o asm.txt
Would there be a way to convince the compiler (spirv-tools v2021.4, glslang 11.1.0-316-g600c5037) to further optimize or to express the idea better?