Saturday, August 6, 2022

AVX512 vbroadcast instructions

It is possible to broadcast 8bit, 16bit, 32bit, 64bit, 128bit and 256bit data onto 512bit register using AVX512 vpbroadcast/vbroadcast instructions.

  • vpbroadcastb zmm1, xmm2/m8 is AVX512BW.
  • vpbroadcastw zmm1, xmm2/m16 is AVX512BW.
  • vpbroadcastd zmm1, xmm2/m32 is AVX512F.
  • vpbroadcastq zmm1, xmm2/m64 is AVX512F.
  • vbroadcasti32x2 zmm1, xmm2/m64 is AVX512DQ.
  • vbroadcasti32x4 zmm1, m128 is AVX512F.
  • vbroadcasti64x2 zmm1, m128 is AVX512DQ.
  • vbroadcasti32x8 zmm1, m256 is AVX512DQ.
  • vbroadcasti64x4 zmm1, m256 is AVX512F.

On large data broadcasts, there is a choice to broadcast data with the same pattern, if it is to simply broadcast data,

  • vpbroadcastq is preferable than vbroadcasti32x2
  • vbroadcasti32x4 is preferable than vbroadcasti64x2
  • vbroadcast64x4 is preferable than vbroadcasti32x8

because the former is more standard AVX512F instruction.

 

References

  • https://www.officedaytime.com/simd512e/simdimg/si.php?f=vbroadcastf128
  • Intel® 64 and IA-32 Architectures Software Developer’s Manual