However, this starts to bottleneck when the code involves branching. Most SIMD engines allow branching by allowing an instruction to be run conditionally - those not on the currently executing branch do nothing. So if there is one level of an if/else statement, only 50% of the units are active. Two levels? 25%. That is why SIMD is usually not used for branching intensive tasks such as decision trees.
One good example of SIMD is a GPU. Almost all current GPUs are groups of SIMD cores. Most CPU's have SIMD - Intel has dual 128-bit, AMD Bulldozer has 256-bit/dual 128-bit, and AMD Zen has 512-bit/dual 256-bit SIMD. Most mobile processors do too, however 1st gen Raspberry Pi's don't.
One good example of SIMD is a GPU. Almost all current GPUs are groups of SIMD cores. Most CPU's have SIMD - Intel has dual 128-bit, AMD Bulldozer has 256-bit/dual 128-bit, and AMD Zen has 512-bit/dual 256-bit SIMD. Most mobile processors do too, however 1st gen Raspberry Pi's don't.
No comments:
Post a Comment