One of the major goals of Vc is to ease development of portable code, while achieving highest possible performance that requires target architecture specific instructions. This is possible through having just a single type use different implementations of the same API depending on the target architecture. Many of the details of the target architecture are often dependent on the compiler flags that were used. Also there can be subtle differences between the implementations that could lead to problems. This page aims to document all issues you might need to know about.
-march=\<target\>
flag. Take a look at the GCC manpage to find all possibilities for \<target\>
. If no SIMD instructions are enabled via compiler flags, Vc must fall back to the scalar implementation. /arch:AVX
, /arch:AVX2
and `/arch:AVX512 flags. Without such a flag, at least SSE2 is enabled.You must be aware of the fact that a binary that is built for a given SIMD hardware may not run on a processor that does not have these instructions. The executable will work fine as long as no such instruction is actually executed and only crash at the place where such an instruction is used. Thus it is better to check at application start whether the compiled in SIMD hardware is really supported on the executing CPU. This can be determined with the currentImplementationSupported function.
If you want to distribute a binary that runs correctly on many different systems you either must restrict it to the least common denominator (which often is SSE2), or you must compile the code several times, with the different target architecture compiler options. A simple way to combine the resulting executables would be via a wrapping script/executable that determines the correct executable to use. A more sophisticated option is the use of the ifunc attribute GCC provides. Other compilers might provide similar functionality.
It is guaranteed that:
Since SIMD is not part of the C/C++ language standards Vc abstracts more or less standardized compiler extensions. Sadly, not every issue can be transparently abstracted. Therefore this will be the place where differences are documented:
Vc_PASSING_VECTOR_BY_VALUE_IS_BROKEN
for such cases. Also the Vc vector types contain a composite typedef AsArg
which resolves to either const-ref or const-by-value. Thus, you can always use