Getting to Know ARM64EC: #Defines and Intrinsic Functions

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

Earlier this year, we announced ARM64EC, a new ABI that will make it easier than ever to build native apps for Windows on ARM.  With the Windows 11 SDK and Visual Studio Preview, you can start using the preview of ARM64EC tools to add ARM64EC to your own apps or build new ARM64EC projects.  For developers looking to dive in and get started, we'll be sharing more details and things to know in this and upcoming blogs. 

 

Today, we'll be diving into one key detail of the environment to know: when compiling ARM64EC, the _M_AMD64 preprocessor macro is defined and _M_ARM64 is not.  There is also a new preprocessor macro, _M_ARM64EC, that is set only when building ARM64EC. 

 

Preprocessor macros defined for each target by MSVC: 

x64 

ARM64EC 

ARM64 

_M_X64 

_M_AMD64 

_M_X64 

_M_AMD64 

_M_ARM64EC 

_M_ARM64 

 

If you include windows.h in your project, you’ll also see that _AMD64_ and _ARM64EC_ are both defined when building ARM64EC code. 

 

This combination may seem counterintuitive at first, but it's key to the fundamental promise of ARM64EC being interoperable with x64 code even within the same binary.  Windows 11 takes care of seamlessly transitioning between code running natively in the CPU and under emulation. To do so, it makes sure that data flows transparently between ARM64EC and x64 including data pointers and function pointers (i.e. callbacks).  For this to work, datatype definitions must be the same when compiling ARM64EC code as when compiling x64. 

 

The defined preprocessor macros for ARM64EC mean that your project compiling as ARM64EC will use definitions from x64, not ones from ARM64. This ensures that datatype definitions are the same when compiling for x64 and ARM64EC and that passing parameters, either by value or by reference, will not generate a mismatch. 

 

Another common use of #define statements in code is platform specific instructions, usually exposed to C/C++ code in the form of intrinsic functions.  Intrinsic functions are functions internally defined by the compiler, which allow C/C++ code to tap into architecture-specific instructions and get the best possible performance without the need for direct use of assembly.   Knowing that ARM64EC projects will follow x64 codepaths, you may ask -- what about any intrinsic functions? 

 

When compiling ARM64EC, x64 intrinsic functions are supported and will be translated to ARM64EC code automatically.  As a result, taking an x64 project and building for ARM64EC, even one that uses intrinsic functions for performance, can easily yield an ARM64EC app with good performance without source changes. 

 

You also have the option to further optimize the processor-specific code in your project by using ARM64 intrinsic functions in your ARM64EC project.  The _M_ARM64EC preprocessor macro allows you to differentiate ARM64EC from x64 and take ARM-specific code paths rather than x64.  For example, if you have code that already handles choosing the best intrinsic functions for x64 and ARM64, you can key off _M_ARM64EC or _M_ARM64 to use the ARM intrinsic functions, as below: 

 

Before​ 

After​ 

#include <intrin.h>​

void func() {​ 
#if defined(_M_AMD64)​
    __m128i vec;​ 
    vec = _mm_setzero_si128();​ 
#elif defined(_M_ARM64)​ 
    __n128 vec;​ 
    vec = vdupq_n_u32(0);​ 
#endif​ 
}​ 
#include <intrin.h>​ 

void func() {​ 
#if defined(_M_AMD64) && !defined(_M_ARM64EC)​ 
   __m128i vec;​ 
   vec = _mm_setzero_si128();​
#elif defined(_M_ARM64) || defined(_M_ARM64EC)​ 
    __n128 vec;​ 
    vec = vdupq_n_u32(0);​ 
#endif​ 
}​ 

 

The architecture #defines set by the compiler when building ARM64EC may be somewhat surprising at first but make more sense when considering that ARM64EC and x64 are interoperable.  These settings, and the automatic translation of intrinsics, enable code to be ported to ARM64EC with the least amount of effort, while still enabling ARM64EC specific fine-tuning and optimization. 

 

Marc Sweetgall, Pedro Justo

 

REMEMBER: these articles are REPUBLISHED. Your best bet to get a reply is to follow the link at the top of the post to the ORIGINAL post! BUT you're more than welcome to start discussions here:

This site uses Akismet to reduce spam. Learn how your comment data is processed.