I am trying to define a library in Ada (built on GNAT specifically) for x86 ISA extensions. (This question is specific to AVX/AVX2).

Here is some example code below:

```
-- 256-bit Vector of Single Precision Floating Point Numbers
type Vector_256_Float_32 is array (0 .. 7) of IEEE_Float_32 with
Alignment => 32, Size => 256, Object_Size => 256;
pragma Machine_Attribute (Vector_256_Float_32, "vector_type");
pragma Machine_Attribute (Vector_256_Float_32, "may_alias");
-- 256-bit Vector of 32-bit Signed Integers
type Vector_256_Integer_32 is array (0 .. 7) of Integer_32 with
Alignment => 32, Size => 256, Object_Size => 256;
pragma Machine_Attribute (Vector_256_Integer_32, "vector_type");
pragma Machine_Attribute (Vector_256_Integer_32, "may_alias");
-- 256-bit Vector of 32-bit Unsigned Integers
type Vector_256_Unsigned_32 is array (0 .. 7) of Unsigned_32 with
Alignment => 32, Size => 256, Object_Size => 256;
pragma Machine_Attribute (Vector_256_Unsigned_32, "vector_type");
pragma Machine_Attribute (Vector_256_Unsigned_32, "may_alias");
function vblendvps
(Left, Right, Mask : Vector_256_Float_32)
return Vector_256_Float_32 with
Inline_Always => True, Convention => Intrinsic, Import => True,
External_Name => "__builtin_ia32_blendvps256";
```

For the sake of education, I want to know how to do this in assembly.

I have tried to define the vblendvps function using the `Asm`

function from `System.Machine_Code`

. However, as I am not knowledgeable about assembly programming, I am struggling to find resources on how to do this.

This is what I have so far:

```
with System.Machine_Code; use System.Machine_Code;
function vblendvps
(Left, Right, Mask : Vector_256_Float_32)
return Vector_256_Float_32
is
result : Vector_256_Float_32;
begin
Asm
(Template => "vblendvps %3, %0, %1, %2",
Outputs => Vector_256_Float_32'Asm_Output ("=g", result),
Inputs =>
(Vector_256_Float_32'Asm_Input ("g", Left),
Vector_256_Float_32'Asm_Input ("g", Right),
Vector_256_Unsigned_32'Asm_Input ("g", Mask)));
return result;
end vblendvps;
```

When compiling the complete code, I get

```
Error: too many memory references for `vblendvps'
```

I believe this means that I need to move the arguments from memory to registers, but I am not sure. If there are some helpful references that explain every instruction, I would greatly appreciate that. (I had quite some trouble looking up the arguments to `vblendvps`

).

My understanding is that the instruction is of the form (from ymm registers in my case)

```
vblendvps RESULT, LEFT, RIGHT, MASK
```

Please let me know how I would do this. Even if it is not in Ada, I'm sure I can figure out how to translate it.