Skip to content

AES-GCM-SIV: Add implementation in C and assembly#10807

Open
SparkiDev wants to merge 1 commit into
wolfSSL:masterfrom
SparkiDev:aes_gcm_siv_asm
Open

AES-GCM-SIV: Add implementation in C and assembly#10807
SparkiDev wants to merge 1 commit into
wolfSSL:masterfrom
SparkiDev:aes_gcm_siv_asm

Conversation

@SparkiDev

Copy link
Copy Markdown
Contributor

Description

Added assembly for Intel x64, ARM64, ARM32, Thumb2.

Testing

Tested each platform with AES-GCM-SIV.

@SparkiDev SparkiDev self-assigned this Jun 29, 2026
@SparkiDev SparkiDev force-pushed the aes_gcm_siv_asm branch 3 times, most recently from 79ae040 to 8bf1491 Compare June 30, 2026 04:51
@dgarske dgarske self-requested a review June 30, 2026 15:58

@dgarske dgarske left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skoll Multi-Scan Review

Modes: review-securityOverall recommendation: COMMENT
Findings: 4 total — 4 posted, 0 skipped
2 finding(s) posted as inline comments (see file-level comments below)
2 finding(s) not tied to a diff line (full detail below)

One or more scans did not complete — the results below are partial.

Posted findings

  • [Medium] [review-security] SGX cpuid hard-set now forces CPUID_VAES | CPUID_AVX512, can crash AES-GCM on SGX CPUs lacking those extensionswolfcrypt/src/cpuid.c:56-57
  • [Medium] [review-security] New Thumb-2 AES-GCM-SIV asm and C wrappers are neither built nor run in the added CI.github/workflows/aesgcm-siv.yml:60-148

Findings not tied to a diff line

POLYVAL hash-key stack temporaries not zeroized (t, hrev)

File: wolfcrypt/src/aes.c (AesGcmSivPolyvalInit asm branch; AesGcmSivPolyvalInitSw)
Function: AesGcmSivPolyvalInit / AesGcmSivPolyvalInitSw
Severity: Low
Category: Zeroization

The per-message POLYVAL key is derived from the secret message-authentication-key. The persistent copies in AesGcmSivPolyval (poly->h / poly->m / poly->hHw) are correctly wiped by AesGcmSivPolyvalFinal via ForceZero(poly, sizeof(*poly)). However the intermediate stack buffers that hold transformed copies of that key are not wiped before going out of scope: byte t[WC_AES_BLOCK_SIZE] in the asm key-prep branch of AesGcmSivPolyvalInit (holds ByteReverse(mulX(ByteReverse(h)))), and byte hrev[WC_AES_BLOCK_SIZE] in both AesGcmSivPolyvalInitSw variants (holds ByteReverse(h)). These leave a permutation of the POLYVAL key on the stack after the function returns. This is defense-in-depth only (recovering it would let an attacker forge for that specific key+nonce), and the buffers are typically overwritten by later stack frames.

Recommendation: Add ForceZero(t, sizeof(t)); before the asm key-prep block returns and ForceZero(hrev, sizeof(hrev)); at the end of each AesGcmSivPolyvalInitSw variant, consistent with the zeroization used elsewhere in the SIV code.

Referenced code: wolfcrypt/src/aes.c (AesGcmSivPolyvalInit asm branch; AesGcmSivPolyvalInitSw) (9 lines)


ARM64 GHASH length conversion changed from ubfiz (#3,#32) to lsl #3, dropping upper-32-bit masking

File: wolfcrypt/src/port/arm/armv8-aes-asm.S (multiple *_partial_done labels); wolfcrypt/src/port/arm/armv8-aes-asm_c.c (AES_GCM_encrypt_final/decrypt_final/_AARCH64 paths)
Function: AES_GCM_*_AARCH64 / aes_gcm_*_arm64_crypto_*_partial_done
Severity: Info
Category: Logic

In the existing (non-SIV) AES-GCM AArch64 code regenerated by this PR, byte-length-to-bit-length conversions changed from ubfiz xN, xN, #3, #32 to lsl xN, xN, #3, and standalone ubfiz xN, xN, #0, #32 truncations (e.g. on tagSz in the *_part_tag paths) were removed. ubfiz #3,#32 masks the operand to its low 32 bits before shifting; lsl #3 shifts the full 64-bit register. The two are identical for any operand whose upper 32 bits are zero, which is always the case for the word32 length/tag-size arguments as materialized by wolfSSL callers (32-bit values are zero-extended). So no wrong tag results for valid wolfSSL usage, and these GCM paths are covered by the broader GCM test suite. The note is only that the new sequence is marginally less defensive against a caller leaving non-zero garbage in the upper 32 bits of the argument register.

Recommendation: No action required if these are deliberate asm-generator output; the behavior is equivalent for all in-tree callers. If the generator can emit it cheaply, retaining a 32-bit mask (or documenting that callers must pass zero-extended lengths) preserves the prior defensiveness.

Referenced code: wolfcrypt/src/port/arm/armv8-aes-asm.S (multiple *_partial_done labels); wolfcrypt/src/port/arm/armv8-aes-asm_c.c (AES_GCM_encrypt_final/decrypt_final/_AARCH64 paths) (4 lines)


Review generated by Skoll

Comment thread wolfcrypt/src/cpuid.c
new_cpuid_flags |= CPUID_ADX;
new_cpuid_flags |= CPUID_MOVBE;
new_cpuid_flags |= CPUID_BMI1;
new_cpuid_flags |= CPUID_VAES;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 [Medium] SGX cpuid hard-set now forces CPUID_VAES | CPUID_AVX512, can crash AES-GCM on SGX CPUs lacking those extensions · Regression

The SGX-specific cpuid_set_flags() does not call cpuid (it would SIGILL inside an enclave, per the file comment) and instead hard-codes the feature set. BEFORE this PR the hard-set list stopped at CPUID_BMI1 (so Intel ASM in SGX used the AES-NI/PCLMUL paths). AFTER this PR it also sets new_cpuid_flags |= CPUID_VAES; and new_cpuid_flags |= CPUID_AVX512;. These flags gate the runtime dispatchers for ALL Intel AES-GCM ASM (not only the new GCM-SIV AesGcmSivPolyvalAsm/AesGcmSivCtrAsm, but the existing AES_GCM_* VAES/AVX512 routines). Many SGX-capable parts (Skylake/Coffee Lake/Comet Lake client, and 12th-gen+ client which dropped AVX512) do NOT implement AVX512 or VAES. On such a CPU, an enclave built after this change will dispatch to vpclmulqdq/vaesenc/zmm code and take a #UD (SIGILL) in AES-GCM/GCM-SIV. This narrows the set of CPUs an SGX build will run on, a behavioral regression introduced by a SIV feature PR. The path is explicitly a benchmarking hard-set (@TODO), which limits real-world exposure.

Fix: Do not unconditionally force CPUID_VAES/CPUID_AVX512 in the SGX hard-set path. Gate them behind a build macro (e.g. only when the integrator asserts the target CPU has them), or omit them so SGX continues to use the universally-available AES-NI/PCLMUL paths. If the intent was only to benchmark the new SIV asm, scope the addition behind WOLFSSL_AESGCM_SIV plus an explicit opt-in.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put #ifdef protection around two new flags being ORed in.

Comment thread .github/workflows/aesgcm-siv.yml

@dgarske dgarske left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

macOS/clang build break:

--enable-intelasm --enable-aesgcm-siv fails to compile on an Intel MacBookPro:

aes.c:885: error: unused function 'AesEcbEncryptBlocks' [-Werror,-Wunused-function]
aes.c:912: error: unused function 'AesEcbDecryptBlocks'
aes.c:997: error: unused function 'AesCtrEncryptBlocks'

These three new static helpers' definitions are compiled in the AES-NI config, but their call sites (lines 7796/15153/15252) are behind narrower #if guards, so without AVX512/AES-ECB they're defined-but-unused. gcc tolerates unused static inline; clang -Werror doesn't, which is why all the Linux x86 hosts passed and only the Mac caught it.

Fix: align the guards (only define each helper when a caller is compiled) or mark them unused.

Added assembly for Intel x64, ARM64, ARM32, Thumb2.
@SparkiDev

Copy link
Copy Markdown
Contributor Author

Add #ifdefs for the three functions.

@SparkiDev

SparkiDev commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Jenkins: retest this please

FIPS aborted
FIPS-read failed x2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants