M487: Support ECP H/W accelerator #5812

ccli8 · 2018-01-09T09:23:20Z

Description

This PR includes support for cryptography ECP H/W accelerator.

ciarmcom · 2018-01-09T16:50:23Z

ARM Internal Ref: IOTSSL-2000

hanno-becker · 2018-01-18T12:48:56Z

@0xc0170 @ccli8 I can't find the specification of the ECP H/W accelerator - can you help?

hanno-becker · 2018-01-18T14:01:03Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+
+        /* R = m*P = (multiple of order)*G = 0 */
+        /* NOTE: If grp->N (order) is a prime, we could detect R = 0 for all m*P cases
+         *       by just checking if m is a multiple of grp->N. Otherwise, sigh. */


The order grp->N is not the order of the entire elliptic curve, but the order of the subgroup <G> spanned by ecp->G, which is assumed to be prime. Therefore the check is fine and exhausts R=0 completely, provided one makes the assumption that P lies in <G> -- whereas without this assumption, the check wouldn't be accurate in the first place. I think P belonging to <G> is a reasonable yet undocumented assumption the default S/W ECC code also makes at times, but I don't remember details - ping @mpg? Continuing on this assumption, the case of order 2, i.e. P = (-P) but P != 0, cannot occur.

What happens to the H/W accelerator if R=0 is not covered beforehand? It seems that it's not capable of returning non-affine points, or is it?

For the short Weierstrass case, for all curves we currently support, <G> is the whole curve, and mbedtls_ecp_check_pubkey() (called at the beginning of mbedtls_ecp_mul() and hopefully all other public functions that receive both secret input and some untrusted input) ensures that the point lies on the curve, hence that it belongs to <G>. If we ever added support for short weierstrass curves with non-trivial cofactor, we would need to adapt mbedtls_ecp_check_pubkey() so that it keeps checking that the point is a multiple of G.

Also, mbedtls_ecp_check_privkey() ensures that m is in the range [1, N) so together with the previous check on P, this ensures that m * P != 0. Note however that this protection only applies to the scalar multiplication part: when doing the final addition in R = m * P + n * Q you still need to be prepared for the R == 0 case.

What happens to the H/W accelerator if R=0 is not covered beforehand? It seems that it's not capable of returning non-affine points, or is it?

@hanno-arm If R = 0, H/W accelerator will hang and its finish/error flag won't be set. That's why I must filter out all R = 0 cases.

hanno-becker · 2018-01-18T14:13:12Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+                            mbedtls_ecp_point *R,
+                            const mbedtls_mpi *m,
+                            const mbedtls_ecp_point *P,
+                            const mbedtls_mpi *n,


The parameter n is unused.

@hanno-arm The parameter n is kept to be consistent with R = m*P + n*Q. I add comment for it in 1728037.

ccli8 · 2018-01-19T02:00:48Z

I can't find the specification of the ECP H/W accelerator

@hanno-arm Nuvoton's M480 series is not released yet (about 3/E). Do you have a strong need for ECP H/W spec for the review?
@0xc0170

1. Add comment for unnecessary parameter 'n' in mbedtls_internal_run_eccop 2. Fix warning message with goto which causes `bypass initialization` 3. Fix comment

hanno-becker · 2018-02-02T14:13:23Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+ *                  n is kept with unused modifier.
+ *                  
+ */
+int mbedtls_internal_run_eccop(const mbedtls_ecp_group *grp,


If it's meant to be an internal function, please make it static and remove the mbedtls_ prefix.

@hanno-arm I remove the mbedtls_ prefix in df76e29 for internal functions. For Nuvoton's self-test on ECP alter. module, I add a macro NU_CRYPTO_SELF_TEST to conditionally remove the static modifier in 2525352.

hanno-becker · 2018-02-02T14:31:08Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+         MBEDTLS_INTERNAL_MPI_NORM(&Np, *o2, N_, *p);
+        MBEDTLS_MPI_CHK(mbedtls_internal_mpi_write_eccreg(Np, (uint32_t *) CRPT->ECC_Y1, NU_ECC_BIGNUM_MAXWORD));
+    } else if (modop == MODOP_DIV) {
+        MBEDTLS_INTERNAL_MPI_NORM(&Np, *o2, N_, *p);


What happens if the divisor is 0? Do we need to check for it or will we get a graceful failure?

@hanno-arm I add zero divisor check in c9cc357.

hanno-becker · 2018-02-02T14:37:31Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+     *
+     * N_: Holds normalized MPI if the passed-in MPI N1 is not
+     * Np: Pointer to normalized MPI, which could be N1 or N_
+     */


I don't see the benefit of this macro compared to performing a modular reduction on the respective variable in-place. On the functional side it's the same, on the performance side it has the chance to be less costly because it is not enforced that the bignum module will need to make a copy, and it would be more readable (for my taste). Could you elaborate why you chose this path, or change MBEDTLS_INTERNAL_MPI_NORM(a,b,c,d) to MBEDTLS_INTERNAL_MPI_NORM(x,N) performing a modular reduction on x mod N in-place if necessary?

Looking at it, I think you can even reduce this to a call to mbedtls_mpi_mod_mpi( x, x, N ) everywhere, as the latter function checks if already x < N before performing any expensive operations.

Oh, I see, many MPI's are unfortunately declared const in the fixed API, and we are not allowed to perform modular reduction in-place. That makes the necessity and motivation for the macro clearer.

hanno-becker · 2018-02-02T14:39:17Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+    CRPT->ECC_CTL = (pbits << CRPT_ECC_CTL_CURVEM_Pos) | (ECCOP_MODULE | modop) | CRPT_ECC_CTL_FSEL_Msk | CRPT_ECC_CTL_START_Msk;
+    ecc_done = crypto_ecc_wait();
+
+    /* FIXME: Better error code for ECC accelerator error */


An error code for HW acceleration has recently been introduced in Mbed-TLS/mbedtls#1303, but it's not yet in Mbed OS.

@hanno-arm OK. I leave it unchanged currently.

hanno-becker

The code in this PR looks good to me. However, it is not entirely clear at which level the PR seeks to provide the alternative ECP implementation. I will go into detail now, recalling first the two ways of providing alternative ECP implementations:

Provide a full re-implementation of the entire ECP module.
This is enabled with MBEDTLS_ECP_ALT and mandates re-implementing the entire ECP API. This option is usually suitable if the accelerator provides high-level ECP functionality like point-multiplication.
Provide a per-function re-implementation of parts the ECP module.
This option is what the PR currently uses: A set of functions is picked in the config for re-implementation, and only those need to be provided. These functions are the low-level building blocks at the heart of the ECP comb multiplication method: Mixed addition, Point doubling, modular arithmetic. This kind of replacement should can be used to speed up the building blocks while keeping the overall multiplication strategy.

This PR in a sense does both: It provides the main high-level entry point to the ECP module, namely, mbedtls_ecp_mul_jac, but after all this function is never called or used! It's because for overwriting this function, you would need a full replacement of the ECP module, i.e. use alternative (1). But the PR does also provide the low-level replacements for doubling, mixed addition etc., which as mentioned looks good and seems to be working.

I am neither approving nor disapproving at this stage, but only want to emphasize that currently it is not clear which path this PR wants to take: The accelerator is capable of accelerating the high-level point multiplication, and code for this is provided, but it is not used.

@ccli8 Could you clarify the path you want to take, and adapt the PR accordingly? This means that if you want full replacement, the fine-grained rewriting is not necessary, and if you want fine-grained replacement, then the high-level mbedtls_internal_ecp_mul_jac should be removed.

MbedTLS doesn't support point multiplication for MBEDTLS_ECP_INTERNAL_ALT acceleration configuration.

ccli8 · 2018-02-06T03:45:55Z

@hanno-arm Actually, I would like to go alternative (2) with point multiplication accelerated. ecp.c has been thoroughly tested which I can rely on and just focus on accelerator logic. Per my test, with alternative (2) on Nuvoton's ECP accelerator, point multiplication can improve 30X~40X, but point addition and point double can improve only around 2X and 2X respectively. It would be very appreciated if you could also open point multiplication acceleration interface in alternative (2). Anyway, I remove mbedtls_internal_ecp_mul in 95d4110 currently.

hanno-becker · 2018-02-07T10:43:36Z

@ccli8 Please be aware that if your accelerator provides the entire ECP multiplication routine, there's no need for writing acceleration code for the low-level subroutines - these are only used by our default S/W implementation of ECP multiplication. So if high-level acceleration is what you want in the end, you should be able to provide a rather small full-module replacement of ECP without too much work: apart from numerous formatting functions like mbedtls_ecp_tls_read_group that you can copy over, the only public functions related to EC arithmetic are mbedtls_ecp_mul and mbedtls_ecp_muladd, and you have direct acceleration for these primitives it seems.
It is unlikely that there will be functionality for replacing mbedtls_internal_ecp_mul only, as you requested - instead, full-module replacement is what be used in this case.

ccli8 · 2018-02-08T01:58:57Z

@hanno-arm OK. Please let the PR keep going. I'll consider full-module replacement. If it comes out, I'll send another PR to replace current one.

hanno-becker · 2018-02-09T16:03:17Z

targets/TARGET_NUVOTON/TARGET_M480/crypto/crypto-misc.c

+/* Implementation that should never be optimized out by the compiler */
+void crypto_zeroize32(uint32_t *v, size_t n)
+{
+    volatile uint32_t *p = (uint32_t*) v;


This should probably be (uint32_t volatile *) v.

@hanno-arm I fixed it in cfdc72d.

hanno-becker · 2018-02-09T17:12:52Z

features/mbedtls/targets/TARGET_NUVOTON/TARGET_M480/ecp/ecp_internal_alt.c

+     *
+     * crypto_zeroize32() has excluded optimization doubt, so we can safely set H/W registers to 0 via it.
+     */
+    crypto_zeroize32((uint32_t *) eccreg + i, eccreg_num - i);


Could you change this to n instead of i, please?

@hanno-arm I fixed it in 03f0ea1.

hanno-becker

Only two minor suggestions left - overall the PR looks good to me. Usual caveat: I haven't done any on-target testing.

hanno-becker · 2018-02-09T17:56:45Z

@mazimkhan Could you give this a final look, too?

hanno-becker · 2018-02-19T16:58:12Z

@cmonr @0xc0170 Ready to go.

0xc0170 · 2018-02-20T15:30:21Z

/morph build

mbed-ci · 2018-02-20T16:16:58Z

Build : SUCCESS

Build number : 1184
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/5812/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

mbed-ci · 2018-02-20T17:38:05Z

Test : SUCCESS

Build number : 986
Test logs :http://mbed-os-logs.s3-website-us-west-1.amazonaws.com/?prefix=logs/5812/986

mbed-ci · 2018-02-20T17:50:24Z

Exporter Build : SUCCESS

Build number : 861
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/exporter/5812/

[M487] Support ECP H/W accelerator

a687504

0xc0170 requested review from mazimkhan and hanno-becker January 9, 2018 13:25

0xc0170 added tracking component: security labels Jan 9, 2018

ciarmcom added the mirrored label Jan 9, 2018

hanno-becker reviewed Jan 18, 2018

View reviewed changes

ccli8 added 2 commits January 22, 2018 10:51

[NUC472/M487] Fix warning in crypto

160f75d

[M487] Refine code in ECP alter.

1728037

1. Add comment for unnecessary parameter 'n' in mbedtls_internal_run_eccop 2. Fix warning message with goto which causes `bypass initialization` 3. Fix comment

cmonr added the needs: work label Feb 1, 2018

hanno-becker reviewed Feb 2, 2018

View reviewed changes

hanno-becker reviewed Feb 5, 2018

View reviewed changes

ccli8 added 4 commits February 6, 2018 09:30

[M487] Check divisor is not zero in MODOP_DIV operation in ECP alter.

c9cc357

[M487] Remove mbedtls prefix for internal functions in ECP alter.

df76e29

[M487] Remove mbedtls_internal_ecp_mul in ECP alter.

95d4110

MbedTLS doesn't support point multiplication for MBEDTLS_ECP_INTERNAL_ALT acceleration configuration.

[M487] Support internal self-test for ECP alter.

2525352

hanno-becker reviewed Feb 9, 2018

View reviewed changes

hanno-becker approved these changes Feb 9, 2018

View reviewed changes

ccli8 added 2 commits February 12, 2018 14:04

[NUC472/M487] Refine crypto_zeroize/crypto_zeroize32

cfdc72d

[M487] Refine internal_mpi_write_eccreg in ECP alter.

03f0ea1

cmonr added needs: review and removed needs: work labels Feb 12, 2018

mazimkhan approved these changes Feb 19, 2018

View reviewed changes

0xc0170 added needs: CI and removed needs: review labels Feb 20, 2018

cmonr merged commit 817f9a5 into ARMmbed:master Feb 20, 2018

cmonr added release-version: 5.7.6 and removed needs: CI labels Feb 20, 2018

ciarmcom added the closed_in_jira label Feb 20, 2018

ccli8 deleted the nuvoton_crypto branch February 21, 2018 01:45

M487: Support ECP H/W accelerator #5812

M487: Support ECP H/W accelerator #5812

Uh oh!

Conversation

ccli8 commented Jan 9, 2018

Description

Uh oh!

ciarmcom commented Jan 9, 2018

Uh oh!

hanno-becker commented Jan 18, 2018

Uh oh!

hanno-becker Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mpg Jan 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ccli8 commented Jan 19, 2018

Uh oh!

hanno-becker Feb 2, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanno-becker left a comment

Choose a reason for hiding this comment

Uh oh!

ccli8 commented Feb 6, 2018

Uh oh!

hanno-becker commented Feb 7, 2018

Uh oh!

ccli8 commented Feb 8, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanno-becker left a comment

Choose a reason for hiding this comment

Uh oh!

hanno-becker commented Feb 9, 2018

Uh oh!

hanno-becker commented Feb 19, 2018

Uh oh!

0xc0170 commented Feb 20, 2018

Uh oh!

mbed-ci commented Feb 20, 2018

Triggering tests

Uh oh!

mbed-ci commented Feb 20, 2018

Uh oh!

mbed-ci commented Feb 20, 2018

Uh oh!

hanno-becker Jan 18, 2018 •

edited

Loading

mpg Jan 18, 2018 •

edited

Loading

hanno-becker Feb 2, 2018 •

edited

Loading