[PATCH] Annotate non-zero-terminated character arrays

List overview All Threads
Download

newer

older

ML-KEM

sntrup761

nettle＠gms.tf

3 Apr 2026 3 Apr '26

4:57 p.m.

From: Georg Sauthoff mail@gms.tf

diff --git a/aclocal.m4 b/aclocal.m4 index 73bf0cfb..9f943b45 100644 --- a/aclocal.m4 +++ b/aclocal.m4 @@ -73,6 +73,13 @@ AH_BOTTOM( # define PRINTF_STYLE(f, a) # define UNUSED #endif +#ifdef __has_attribute +# if __has_attribute(nonstring) +# define NONSTRING __attribute__((__nonstring__)) +# else +# define NONSTRING +# endif +#endif ])])

# Check for alloca, and include the standard blurb in config.h diff --git a/base16-encode.c b/base16-encode.c index 9c7f0b1e..c6a4c6b9 100644 --- a/base16-encode.c +++ b/base16-encode.c @@ -39,7 +39,7 @@

static const uint8_t -hex_digits[16] = "0123456789abcdef"; +hex_digits[16] NONSTRING = "0123456789abcdef";

#define DIGIT(x) (hex_digits[(x) & 0xf])

diff --git a/base64-encode.c b/base64-encode.c index ee1ec149..6bb55782 100644 --- a/base64-encode.c +++ b/base64-encode.c @@ -83,7 +83,7 @@ encode_raw(const char *alphabet, assert(out == dst); }

-static const char base64_encode_table[64] = +static const char base64_encode_table[64] NONSTRING = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789+/"; diff --git a/base64url-encode.c b/base64url-encode.c index d30044ea..2b78a800 100644 --- a/base64url-encode.c +++ b/base64url-encode.c @@ -38,7 +38,7 @@ void base64url_encode_init(struct base64_encode_ctx *ctx) { - static const char base64url_encode_table[64] = + static const char base64url_encode_table[64] NONSTRING = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789-_"; diff --git a/blowfish-bcrypt.c b/blowfish-bcrypt.c index 385503ac..0326f1ff 100644 --- a/blowfish-bcrypt.c +++ b/blowfish-bcrypt.c @@ -70,7 +70,7 @@ static const signed char radix64_decode_table[0x100] = { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, };

-static const char radix64_encode_table[64] = +static const char radix64_encode_table[64] NONSTRING = "./ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789"; diff --git a/tools/pkcs1-conv.c b/tools/pkcs1-conv.c index f6b044d2..68bb67d1 100644 --- a/tools/pkcs1-conv.c +++ b/tools/pkcs1-conv.c @@ -117,13 +117,13 @@ read_file(struct nettle_buffer *buffer, FILE *f) }

static const uint8_t -pem_start_pattern[11] = "-----BEGIN "; +pem_start_pattern[11] NONSTRING = "-----BEGIN ";

static const uint8_t -pem_end_pattern[9] = "-----END "; +pem_end_pattern[9] NONSTRING = "-----END ";

static const uint8_t -pem_trailer_pattern[5] = "-----"; +pem_trailer_pattern[5] NONSTRING = "-----";

static const char pem_ws[33] = {

-- 2.53.0

Show replies by date

Niels Möller

12 Apr 12 Apr

8:54 p.m.

nettle@gms.tf writes:

...

This fixes -Wunterminated-string-initialization warnings with gcc 15.2.1.

Hmm, I'm to sure its worth the effort to add annotations to this. I see two alternatives:

1. Simply add the trailing NUL byte, e.g., instead of

...

static const uint8_t -hex_digits[16] = "0123456789abcdef"; +hex_digits[16] NONSTRING = "0123456789abcdef";

change it to

hex_digits[17] = "0123456789abcdef";

or just

hex_digits = "0123456789abcdef";

(will need code adjustments if sizeof is applied to affected string constants).

2. Disable this warning.

Opinions?

Regards, /Niels

-- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.

Simon Josefsson

13 Apr 13 Apr

7:24 a.m.

New subject: Annotate non-zero-terminated character arrays

Niels Möller nisse@lysator.liu.se writes:

...

hex_digits[17] = "0123456789abcdef";

That looks ugly but +1, IMHO.

IIRC, I've seen it used like this too:

hex_digits[17] = "0123456789abcdef\0";

A less idiomatic but more tidy approach would be

hex_digits[16] = { '0', '1', '2', ... 'e', 'f' };

I'm hoping no compiler complains about missing ASCII NUL in a "string" defined that way.

Several base64 implementations use the last approach, but mostly for EBCDIC compatibility rather than to pacify false positive compiler warnings. Do Nettle care about EBCDIC targets? Gnulib's base64 code has the snippet below, but it assumes 'char' is 8-bit.

/Simon

/* With this approach this file works independent of the charset used (think EBCDIC). However, it does assume that the characters in the Base64 alphabet (A-Za-z0-9+/) are encoded in 0..255. POSIX 1003.1-2001 require that char and unsigned char are 8-bit quantities, though, taking care of that problem. But this may be a potential problem on non-POSIX C99 platforms.

IBM C V6 for AIX mishandles "#define B64(x) ...'x'...", so use "_" as the formal parameter rather than "x". */ #define B64(_) \ ((_) == 'A' ? 0 \ : (_) == 'B' ? 1 \ : (_) == 'C' ? 2 \ : (_) == 'D' ? 3 \ ... : (_) == '8' ? 60 \ : (_) == '9' ? 61 \ : (_) == '+' ? 62 \ : (_) == '/' ? 63 \ : -1)

signed char const base64_to_int[256] = { B64 (0), B64 (1), B64 (2), B64 (3), ... B64 (252), B64 (253), B64 (254), B64 (255) };

Georg Sauthoff

18 Apr 18 Apr

11:28 a.m.

New subject: Annotate non-zero-terminated character arrays

Hello,

On Mon, Apr 13, 2026 at 09:24:17AM +0200, Simon Josefsson wrote: [..]

...

hex_digits[16] = { '0', '1', '2', ... 'e', 'f' };

I'm hoping no compiler complains about missing ASCII NUL in a "string" defined that way.

The compilers available on gotbolt don't complain:

https://godbolt.org/z/Whb9Eh36q

...

Several base64 implementations use the last approach, but mostly for EBCDIC compatibility rather than to pacify false positive compiler warnings. Do Nettle care about EBCDIC targets? Gnulib's base64 code

FWIW, I would only worry about EBCDIC after somebody shows up who has a plausible use-case.

...

has the snippet below, but it assumes 'char' is 8-bit.

I'm sorry, but this gnulib code is horrible.

Why are they over-using the c pre-processor like this ... I mean since gnulib is configure-time checking the world anyway they could simply detect EBCDIC at configure time and conditonally guard/include full ASCII/EBCDIC arrays based on that result. Or even generating the array in M4 or some ultra-portable form of shell at configure time would be a more sensible approach than this.

...

#define B64(_) \ ((_) == 'A' ? 0 \ : (_) == 'B' ? 1 \ : (_) == 'C' ? 2 \ : (_) == 'D' ? 3 \ ... : (_) == '8' ? 60 \ : (_) == '9' ? 61 \ : (_) == '+' ? 62 \ : (_) == '/' ? 63 \ : -1)

signed char const base64_to_int[256] = { B64 (0), B64 (1), B64 (2), B64 (3), ... B64 (252), B64 (253), B64 (254), B64 (255) };

Best regards, Georg

-- "'=' is for the weak" (Orpie, http://pessimization.com/software/orpie/)

Simon Josefsson

11:45 a.m.

New subject: Annotate non-zero-terminated character arrays

Georg Sauthoff nettle@gms.tf writes:

...

Hello,

On Mon, Apr 13, 2026 at 09:24:17AM +0200, Simon Josefsson wrote: [..]

...
hex_digits[16] = { '0', '1', '2', ... 'e', 'f' };

I'm hoping no compiler complains about missing ASCII NUL in a "string" defined that way.

The compilers available on gotbolt don't complain:

https://godbolt.org/z/Whb9Eh36q

Great! I think that is a reasonable idiom, then.

...

...
Several base64 implementations use the last approach, but mostly for EBCDIC compatibility rather than to pacify false positive compiler warnings. Do Nettle care about EBCDIC targets? Gnulib's base64 code

FWIW, I would only worry about EBCDIC after somebody shows up who has a plausible use-case.

IIRC that is what happened for gnulib base64 code, though, but it was a couple of years ago this change was made. Gnulib often goes to effort to be written in a portable way, because its applicability is beyond GNU/Linux. Maybe this isn't applicable to Nettle, but Nettle is a fairly low-level library so I wouldn't be surprised if it is running in EBCDIC environments.

...

...
has the snippet below, but it assumes 'char' is 8-bit.

I'm sorry, but this gnulib code is horrible.

Why are they over-using the c pre-processor like this ... I mean since gnulib is configure-time checking the world anyway they could simply detect EBCDIC at configure time and conditonally guard/include full ASCII/EBCDIC arrays based on that result.

That would be unreliable for cross-compilation.

...

Or even generating the array in M4 or some ultra-portable form of shell at configure time would be a more sensible approach than this.

I'd say the current C code is relatively straight-forward, solves the problem, and compilers end up optimizing it properly anyway. Introducing a pre-processing step seems complicated to me.

I'm not disagreeing that the code is horrible, but I'm not aware of any reliable idiom that solves the requirements in a better way.

/Simon

...

...
#define B64(_) \ ((_) == 'A' ? 0 \ : (_) == 'B' ? 1 \ : (_) == 'C' ? 2 \ : (_) == 'D' ? 3 \ ... : (_) == '8' ? 60 \ : (_) == '9' ? 61 \ : (_) == '+' ? 62 \ : (_) == '/' ? 63 \ : -1)

signed char const base64_to_int[256] = { B64 (0), B64 (1), B64 (2), B64 (3), ... B64 (252), B64 (253), B64 (254), B64 (255) };

Best regards, Georg

Georg Sauthoff

11:04 a.m.

Hello,

On Sun, Apr 12, 2026 at 10:54:20PM +0200, Niels Möller wrote:

...

nettle@gms.tf writes:

...
This fixes -Wunterminated-string-initialization warnings with gcc 15.2.1.

Hmm, I'm to sure its worth the effort to add annotations to this. I see two alternatives:

it didn't feel like much effort to me.

...

Simply add the trailing NUL byte, e.g., instead of

...
static const uint8_t -hex_digits[16] = "0123456789abcdef"; +hex_digits[16] NONSTRING = "0123456789abcdef";

change it to

hex_digits[17] = "0123456789abcdef";

or just

hex_digits = "0123456789abcdef";

(will need code adjustments if sizeof is applied to affected string constants).

I can see how the sizeof change could confuse and trick people, i.e. sounds like a foot-gun one would want to avoid.

Also, even wasting one byte per such array just to silence warnings feels very wrong to me. Slippery slope, embedded systems and all that.

...

Disable this warning.

Also similar effort necessary without the benefits of the annotation. The annotation clearly communicates intent, what's going on, and can even be taken into account by static analyzers.

...

Opinions?

My order of preference is:

a) annotation b) array syntax (cf. Simon's reply)

The disadvantage of the array syntax ( `{ '0', '1', ... }` ) is that it somewhat increases compiler parsing time, but which is hardly measureable. It also increases he source file size slightly.

Best regards, Georg

Simon Josefsson

11:53 a.m.

New subject: Annotate non-zero-terminated character arrays

Georg Sauthoff nettle@gms.tf writes:

...

...

Simply add the trailing NUL byte, e.g., instead of

...
static const uint8_t -hex_digits[16] = "0123456789abcdef"; +hex_digits[16] NONSTRING = "0123456789abcdef";

change it to

hex_digits[17] = "0123456789abcdef";

or just

hex_digits = "0123456789abcdef";

(will need code adjustments if sizeof is applied to affected string constants).

I can see how the sizeof change could confuse and trick people, i.e. sounds like a foot-gun one would want to avoid.

Indeed -- and I think using sizeof on C string arrays declared with hard-coded magical values can be hard to read. One way around that would be:

#define HEX_ALPHABET_LEN 16 const uint8_t hex_digits[HEX_ALPHABET_LEN] = { '0', ... 'f' };

and to replace all uses of 'sizeof hex_digits' with HEX_ALPHABET_LEN.

/Simon

Niels Möller

8 May 8 May

2:02 p.m.

Georg Sauthoff nettle@gms.tf writes:

...

My order of preference is:

a) annotation b) array syntax (cf. Simon's reply)

I'm looking at this again. array syntax is reasonable for the hex_digits case, but is a bit unwieldy for the other cases.

When was __has_attribute introduced? If we use that for __attribute__ ((nonstring)), I guess we can use that for other attributes too, and delete the configure test that sets HAVE_GCC_ATTRIBUTE?

Documented here: https://gcc.gnu.org/onlinedocs/cpp/_005f_005fhas_005fattribute.html.

Regards, /Niels

-- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.

Georg Sauthoff

11 May 11 May

9:46 a.m.

Hello,

On Fri, May 08, 2026 at 04:02:59PM +0200, Niels Möller wrote:

...

Georg Sauthoff nettle@gms.tf writes:

...
My order of preference is:

a) annotation b) array syntax (cf. Simon's reply)

I'm looking at this again. array syntax is reasonable for the hex_digits case, but is a bit unwieldy for the other cases.

When was __has_attribute introduced? If we use that for __attribute__

it was introduced with GCC 5.1 (released 2015) and Clang supports it since at least version 3.1 (released 2012).

Demo: https://godbolt.org/z/WWh34Mer7 Mention in the gcc release notes: https://gcc.gnu.org/gcc-5/changes.html

...

((nonstring)), I guess we can use that for other attributes too, and delete the configure test that sets HAVE_GCC_ATTRIBUTE?

Yes, definitely.

Best regards, Georg

Niels Möller

12 May 12 May

6:33 p.m.

nettle@gms.tf writes:

...

This fixes -Wunterminated-string-initialization warnings with gcc 15.2.1.

I've merged more or less these changes to the use-has_attribute branch. I replaced the old configure-time attribute check with a macro that just inserts __has_attribute conditionals in config.h.in.

I'm about to merge this to master, maybe you or someone else wants to have a look at the changes? In patch form below.

Regards, /Niels

diff --git a/ChangeLog b/ChangeLog index 274c5392..831788e2 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,31 @@ +2026-05-12 Niels Möller nisse@lysator.liu.se + + Avoid gcc-15 warnings on missing NUL terminators. Based on patch + by Georg Sauthoff: + * aclocal.m4 (NETTLE_C_ATTRIBUTES): Define NONSTRING attribute. + * base16-encode.c (hex_digits): Declare as NONSTRING. + * base64-encode.c (base64_encode_table): Likewise. + * blowfish-bcrypt.c (radix64_encode_table): Likewise. + * tools/pkcs1-conv.c (pem_start_pattern, pem_end_pattern, pem_trailer_pattern): Likewise. + + * nettle-types.h (_NETTLE_ATTRIBUTE_PURE): Change preprocessor + conditionals to use __has_attribute. + (_NETTLE_ATTRIBUTE_DEPRECATED): Deleted, no longer used. + +2026-05-11 Niels Möller nisse@lysator.liu.se + + * aclocal.m4 (LSH_GCC_ATTRIBUTES): Delete macro, based on + AC_COMPILE_IF_ELSE, and config.h define HAVE_GCC_ATTRIBUTE. + Replaced by... + (NETTLE_C_ATTRIBUTES): Simpler macro, just adding code to config.h + to define CONSTRUCTOR, NORETURN, PRINTF_STYLE and UNUSED, when + corresponding attributes are supported according to + __has_attribute. The __has_attribute test was introduced in clang + and gcc a decade ago. + * configure.ac: Use NETTLE_C_ATTRIBUTES. + * fat-setup.h: Remove local definition of CONSTRUCTOR. Also drop + support for using #pragma init(...) with Sun compilers. + 2026-05-07 Niels Möller nisse@lysator.liu.se

Add support for sntrup761. diff --git a/aclocal.m4 b/aclocal.m4 index 73bf0cfb..602c5bad 100644 --- a/aclocal.m4 +++ b/aclocal.m4 @@ -38,37 +38,40 @@ AC_CACHE_VAL(lsh_cv_sys_ccpic,[ CCPIC="$lsh_cv_sys_ccpic" AC_MSG_RESULT($CCPIC)])

-dnl LSH_GCC_ATTRIBUTES -dnl Check for gcc's __attribute__ construction - -AC_DEFUN([LSH_GCC_ATTRIBUTES], -[AC_CACHE_CHECK(for __attribute__, - lsh_cv_c_attribute, -[ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ -#include <stdlib.h> - -static void foo(void) __attribute__ ((noreturn)); - -static void __attribute__ ((noreturn)) -foo(void) -{ - exit(1); -} -]], [[]])], - [lsh_cv_c_attribute=yes], - [lsh_cv_c_attribute=no])]) - -AH_TEMPLATE([HAVE_GCC_ATTRIBUTE], [Define if the compiler understands __attribute__]) -if test "x$lsh_cv_c_attribute" = "xyes"; then - AC_DEFINE(HAVE_GCC_ATTRIBUTE) -fi - -AH_BOTTOM( -[#if __GNUC__ && HAVE_GCC_ATTRIBUTE -# define NORETURN __attribute__ ((__noreturn__)) -# define PRINTF_STYLE(f, a) __attribute__ ((__format__ (__printf__, f, a))) -# define UNUSED __attribute__ ((__unused__)) -#else +dnl NETTLE_C_ATTRIBUTES +dnl Add code to config.h checking __has_attribute for the attributes +dnl we use. +AC_DEFUN([NETTLE_C_ATTRIBUTES], +[AH_BOTTOM([ +#ifdef __has_attribute +# if __has_attribute (__constructor__) +# define CONSTRUCTOR __attribute__ ((__constructor__)) +# else +# define CONSTRUCTOR +# endif +# if __has_attribute (__nonstring__) +# define NONSTRING __attribute__ ((__nonstring__)) +# else +# define NONSTRING +# endif +# if __has_attribute (__noreturn__) +# define NORETURN __attribute__ ((__noreturn__)) +# else +# define NORETURN +# endif +# if __has_attribute (__format__) +# define PRINTF_STYLE(f, a) __attribute__ ((__format__ (__printf__, f, a))) +# else +# define PRINTF_STYLE(f, a) +# endif +# if __has_attribute (__unused__) +# define UNUSED __attribute__ ((__unused__)) +# else +# define UNUSED +# endif +#else /* !_has_attribute */ +# define CONSTRUCTOR +# define NONSTRING # define NORETURN # define PRINTF_STYLE(f, a) # define UNUSED diff --git a/base16-encode.c b/base16-encode.c index 9c7f0b1e..c6a4c6b9 100644 --- a/base16-encode.c +++ b/base16-encode.c @@ -39,7 +39,7 @@

static const uint8_t -hex_digits[16] = "0123456789abcdef"; +hex_digits[16] NONSTRING = "0123456789abcdef";

#define DIGIT(x) (hex_digits[(x) & 0xf])

diff --git a/base64-encode.c b/base64-encode.c index ee1ec149..6bb55782 100644 --- a/base64-encode.c +++ b/base64-encode.c @@ -83,7 +83,7 @@ encode_raw(const char *alphabet, assert(out == dst); }

-static const char radix64_encode_table[64] = +static const char radix64_encode_table[64] NONSTRING = "./ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789"; diff --git a/configure.ac b/configure.ac index af017d81..c49462cb 100644 --- a/configure.ac +++ b/configure.ac @@ -215,7 +215,7 @@ if test "x$nettle_cv_c_builtin_bswap64" = "xyes" ; then AC_DEFINE(HAVE_BUILTIN_BSWAP64) fi

-LSH_GCC_ATTRIBUTES +NETTLE_C_ATTRIBUTES

# Check for file locking. We (AC_PROG_CC?) have already checked for # sys/types.h and unistd.h. diff --git a/fat-setup.h b/fat-setup.h index 37a600a8..3c53db41 100644 --- a/fat-setup.h +++ b/fat-setup.h @@ -68,15 +68,6 @@ time. */

-#if HAVE_GCC_ATTRIBUTE -# define CONSTRUCTOR __attribute__ ((constructor)) -#else -# define CONSTRUCTOR -# if defined (__sun) -# pragma init(fat_init) -# endif -#endif - /* Disable use of ifunc for now. Problem is, there's no guarantee that one can call any libc functions from the ifunc resolver. On x86 and x86_64, the corresponding IRELATIVE relocs are supposed to be diff --git a/nettle-types.h b/nettle-types.h index b4f1c1a0..08621079 100644 --- a/nettle-types.h +++ b/nettle-types.h @@ -39,21 +39,16 @@

/* Attributes we want to use in installed header files, and hence can't rely on config.h. */ -#ifdef __GNUC__ - -#define _NETTLE_ATTRIBUTE_PURE __attribute__((pure)) -#ifndef _NETTLE_ATTRIBUTE_DEPRECATED -/* Variant without message is supported since gcc-3.1 or so. */ -#define _NETTLE_ATTRIBUTE_DEPRECATED __attribute__((deprecated)) +#ifdef __has_attribute +# if __has_attribute (__pure__) +# define _NETTLE_ATTRIBUTE_PURE __attribute__((__pure__)) +# else +# define _NETTLE_ATTRIBUTE_PURE +# endif +#else +# define _NETTLE_ATTRIBUTE_PURE #endif

-#else /* !__GNUC__ */ - -#define _NETTLE_ATTRIBUTE_PURE -#define _NETTLE_ATTRIBUTE_DEPRECATED - -#endif /* !__GNUC__ */ - #ifdef __cplusplus extern "C" { #endif diff --git a/tools/pkcs1-conv.c b/tools/pkcs1-conv.c index f6b044d2..68bb67d1 100644 --- a/tools/pkcs1-conv.c +++ b/tools/pkcs1-conv.c @@ -117,13 +117,13 @@ read_file(struct nettle_buffer *buffer, FILE *f) }

static const uint8_t -pem_start_pattern[11] = "-----BEGIN "; +pem_start_pattern[11] NONSTRING = "-----BEGIN ";

static const uint8_t -pem_end_pattern[9] = "-----END "; +pem_end_pattern[9] NONSTRING = "-----END ";

static const uint8_t -pem_trailer_pattern[5] = "-----"; +pem_trailer_pattern[5] NONSTRING = "-----";

static const char pem_ws[33] = {

-- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.

Georg Sauthoff

13 May 13 May

8:03 p.m.

Hello,

On Tue, May 12, 2026 at 08:33:31PM +0200, Niels Möller wrote:

...

I've merged more or less these changes to the use-has_attribute branch. I replaced the old configure-time attribute check with a macro that just inserts __has_attribute conditionals in config.h.in.

...

I'm about to merge this to master, maybe you or someone else wants to have a look at the changes? In patch form below.

I've reviewed the patch and and it looks good to me.

Best regards, Georg

-- 'FROM:mail@example.org 221 2.7.0 Error: I can break rules, too. Goodbye.' (Postfix 2.10, 2014)

Niels Möller

17 May 17 May

7:40 p.m.

Georg Sauthoff nettle@gms.tf writes:

...

I've reviewed the patch and and it looks good to me.

Thanks! Merged to the master branch now.

Regards, /Niels

-- Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677. Internet email is subject to wholesale government surveillance.

Age (days ago)

112

Last active (days ago)

nettle-bugs@lists.lysator.liu.se

11 comments

4 participants

tags (0)

participants (4)

Georg Sauthoff
nettle＠gms.tf
Niels Möller
Simon Josefsson