Forum: War Ensemble BBS

Early History Of C

From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 05:43:10 2025

From Newsgroup: comp.lang.c

I was recently reading the old “Bell System Technical Journal” special issue on Unix <https://bitsavers.trailing-edge.com/magazines/Bell_System_Technical_Journal/BSTJ_V57N06_197807_Part_2.pdf>,
from 1978. After the better part of a decade of existence, Unix is
running on two major processor architectures: the DEC PDP-11 family,
and one or two Interdata machines. The Interdata ports seem to be
mainly research projects -- one at Bell Labs, the other at Wollongong.
As far as I can make out, just about all of the “production” uses of
Unix (inside and outside Bell Labs) are on PDP-11s.

The articles on the development of C include some interesting
historical detail. One point that stood out for me was the handling of
global variables.

When I first came across C (back in K&R days), the semantics of
duplicated global variable declarations -- overlay the allocated
storage for each allocation of a variable with the same name, so the
variable ends up being the largest size of all the declarations --
immediately reminded me of Fortran COMMON blocks. And one article
makes it clear that was a conscious decision, to try to ease
implementation of the language on non-Unix systems.

But they reckoned without the sheer human capacity to screw things up.
From Johnson and Ritchie, “Portability of C Programs and the UNIX
System”, page 2025:

Additional problems in the compilers arose from the decision to
use the local assemblers, loaders, and library editors on the host
operating systems. Surprisingly often, they were unable to handle the
code most naturally produced by the C compilers. For example, the
semantics of possibly initialized external variables in C was quite
consciously designed to be implementable in a way identical to
Fortran's COMMON blocks to guarantee its portability. It was an
unpleasant surprise to discover that the Honeywell assembler would
allow at most 61 such blocks (and hence external variables) and that
the IBM link-editor preferred to start external variables on even
4096-byte boundaries. Software limitations in the target systems
complicated the compilers and, in one case, the problems with external
variables just mentioned, forced changes in the C language itself.

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?
--- Synchronet 3.21a-Linux NewsLink 1.2

From rbowman@bowman@montana.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 07:03:08 2025

From Newsgroup: comp.lang.c

On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?

It was never a good idea but a joy of legacy code is variables were
sometimes defined in header files. I think it was gcc 10, or whatever
shipped with Debian Bullseye when gcc put its foot down and threw errors
about multiply defined variables.

-fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to alt.folklore.computers,comp.lang.c on Sat Jul 26 12:44:17 2025

From Newsgroup: comp.lang.c

On 26/07/2025 09:03, rbowman wrote:

On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?

It was never a good idea but a joy of legacy code is variables were
sometimes defined in header files. I think it was gcc 10, or whatever
shipped with Debian Bullseye when gcc put its foot down and threw errors about multiply defined variables.

It was indeed gcc 10. I was one of the people campaigning for it (see <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678>).

-fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.

It is always difficult dealing with old code that was incorrect C, but
happens to work as the developer intended due to lax compilers, luck,
limits to compiler optimisation, etc. Correcting the old code comes
with its own risks too. Fortunately, many compilers - like gcc -
provide options to get the "old-style" behaviour to ease transitions.

(Code that can be built and does what was intended when built with
"-fcommon", but not with "-fno-common", breaks the requirement that
there must be "exactly one external definition for the identifier" that
has been in all C standard versions.)

--- Synchronet 3.21a-Linux NewsLink 1.2

From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 07:27:34 2025

From Newsgroup: comp.lang.c

On 7/26/25 00:03, rbowman wrote:

On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?

It was never a good idea but a joy of legacy code is variables were
sometimes defined in header files. I think it was gcc 10, or whatever
shipped with Debian Bullseye when gcc put its foot down and threw errors about multiply defined variables.

-fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.

Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
this to work. No a common block, each global (PL/I EXTERNAL) variable is
its own section.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Lars Poulsen@lars@cleo.beagle-ears.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 20:42:09 2025

From Newsgroup: comp.lang.c

On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:

On 7/26/25 00:03, rbowman wrote:

On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?

It was never a good idea but a joy of legacy code is variables were
sometimes defined in header files. I think it was gcc 10, or whatever
shipped with Debian Bullseye when gcc put its foot down and threw errors
about multiply defined variables.

-fno-common was made the default flag. Luckily -fcommon restored the lax
behavior. That was easier than going down the rabbit hole of putting a
variable definition in multiple applications.

Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
this to work. No a common block, each global (PL/I EXTERNAL) variable is
its own section.

Isn't that exactly the same: A named CSECT ? (C for COMMON)
It's been decades, but tht is what I thinbk I remember.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 22:22:41 2025

From Newsgroup: comp.lang.c

On Sat, 26 Jul 2025 20:42:09 -0000 (UTC), Lars Poulsen wrote:

On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:

No a common block, each global (PL/I EXTERNAL) variable is its own
section.

Isn't that exactly the same: A named CSECT ? (C for COMMON) It's
been decades, but tht is what I thinbk I remember.

The DEC terminology was, in one form, ASECT/CSECT, then I think in a
later, generalized form, PSECT.

PSECTs had various attribute settings, one of which was “concatenated” versus “overlaid”. In the former case, multiple definitions of the
same PSECT name in different object modules had their allocations
added together by the linker to make the total size, while in the
latter, all allocations were made to start at the same address, so the
total size was that of the largest definition of that PSECT.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 15:46:10 2025

From Newsgroup: comp.lang.c

On 7/26/25 13:42, Lars Poulsen wrote:

On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:

On 7/26/25 00:03, rbowman wrote:

On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

Was the “forced change” the abandonment of Fortran-COMMON-block
semantics altogether for C globals?

It was never a good idea but a joy of legacy code is variables were
sometimes defined in header files. I think it was gcc 10, or whatever
shipped with Debian Bullseye when gcc put its foot down and threw errors >>> about multiply defined variables.

-fno-common was made the default flag. Luckily -fcommon restored the lax >>> behavior. That was easier than going down the rabbit hole of putting a
variable definition in multiple applications.

Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
this to work. No a common block, each global (PL/I EXTERNAL) variable is
its own section.

Isn't that exactly the same: A named CSECT ? (C for COMMON)
It's been decades, but tht is what I thinbk I remember.

Possibly I misunderstood. FORTRAN Blank Common is one big glob, and it's interpretation depends on how the programs declare it. Named COMMON
should be the same, with one linker section per common section. I
thought the first was what everyone was talking about in the context of
C, and the second is what I'm doing.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 23:07:14 2025

From Newsgroup: comp.lang.c

On Sat, 26 Jul 2025 15:46:10 -0700, Peter Flass wrote:

FORTRAN Blank Common is one big glob, and it's interpretation
depends on how the programs declare it. Named COMMON should be the
same, with one linker section per common section. I thought the
first was what everyone was talking about in the context of C, and
the second is what I'm doing.

“Blank COMMON” is just a COMMON block with an implementation-defined name that is distinct from every possible name that the programmer may give to
a named COMMON block.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Andrey Tarasevich@noone@noone.net to alt.folklore.computers,comp.lang.c on Sat Jul 26 17:11:21 2025

From Newsgroup: comp.lang.c

On Fri 7/25/2025 10:43 PM, Lawrence D'Oliveiro wrote:

The articles on the development of C include some interesting
historical detail. One point that stood out for me was the handling of
global variables.

When I first came across C (back in K&R days), the semantics of
duplicated global variable declarations -- overlay the allocated
storage for each allocation of a variable with the same name, so the
variable ends up being the largest size of all the declarations -- immediately reminded me of Fortran COMMON blocks. And one article
makes it clear that was a conscious decision, to try to ease
implementation of the language on non-Unix systems.

Both C89/90 rationale and C99 rationale have entire sections on the
ref/def models, which were taken into consideration. They provides the
same reasoning for the decision made by the committee: not burdening the weaker platforms with the task of merging/cleaning-out repetitive
definitions. The responsibility to ensure that there is at most one
definition for entities with external linkage lies on the user.

On a related note, it is interesting to point out that the language
continues to staunchly stick to the same approach in its later
iterations: C99 introduced inline functions, and the definition model
for inline functions with external linkage is also strikingly different
from, say, C++. The user is required to manually choose the definition
site and provide only one `extern inline` definition for the function
(i.e. regular non-inlined body, in case the compiler decides to use one).
--
Best regards,
Andrey
--- Synchronet 3.21a-Linux NewsLink 1.2

From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 20:58:51 2025

From Newsgroup: comp.lang.c

On 7/26/25 16:07, Lawrence D'Oliveiro wrote:

On Sat, 26 Jul 2025 15:46:10 -0700, Peter Flass wrote:

FORTRAN Blank Common is one big glob, and it's interpretation
depends on how the programs declare it. Named COMMON should be the
same, with one linker section per common section. I thought the
first was what everyone was talking about in the context of C, and
the second is what I'm doing.

“Blank COMMON” is just a COMMON block with an implementation-defined name that is distinct from every possible name that the programmer may give to
a named COMMON block.

I've been programming long enough to remember when it was the only kind
of common, and I seem to recall that different programs invoked by, I
think "CALL LINK" might each define it differently.
--- Synchronet 3.21a-Linux NewsLink 1.2

From rbowman@bowman@montana.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 04:49:46 2025

From Newsgroup: comp.lang.c

On Sat, 26 Jul 2025 12:44:17 +0200, David Brown wrote:

(Code that can be built and does what was intended when built with "-fcommon", but not with "-fno-common", breaks the requirement that
there must be "exactly one external definition for the identifier" that
has been in all C standard versions.)

When you're dealing with a legacy product that was being phased out, a codebase that goes back 30 years, and Gods know how many programmers of varying skills, you set the flag and take the win.

Not so easily dealt with was the idea that a signed short could handle all
the objects that wold ever be in the system. We changed that to unsigned
and kept our fingers crossed no site ever hit 64k objects before the
software was retired. We got lucky.

Moving from AIX to Linux was fun. AIX apparently had a special little bit bucket to handle attempts to do something with a null address, shrugged,
and moved on. Linux had no sense of humor.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 06:02:26 2025

From Newsgroup: comp.lang.c

On 2025-07-27, rbowman <bowman@montana.com> wrote:

Moving from AIX to Linux was fun. AIX apparently had a special little bit bucket to handle attempts to do something with a null address, shrugged,
and moved on. Linux had no sense of humor.

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

Of course, you still have GCC to contend with, in situations when
behavior is undefined when a pointer is null, and things are optimized accordingly.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to alt.folklore.computers,comp.lang.c on Sun Jul 27 09:53:02 2025

From Newsgroup: comp.lang.c

On 27/07/2025 08:02, Kaz Kylheku wrote:

On 2025-07-27, rbowman <bowman@montana.com> wrote:

Moving from AIX to Linux was fun. AIX apparently had a special little bit
bucket to handle attempts to do something with a null address, shrugged,
and moved on. Linux had no sense of humor.

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

Of course, you still have GCC to contend with, in situations when
behavior is undefined when a pointer is null, and things are optimized accordingly.

"-fno-delete-null-pointer-checks" is your friend there.

When dealing with code like this, "-fno-strict-aliasing" and "-fwrapv"
are probably also helpful, and you may want to disable warnings on
missing prototypes, and so on.

Not all old code needs this kind of hand-holding, of course - it was
perfectly possible to write good code 30+ years ago. (I have never
written code that needs "-fcommon", or "-fwrapv".) But some code needs
it, and often it is impractical or impossible to check through the old
code base.

My preference for old code is to keep the old compiler and old makefiles
(with flag settings) used for it - that way you are unlikely to be
caught out. But that's not so easy if you are mixing code from
different projects, or old and new code.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 01:16:24 2025

From Newsgroup: comp.lang.c

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2025-07-27, rbowman <bowman@montana.com> wrote:

Moving from AIX to Linux was fun. AIX apparently had a special little bit >> bucket to handle attempts to do something with a null address, shrugged,
and moved on. Linux had no sense of humor.

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

I don't think you can.

mmap()'s first argument is an address. If the address is non-null,
it's a hint about where to place the mapping (typically at a nearby
page boundary). If it's null, the kernel chooses the address.
Since address zero is a null pointer, I don't see any way to request
a mapping at address zero.

Of course, you still have GCC to contend with, in situations when
behavior is undefined when a pointer is null, and things are optimized accordingly.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Sun Jul 27 14:32:15 2025

From Newsgroup: comp.lang.c

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2025-07-27, rbowman <bowman@montana.com> wrote:

Moving from AIX to Linux was fun. AIX apparently had a special little bit >> bucket to handle attempts to do something with a null address, shrugged,
and moved on. Linux had no sense of humor.

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

I believe Bowman was referring to the BSD behavior of mapping a
read-only page of zeros at address zero. Dereferencing a null
pointers would return zero (which for the string functions, would
indicate end of string) instead of SIGSEGV/SIGBUS.

When BSD utilities were ported to System V, chaos ensued.

--- Synchronet 3.21a-Linux NewsLink 1.2

From sean@sean@conman.org to alt.folklore.computers,comp.lang.c on Mon Jul 28 02:33:21 2025

From Newsgroup: comp.lang.c

In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

I don't think you can.

You can, on some Linux systems. I was able to to do it on a Linux x86-32
bit system to run an old MS-DOS executable via the vm86() system call (and I had to implement enough MS-DOS calls to run just this one executable). Why?
I wanted to pipe stdin/stdout to some other Unix program and hacking DosBOX seemed like the harder option.

-spc

--- Synchronet 3.21a-Linux NewsLink 1.2

From James Kuyper@jameskuyper@alumni.caltech.edu to alt.folklore.computers,comp.lang.c on Mon Jul 28 08:02:37 2025

From Newsgroup: comp.lang.c

On 2025-07-27 22:33, sean@conman.org wrote:

In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

I don't think you can.

You can, on some Linux systems. I was able to to do it on a Linux x86-32 bit system to run an old MS-DOS executable via the vm86() system call (and I had to implement enough MS-DOS calls to run just this one executable). Why? I wanted to pipe stdin/stdout to some other Unix program and hacking DosBOX seemed like the harder option.

-spc

As Keith pointed out, passing it a null pointer leaves the
implementation free to choose whatever location it wants, even if
MAP_FIXED is chosen. An implementation could choose to return a null
pointer value, but there's nothing you can do to instruct it to do so.

With regards to

pa=mmap(addr, len, prot, flags, fildes, off);

The Single Unix standard says:

"When the implementation selects a value for pa, it never places a
mapping at address 0, nor does it replace any extant mapping."

However, that occurs in a paragraph which starts with "When MAP_FIXED is
not set ...", which implies that restriction does not apply when
MAP_FIXED is set.

The change history indicates that this behavior has not been changed. Interestingly, what the standard now says about MAP_FAILED, it used to
say about (void*)-1. The fact that it didn't use a null value to
indicate failure may imply that a null value was intended to be allowed
as a successful return.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 12:23:29 2025

From Newsgroup: comp.lang.c

On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The change history indicates that this behavior has not been changed. Interestingly, what the standard now says about MAP_FAILED, it used to
say about (void*)-1. The fact that it didn't use a null value to
indicate failure may imply that a null value was intended to be allowed
as a successful return.

On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser. Perhaps there is some CAP_* capability for finer-grained access to this:

$ cat mmap-null.c
#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>

int main(void)
{
int res = munmap(0, 4096);
void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
return 0;
}
$ make CFLAGS='-W -Wall -O2' mmap-null
cc -W -Wall -O2 mmap-null.c -o mmap-null
$ ./mmap-null
res = 0, ptr = 0xffffffff, errno = 1
$ sudo ./mmap-null
res = 0, ptr = (nil), errno = 0

If you have some legacy app which needs writable memory at tne null pointer, it's doable but not without some inconveniences, like not being able to run as a regular user on a system without sudo access.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Mon Jul 28 13:57:11 2025

From Newsgroup: comp.lang.c

James Kuyper <jameskuyper@alumni.caltech.edu> writes:

On 2025-07-27 22:33, sean@conman.org wrote:

In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

I believe you can use mmap to instruct Linux to provide a mapped,
writable page at address zero. Or multiple pages, in proportion
to your need.

As Keith pointed out, passing it a null pointer leaves the
implementation free to choose whatever location it wants, even if
MAP_FIXED is chosen. An implementation could choose to return a null
pointer value, but there's nothing you can do to instruct it to do so.

With regards to

pa=mmap(addr, len, prot, flags, fildes, off);

The Single Unix standard says:

"When the implementation selects a value for pa, it never places a
mapping at address 0, nor does it replace any extant mapping."

However, that occurs in a paragraph which starts with "When MAP_FIXED is
not set ...", which implies that restriction does not apply when
MAP_FIXED is set.

The change history indicates that this behavior has not been changed. >Interestingly, what the standard now says about MAP_FAILED, it used to
say about (void*)-1. The fact that it didn't use a null value to
indicate failure may imply that a null value was intended to be allowed
as a successful return.

Indeed, that is the case. If you need a page mapped at address zero, MAP_FIXED is the way to go. It's also discouraged, for various reasons.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Mon Jul 28 22:19:34 2025

From Newsgroup: comp.lang.c

On Mon, 28 Jul 2025 08:02:37 -0400, James Kuyper wrote:

As Keith pointed out, passing it a null pointer leaves the
implementation free to choose whatever location it wants, even if
MAP_FIXED is chosen. An implementation could choose to return a null
pointer value, but there's nothing you can do to instruct it to do so.

But C does not specify that the NULL address is actually address 0 (even
if it is denotable by an integer literal equal to 0).

This leaves the door open to MAP_FIXED (or better still,
MAP_FIXED_NOREPLACE) to creating a mapping at address 0 if you specify it
-- if you can find some way to specify address 0 from C ...
--- Synchronet 3.21a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 15:37:06 2025

From Newsgroup: comp.lang.c

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The change history indicates that this behavior has not been changed.
Interestingly, what the standard now says about MAP_FAILED, it used to
say about (void*)-1. The fact that it didn't use a null value to
indicate failure may imply that a null value was intended to be allowed
as a successful return.

On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser. Perhaps there is some CAP_* capability for finer-grained access to this:

$ cat mmap-null.c
#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>

int main(void)
{
int res = munmap(0, 4096);
void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
return 0;
}
$ make CFLAGS='-W -Wall -O2' mmap-null
cc -W -Wall -O2 mmap-null.c -o mmap-null
$ ./mmap-null
res = 0, ptr = 0xffffffff, errno = 1
$ sudo ./mmap-null
res = 0, ptr = (nil), errno = 0

If you have some legacy app which needs writable memory at tne null pointer, it's doable but not without some inconveniences, like not being able to run as
a regular user on a system without sudo access.

I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

I'm able to read from address 0 without a segfault.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.21a-Linux NewsLink 1.2

From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 18:05:43 2025

From Newsgroup: comp.lang.c

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 28 Jul 2025 08:02:37 -0400, James Kuyper wrote:

As Keith pointed out, passing it a null pointer leaves the
implementation free to choose whatever location it wants, even if
MAP_FIXED is chosen. An implementation could choose to return a null
pointer value, but there's nothing you can do to instruct it to do so.

But C does not specify that the NULL address is actually address 0 (even
if it is denotable by an integer literal equal to 0).

Right, C doesn't -- but POSIX does, and mmap() is specified by POSIX.

https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/stddef.h.html

"""
Additionally, any pointer object whose representation has all bits set
to zero, perhaps by memset() to 0 or by calloc(), shall be treated as a
null pointer.
"""

So all-bits-zero may or may not be the *only* representation for
a null pointer, but it's guaranteed (again, by POSIX, not by C)
to be *a* representation for a null pointer.

If you have the mmap() function, you can safely assume that
all-bits-zero is a null pointer (unless you're using some other
mmap() that doesn't conform to POSIX).

This leaves the door open to MAP_FIXED (or better still, MAP_FIXED_NOREPLACE) to creating a mapping at address 0 if you specify it
-- if you can find some way to specify address 0 from C ...

Experiments (documented in this thread) show that passing a null
pointer as the first argument to mmap() will (apparently) succeed in
creating a new mapping at address 0, but only if you're running with
root privileges. Neither the man page nor the POSIX specification
says anything about requiring root privileges (unless I've missed
something).

There are very few good reasons to want to create a mapping at
address 0. If you do have a good reason, you'll need to be careful
to avoid compiler optimizations based on the undefinedness of
dereferencing a null pointer. (Judicious use of the "volatile"
keyword might be appropriate, but I don't think even that is
guaranteed.)
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
--- Synchronet 3.21a-Linux NewsLink 1.2

From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Tue Jul 29 14:24:32 2025

From Newsgroup: comp.lang.c

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser.
Perhaps there is some CAP_* capability for finer-grained access to this:

$ cat mmap-null.c
#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>

int main(void)
{
int res = munmap(0, 4096);
void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
return 0;
}
$ make CFLAGS='-W -Wall -O2' mmap-null
cc -W -Wall -O2 mmap-null.c -o mmap-null
$ ./mmap-null
res = 0, ptr = 0xffffffff, errno = 1
$ sudo ./mmap-null
res = 0, ptr = (nil), errno = 0

If you have some legacy app which needs writable memory at tne null pointer, >> it's doable but not without some inconveniences, like not being able to run as
a regular user on a system without sudo access.

I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

I'm able to read from address 0 without a segfault.

The requirement that one must be root to map address zero with
MAP_FIXED is not a POSIX requirement, but rather a Linux implementation
choice.

Unixware, for example, had no permission checks on MAP_FIXED.

if (sfs_vfsp->vfs_flags & SFS_FSINVALID)
return EIO;

if (vp->v_flag & VNOMAP)
return (ENOSYS);

if (vp->v_type != VREG)
return (ENODEV);

if ((int)off < 0 || (int)(off + len) < 0)
return (EINVAL);

/*
* If file is being locked, disallow mapping.
*/
if (vp->v_filocks != NULL && MANDLOCK(vp, ip->i_mode))
return EAGAIN;

SFS_IRWLOCK_WRLOCK(ip);

as_wrlock(as);

if ((flags & MAP_FIXED) == 0) {
map_addr(addrp, len, (off_t)off, 0);
if (*addrp == NULL) {
as_unlock(as);
SFS_IRWLOCK_UNLOCK(ip);
return (ENOMEM);
}
} else {
/*
* User specified address - blow away any previous mappings
*/
(void) as_unmap(as, *addrp, len);
}
--- Synchronet 3.21a-Linux NewsLink 1.2

From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Wed Jul 30 03:50:51 2025

From Newsgroup: comp.lang.c

On 2025-07-28, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:

Kaz Kylheku <643-408-1753@kylheku.com> writes:

On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:

The change history indicates that this behavior has not been changed.
Interestingly, what the standard now says about MAP_FAILED, it used to
say about (void*)-1. The fact that it didn't use a null value to
indicate failure may imply that a null value was intended to be allowed
as a successful return.

On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser.
Perhaps there is some CAP_* capability for finer-grained access to this:

$ cat mmap-null.c
#include <sys/mman.h>
#include <stdio.h>
#include <errno.h>

int main(void)
{
int res = munmap(0, 4096);
void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
return 0;
}
$ make CFLAGS='-W -Wall -O2' mmap-null
cc -W -Wall -O2 mmap-null.c -o mmap-null
$ ./mmap-null
res = 0, ptr = 0xffffffff, errno = 1
$ sudo ./mmap-null
res = 0, ptr = (nil), errno = 0

If you have some legacy app which needs writable memory at tne null pointer, >> it's doable but not without some inconveniences, like not being able to run as
a regular user on a system without sudo access.

I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

I'm able to read from address 0 without a segfault.

I just had an idle thought: is this controlled by sysctl somehow?

Grepping the output of "sysctl -a" for "mmap", I found:

vm.mmap_min_addr = 65536

I think this is it? As in below this value, you need privilege?

The kernel documentation says:

This file indicates the amount of address space which a user process
will be restricted from mmapping. Since kernel null dereference bugs
could accidentally operate based on the information in the first couple
of pages of memory userspace processes should not be allowed to write to
them. By default this value is set to 0 and no protections will be
enforced by the security module. Setting this value to something like
64k will allow the vast majority of applications to work correctly and
provide defense in depth against future potential kernel bugs.

OK, so if the kernel has a null pointer dereference bug, if a malicious
user space process can map something there, it can then influence
subsequent behavior of kernel code, possibly leading to a privilege
escalation.

Searching around, it seems to date back to the 2.6 kernel (or even
earlier) and used to have a lower value, like 4096. Obviously 4K might
not be enough if the kernel is using a large dynamic array or large
structure.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.21a-Linux NewsLink 1.2

From Jakob Bohm@egenagwemdimtapsar@jbohm.dk to alt.folklore.computers,comp.lang.c on Sun Aug 17 20:41:18 2025

From Newsgroup: comp.lang.c

On 2025-07-27 02:11, Andrey Tarasevich wrote:

On Fri 7/25/2025 10:43 PM, Lawrence D'Oliveiro wrote:

The articles on the development of C include some interesting
historical detail. One point that stood out for me was the handling of
global variables.

When I first came across C (back in K&R days), the semantics of
duplicated global variable declarations -- overlay the allocated
storage for each allocation of a variable with the same name, so the
variable ends up being the largest size of all the declarations --
immediately reminded me of Fortran COMMON blocks. And one article
makes it clear that was a conscious decision, to try to ease
implementation of the language on non-Unix systems.

Both C89/90 rationale and C99 rationale have entire sections on the
ref/def models, which were taken into consideration. They provides the
same reasoning for the decision made by the committee: not burdening the weaker platforms with the task of merging/cleaning-out repetitive definitions. The responsibility to ensure that there is at most one definition for entities with external linkage lies on the user.

On a related note, it is interesting to point out that the language continues to staunchly stick to the same approach in its later
iterations: C99 introduced inline functions, and the definition model
for inline functions with external linkage is also strikingly different from, say, C++. The user is required to manually choose the definition
site and provide only one `extern inline` definition for the function
(i.e. regular non-inlined body, in case the compiler decides to use one).

In stark contrast to this approach, some major compilers require a
linker with support for many global common blocks, and use this for
features such as merging identical string literals across compilation
units, such that 1000 compilation units each having logic to print the
string "Error", will only consume a single 6 bytes of linked program
size . This is of cause because those compilers enjoy the luxury of
requiring specific linker implementations .

As for global vars, I maintain an internal library that has relies on a compiler-initialized opaque global being initialized using the internal implementation types, while most users will see only a same size opaque
type (appropriate static asserts check the size identity).

Enjoy

Jakob
--
Jakob Bohm, MSc.Eng., I speak only for myself, not my company
This public discussion message is non-binding and may contain errors
All trademarks and other things belong to their owners, if any.
--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Microbot
  Sat Aug 23 00:05:56 2025
  from Moore, Ok via Telnet
- Noozle
  Fri Aug 22 11:07:42 2025
  from Noozle City via Telnet
- Microbot
  Fri Aug 22 01:53:59 2025
  from Moore, Ok via Telnet
- Microbot
  Thu Aug 21 03:21:53 2025
  from Moore, Ok via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,064
Nodes:	10 (0 / 10)
Uptime:	159:15:24
Calls:	13,691
Calls today:	1
Files:	186,936
D/L today:	7,070 files (2,120M bytes)
Messages:	2,411,313

Early History Of C

Who's Online

Recent Visitors

System Info