• Early History Of C

    From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 05:43:10 2025
    From Newsgroup: comp.lang.c

    I was recently reading the old “Bell System Technical Journal” special issue on Unix <https://bitsavers.trailing-edge.com/magazines/Bell_System_Technical_Journal/BSTJ_V57N06_197807_Part_2.pdf>,
    from 1978. After the better part of a decade of existence, Unix is
    running on two major processor architectures: the DEC PDP-11 family,
    and one or two Interdata machines. The Interdata ports seem to be
    mainly research projects -- one at Bell Labs, the other at Wollongong.
    As far as I can make out, just about all of the “production” uses of
    Unix (inside and outside Bell Labs) are on PDP-11s.

    The articles on the development of C include some interesting
    historical detail. One point that stood out for me was the handling of
    global variables.

    When I first came across C (back in K&R days), the semantics of
    duplicated global variable declarations -- overlay the allocated
    storage for each allocation of a variable with the same name, so the
    variable ends up being the largest size of all the declarations --
    immediately reminded me of Fortran COMMON blocks. And one article
    makes it clear that was a conscious decision, to try to ease
    implementation of the language on non-Unix systems.

    But they reckoned without the sheer human capacity to screw things up.
    From Johnson and Ritchie, “Portability of C Programs and the UNIX
    System”, page 2025:

    Additional problems in the compilers arose from the decision to
    use the local assemblers, loaders, and library editors on the host
    operating systems. Surprisingly often, they were unable to handle the
    code most naturally produced by the C compilers. For example, the
    semantics of possibly initialized external variables in C was quite
    consciously designed to be implementable in a way identical to
    Fortran's COMMON blocks to guarantee its portability. It was an
    unpleasant surprise to discover that the Honeywell assembler would
    allow at most 61 such blocks (and hence external variables) and that
    the IBM link-editor preferred to start external variables on even
    4096-byte boundaries. Software limitations in the target systems
    complicated the compilers and, in one case, the problems with external
    variables just mentioned, forced changes in the C language itself.

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From rbowman@bowman@montana.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 07:03:08 2025
    From Newsgroup: comp.lang.c

    On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?

    It was never a good idea but a joy of legacy code is variables were
    sometimes defined in header files. I think it was gcc 10, or whatever
    shipped with Debian Bullseye when gcc put its foot down and threw errors
    about multiply defined variables.

    -fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to alt.folklore.computers,comp.lang.c on Sat Jul 26 12:44:17 2025
    From Newsgroup: comp.lang.c

    On 26/07/2025 09:03, rbowman wrote:
    On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?

    It was never a good idea but a joy of legacy code is variables were
    sometimes defined in header files. I think it was gcc 10, or whatever
    shipped with Debian Bullseye when gcc put its foot down and threw errors about multiply defined variables.


    It was indeed gcc 10. I was one of the people campaigning for it (see <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85678>).

    -fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.

    It is always difficult dealing with old code that was incorrect C, but
    happens to work as the developer intended due to lax compilers, luck,
    limits to compiler optimisation, etc. Correcting the old code comes
    with its own risks too. Fortunately, many compilers - like gcc -
    provide options to get the "old-style" behaviour to ease transitions.

    (Code that can be built and does what was intended when built with
    "-fcommon", but not with "-fno-common", breaks the requirement that
    there must be "exactly one external definition for the identifier" that
    has been in all C standard versions.)

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 07:27:34 2025
    From Newsgroup: comp.lang.c

    On 7/26/25 00:03, rbowman wrote:
    On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?

    It was never a good idea but a joy of legacy code is variables were
    sometimes defined in header files. I think it was gcc 10, or whatever
    shipped with Debian Bullseye when gcc put its foot down and threw errors about multiply defined variables.

    -fno-common was made the default flag. Luckily -fcommon restored the lax behavior. That was easier than going down the rabbit hole of putting a variable definition in multiple applications.

    Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
    this to work. No a common block, each global (PL/I EXTERNAL) variable is
    its own section.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lars Poulsen@lars@cleo.beagle-ears.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 20:42:09 2025
    From Newsgroup: comp.lang.c

    On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:
    On 7/26/25 00:03, rbowman wrote:
    On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?

    It was never a good idea but a joy of legacy code is variables were
    sometimes defined in header files. I think it was gcc 10, or whatever
    shipped with Debian Bullseye when gcc put its foot down and threw errors
    about multiply defined variables.

    -fno-common was made the default flag. Luckily -fcommon restored the lax
    behavior. That was easier than going down the rabbit hole of putting a
    variable definition in multiple applications.

    Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
    this to work. No a common block, each global (PL/I EXTERNAL) variable is
    its own section.

    Isn't that exactly the same: A named CSECT ? (C for COMMON)
    It's been decades, but tht is what I thinbk I remember.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 22:22:41 2025
    From Newsgroup: comp.lang.c

    On Sat, 26 Jul 2025 20:42:09 -0000 (UTC), Lars Poulsen wrote:

    On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:

    No a common block, each global (PL/I EXTERNAL) variable is its own
    section.

    Isn't that exactly the same: A named CSECT ? (C for COMMON) It's
    been decades, but tht is what I thinbk I remember.

    The DEC terminology was, in one form, ASECT/CSECT, then I think in a
    later, generalized form, PSECT.

    PSECTs had various attribute settings, one of which was “concatenated” versus “overlaid”. In the former case, multiple definitions of the
    same PSECT name in different object modules had their allocations
    added together by the linker to make the total size, while in the
    latter, all allocations were made to start at the same address, so the
    total size was that of the largest definition of that PSECT.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 15:46:10 2025
    From Newsgroup: comp.lang.c

    On 7/26/25 13:42, Lars Poulsen wrote:
    On 2025-07-26, Peter Flass <Peter@Iron-Spring.com> wrote:
    On 7/26/25 00:03, rbowman wrote:
    On Sat, 26 Jul 2025 05:43:10 -0000 (UTC), Lawrence D'Oliveiro wrote:

    Was the “forced change” the abandonment of Fortran-COMMON-block
    semantics altogether for C globals?

    It was never a good idea but a joy of legacy code is variables were
    sometimes defined in header files. I think it was gcc 10, or whatever
    shipped with Debian Bullseye when gcc put its foot down and threw errors >>> about multiply defined variables.

    -fno-common was made the default flag. Luckily -fcommon restored the lax >>> behavior. That was easier than going down the rabbit hole of putting a
    variable definition in multiple applications.

    Sort of. For PL/I I have to specify the ld option "-z muldefs" to get
    this to work. No a common block, each global (PL/I EXTERNAL) variable is
    its own section.

    Isn't that exactly the same: A named CSECT ? (C for COMMON)
    It's been decades, but tht is what I thinbk I remember.

    Possibly I misunderstood. FORTRAN Blank Common is one big glob, and it's interpretation depends on how the programs declare it. Named COMMON
    should be the same, with one linker section per common section. I
    thought the first was what everyone was talking about in the context of
    C, and the second is what I'm doing.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Sat Jul 26 23:07:14 2025
    From Newsgroup: comp.lang.c

    On Sat, 26 Jul 2025 15:46:10 -0700, Peter Flass wrote:

    FORTRAN Blank Common is one big glob, and it's interpretation
    depends on how the programs declare it. Named COMMON should be the
    same, with one linker section per common section. I thought the
    first was what everyone was talking about in the context of C, and
    the second is what I'm doing.

    “Blank COMMON” is just a COMMON block with an implementation-defined name that is distinct from every possible name that the programmer may give to
    a named COMMON block.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Andrey Tarasevich@noone@noone.net to alt.folklore.computers,comp.lang.c on Sat Jul 26 17:11:21 2025
    From Newsgroup: comp.lang.c

    On Fri 7/25/2025 10:43 PM, Lawrence D'Oliveiro wrote:
    The articles on the development of C include some interesting
    historical detail. One point that stood out for me was the handling of
    global variables.

    When I first came across C (back in K&R days), the semantics of
    duplicated global variable declarations -- overlay the allocated
    storage for each allocation of a variable with the same name, so the
    variable ends up being the largest size of all the declarations -- immediately reminded me of Fortran COMMON blocks. And one article
    makes it clear that was a conscious decision, to try to ease
    implementation of the language on non-Unix systems.


    Both C89/90 rationale and C99 rationale have entire sections on the
    ref/def models, which were taken into consideration. They provides the
    same reasoning for the decision made by the committee: not burdening the weaker platforms with the task of merging/cleaning-out repetitive
    definitions. The responsibility to ensure that there is at most one
    definition for entities with external linkage lies on the user.

    On a related note, it is interesting to point out that the language
    continues to staunchly stick to the same approach in its later
    iterations: C99 introduced inline functions, and the definition model
    for inline functions with external linkage is also strikingly different
    from, say, C++. The user is required to manually choose the definition
    site and provide only one `extern inline` definition for the function
    (i.e. regular non-inlined body, in case the compiler decides to use one).
    --
    Best regards,
    Andrey
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Peter Flass@Peter@Iron-Spring.com to alt.folklore.computers,comp.lang.c on Sat Jul 26 20:58:51 2025
    From Newsgroup: comp.lang.c

    On 7/26/25 16:07, Lawrence D'Oliveiro wrote:
    On Sat, 26 Jul 2025 15:46:10 -0700, Peter Flass wrote:

    FORTRAN Blank Common is one big glob, and it's interpretation
    depends on how the programs declare it. Named COMMON should be the
    same, with one linker section per common section. I thought the
    first was what everyone was talking about in the context of C, and
    the second is what I'm doing.

    “Blank COMMON” is just a COMMON block with an implementation-defined name that is distinct from every possible name that the programmer may give to
    a named COMMON block.

    I've been programming long enough to remember when it was the only kind
    of common, and I seem to recall that different programs invoked by, I
    think "CALL LINK" might each define it differently.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From rbowman@bowman@montana.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 04:49:46 2025
    From Newsgroup: comp.lang.c

    On Sat, 26 Jul 2025 12:44:17 +0200, David Brown wrote:

    (Code that can be built and does what was intended when built with "-fcommon", but not with "-fno-common", breaks the requirement that
    there must be "exactly one external definition for the identifier" that
    has been in all C standard versions.)

    When you're dealing with a legacy product that was being phased out, a codebase that goes back 30 years, and Gods know how many programmers of varying skills, you set the flag and take the win.

    Not so easily dealt with was the idea that a signed short could handle all
    the objects that wold ever be in the system. We changed that to unsigned
    and kept our fingers crossed no site ever hit 64k objects before the
    software was retired. We got lucky.

    Moving from AIX to Linux was fun. AIX apparently had a special little bit bucket to handle attempts to do something with a null address, shrugged,
    and moved on. Linux had no sense of humor.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 06:02:26 2025
    From Newsgroup: comp.lang.c

    On 2025-07-27, rbowman <bowman@montana.com> wrote:
    Moving from AIX to Linux was fun. AIX apparently had a special little bit bucket to handle attempts to do something with a null address, shrugged,
    and moved on. Linux had no sense of humor.

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    Of course, you still have GCC to contend with, in situations when
    behavior is undefined when a pointer is null, and things are optimized accordingly.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to alt.folklore.computers,comp.lang.c on Sun Jul 27 09:53:02 2025
    From Newsgroup: comp.lang.c

    On 27/07/2025 08:02, Kaz Kylheku wrote:
    On 2025-07-27, rbowman <bowman@montana.com> wrote:
    Moving from AIX to Linux was fun. AIX apparently had a special little bit
    bucket to handle attempts to do something with a null address, shrugged,
    and moved on. Linux had no sense of humor.

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    Of course, you still have GCC to contend with, in situations when
    behavior is undefined when a pointer is null, and things are optimized accordingly.


    "-fno-delete-null-pointer-checks" is your friend there.

    When dealing with code like this, "-fno-strict-aliasing" and "-fwrapv"
    are probably also helpful, and you may want to disable warnings on
    missing prototypes, and so on.

    Not all old code needs this kind of hand-holding, of course - it was
    perfectly possible to write good code 30+ years ago. (I have never
    written code that needs "-fcommon", or "-fwrapv".) But some code needs
    it, and often it is impractical or impossible to check through the old
    code base.

    My preference for old code is to keep the old compiler and old makefiles
    (with flag settings) used for it - that way you are unlikely to be
    caught out. But that's not so easy if you are mixing code from
    different projects, or old and new code.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Sun Jul 27 01:16:24 2025
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> writes:
    On 2025-07-27, rbowman <bowman@montana.com> wrote:
    Moving from AIX to Linux was fun. AIX apparently had a special little bit >> bucket to handle attempts to do something with a null address, shrugged,
    and moved on. Linux had no sense of humor.

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    I don't think you can.

    mmap()'s first argument is an address. If the address is non-null,
    it's a hint about where to place the mapping (typically at a nearby
    page boundary). If it's null, the kernel chooses the address.
    Since address zero is a null pointer, I don't see any way to request
    a mapping at address zero.

    Of course, you still have GCC to contend with, in situations when
    behavior is undefined when a pointer is null, and things are optimized accordingly.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Sun Jul 27 14:32:15 2025
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> writes:
    On 2025-07-27, rbowman <bowman@montana.com> wrote:
    Moving from AIX to Linux was fun. AIX apparently had a special little bit >> bucket to handle attempts to do something with a null address, shrugged,
    and moved on. Linux had no sense of humor.

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.


    I believe Bowman was referring to the BSD behavior of mapping a
    read-only page of zeros at address zero. Dereferencing a null
    pointers would return zero (which for the string functions, would
    indicate end of string) instead of SIGSEGV/SIGBUS.

    When BSD utilities were ported to System V, chaos ensued.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From sean@sean@conman.org to alt.folklore.computers,comp.lang.c on Mon Jul 28 02:33:21 2025
    From Newsgroup: comp.lang.c

    In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    I don't think you can.

    You can, on some Linux systems. I was able to to do it on a Linux x86-32
    bit system to run an old MS-DOS executable via the vm86() system call (and I had to implement enough MS-DOS calls to run just this one executable). Why?
    I wanted to pipe stdin/stdout to some other Unix program and hacking DosBOX seemed like the harder option.

    -spc

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to alt.folklore.computers,comp.lang.c on Mon Jul 28 08:02:37 2025
    From Newsgroup: comp.lang.c

    On 2025-07-27 22:33, sean@conman.org wrote:
    In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    I don't think you can.

    You can, on some Linux systems. I was able to to do it on a Linux x86-32 bit system to run an old MS-DOS executable via the vm86() system call (and I had to implement enough MS-DOS calls to run just this one executable). Why? I wanted to pipe stdin/stdout to some other Unix program and hacking DosBOX seemed like the harder option.

    -spc

    As Keith pointed out, passing it a null pointer leaves the
    implementation free to choose whatever location it wants, even if
    MAP_FIXED is chosen. An implementation could choose to return a null
    pointer value, but there's nothing you can do to instruct it to do so.

    With regards to

    pa=mmap(addr, len, prot, flags, fildes, off);

    The Single Unix standard says:

    "When the implementation selects a value for pa, it never places a
    mapping at address 0, nor does it replace any extant mapping."

    However, that occurs in a paragraph which starts with "When MAP_FIXED is
    not set ...", which implies that restriction does not apply when
    MAP_FIXED is set.

    The change history indicates that this behavior has not been changed. Interestingly, what the standard now says about MAP_FAILED, it used to
    say about (void*)-1. The fact that it didn't use a null value to
    indicate failure may imply that a null value was intended to be allowed
    as a successful return.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 12:23:29 2025
    From Newsgroup: comp.lang.c

    On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    The change history indicates that this behavior has not been changed. Interestingly, what the standard now says about MAP_FAILED, it used to
    say about (void*)-1. The fact that it didn't use a null value to
    indicate failure may imply that a null value was intended to be allowed
    as a successful return.

    On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser. Perhaps there is some CAP_* capability for finer-grained access to this:

    $ cat mmap-null.c
    #include <sys/mman.h>
    #include <stdio.h>
    #include <errno.h>

    int main(void)
    {
    int res = munmap(0, 4096);
    void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
    printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
    return 0;
    }
    $ make CFLAGS='-W -Wall -O2' mmap-null
    cc -W -Wall -O2 mmap-null.c -o mmap-null
    $ ./mmap-null
    res = 0, ptr = 0xffffffff, errno = 1
    $ sudo ./mmap-null
    res = 0, ptr = (nil), errno = 0

    If you have some legacy app which needs writable memory at tne null pointer, it's doable but not without some inconveniences, like not being able to run as a regular user on a system without sudo access.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Mon Jul 28 13:57:11 2025
    From Newsgroup: comp.lang.c

    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    On 2025-07-27 22:33, sean@conman.org wrote:
    In alt.folklore.computers Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    I believe you can use mmap to instruct Linux to provide a mapped,
    writable page at address zero. Or multiple pages, in proportion
    to your need.

    As Keith pointed out, passing it a null pointer leaves the
    implementation free to choose whatever location it wants, even if
    MAP_FIXED is chosen. An implementation could choose to return a null
    pointer value, but there's nothing you can do to instruct it to do so.

    With regards to

    pa=mmap(addr, len, prot, flags, fildes, off);

    The Single Unix standard says:

    "When the implementation selects a value for pa, it never places a
    mapping at address 0, nor does it replace any extant mapping."

    However, that occurs in a paragraph which starts with "When MAP_FIXED is
    not set ...", which implies that restriction does not apply when
    MAP_FIXED is set.

    The change history indicates that this behavior has not been changed. >Interestingly, what the standard now says about MAP_FAILED, it used to
    say about (void*)-1. The fact that it didn't use a null value to
    indicate failure may imply that a null value was intended to be allowed
    as a successful return.

    Indeed, that is the case. If you need a page mapped at address zero, MAP_FIXED is the way to go. It's also discouraged, for various reasons.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to alt.folklore.computers,comp.lang.c on Mon Jul 28 22:19:34 2025
    From Newsgroup: comp.lang.c

    On Mon, 28 Jul 2025 08:02:37 -0400, James Kuyper wrote:

    As Keith pointed out, passing it a null pointer leaves the
    implementation free to choose whatever location it wants, even if
    MAP_FIXED is chosen. An implementation could choose to return a null
    pointer value, but there's nothing you can do to instruct it to do so.

    But C does not specify that the NULL address is actually address 0 (even
    if it is denotable by an integer literal equal to 0).

    This leaves the door open to MAP_FIXED (or better still,
    MAP_FIXED_NOREPLACE) to creating a mapping at address 0 if you specify it
    -- if you can find some way to specify address 0 from C ...
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 15:37:06 2025
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> writes:
    On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    The change history indicates that this behavior has not been changed.
    Interestingly, what the standard now says about MAP_FAILED, it used to
    say about (void*)-1. The fact that it didn't use a null value to
    indicate failure may imply that a null value was intended to be allowed
    as a successful return.

    On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser. Perhaps there is some CAP_* capability for finer-grained access to this:

    $ cat mmap-null.c
    #include <sys/mman.h>
    #include <stdio.h>
    #include <errno.h>

    int main(void)
    {
    int res = munmap(0, 4096);
    void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
    printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
    return 0;
    }
    $ make CFLAGS='-W -Wall -O2' mmap-null
    cc -W -Wall -O2 mmap-null.c -o mmap-null
    $ ./mmap-null
    res = 0, ptr = 0xffffffff, errno = 1
    $ sudo ./mmap-null
    res = 0, ptr = (nil), errno = 0

    If you have some legacy app which needs writable memory at tne null pointer, it's doable but not without some inconveniences, like not being able to run as
    a regular user on a system without sudo access.

    I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

    I'm able to read from address 0 without a segfault.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to alt.folklore.computers,comp.lang.c on Mon Jul 28 18:05:43 2025
    From Newsgroup: comp.lang.c

    Lawrence D'Oliveiro <ldo@nz.invalid> writes:
    On Mon, 28 Jul 2025 08:02:37 -0400, James Kuyper wrote:
    As Keith pointed out, passing it a null pointer leaves the
    implementation free to choose whatever location it wants, even if
    MAP_FIXED is chosen. An implementation could choose to return a null
    pointer value, but there's nothing you can do to instruct it to do so.

    But C does not specify that the NULL address is actually address 0 (even
    if it is denotable by an integer literal equal to 0).

    Right, C doesn't -- but POSIX does, and mmap() is specified by POSIX.

    https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/stddef.h.html

    """
    Additionally, any pointer object whose representation has all bits set
    to zero, perhaps by memset() to 0 or by calloc(), shall be treated as a
    null pointer.
    """

    So all-bits-zero may or may not be the *only* representation for
    a null pointer, but it's guaranteed (again, by POSIX, not by C)
    to be *a* representation for a null pointer.

    If you have the mmap() function, you can safely assume that
    all-bits-zero is a null pointer (unless you're using some other
    mmap() that doesn't conform to POSIX).

    This leaves the door open to MAP_FIXED (or better still, MAP_FIXED_NOREPLACE) to creating a mapping at address 0 if you specify it
    -- if you can find some way to specify address 0 from C ...

    Experiments (documented in this thread) show that passing a null
    pointer as the first argument to mmap() will (apparently) succeed in
    creating a new mapping at address 0, but only if you're running with
    root privileges. Neither the man page nor the POSIX specification
    says anything about requiring root privileges (unless I've missed
    something).

    There are very few good reasons to want to create a mapping at
    address 0. If you do have a good reason, you'll need to be careful
    to avoid compiler optimizations based on the undefinedness of
    dereferencing a null pointer. (Judicious use of the "volatile"
    keyword might be appropriate, but I don't think even that is
    guaranteed.)
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to alt.folklore.computers,comp.lang.c on Tue Jul 29 14:24:32 2025
    From Newsgroup: comp.lang.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser.
    Perhaps there is some CAP_* capability for finer-grained access to this:

    $ cat mmap-null.c
    #include <sys/mman.h>
    #include <stdio.h>
    #include <errno.h>

    int main(void)
    {
    int res = munmap(0, 4096);
    void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
    printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
    return 0;
    }
    $ make CFLAGS='-W -Wall -O2' mmap-null
    cc -W -Wall -O2 mmap-null.c -o mmap-null
    $ ./mmap-null
    res = 0, ptr = 0xffffffff, errno = 1
    $ sudo ./mmap-null
    res = 0, ptr = (nil), errno = 0

    If you have some legacy app which needs writable memory at tne null pointer, >> it's doable but not without some inconveniences, like not being able to run as
    a regular user on a system without sudo access.

    I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

    I'm able to read from address 0 without a segfault.

    The requirement that one must be root to map address zero with
    MAP_FIXED is not a POSIX requirement, but rather a Linux implementation
    choice.

    Unixware, for example, had no permission checks on MAP_FIXED.

    if (sfs_vfsp->vfs_flags & SFS_FSINVALID)
    return EIO;

    if (vp->v_flag & VNOMAP)
    return (ENOSYS);

    if (vp->v_type != VREG)
    return (ENODEV);

    if ((int)off < 0 || (int)(off + len) < 0)
    return (EINVAL);


    /*
    * If file is being locked, disallow mapping.
    */
    if (vp->v_filocks != NULL && MANDLOCK(vp, ip->i_mode))
    return EAGAIN;

    SFS_IRWLOCK_WRLOCK(ip);

    as_wrlock(as);

    if ((flags & MAP_FIXED) == 0) {
    map_addr(addrp, len, (off_t)off, 0);
    if (*addrp == NULL) {
    as_unlock(as);
    SFS_IRWLOCK_UNLOCK(ip);
    return (ENOMEM);
    }
    } else {
    /*
    * User specified address - blow away any previous mappings
    */
    (void) as_unmap(as, *addrp, len);
    }
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to alt.folklore.computers,comp.lang.c on Wed Jul 30 03:50:51 2025
    From Newsgroup: comp.lang.c

    On 2025-07-28, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Kaz Kylheku <643-408-1753@kylheku.com> writes:
    On 2025-07-28, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
    The change history indicates that this behavior has not been changed.
    Interestingly, what the standard now says about MAP_FAILED, it used to
    say about (void*)-1. The fact that it didn't use a null value to
    indicate failure may imply that a null value was intended to be allowed
    as a successful return.

    On a Linux 4.15 system (older Ubuntu) it succeeds if the caller is superuser.
    Perhaps there is some CAP_* capability for finer-grained access to this:

    $ cat mmap-null.c
    #include <sys/mman.h>
    #include <stdio.h>
    #include <errno.h>

    int main(void)
    {
    int res = munmap(0, 4096);
    void *ptr = mmap(0, 4096, PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
    printf("res = %d, ptr = %p, errno = %d\n", res, ptr, errno);
    return 0;
    }
    $ make CFLAGS='-W -Wall -O2' mmap-null
    cc -W -Wall -O2 mmap-null.c -o mmap-null
    $ ./mmap-null
    res = 0, ptr = 0xffffffff, errno = 1
    $ sudo ./mmap-null
    res = 0, ptr = (nil), errno = 0

    If you have some legacy app which needs writable memory at tne null pointer, >> it's doable but not without some inconveniences, like not being able to run as
    a regular user on a system without sudo access.

    I get similar results on Ubuntu 24.04.2, Linux 6.14.0-24-generic.

    I'm able to read from address 0 without a segfault.

    I just had an idle thought: is this controlled by sysctl somehow?

    Grepping the output of "sysctl -a" for "mmap", I found:

    vm.mmap_min_addr = 65536

    I think this is it? As in below this value, you need privilege?

    The kernel documentation says:

    This file indicates the amount of address space which a user process
    will be restricted from mmapping. Since kernel null dereference bugs
    could accidentally operate based on the information in the first couple
    of pages of memory userspace processes should not be allowed to write to
    them. By default this value is set to 0 and no protections will be
    enforced by the security module. Setting this value to something like
    64k will allow the vast majority of applications to work correctly and
    provide defense in depth against future potential kernel bugs.

    OK, so if the kernel has a null pointer dereference bug, if a malicious
    user space process can map something there, it can then influence
    subsequent behavior of kernel code, possibly leading to a privilege
    escalation.

    Searching around, it seems to date back to the 2.6 kernel (or even
    earlier) and used to have a lower value, like 4096. Obviously 4K might
    not be enough if the kernel is using a large dynamic array or large
    structure.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Jakob Bohm@egenagwemdimtapsar@jbohm.dk to alt.folklore.computers,comp.lang.c on Sun Aug 17 20:41:18 2025
    From Newsgroup: comp.lang.c

    On 2025-07-27 02:11, Andrey Tarasevich wrote:
    On Fri 7/25/2025 10:43 PM, Lawrence D'Oliveiro wrote:
    The articles on the development of C include some interesting
    historical detail. One point that stood out for me was the handling of
    global variables.

    When I first came across C (back in K&R days), the semantics of
    duplicated global variable declarations -- overlay the allocated
    storage for each allocation of a variable with the same name, so the
    variable ends up being the largest size of all the declarations --
    immediately reminded me of Fortran COMMON blocks. And one article
    makes it clear that was a conscious decision, to try to ease
    implementation of the language on non-Unix systems.


    Both C89/90 rationale and C99 rationale have entire sections on the
    ref/def models, which were taken into consideration. They provides the
    same reasoning for the decision made by the committee: not burdening the weaker platforms with the task of merging/cleaning-out repetitive definitions. The responsibility to ensure that there is at most one definition for entities with external linkage lies on the user.

    On a related note, it is interesting to point out that the language continues to staunchly stick to the same approach in its later
    iterations: C99 introduced inline functions, and the definition model
    for inline functions with external linkage is also strikingly different from, say, C++. The user is required to manually choose the definition
    site and provide only one `extern inline` definition for the function
    (i.e. regular non-inlined body, in case the compiler decides to use one).


    In stark contrast to this approach, some major compilers require a
    linker with support for many global common blocks, and use this for
    features such as merging identical string literals across compilation
    units, such that 1000 compilation units each having logic to print the
    string "Error", will only consume a single 6 bytes of linked program
    size . This is of cause because those compilers enjoy the luxury of
    requiring specific linker implementations .

    As for global vars, I maintain an internal library that has relies on a compiler-initialized opaque global being initialized using the internal implementation types, while most users will see only a same size opaque
    type (appropriate static asserts check the size identity).


    Enjoy

    Jakob
    --
    Jakob Bohm, MSc.Eng., I speak only for myself, not my company
    This public discussion message is non-binding and may contain errors
    All trademarks and other things belong to their owners, if any.
    --- Synchronet 3.21a-Linux NewsLink 1.2