• Memory protection between compilation units?

    From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Wed Jun 11 15:32:39 2025
    From Newsgroup: comp.lang.c

    This might not be a strictly C question, but it definitely concerns all
    C programmers.
    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:
    static int *socks[0xffff];
    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }
    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the
    moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.
    This raises a question: how can such corruptions be detected sooner?
    Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.
    How would you approach this?
    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From =?UTF-8?Q?Josef_M=C3=B6llers?=@josef@invalid.invalid to comp.lang.c on Wed Jun 11 16:06:10 2025
    From Newsgroup: comp.lang.c

    On 11.06.25 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    The pointer was allocated immediately behind the "socks" array, i.e. as
    the 0x10000th element of the array (I have analyzed a similar problem
    for our son a couple of years ago, where the problem occurred and
    vanished when he added some debug statements ;-) ).

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.

    I guess it can't because modules can access variables from other
    modules, so either you forbid module B to modify a variable from module
    A, which would break almost every moderately complex program, or you
    fall into this trap.
    Thus said ... this is not a problem of memory protection but a problem
    of an out-of-bounds programming error. And ... no, you can't forbid this
    as well, as there are quite a number of programs that define a
    variable-length array (usually in a structure) as having a size of 1 and happily writing to index 1234.

    How would you approach this?

    Difficult, but, as I said, it's a programming error.

    Josef

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lew Pitcher@lew.pitcher@digitalfreehold.ca to comp.lang.c on Wed Jun 11 14:30:30 2025
    From Newsgroup: comp.lang.c

    On Wed, 11 Jun 2025 15:32:39 +0200, Mateusz Viste wrote:

    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.


    Your questions, below, are all quite valid, and (AFAICT) all relate to
    how your operating environment (OS, linker, libraries, etc) works.

    In general, you prevent or detect such issues by understanding
    1) the environment in which your code runs,
    2) the operation and implications of each component linked into your
    process, and
    3) the operation and implications of each compilation unit compiled
    in each component of your process.

    You will not be able to completely understand some components, as
    you will probably, at best, only have documentation, and not source code
    for them. Others will be too complex to properly understand.

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself.

    For components for which you have source code, bench-checking, peer
    review, unit-testing, integration testing, compliance testing, and
    performance testing should catch most flaws. Use the appropriate
    tools: a language linter to catch language usage errors, a profiling
    program to find where your code spends it's time, and a memory-use
    tracking program (like valgrind, for instance) to catch out-of-bounds conditions.

    Is there a way to enforce memory protection between module files of
    the same program? After all, static objects shouldn't be accessible
    outside their compilation unit.

    This all depends on your linker/binder and your operating environment
    (OS, etc). The linker or binder arranges your compilation units into
    a cohesive whole, combining and arranging static memory areas, code
    blocks, etc, to suit the requirements of you operating environment.
    Your operating environment arranges all that into memory in order
    to execute the code, which means moving those static memory blocks,
    dynamic blocks, and code blocks around. The end result is that,
    when executing, the placement and boundaries of each compilation-unit's "static" memory depends entirely on where the linker and OS decide
    they should be. Objects that live "in isolation" in the source code
    may occupy contiguous sequential memory locations in execution. The
    OS may or may not provide some sort of "fencing" around objects or
    blocks.

    How would you approach this?

    Carefully. Very carefully.
    I could tell stories..... :-)



    HTH
    --
    Lew Pitcher
    "In Skills We Trust"
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Wed Jun 11 14:32:39 2025
    From Newsgroup: comp.lang.c

    =?UTF-8?Q?Josef_M=C3=B6llers?= <josef@invalid.invalid> writes:
    On 11.06.25 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the
    moment of the out-of-bounds write but much later - somewhere entirely
    different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    valgrind.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed Jun 11 17:14:44 2025
    From Newsgroup: comp.lang.c

    On 11/06/2025 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.

    How would you approach this?


    Your key tools for catching such errors early are static error checking
    and then run-time checkers. Then when you get strange symptoms, a debugger.

    Static error checking (like gcc -O2 -Wall -Wextra) will not catch
    everything, but it will catch /some/ out-of-bounds errors and other
    bugs. The more you catch there with your compiler, the better. There
    are also more advance static error checking tools for special
    situations, or special prices.

    Run-time checks like valgrind or gcc / clang sanitizers can catch quite
    a lot of out-of-bounds accesses and other run-time errors. They can
    take some practice to use well, and can have a significant impact on the run-time characteristics of the code (such as timing or memory usage)
    which may then affect the way the code is run. And of course they won't
    catch bugs unless the buggy parts of the code are actually run in a way
    that triggers the problem.

    For debugging problems like this with gdb, you can put a data breakpoint
    on the pointer that is your known symptom. Set it to stop when
    something writes 0 to it - then you can see where you are in code when
    that happens. Of course, that will be a real pain if it only happens
    once a week.

    If you suspect a buffer overflow, then you can also look in your map
    file for the pointer, and then look at what is next to it in memory.
    This is inconvenient with static data - you have to combine it with
    listing files, as the details of static data don't make it through to
    the linker-generated map files. You might see more information using a debugger.

    As you can see, there is no simple solution to this!


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Opus@ifonly@youknew.org to comp.lang.c on Wed Jun 11 17:19:36 2025
    From Newsgroup: comp.lang.c

    On 11/06/2025 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.
    (...)
    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.

    This is an interesting question, indeed not specific to C.

    This would require fine-grained memory protection, something that would require hardware support. Most OSs that implement some kind of
    "processes" use memory protection to isolate processes, but that's not
    more fine-grained than that.

    So the short answer is: you have no means of doing this with current
    OSs, hardware and languages.

    Language-wise, the options to make memory corruption less likely is to implement bounds checking and other mechanisms like that.

    In C, to avoid out-of-bounds access of arrays, you could check all your
    array accesses dynamically (by checking indices). But that would require
    using the right array length for checking, which you may also get wrong,
    as this would be "manual".

    There is a proposed extension for the RISC-V ISA called CHERI that
    offers the kind of fine-grained memory protection that could fit your
    purpose here. This is a topic that is certainly being investigated. But nothing available outside of research for now.

    To answer your question in a more practical way, I would rewrite your
    code snippet as something like the following, making it safer and
    clearer to maintain:

    #define SOCKS_LEN 65536 // or (1U << 16), whatever better expresses the intent.

    static int *socks[SOCKS_LEN];

    void update_my_socks(int *sock, int val) {
    socks[val % SOCKS_LEN] = sock;
    }

    Note that the modulo (% SOCKS_LEN) will be compiled as a mask by the
    compiler if SOCKS_LEN is a power of two. So no need to bother with
    trying to hand-optimize it. But the code above also works if SOCKS_LEN
    is not a power of two. That's robust.

    Second note: you chose to wrap indices around to handle possible
    out-of-bounds accesses. That may or may not be a good idea depending on
    the exact context. You may alternatively want to do nothing if val is
    out of bounds:

    void update_my_socks(int *sock, int val) {
    if (val >= SOCKS_LEN)
    return;

    socks[val] = sock;
    }

    Of course, if you want to be able to handle the case where there is an
    error, you may also want to return an error from update_my_socks()
    instead of having a function returning nothing. Or call some specific
    error function. Your pick.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Wed Jun 11 15:36:47 2025
    From Newsgroup: comp.lang.c

    On 2025-06-11, Mateusz Viste <mateusz@x.invalid> wrote:
    How would you approach this?

    Custom linker script which aligns the static area address of each module
    to a page size, and introduces a dummy page-sized object.

    Then at program load time, we iterate over these, and unmap the dummy
    pages.

    We might also have to think about perhaps coalescing the
    non-zero-initialized and zero-initialized ("BSS") data. Or, rather than
    saying coalescing, perhaps not separating the two. Or else separate implementing the strategy for the two areas: have unmapped pages in the
    "BSS" area as well as the initialized data.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From wij@wyniijj5@gmail.com to comp.lang.c on Wed Jun 11 23:38:06 2025
    From Newsgroup: comp.lang.c

    On Wed, 2025-06-11 at 17:19 +0200, Opus wrote:
    On 11/06/2025 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.
    (...)
    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t safeguard a program from corrupting itself. Is there a way to enforce memory protection between module files of the same program? After all, static objects shouldn't be accessible outside their compilation unit.

    This is an interesting question, indeed not specific to C.

    This would require fine-grained memory protection, something that would require hardware support. Most OSs that implement some kind of
    "processes" use memory protection to isolate processes, but that's not
    more fine-grained than that.

    So the short answer is: you have no means of doing this with current
    OSs, hardware and languages.

    Language-wise, the options to make memory corruption less likely is to implement bounds checking and other mechanisms like that.

    In C, to avoid out-of-bounds access of arrays, you could check all your array accesses dynamically (by checking indices). But that would require using the right array length for checking, which you may also get wrong,
    as this would be "manual".

    There is a proposed extension for the RISC-V ISA called CHERI that
    offers the kind of fine-grained memory protection that could fit your purpose here. This is a topic that is certainly being investigated. But nothing available outside of research for now.

    To answer your question in a more practical way, I would rewrite your
    code snippet as something like the following, making it safer and
    clearer to maintain:

    #define SOCKS_LEN 65536 // or (1U << 16), whatever better expresses the intent.

    static int *socks[SOCKS_LEN];

       void update_my_socks(int *sock, int val) {
         socks[val % SOCKS_LEN] = sock;
       }

    Note that the modulo (% SOCKS_LEN) will be compiled as a mask by the compiler if SOCKS_LEN is a power of two. So no need to bother with
    trying to hand-optimize it. But the code above also works if SOCKS_LEN
    is not a power of two. That's robust.

    Second note: you chose to wrap indices around to handle possible out-of-bounds accesses. That may or may not be a good idea depending on
    the exact context. You may alternatively want to do nothing if val is
    out of bounds:

       void update_my_socks(int *sock, int val) {
         if (val >= SOCKS_LEN)
             return;

         socks[val] = sock;
       }

    Of course, if you want to be able to handle the case where there is an error, you may also want to return an error from update_my_socks()
    instead of having a function returning nothing. Or call some specific
    error function. Your pick.
    I would suggest the error checking solution.
    But , it looked to me a prototype problem: "int val" should probably be "unsigned val".
    or manually write a range check:
    void update_my_socks(int *sock, int val) {
    unsigned int idx= val & 0xffff;
    if(idx>=0xffff) {
    // report out of range error
    }
    socks[idx] = sock;
    }
    Error checking codes are normally avoided (lots efforts are devoted for such seemingly
    'useless' purpose). But I feel from experiences it is useful and should be generally
    considered 'final optimization'. IOW, always check for errors if possible.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Thu Jun 12 10:28:57 2025
    From Newsgroup: comp.lang.c

    Thank you all for your thoughtful responses. You rightly identified
    that the problem is essentially an out-of-bounds access - a symptom of
    deeper code quality issues. The bug in question managed to pass unit
    tests, peer review, functional tests, and it didn’t trigger any
    warnings from GCC or clang, even with the strict -Weverything flag I
    enforce across my teams. This underscores a fundamental truth: every
    software has bugs, and some, like this one, are notoriously difficult
    to locate. The bug caused a segfault about once every 10 days,
    manifesting in an unrelated part of the code and sometimes days after
    the out-of-bounds write occurred.
    This led me to wonder how I could accelerate such crashes to simplify debugging. In large programs, unnoticed memory corruption becomes more probable. One strategy is to break the program into modular parts that communicate via IPC so programs would be protected from each other
    thanks to the wonders of protected mode. However, this approach
    sacrifices the efficiency and simplicity of function calls. A more
    elegant solution would be to leverage the MMU to isolate the memory of
    each compilation unit, triggering a segfault when a unit accesses
    memory outside its scope. Unfortunately, such technology does not seem
    to exist yet - at least not in the Linux world (which is my target
    platform).
    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mikko@mikko.levanto@iki.fi to comp.lang.c on Thu Jun 12 11:40:20 2025
    From Newsgroup: comp.lang.c

    On 2025-06-11 13:32:39 +0000, Mateusz Viste said:

    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.

    How would you approach this?

    The traditional method to ensure that a program or a part of a program
    does not do what it must not do is testing. In this case the tester
    must modify the code so that the array socks is a part of a larger
    data structure and and call update_my_socks with different values for
    val, including the critical values -1, 0, 0xfffe, and 0xffff.
    --
    Mikko

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Thu Jun 12 11:05:02 2025
    From Newsgroup: comp.lang.c

    On Thu, 12 Jun 2025 11:40 Mikko wrote:
    The traditional method to ensure that a program or a part of a program
    does not do what it must not do is testing. In this case the tester
    must modify the code so that the array socks is a part of a larger
    data structure and and call update_my_socks with different values for
    val, including the critical values -1, 0, 0xfffe, and 0xffff.
    Essentially checking for out-of-bounds writes using safeguard markers:
    struct {
    int low;
    int array[0xffff];
    int high;
    } x;
    low = -1;
    high = -1;
    do_some_job(&x);
    assert((low == -1) && (high == -1));
    This approach might be a valid strategy, but is it practical?
    Uncertain. Foolproof? Definitely not: an out-of-bounds write could
    easily occur 4 KiB past the array and be undetected.
    While various testing methods exist, my original question wasn’t about testing scenarios, but rather about potential methods to isolate and
    protect compilation units from one another.
    It appears this is not a novel idea and there are some solutions, for
    example CHERI: https://en.wikipedia.org/wiki/Capability_Hardware_Enhanced_RISC_Instructions But this requires special hardware, while I am looking for something
    that would be usable on Linux with commodity x86_64 hardware.
    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Thu Jun 12 14:31:17 2025
    From Newsgroup: comp.lang.c

    On Wed, 11 Jun 2025 17:14 David Brown wrote:

    For debugging problems like this with gdb, you can put a data
    breakpoint on the pointer that is your known symptom. Set it to stop
    when something writes 0 to it - then you can see where you are in
    code when that happens. Of course, that will be a real pain if it
    only happens once a week.

    The idea is good, but as you observed it is hard to apply in a
    production situation when the issue happens like three times a month.

    In fact, a breakpoint would be even overkill - I'd be perfectly happy
    for the program crashing when said variable changes. Like a
    runtime-setup assertion that constantly checks the state of the
    variable. Sadly, I'm not aware of such mechanism either. :)

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Thu Jun 12 14:41:03 2025
    From Newsgroup: comp.lang.c

    On Wed, 11 Jun 2025 17:19 Opus wrote:
    There is a proposed extension for the RISC-V ISA called CHERI that
    offers the kind of fine-grained memory protection that could fit your purpose here.

    CHERI was indeed one of the first links that google offered when I
    tried looking for an existing solution. But as you noted, it's not
    available on "normal" hardware, and sadly google wasn't able to propose
    any more "real-world" alternatives.

    Second note: you chose to wrap indices around to handle possible out-of-bounds accesses. That may or may not be a good idea depending
    on the exact context. You may alternatively want to do nothing if val
    is out of bounds

    This was about a primitive 64K hash map, so out of bounds situations
    were expected impossible to happen... if the programmer hadn't
    sized his array 1 entry too short.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Thu Jun 12 06:05:09 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@x.invalid> writes:

    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn?t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all, static objects shouldn't be accessible outside their compilation unit.

    How would you approach this?

    The code in question shows several classic error patterns. In no
    particular order:

    * buffer overflow
    * off-by-one error
    * hard-coded constants (rather than symbolic)
    * bitwise operator with signed operand
    * using & to effect what is really a modulo operation
    * two of the above combine to impose a constraint on a
    hard-coded value, and the constraint is never checked

    Of course some of these, notably buffer overflow, are hard to find.
    But some of them are easy. The hard-coded constants stand out like a
    neon sign, especially because one is duplicated. Check for any
    constant written in open code above the value of, say, 10. Once the
    offending example is found, it can be rewritten, as for example

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    const unsigned N = sizeof socks / sizeof socks[0];
    socks[val % N] = sock;
    }

    This revision doesn't fix the program but it does eliminate the bug. (Presumably fixing the program will happen later.) Of course the
    code should be further revised so that the temptation to use the
    hard-coded value elsewhere is reduced, but this revision at least is
    a step in the right direction.

    Also, whenever a cockroach is seen, you can be sure there are other
    cockroaches around. Each of the types of errors evidenced by the
    original code (at least three of the list of six types) represent
    bugs waiting to be found; go through the code and check for all
    of them, at least for the ones that can be located easily. Add
    these error classes to the list of potential problems checked
    during code review.

    I acknowledge that this response isn't exactly an answer to the
    original question. It does illustrate though a kind of thinking
    that can be useful when trying to track down hard-to-find bugs.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Jun 12 13:18:04 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@x.invalid> writes:

    <snip>

    This led me to wonder how I could accelerate such crashes to simplify >debugging. In large programs, unnoticed memory corruption becomes more >probable. One strategy is to break the program into modular parts that >communicate via IPC so programs would be protected from each other
    thanks to the wonders of protected mode. However, this approach
    sacrifices the efficiency and simplicity of function calls. A more
    elegant solution would be to leverage the MMU to isolate the memory of
    each compilation unit, triggering a segfault when a unit accesses
    memory outside its scope. Unfortunately, such technology does not seem
    to exist yet - at least not in the Linux world (which is my target
    platform).

    CHERI is designed to adddress those issue. And there is a linux-based
    PoC.

    https://cheri-alliance.org/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Jun 12 13:21:31 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@x.invalid> writes:
    On Wed, 11 Jun 2025 17:19 Opus wrote:
    There is a proposed extension for the RISC-V ISA called CHERI that
    offers the kind of fine-grained memory protection that could fit your
    purpose here.

    CHERI was indeed one of the first links that google offered when I
    tried looking for an existing solution. But as you noted, it's not
    available on "normal" hardware, and sadly google wasn't able to propose
    any more "real-world" alternatives.

    A real-world alternative is the Unisys Clearpath Libra system,
    with fine grained capability-based object security (descended
    from the B6500).

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu Jun 12 15:29:39 2025
    From Newsgroup: comp.lang.c

    On 12/06/2025 14:31, Mateusz Viste wrote:
    On Wed, 11 Jun 2025 17:14 David Brown wrote:

    For debugging problems like this with gdb, you can put a data
    breakpoint on the pointer that is your known symptom. Set it to stop
    when something writes 0 to it - then you can see where you are in
    code when that happens. Of course, that will be a real pain if it
    only happens once a week.

    The idea is good, but as you observed it is hard to apply in a
    production situation when the issue happens like three times a month.

    In fact, a breakpoint would be even overkill - I'd be perfectly happy
    for the program crashing when said variable changes. Like a
    runtime-setup assertion that constantly checks the state of the
    variable. Sadly, I'm not aware of such mechanism either. :)


    Run-time assertions or other specific run-time checks will be triggered
    when they see the given condition. For example, if you had compiled
    with "gcc -fsanitize=null", then you'd get a run-time error and "crash"
    when the null pointer was dereferenced.

    But that only tells you when you look at the corrupted data - it tells
    you nothing about when the data was corrupted.

    A data breakpoint is triggered when the data item is written (or read, depending on the settings). I have only used these on embedded systems,
    and don't know about their support in x86 hardware (assuming that is
    your target). But the point is that the breakpoint would be hit in the
    buggy code with the buffer overrun, rather than in the correct code that
    used the pointer that got stomped on.

    Data breakpoints are not perfect either - you will also get a hit when legitimate code changes the same address, and have to have filtering to
    skip such false positives. They obviously do not directly help make the unwanted situation occur often enough for convenient debugging, but they
    might nonetheless be useful. (Perhaps you have a bug that regularly
    stomps on the pointer, and other code that regularly writes to the
    pointer with valid data. The failure might only happen once a week by coincidence in timing, while the incorrect write to the pointer might
    occur far more often.)

    So data breakpoints are not always helpful, but I have used them in
    similar circumstances and they are often a tool people don't know much
    about.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Jun 12 14:27:47 2025
    From Newsgroup: comp.lang.c

    David Brown <david.brown@hesbynett.no> writes:
    On 12/06/2025 14:31, Mateusz Viste wrote:
    On Wed, 11 Jun 2025 17:14 David Brown wrote:

    For debugging problems like this with gdb, you can put a data
    breakpoint on the pointer that is your known symptom. Set it to stop
    when something writes 0 to it - then you can see where you are in
    code when that happens. Of course, that will be a real pain if it
    only happens once a week.

    The idea is good, but as you observed it is hard to apply in a
    production situation when the issue happens like three times a month.

    In fact, a breakpoint would be even overkill - I'd be perfectly happy
    for the program crashing when said variable changes. Like a
    runtime-setup assertion that constantly checks the state of the
    variable. Sadly, I'm not aware of such mechanism either. :)


    <snip>
    A data breakpoint is triggered when the data item is written (or read, >depending on the settings). I have only used these on embedded systems,
    and don't know about their support in x86 hardware (assuming that is
    your target). But the point is that the breakpoint would be hit in the >buggy code with the buffer overrun, rather than in the correct code that >used the pointer that got stomped on.

    I use data breakpoints on x86_64 systems routinely. The hdw supports a
    small number of hardware data breakpoints. The gdb 'watch'command
    will set a hardware data breakpoint. They're very useful, if
    you know the address of the data that is being corrupted.

    ARM64 also supports data breakpoints.


    Data breakpoints are not perfect either - you will also get a hit when >legitimate code changes the same address, and have to have filtering to

    GDB also has some filtering capability (the 'condition' command).

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Thu Jun 12 20:01:59 2025
    From Newsgroup: comp.lang.c

    On Wed, 11 Jun 2025 14:32:39 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    =?UTF-8?Q?Josef_M=C3=B6llers?= <josef@invalid.invalid> writes:
    On 11.06.25 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely
    concerns all C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious
    issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar
    with C, *locating* the bug was challenging. The program did not
    crash at the moment of the out-of-bounds write but much later -
    somewhere entirely different, in a different object file that
    maintained a static pointer for tracking a position in a linked
    list. To my surprise, the pointer was randomly reset to NULL about
    once a week, causing a segfault. Tracing this back to an unrelated
    out-of-bounds write elsewhere in the code was tedious, to say the
    least.

    valgrind.


    Probably too slow. If I were in Mateusz's situation, I would try AddressSanitizer.
    Never tried it myself, but it looks like better fit for this particular relatively simple case of buffer overrun.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Richard Heathfield@rjh@cpax.org.uk to comp.lang.c on Thu Jun 12 19:15:26 2025
    From Newsgroup: comp.lang.c

    On 11/06/2025 15:32, Scott Lurndal wrote:
    =?UTF-8?Q?Josef_M=C3=B6llers?= <josef@invalid.invalid> writes:
    On 11.06.25 15:32, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the
    moment of the out-of-bounds write but much later - somewhere entirely
    different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    valgrind.


    Sure. Or some people prefer to single-step with a debugger. Such
    people can make their lives a little easier by surrounding the
    buffer with sentinel soldiers, setting the sentinel soldiers to a
    magic number, and putting a watch on them both - the buffer high
    soldier and the buffer low soldier.
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Thu Jun 12 18:59:10 2025
    From Newsgroup: comp.lang.c

    On 2025-06-12, Mateusz Viste <mateusz@x.invalid> wrote:
    Thank you all for your thoughtful responses. You rightly identified
    that the problem is essentially an out-of-bounds access - a symptom of
    deeper code quality issues. The bug in question managed to pass unit
    tests, peer review, functional tests, and it didn’t trigger any
    warnings from GCC or clang, even with the strict -Weverything flag I
    enforce across my teams. This underscores a fundamental truth: every
    software has bugs, and some, like this one, are notoriously difficult
    to locate. The bug caused a segfault about once every 10 days,
    manifesting in an unrelated part of the code and sometimes days after
    the out-of-bounds write occurred.

    This led me to wonder how I could accelerate such crashes to simplify debugging.

    Below is a proof-of-concept program that works in GNU/Linux. For
    rapidity of prototyping, I have assumed a page size of 4096; this is not
    right for all systems.

    The my_array[] array is declared between two page-sized and page-aligned
    guard arrays, guard_0 and guard_1.

    The program write-protects the two arrays with mprotect.

    The output demonstrates that the egregious overrun of my_array[],
    namely a write to my_array[5000] triggers a segfault:

    $ ./prog
    Address of guard_0: 0x4c4000
    Address of my_array: 0x4c5000
    Address of guard_1: 0x4c6000
    guard_1 is now write-protected (read-only).
    writing my_array[0] succeeded
    Segmentation fault (core dumped)

    With a little additional effort, we can manipulate the declarations
    such that the high element of my_array[] will be placed just before
    the guard_1 page. Then we will have byte-accurate overrun detection,
    at the loss of accurate underrun detection.

    A bunch of decades ago, hacker Bruce Perens developed a malloc
    debugging library called Electric Fence which implemented exactly
    this technique, but for malloced objects. We can think of this
    as "Electric Fence, but for static".

    Note that all static arrays have initializers. This is so that they
    are part of the same category of non-zero-initialized data.

    I suspect that this will work fine if all three arrays are
    zero-initialized or all three are non-zero-initialized, but not
    for mixtures. The reason is that zero-initialized and
    non-zero-initialized statics are separated and put into different
    sections.

    Try it, verify that my_array[-1] = 0 segfaults, showing that
    there is accurate underrun protection. Try manipulating the
    declarations to get my_array to butt up against guard_1.

    Code follows ...

    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <sys/mman.h>

    #define PAGE_SIZE 4096

    static char __attribute__((aligned(PAGE_SIZE))) guard_0[PAGE_SIZE] = { 1 }; static char my_array[42] = { 1 };
    static char __attribute__((aligned(PAGE_SIZE))) guard_1[PAGE_SIZE] = { 1 };

    int main() {
    printf("Address of guard_0: %p\n", (void*)guard_0);
    printf("Address of my_array: %p\n", (void*)my_array);
    printf("Address of guard_1: %p\n", (void*)guard_1);

    if (mprotect(guard_0, PAGE_SIZE, PROT_READ) == -1) {
    perror("mprotect guard_0 failed");
    return EXIT_FAILURE;
    }

    if (mprotect(guard_1, PAGE_SIZE, PROT_READ) == -1) {
    perror("mprotect guard_1 failed");
    return EXIT_FAILURE;
    }

    printf("guard_1 is now write-protected (read-only).\n");

    my_array[0] = 2;

    printf("writing my_array[0] succeeded\n");

    my_array[5000] = 2;

    printf("writing my_array[5000] should not have succeeded\n");

    return EXIT_SUCCESS;
    }

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Fri Jun 13 08:00:58 2025
    From Newsgroup: comp.lang.c

    Am 11.06.2025 um 15:32 schrieb Mateusz Viste:

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    Therefore I love bounds-checking C++ containers with MSVC (debug builds)
    and with the libstdc++ runtime (enabled via macro). With that the bug
    still remains in release-builds, but anyone who has access to the source
    can run the code and apply suspicious input and can determine if there's
    a bounds violation without knowing how the code works.
    But sometimes you've got a simple memory range, usually from a C-API.
    With that I use a C++20 span, that internally is usually a pointer and
    a size_t. If you apply f.e. an indexed access on it the []-operator
    checks the bounds with that.
    Debug builds are usually much slower, but if you use C++ that's even
    more slower since simple things like a container acces via []-operator
    occur with a separate function call while debugging. With iterator
    -debugging that's even slower. But this price is worth the advantage
    that you can easily find bounds-problems with C++.


    This raises a question: how can such corruptions be detected sooner?

    Use C++.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Fri Jun 13 08:03:23 2025
    From Newsgroup: comp.lang.c

    Am 12.06.2025 um 15:05 schrieb Tim Rentsch:

    void update_my_socks(int *sock, int val) {
    const unsigned N = sizeof socks / sizeof socks[0];
    socks[val % N] = sock;
    }

    For someone who uses bounds-checked containers in C++ every day
    this really looks achaic.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Fri Jun 13 08:42:05 2025
    From Newsgroup: comp.lang.c

    On Thu, 12 Jun 2025 18:59 Kaz Kylheku wrote:
    Below is a proof-of-concept program that works in GNU/Linux. For
    rapidity of prototyping, I have assumed a page size of 4096; this is
    not right for all systems.

    This is very cool! A variation of the classic "sentinel-guarded
    memory" concept, where sentinels are write-protected rather than
    requiring runtime checks against some magic signature.

    Another potential strategy would be to safeguard the static array
    itself, or any other data storage for that matter, immediately after the legitimate code has finished using it. Then unprotect it only when
    needed again. While this might not be a good performer for
    high-frequency operations, it could be an interesting practice
    for memory regions that are rarely modified.

    man mprotect() suggests that it should be used only on mmap-ed memory,
    but apparently under Linux it works with everything.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Fri Jun 13 08:47:17 2025
    From Newsgroup: comp.lang.c

    On Fri, 13 Jun 2025 08:00 Bonita Montero wrote:
    Therefore I love bounds-checking C++ containers with MSVC (debug
    builds) and with the libstdc++ runtime (enabled via macro). (...)
    Debug builds are usually much slower, but if you use C++ that's even
    more slower since simple things like a container acces via []-operator
    occur with a separate function call while debugging. With iterator
    -debugging that's even slower. But this price is worth the advantage
    that you can easily find bounds-problems with C++.

    Sounds similar to Pixar's "Electric Fence" that Kaz mentioned earlier: https://linux.die.net/man/3/efence

    Depending on the performance impact this may or may not be a viable
    solution to debug a rare production issue, but still nice to know it
    exists.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Fri Jun 13 08:59:27 2025
    From Newsgroup: comp.lang.c

    On Thu, 12 Jun 2025 06:05 Tim Rentsch wrote:
    The code in question shows several classic error patterns. In no
    particular order:

    * buffer overflow
    * off-by-one error

    I'd consider that one item, since one leads to another.

    * bitwise operator with signed operand

    My mistake. Real code is acting on something else than an int, I wasn't
    paying enough attention when writing the illustrative example.

    * using & to effect what is really a modulo operation

    You think of it as modulo, I think of it as "bits trimming".
    Essentially same operation, but different viewpoints I guess.

    I acknowledge that this response isn't exactly an answer to the
    original question. It does illustrate though a kind of thinking
    that can be useful when trying to track down hard-to-find bugs.

    Thank you for your insightful remarks. I completely agree - the best
    way to debug a program is to avoid the need for debugging in the first
    place. :-) But working with a large, 15-year-old codebase that has
    seen contributions from dozens of programmers makes things a bit
    non-ideal sometimes.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Fri Jun 13 09:13:13 2025
    From Newsgroup: comp.lang.c

    On Thu, 12 Jun 2025 20:01:59 Michael S wrote:

    Probably too slow. If I were in Mateusz's situation, I would try AddressSanitizer.

    Still slow - albeit maybe at an acceptable level. But if not suitable
    for production code, this is actually an awesome addition for testing
    builds. Thanks for the hint!

    Never tried it myself, but it looks like better fit for this
    particular relatively simple case of buffer overrun.

    Part of the problem was that I had no clue this was a stupid buffer
    overrun before I actually found the issue. My leading hypothesis was
    involving mischievous gremlins tampering with bits in my variables.

    In hindsight, enabling -fsanitize=address in testing builds could have highlighted the problem sooner, potentially sparing me a few hours of
    hunting.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From pozz@pozzugno@gmail.com to comp.lang.c on Fri Jun 13 09:21:47 2025
    From Newsgroup: comp.lang.c

    Il 12/06/2025 20:59, Kaz Kylheku ha scritto:
    On 2025-06-12, Mateusz Viste <mateusz@x.invalid> wrote:
    Thank you all for your thoughtful responses. You rightly identified
    that the problem is essentially an out-of-bounds access - a symptom of
    deeper code quality issues. The bug in question managed to pass unit
    tests, peer review, functional tests, and it didn’t trigger any
    warnings from GCC or clang, even with the strict -Weverything flag I
    enforce across my teams. This underscores a fundamental truth: every
    software has bugs, and some, like this one, are notoriously difficult
    to locate. The bug caused a segfault about once every 10 days,
    manifesting in an unrelated part of the code and sometimes days after
    the out-of-bounds write occurred.

    This led me to wonder how I could accelerate such crashes to simplify
    debugging.

    Below is a proof-of-concept program that works in GNU/Linux. For
    rapidity of prototyping, I have assumed a page size of 4096; this is not right for all systems.
    [...]

    However this strategy assumes you already know there's some instruction
    that write to the array at an out-of-bound position.

    I think the situation of the original post is different. His program
    crashed infrequently, very infrequently, and he didn't know anything
    about the cause. I think it was a very big effort to link the crash to
    the array (in another source module) and to the out-of-bound access of
    the array.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Fri Jun 13 14:14:20 2025
    From Newsgroup: comp.lang.c

    On Fri, 13 Jun 2025 09:21 pozz wrote:

    However this strategy assumes you already know there's some
    instruction that write to the array at an out-of-bound position.

    Yes, though I see Kaz's idea is to proactively protect all memory used
    by the program. It's an interesting concept, though not particularly
    practical.

    I think the situation of the original post is different. His program
    crashed infrequently, very infrequently, and he didn't know anything
    about the cause. I think it was a very big effort to link the crash
    to the array (in another source module) and to the out-of-bound
    access of the array.

    You are spot on indeed. Huge program with lots of modules, processing
    millions of data entries every minute. Realizing that the issue was an
    out of bounds situation was challenging because the symptoms were in a
    totally different part of the program. Very confusing.

    Hence why I was wondering if there is any way to make invalid memory
    accesses *within the same program* generate a segfault, so next time I
    have to deal with such self-sabotaging program I know at least which
    module (compilation unit) to look at. Since then I learned that:
    - There is no readily available mechanism for this today on x86
    - CHERI shows great promise, possibly in the coming years
    - mprotect() can offer some degree of protection but must be used
    carefully, as it primarily safeguards against writes in general rather
    than restricting which parts of the code can access memory


    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Fri Jun 13 16:56:23 2025
    From Newsgroup: comp.lang.c

    On Fri, 13 Jun 2025 14:14:20 +0200
    Mateusz Viste <mateusz@x.invalid> wrote:

    - There is no readily available mechanism for this today on x86

    A significant part of x86 installed base (all Intel Core CPUs starting
    from gen 6 up to gen 9 and their Xeon contemporaries) has extension
    named Itel MPX that was invented exactly for that purpose. But it didn't
    work particularly well. Compiler people never liked it, but despite
    that it was supported by several generations of gcc and probably by
    clang as well.

    The proper solution to your problem is to stop using memory-unsafe
    language for complex application programming. It's not that successful
    use of unsafe languages is for complex application programming is
    impossible. The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Richard Heathfield@rjh@cpax.org.uk to comp.lang.c on Fri Jun 13 15:43:09 2025
    From Newsgroup: comp.lang.c

    On 13/06/2025 14:56, Michael S wrote:
    The proper solution to your problem is to stop using memory-unsafe
    language for complex application programming.

    Not if you know what you're doing.

    It's not that successful
    use of unsafe languages is for complex application programming is
    impossible.

    It isn't.

    The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.

    Sound advice. If you can't stand the heat, get out of the
    kitchen. Go and drive a cab or something, and leave programming
    to the grown-ups.
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Jun 13 16:17:28 2025
    From Newsgroup: comp.lang.c

    On 2025-06-13, Mateusz Viste <mateusz@x.invalid> wrote:
    On Thu, 12 Jun 2025 18:59 Kaz Kylheku wrote:
    Below is a proof-of-concept program that works in GNU/Linux. For
    rapidity of prototyping, I have assumed a page size of 4096; this is
    not right for all systems.

    This is very cool! A variation of the classic "sentinel-guarded
    memory" concept, where sentinels are write-protected rather than
    requiring runtime checks against some magic signature.

    Another potential strategy would be to safeguard the static array
    itself, or any other data storage for that matter, immediately after the legitimate code has finished using it. Then unprotect it only when
    needed again. While this might not be a good performer for
    high-frequency operations, it could be an interesting practice
    for memory regions that are rarely modified.

    I have taken such an approach in the integration between Valgrind
    and the TXR Lisp garbage collector. Free objects are inaccessible.
    During conservative parts of the scan, we could encounter, in the
    run-time stack, a pointer to a freed object. So the Valgrind
    API has to be used to make the object accessible before examining it.
    If it is a free object, it is marked inaccessible again and ignored.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Jun 13 16:19:03 2025
    From Newsgroup: comp.lang.c

    On 2025-06-13, Mateusz Viste <mateusz@x.invalid> wrote:
    On Fri, 13 Jun 2025 08:00 Bonita Montero wrote:
    Therefore I love bounds-checking C++ containers with MSVC (debug
    builds) and with the libstdc++ runtime (enabled via macro). (...)
    Debug builds are usually much slower, but if you use C++ that's even
    more slower since simple things like a container acces via []-operator
    occur with a separate function call while debugging. With iterator
    -debugging that's even slower. But this price is worth the advantage
    that you can easily find bounds-problems with C++.

    Sounds similar to Pixar's "Electric Fence" that Kaz mentioned earlier: https://linux.die.net/man/3/efence

    Depending on the performance impact this may or may not be a viable
    solution to debug a rare production issue, but still nice to know it
    exists.

    Saved my ass back in 1994. I cranked out an event-driven windowing
    UI over ncurses and had a crash somewhere. The ncurses guys pointed
    me to efence.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Jun 13 16:23:28 2025
    From Newsgroup: comp.lang.c

    On 2025-06-13, Mateusz Viste <mateusz@x.invalid> wrote:
    On Fri, 13 Jun 2025 09:21 pozz wrote:

    However this strategy assumes you already know there's some
    instruction that write to the array at an out-of-bound position.

    Yes, though I see Kaz's idea is to proactively protect all memory used
    by the program. It's an interesting concept, though not particularly practical.

    The question you posed at the root of the thread, in the middle of the
    article was: "Is there a way to enforce memory protection between module
    files of the same program?".

    Well, that is one way. Put guard pages around their statics, and have
    a little framework whereby the init routines of all the modules can
    regsiter these pages. You can make it so that it all disapepars based
    on some #define.

    It can be entirely practical, depending on the program. Even
    in some programs of moderate complexity.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Jun 13 17:14:14 2025
    From Newsgroup: comp.lang.c

    On 2025-06-13, Michael S <already5chosen@yahoo.com> wrote:
    The proper solution to your problem is to stop using memory-unsafe
    language for complex application programming. It's not that successful
    use of unsafe languages is for complex application programming is
    impossible. The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.

    There are disadvantages to it even if the team is good and the work
    product is free of defects.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From wij@wyniijj5@gmail.com to comp.lang.c on Sat Jun 14 02:10:13 2025
    From Newsgroup: comp.lang.c

    On Fri, 2025-06-13 at 14:14 +0200, Mateusz Viste wrote:
    On Fri, 13 Jun 2025 09:21 pozz wrote:

    However this strategy assumes you already know there's some
    instruction that write to the array at an out-of-bound position.

    Yes, though I see Kaz's idea is to proactively protect all memory used
    by the program. It's an interesting concept, though not particularly practical.

    I think the situation of the original post is different. His program crashed infrequently, very infrequently, and he didn't know anything
    about the cause. I think it was a very big effort to link the crash
    to the array (in another source module) and to the out-of-bound
    access of the array.

    You are spot on indeed. Huge program with lots of modules, processing millions of data entries every minute. Realizing that the issue was an
    out of bounds situation was challenging because the symptoms were in a totally different part of the program. Very confusing.

    Hence why I was wondering if there is any way to make invalid memory
    accesses *within the same program* generate a segfault, so next time I
    have to deal with such self-sabotaging program I know at least which
    module (compilation unit) to look at. Since then I learned that:
    - There is no readily available mechanism for this today on x86
    There will never be a cure for what you are looking for a 'auto range check' Manually coding for out-of-range is the way to go.
    - CHERI shows great promise, possibly in the coming years
    - mprotect() can offer some degree of protection but must be used
      carefully, as it primarily safeguards against writes in general rather
      than restricting which parts of the code can access memory

    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From wij@wyniijj5@gmail.com to comp.lang.c on Sat Jun 14 02:16:59 2025
    From Newsgroup: comp.lang.c

    On Fri, 2025-06-13 at 08:03 +0200, Bonita Montero wrote:
    Am 12.06.2025 um 15:05 schrieb Tim Rentsch:

        void update_my_socks(int *sock, int val) {
           const unsigned N = sizeof socks / sizeof socks[0];        socks[val % N] = sock;
        }

    For someone who uses bounds-checked containers in C++ every day
    this really looks achaic.
    Really? What are they?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Fri Jun 13 20:43:28 2025
    From Newsgroup: comp.lang.c

    Am 13.06.2025 um 20:16 schrieb wij:
    On Fri, 2025-06-13 at 08:03 +0200, Bonita Montero wrote:
    Am 12.06.2025 um 15:05 schrieb Tim Rentsch:

        void update_my_socks(int *sock, int val) {
           const unsigned N = sizeof socks / sizeof socks[0];
           socks[val % N] = sock;
        }

    For someone who uses bounds-checked containers in C++ every day
    this really looks achaic.

    Really? What are they?

    All containers with MSVC are bounds-checked wile debugging.
    With libstdc++ (g++ / clang) you've to define a macro that
    enables bounds-checking.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri Jun 13 12:32:29 2025
    From Newsgroup: comp.lang.c

    wij <wyniijj5@gmail.com> writes:
    On Fri, 2025-06-13 at 08:03 +0200, Bonita Montero wrote:
    Am 12.06.2025 um 15:05 schrieb Tim Rentsch:

        void update_my_socks(int *sock, int val) {
           const unsigned N = sizeof socks / sizeof socks[0];
           socks[val % N] = sock;
        }

    For someone who uses bounds-checked containers in C++ every day
    this really looks achaic.

    Really? What are they?

    Feel free to discuss that in comp.lang.c++.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri Jun 13 15:48:37 2025
    From Newsgroup: comp.lang.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    wij <wyniijj5@gmail.com> writes:

    On Fri, 2025-06-13 at 08:03 +0200, Bonita Montero wrote:

    Am 12.06.2025 um 15:05 schrieb Tim Rentsch:

    void update_my_socks(int *sock, int val) {
    const unsigned N = sizeof socks / sizeof socks[0];
    socks[val % N] = sock;
    }

    For someone who uses bounds-checked containers in C++ every day
    this really looks achaic.

    Really? What are they?

    Feel free to discuss that in comp.lang.c++.

    As a point of information, I have given up reading posts from
    Bonita Montero.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri Jun 13 16:31:26 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@x.invalid> writes:

    On Thu, 12 Jun 2025 06:05 Tim Rentsch wrote:

    The code in question shows several classic error patterns. In no
    particular order:

    * buffer overflow
    * off-by-one error

    I'd consider that one item, since one leads to another.

    You shouldn't. Even if they seem to be related in this instance,
    they are distinct kinds of errors. The code I posted to eliminate
    the buffer overflow does avoid that problem but it still had an
    off-by-one error.

    * using & to effect what is really a modulo operation

    You think of it as modulo, I think of it as "bits trimming".
    Essentially same operation, but different viewpoints I guess.

    It isn't wrong to think of bitwise-and as masking-in (or possibly
    masking-out) of certain bits, but it still isn't a modulo. A modulo
    operation is what is desired; in some cases that can be effected by
    a bitwise-and, but in this case bitwise-and does the wrong thing.
    The whole point is that it is NOT essentially the same operation.
    It's a different operation, and in this case the wrong one.

    I acknowledge that this response isn't exactly an answer to the
    original question. It does illustrate though a kind of thinking
    that can be useful when trying to track down hard-to-find bugs.

    Thank you for your insightful remarks. I completely agree - the best
    way to debug a program is to avoid the need for debugging in the first
    place. :-) But working with a large, 15-year-old codebase that has
    seen contributions from dozens of programmers makes things a bit
    non-ideal sometimes.

    I think you have misunderstood the point of my comments. In some
    cases one is confronted with a symptom that defies one's best
    efforts to diagnose what is causing the symptom. Looking for known
    classes of errors is another arrow in the quiver of techniques for
    discovering what is causing the observed behavior. My point is that
    there are several types of errors that could have been used, after
    the fact, to uncover what was causing your problem here. Taking
    this approach might end up using a fair bit of time, but that time
    is not wasted if it finds other potential lurking bugs, and there is
    a good chance it will.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Sat Jun 14 22:07:02 2025
    From Newsgroup: comp.lang.c

    On Fri, 13 Jun 2025 15:43:09 +0100
    Richard Heathfield <rjh@cpax.org.uk> wrote:

    On 13/06/2025 14:56, Michael S wrote:

    The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.

    Sound advice. If you can't stand the heat, get out of the
    kitchen. Go and drive a cab or something, and leave programming
    to the grown-ups.


    That does not sound right.
    There are plenty of people that can be successful programmer despite
    lacking abilities to be successful programmers in unsafe languages.
    More so, it's not uncommon for people that can successfully program in
    unsafe languages to be less productive application programmers than
    people that, as you put it, "can't stand the heat".

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@not.gonna.tell to comp.lang.c on Sat Jun 14 21:37:51 2025
    From Newsgroup: comp.lang.c

    On 13.06.2025 15:56, Michael S wrote:

    A significant part of x86 installed base (all Intel Core CPUs starting
    from gen 6 up to gen 9 and their Xeon contemporaries) has extension
    named Itel MPX that was invented exactly for that purpose. But it didn't
    work particularly well. Compiler people never liked it, but despite
    that it was supported by several generations of gcc and probably by
    clang as well.

    This does not really sound like something "readily available", unless you
    are suggesting that I migrate to a Linux kernel from 10 years ago, switch
    to gcc 5.0 and use outdated hardware.

    The proper solution to your problem is to stop using memory-unsafe
    language for complex application programming. It's not that successful
    use of unsafe languages is for complex application programming is
    impossible. The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.

    Just to clarify: I didn’t post here seeking help with a simple out-of-bounds
    issue, nor was I here to vent. I’ve been wrangling C code in complex,
    high-performance systems for over a decade - I’m managing just fine. Code
    improvement is a continual, non-negotiable process in our line of work, but
    fires happen occasionally nonetheless. While fixing the issue, I started
    wondering about how faults like this could be located faster, that is
    assuming they do slip into production - because in spite of the testing
    process, some faults will inevitably get to customers.

    A crash that happens closer to the source of the problem (same compilation
    unit) would significantly ease the debugging effort. I figured it was a
    topic worth sharing, in the spirit of sparking some constructive
    discussions.

    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@not.gonna.tell to comp.lang.c on Sat Jun 14 22:22:12 2025
    From Newsgroup: comp.lang.c

    On 14.06.2025 01:31, Tim Rentsch wrote:
    It isn't wrong to think of bitwise-and as masking-in (or possibly >masking-out) of certain bits, but it still isn't a modulo. A modulo >operation is what is desired;

    By "different viewpoints," I meant that while you approach the problem by
    applying a modulo operation to the index so it fits the array size, I tend
    to think in terms of ensuring the index correctly maps to a location within
    an n-bit address space. Naturally, the array should accommodate the maximum
    possible index for the given address space, and that’s where the original
    code fell short. And you're absolutely right that hardcoded values are
    problematic, the size of the array should have been linked with the n-bits
    address space expectation.

    I think you have misunderstood the point of my comments. In some
    cases one is confronted with a symptom that defies one's best
    efforts to diagnose what is causing the symptom. Looking for known
    classes of errors is another arrow in the quiver of techniques for >discovering what is causing the observed behavior.

    My remark was tongue-in-cheek, but we’re clearly on the same wavelengt, no
    worries. Digging into “known classes of errors” when facing bit-fiddling
    gremlins is precisely how I pinpointed the root cause, and proactively
    tracking other similar mistakes is on my todo. But this is an obvious,
    mechanical and uninteresting subject. As I mentioned to Michael earlier,
    improving code quality is a long-term, essential aspect of our work,
    there’s no question about that. But alongside this continuous effort, I’m
    always exploring strategies to be more defensive towards the current,
    non-ideal code.

    In this case, my initial thought was to split the program into smaller
    components that communicate via IPC. This approach would allow a faulty
    component to crash with a segfault without compromising the memory of other
    parts and greatly easing the debugging process. An IPC is much more
    limiting and slower than a function call, so it made me wonder if it is
    possible to achieve a similar level of isolation within a single program.
    That question led me to post here.

    While there is no magic solution yet, Kaz suggested a clever workaround
    using mprotect(), a compromise I’m considering applying in a few places.


    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Jun 15 13:57:59 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@not.gonna.tell> wrote:
    On 13.06.2025 15:56, Michael S wrote:

    A significant part of x86 installed base (all Intel Core CPUs starting
    from gen 6 up to gen 9 and their Xeon contemporaries) has extension
    named Itel MPX that was invented exactly for that purpose. But it didn't >>work particularly well. Compiler people never liked it, but despite
    that it was supported by several generations of gcc and probably by
    clang as well.

    This does not really sound like something "readily available", unless you
    are suggesting that I migrate to a Linux kernel from 10 years ago, switch
    to gcc 5.0 and use outdated hardware.

    The proper solution to your problem is to stop using memory-unsafe
    language for complex application programming. It's not that successful
    use of unsafe languages is for complex application programming is >>impossible. The practice proved many times that it can be done. But
    only by very good team. You team is not good enough.

    Just to clarify: I didn’t post here seeking help with a simple out-of-bounds
    issue, nor was I here to vent. I’ve been wrangling C code in complex, high-performance systems for over a decade - I’m managing just fine. Code improvement is a continual, non-negotiable process in our line of work, but fires happen occasionally nonetheless. While fixing the issue, I started wondering about how faults like this could be located faster, that is assuming they do slip into production - because in spite of the testing process, some faults will inevitably get to customers.

    A crash that happens closer to the source of the problem (same compilation unit) would significantly ease the debugging effort. I figured it was a
    topic worth sharing, in the spirit of sparking some constructive
    discussions.

    You should understand that C array indexing and pointer pointer
    operations are defined in specific way. This has several
    advantages. But also has significant cost: checking validity
    of array indexing in C is much harder than in other languages.
    Namely, in most languages implementation knows size/bounds of
    an array and can automatically generate checks on each access.
    This has some cost, but modern experience is that this cost
    is quite acceptable (on average about 5-10% increase in runtime
    and similar increase in size). In C compiler sometimes knows
    size of the array, but in general it does not. So in C you
    either use half measures, like hoping that paging hardware
    will catch of of bound access (possibly arranging data layout to
    increase chance of fault) or very expensive approches,
    which essentially bundle bounds with the pointer (Intel
    tried to add hardware support for this, but even with
    hardware support it is still much more expensive than checking
    in some other languages).

    IIUC in your example the array was global, so compiler knew its
    bound and in principle could generate bounds checks. But
    I am not aware of C compiler which actually generate such
    checks. AFAIK gcc sanitize options are doing somewhat different
    thing, Tiny C has an option to generate bounds checks, but
    it is not clear to me in which cases it is effective (and you
    probably would not use Tiny C for preformance critical code).

    Note that in C++ when you use C arrays, you have the same
    situation as in C. But you can instead use array classes which
    check accesses.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@not.gonna.tell to comp.lang.c on Sun Jun 15 20:27:17 2025
    From Newsgroup: comp.lang.c

    On 15.06.2025 15:57, antispam@fricas.org wrote:
    IIUC in your example the array was global, so compiler knew its
    bound and in principle could generate bounds checks. But
    I am not aware of C compiler which actually generate such
    checks.

    There was one apparently as early as 1983 :)

    https://www.doc.ic.ac.uk/~afd/rarepapers/KendallBccRuntimeCheckingsforC.pdf

    Granted, it wasn’t a full-fledged C compiler, more of a bounds-checking code
    generator. Still, the paper is a fascinating read and highlights that this
    topic has been explored for quite some time. A more recent variation on the
    theme can be seen here (based on GCC BP, abandoned a couple years ago):

    https://www.cs.purdue.edu/homes/xyzhang/fall07/Papers/TR181.pdf

    That said, detecting out-of-bounds array access is no panacea. Memory
    corruption can arise from various sources, such as dangling pointers or
    poorly managed pointer arithmetic. Hence why I was looking in the direction
    of the MMU. All compilation units of a program share the same set of TLBs.
    I figured there might perhaps be a way to isolate a given compilation unit
    in different TLBs, effectively sandboxing its memory, then make this unit
    communicate with the rest of the program via shm when shared memory
    accesses are needed.

    Of course, even if such solution would be possible, it would not be very
    practical. Besides, one could easily achieve the same isolation by turning
    that compilation unit into a standalone, service-providing daemon.

    Mateusz
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Jun 15 23:50:15 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@not.gonna.tell> wrote:

    That said, detecting out-of-bounds array access is no panacea. Memory corruption can arise from various sources, such as dangling pointers or poorly managed pointer arithmetic.

    AFAICS there is no reason for explicit pointer arithmetic in well
    written C programs. Implicit pointer arithmetic (coming from array
    indexing) is done by compiler so should be no problem. Like in
    case of bounds checking using other languages can help in avoiding
    dangling pointers.

    Hence why I was looking in the direction
    of the MMU. All compilation units of a program share the same set of TLBs.
    I figured there might perhaps be a way to isolate a given compilation unit
    in different TLBs, effectively sandboxing its memory, then make this unit communicate with the rest of the program via shm when shared memory
    accesses are needed.

    Changing TLB-s content is rather expensive. Also what "its memory"
    is supposed to mean? Normaly functions in a C program pass pointers
    to other functions, so several functions can legaly access rather
    large and varying in time parts of memory. Best approximation to
    your idea available in PC hardware is 286/386 segmentation. But
    it proved to be quite inconvenient, so "everybody" is now using flat
    mode. One could try to emulate segmentation using paging hardware,
    and your idea clearly goes in such direction, but it is unlikely
    to work well.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Mon Jun 16 01:01:35 2025
    From Newsgroup: comp.lang.c

    On 2025-06-15, Waldek Hebisch <antispam@fricas.org> wrote:
    Mateusz Viste <mateusz@not.gonna.tell> wrote:

    That said, detecting out-of-bounds array access is no panacea. Memory
    corruption can arise from various sources, such as dangling pointers or
    poorly managed pointer arithmetic.

    AFAICS there is no reason for explicit pointer arithmetic in well
    written C programs.

    LOL, you heard it here.

    Implicit pointer arithmetic (coming from array
    indexing) is done by compiler so should be no problem. Like in

    Array indexing *is* pointer arithmetic.

    Are you not aware of this equivalence?

    (E1)[(E2)] <---> *((E1) + (E2))

    In fact, let's draw the commutative diagram

    (E1)[(E2)] <---> *((E1) + (E2))
    ^ ^
    | |
    | |
    v v
    (E2)[(E1)] <---> *((E2) + (E1))

    You're not saying anything here other than that you like the p[i]
    /notation/ better than *(p + i), and &p[i] better than p + i.

    Great, thanks for sharing!

    You're not doing yourself any favor by confusing
    "not styled in my taste" with "not well written".
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon Jun 16 10:00:34 2025
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    On 2025-06-15, Waldek Hebisch <antispam@fricas.org> wrote:
    Mateusz Viste <mateusz@not.gonna.tell> wrote:

    That said, detecting out-of-bounds array access is no panacea. Memory
    corruption can arise from various sources, such as dangling pointers or
    poorly managed pointer arithmetic.

    AFAICS there is no reason for explicit pointer arithmetic in well
    written C programs.

    LOL, you heard it here.

    Implicit pointer arithmetic (coming from array
    indexing) is done by compiler so should be no problem. Like in

    Array indexing *is* pointer arithmetic.

    Are you not aware of this equivalence?

    (E1)[(E2)] <---> *((E1) + (E2))


    Learn to read.

    In fact, let's draw the commutative diagram

    (E1)[(E2)] <---> *((E1) + (E2))
    ^ ^
    | |
    | |
    v v
    (E2)[(E1)] <---> *((E2) + (E1))

    You're not saying anything here other than that you like the p[i]
    /notation/ better than *(p + i), and &p[i] better than p + i.

    The indexing notation at least have chance of being automatically
    checked (in cases when compiler/checker knows array size). With arbitrary user-written pointer arithmetic there is no hope of automatic checking.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Mon Jun 16 06:12:01 2025
    From Newsgroup: comp.lang.c

    On 2025-06-16 06:00, Waldek Hebisch wrote:
    Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    ...
    You're not saying anything here other than that you like the p[i]
    /notation/ better than *(p + i), and &p[i] better than p + i.

    The indexing notation at least have chance of being automatically
    checked (in cases when compiler/checker knows array size). With arbitrary user-written pointer arithmetic there is no hope of automatic checking.

    Since they are, by definition, equivalent, *(p+i) is can be
    automatically checked under precisely the same situations where p[i] can
    be checked. It makes NO difference.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Louis Krupp@lkrupp@invalid.pssw.com.invalid to comp.lang.c on Mon Jun 16 06:29:30 2025
    From Newsgroup: comp.lang.c

    On 6/11/2025 7:32 AM, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    <snip>

    Imagine an alternate universe in which array declarations took the form (borrowed from Unisys ALGOL):

    array_name[lower_bound : upper_bound]

    The array in question would have been declared

    static int *socks[0 : 0xffff]

    The mask 0xffff and the upper bound would have been the same, and the
    code would have been obviously right instead of subtly wrong.

    Louis


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mateusz Viste@mateusz@x.invalid to comp.lang.c on Mon Jun 16 15:01:28 2025
    From Newsgroup: comp.lang.c

    On Mon, 16 Jun 2025 06:29:30 Louis Krupp wrote:
    Imagine an alternate universe in which array declarations took the
    form (borrowed from Unisys ALGOL):

    array_name[lower_bound : upper_bound]

    This alternate C universe you describe looks appealing, but I strongly
    suspect it is currently tormented by violent conflicts between the
    noble 0-based traditionalists, the idealistic 1-based reformists, and
    the rogue "random-based" anarchists. Our C is not perfect, by we could
    have ended with much worse.

    Mateusz

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Jun 16 06:10:46 2025
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Mateusz Viste <mateusz@not.gonna.tell> wrote:

    That said, detecting out-of-bounds array access is no panacea. Memory
    corruption can arise from various sources, such as dangling pointers or
    poorly managed pointer arithmetic.

    AFAICS there is no reason for explicit pointer arithmetic in well
    written C programs.

    This assertion is in effect a No True Scotsman statement.

    Implicit pointer arithmetic (coming from array
    indexing) is done by compiler so should be no problem.

    Even if there is no direct manipulation ("pointer arithmetic") of
    pointer variables, access can be checked only if array bounds
    information is available, and in many cases it isn't. The reason is
    (among other things) C doesn't have array parameters; what it does
    have instead is pointer parameters. At the point in the code when
    an "array" access is to be done, the information needed to check
    that an index value is in bounds just isn't available. The culprit
    here is not explicit pointer arithmetic, but lacking the information
    needed to do a bounds check. That lack is inherent in how the C
    language works with respect to arrays and pointer conversion.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Rosario19@Ros@invalid.invalid to comp.lang.c on Mon Jun 16 18:14:05 2025
    From Newsgroup: comp.lang.c

    On Thu, 12 Jun 2025 19:15:26 +0100, Richard Heathfield wrote:
    Sure. Or some people prefer to single-step with a debugger. Such
    people can make their lives a little easier by surrounding the
    buffer with sentinel soldiers, setting the sentinel soldiers to a
    magic number, and putting a watch on them both - the buffer high
    soldier and the buffer low soldier.
    I think out of bound of the array many times there is a write of the 2
    limit bounds memory... but there are cases where bound are ok but
    memory is written out the array the same, in some other places
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon Jun 16 16:47:26 2025
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    antispam@fricas.org (Waldek Hebisch) writes:

    Mateusz Viste <mateusz@not.gonna.tell> wrote:

    That said, detecting out-of-bounds array access is no panacea. Memory
    corruption can arise from various sources, such as dangling pointers or
    poorly managed pointer arithmetic.

    AFAICS there is no reason for explicit pointer arithmetic in well
    written C programs.

    This assertion is in effect a No True Scotsman statement.

    Implicit pointer arithmetic (coming from array
    indexing) is done by compiler so should be no problem.

    Even if there is no direct manipulation ("pointer arithmetic") of
    pointer variables, access can be checked only if array bounds
    information is available, and in many cases it isn't. The reason is
    (among other things) C doesn't have array parameters; what it does
    have instead is pointer parameters. At the point in the code when
    an "array" access is to be done, the information needed to check
    that an index value is in bounds just isn't available. The culprit
    here is not explicit pointer arithmetic, but lacking the information
    needed to do a bounds check. That lack is inherent in how the C
    language works with respect to arrays and pointer conversion.

    Yes, I wrote this in an earlier message. Here OP concern was
    specifically "poorly managed pointer arithmetic".
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Richard Heathfield@rjh@cpax.org.uk to comp.lang.c on Mon Jun 16 17:53:31 2025
    From Newsgroup: comp.lang.c

    On 16/06/2025 17:14, Rosario19 wrote:
    On Thu, 12 Jun 2025 19:15:26 +0100, Richard Heathfield wrote:

    Sure. Or some people prefer to single-step with a debugger. Such
    people can make their lives a little easier by surrounding the
    buffer with sentinel soldiers, setting the sentinel soldiers to a
    magic number, and putting a watch on them both - the buffer high
    soldier and the buffer low soldier.

    I think out of bound of the array many times there is a write of the 2
    limit bounds memory... but there are cases where bound are ok but
    memory is written out the array the same, in some other places

    <whoosh>
    --
    Richard Heathfield
    Email: rjh at cpax dot org dot uk
    "Usenet is a strange place" - dmr 29 July 1999
    Sig line 4 vacant - apply within

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From olcott@polcott333@gmail.com to comp.lang.c on Sat Jun 21 15:49:10 2025
    From Newsgroup: comp.lang.c

    On 6/11/2025 8:32 AM, Mateusz Viste wrote:
    This might not be a strictly C question, but it definitely concerns all
    C programmers.

    Earlier today, I fixed an out-of-bounds write bug. An obvious issue:

    static int *socks[0xffff];

    void update_my_socks(int *sock, int val) {
    socks[val & 0xffff] = sock;
    }

    While the presented issue is common knowledge for anyone familiar with
    C, *locating* the bug was challenging. The program did not crash at the moment of the out-of-bounds write but much later - somewhere entirely different, in a different object file that maintained a static pointer
    for tracking a position in a linked list. To my surprise, the pointer
    was randomly reset to NULL about once a week, causing a segfault.
    Tracing this back to an unrelated out-of-bounds write elsewhere in the
    code was tedious, to say the least.

    This raises a question: how can such corruptions be detected sooner? Protected mode prevents interference between programs but doesn’t
    safeguard a program from corrupting itself. Is there a way to enforce
    memory protection between module files of the same program? After all,
    static objects shouldn't be accessible outside their compilation unit.

    How would you approach this?

    Mateusz


    https://en.cppreference.com/w/c/types/integer.html
    One way to fix the problem in the above specific
    case is to define: void update_my_socks(int *sock, uint16_t val)
    --
    Copyright 2025 Olcott "Talent hits a target no one else can hit; Genius
    hits a target no one else can see." Arthur Schopenhauer
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Jul 1 09:54:36 2025
    From Newsgroup: comp.lang.c

    Mateusz Viste <mateusz@not.gonna.tell> writes:

    On 14.06.2025 01:31, Tim Rentsch wrote:

    It isn't wrong to think of bitwise-and as masking-in (or possibly
    masking-out) of certain bits, but it still isn't a modulo. A
    modulo operation is what is desired;

    By "different viewpoints," I meant that while you approach the
    problem by applying a modulo operation to the index so it fits the
    array size, I tend to think in terms of ensuring the index
    correctly maps to a location within an n-bit address space.
    Naturally, the array should accommodate the maximum possible index
    for the given address space, and that?s where the original code
    fell short. And you're absolutely right that hardcoded values are problematic, the size of the array should have been linked with
    the n-bits address space expectation.

    I understand what you're doing. However one thinks of it, what is
    needed is a way to ensure the produced index value is in the range
    of array index values, and that the mapping covers the full range of
    array index values. Using bitwise-and is a way of solving a less
    general problem. Unfortunately: one, although it is known that
    using bitwise-and works only for certain array sizes, there was no
    check or assertion in the code to verify that requirement; two,
    it's a holdover from earlier times when the performance difference
    might matter, but now it's a premature optimization (and in most
    cases does not result in any improvement); and three, in this case
    using bitwise-and contributed to the bug, which wouldn't have
    happened if modulo had been used instead.
    --- Synchronet 3.21a-Linux NewsLink 1.2