• POINT OF VIEW OF AN ALGORITHM (Re: Algorithm introduced in Hogwild!SGD (Niu et al., 2011)) (Re: parallel random-access machine)

    From Mild Shock@janburse@fastmail.fm to sci.physics.relativity,sci.math,comp.lang.prolog on Mon Dec 1 23:12:14 2025
    From Newsgroup: comp.lang.prolog

    Hi,

    I am not saying anything. Thats the definition of PRAM.
    Whats wrong with you, are you a 5 year old moron.
    I am only citing a theoretical computer science model:

    - Concurrent read concurrent write (CRCW)—multiple
    processors can read and write. A CRCW PRAM is sometimes
    called a concurrent random-access machine. https://en.wikipedia.org/wiki/Parallel_RAM

    Technically with multi-channel memory nowadays, it
    doesn't need locks on the hardware level, only tiny
    serialization, could even happen outside of the CPU.

    So if you drop some barrier requirements, you could
    really have the chaos of a PRAM, for worse or
    for better. I think you need to accept that,

    even if its to big to fit in your tiny squirrel brain.

    Bye

    P.S.: "effectively CREW, since only one write per address at
    a time", it will just block the other cores? Short answer:
    Yes — if two cores try to write the same address, one

    of them is forced to stall (block) until the other completes.
    In real hardware, the effect can mimic CRCW behavior over
    a short time window, even though it’s not truly simultaneous.

    this blocking usually happens in the cache-coherence
    system, not at DRAM. Modern CPUs use MESI/MOESI. It happens
    over a small interval [t₁, t₂] dictated by cache coherence.

    From the POINT OF VIEW OF AN ALGORITHM, it’s “CRCW enough.”


    Bosephis Otlesnov schrieb:
    Mild Shock wrote:

    What are you, a 5 year old moron?

    There are millions of algorithm that use volatile variables. Just look
    at the Java code base.

    But I was not refering to multi-threading, I was refering to PRAM for
    matrix operations.

    i thought you said you wanna read and write parallel to RAM, aka PRAM, let
    me see.. zum zum zum, yeah, you said that. Take a lock at timing
    requirements for a read/write cycle, deadlines etc, shared memory or not, fucking idiot.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mild Shock@janburse@fastmail.fm to sci.physics.relativity,sci.math,comp.lang.prolog on Mon Dec 1 23:37:23 2025
    From Newsgroup: comp.lang.prolog

    Hi,

    Come on squirrel brain, that we practically have
    PRAM on multi-core CPUs, is an old hat. ARM kept
    up with MESI/MOESI in 2011:

    https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/CacheCoherencyWhitepaper_6June2011.pdf

    What are you squirrel brain, some russion developer
    controlling a drone from within EMACS ? Meanwhile
    ARM and Intel and Snapdragon etc.. have developed

    much more marvels than only this simple PRAM.
    The excitement on the side of ARM is quite big,
    that they got into the boat of OpenAI:

    OpenAI co-founder on new deal with AMD https://www.youtube.com/watch?v=WuXCNpbO9hI

    Bye

    P.S.: Because of contention, you should of course
    only use volatile variables carefully. It might
    not scale well to 1000 cores.

    There are also algorithms around to lift the
    pressure when there is a large amount of cores.
    Even Doug Lea has already put a few utilities in

    java.concurrent.* for certain problems with large
    number of cores, kind of easter eggs in java.concurrent.*.
    But I am not sure whether Doug Lea is involved in

    additions for AI accelerators. But he is in the
    Program Committee of:

    Parallel programming for emerging hardware, including
    AI accelerators, processor-in-memory, programmable logic,
    non-volatile memory technologies, and quantum computers https://ppopp26.sigplan.org/track/PPoPP-2026-papers

    It could be that the data flow compiler, things sketched
    by OpenXLA already work well enough.

    Mild Shock schrieb:
    Hi,

    I am not saying anything. Thats the definition of PRAM.
    Whats wrong with you, are you a 5 year old moron.
    I am only citing a theoretical computer science model:

    - Concurrent read concurrent write (CRCW)—multiple
    processors can read and write. A CRCW PRAM is sometimes
    called a concurrent random-access machine. https://en.wikipedia.org/wiki/Parallel_RAM

    Technically with multi-channel memory nowadays, it
    doesn't need locks on the hardware level, only tiny
    serialization, could even happen outside of the CPU.

    So if you drop some barrier requirements, you could
    really have the chaos of a PRAM, for worse or
    for better. I think you need to accept that,

    even if its to big to fit in your tiny squirrel brain.

    Bye

    P.S.: "effectively CREW, since only one write per address at
    a time", it will just block the other cores? Short answer:
    Yes — if two cores try to write the same address, one

    of them is forced to stall (block) until the other completes.
    In real hardware, the effect can mimic CRCW behavior over
    a short time window, even though it’s not truly simultaneous.

    this blocking usually happens in the cache-coherence
    system, not at DRAM. Modern CPUs use MESI/MOESI. It happens
    over a small interval [t₁, t₂] dictated by cache coherence.

    From the POINT OF VIEW OF AN ALGORITHM, it’s “CRCW enough.”


    Bosephis Otlesnov schrieb:
    Mild Shock wrote:

    What are you, a 5 year old moron?

    There are millions of algorithm that use volatile variables. Just look
    at the Java code base.

    But I was not refering to multi-threading, I was refering to PRAM for
    matrix operations.

    i thought you said you wanna read and write parallel to RAM, aka PRAM,
    let
    me see.. zum zum zum, yeah, you said that. Take a lock at timing
    requirements for a read/write cycle, deadlines etc, shared memory or not,
    fucking idiot.



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Mild Shock@janburse@fastmail.fm to sci.physics.relativity,sci.math,comp.lang.prolog on Mon Dec 1 23:53:21 2025
    From Newsgroup: comp.lang.prolog

    Hi,

    Looking at how they phrase it:

    "symposium focuses on improving the programming
    productivity and performance engineering of all
    concurrent and parallel systems—multicore, multi-
    threaded, heterogeneous, clustered, and distributed
    systems, grids, accelerators such as ASICs, GPUs,
    FPGAs, data centers, clouds, large scale machines,
    and quantum computers. PPoPP is also interested in
    new and emerging parallel workloads and applications,
    such as artificial intelligence and large-scale
    scientific/enterprise workloads." https://ppopp26.sigplan.org/track/PPoPP-2026-papers

    It could be also that academia was overrun by the AI boom.
    Is lost in the nowhere. That the techno lords have
    created realities turning the academia into savages.

    No wonder there is a call for automated AI researchers,
    and automated AI engineers, by the AI industry itself.
    And which might be the outcome of the current manhatten

    project, also known as genesis mission. So that the AI
    can be programmed by AI, AI which is more knowledgable
    than tiny accademics. We are maybe heading towards a

    first Ultraintelligence, that will then shape subsequent
    Ultraintelligences. As described by I. J. Good:

    "Let an ultraintelligent machine be defined as a machine
    that can far surpass all the intellectual activities of
    any man however clever. Since the design of machines is
    one of these intellectual activities, an ultraintelligent
    machine could design even better machines; there would
    then unquestionably be an 'intelligence explosion,' and
    the intelligence of man would be left far behind...
    Thus the first ultraintelligent machine is the last
    invention that man need ever make, provided that the
    machine is docile enough to tell us how to keep it under
    control. It is curious that this point is made so
    seldom outside of science fiction. It is sometimes
    worthwhile to take science fiction seriously." https://exhibits.stanford.edu/feigenbaum/catalog/gz727rg3869

    Bye

    Mild Shock schrieb:
    Hi,

    Come on squirrel brain, that we practically have
    PRAM on multi-core CPUs, is an old hat. ARM kept
    up with MESI/MOESI in 2011:

    https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/CacheCoherencyWhitepaper_6June2011.pdf


    What are you squirrel brain, some russion developer
    controlling a drone from within EMACS ? Meanwhile
    ARM and Intel and Snapdragon etc.. have developed

    much more marvels than only this simple PRAM.
    The excitement on the side of ARM is quite big,
    that they got into the boat of OpenAI:

    OpenAI co-founder on new deal with AMD https://www.youtube.com/watch?v=WuXCNpbO9hI

    Bye

    P.S.: Because of contention, you should of course
    only use volatile variables carefully. It might
    not scale well to 1000 cores.

    There are also algorithms around to lift the
    pressure when there is a large amount of cores.
    Even Doug Lea has already put a few utilities in

    java.concurrent.* for certain problems with large
    number of cores, kind of easter eggs in java.concurrent.*.
    But I am not sure whether Doug Lea is involved in

    additions for AI accelerators. But he is in the
    Program Committee of:

    Parallel programming for emerging hardware, including
    AI accelerators, processor-in-memory, programmable logic,
    non-volatile memory technologies, and quantum computers https://ppopp26.sigplan.org/track/PPoPP-2026-papers

    It could be that the data flow compiler, things sketched
    by OpenXLA already work well enough.

    Mild Shock schrieb:
    Hi,

    I am not saying anything. Thats the definition of PRAM.
    Whats wrong with you, are you a 5 year old moron.
    I am only citing a theoretical computer science model:

    - Concurrent read concurrent write (CRCW)—multiple
    processors can read and write. A CRCW PRAM is sometimes
    called a concurrent random-access machine.
    https://en.wikipedia.org/wiki/Parallel_RAM

    Technically with multi-channel memory nowadays, it
    doesn't need locks on the hardware level, only tiny
    serialization, could even happen outside of the CPU.

    So if you drop some barrier requirements, you could
    really have the chaos of a PRAM, for worse or
    for better. I think you need to accept that,

    even if its to big to fit in your tiny squirrel brain.

    Bye

    P.S.: "effectively CREW, since only one write per address at
    a time", it will just block the other cores? Short answer:
    Yes — if two cores try to write the same address, one

    of them is forced to stall (block) until the other completes.
    In real hardware, the effect can mimic CRCW behavior over
    a short time window, even though it’s not truly simultaneous.

    this blocking usually happens in the cache-coherence
    system, not at DRAM. Modern CPUs use MESI/MOESI. It happens
    over a small interval [t₁, t₂] dictated by cache coherence.

     From the POINT OF VIEW OF AN ALGORITHM, it’s “CRCW enough.”


    Bosephis Otlesnov schrieb:
    Mild Shock wrote:

    What are you, a 5 year old moron?

    There are millions of algorithm that use volatile variables. Just look >>>> at the Java code base.

    But I was not refering to multi-threading, I was refering to PRAM for
    matrix operations.

    i thought you said you wanna read and write parallel to RAM, aka
    PRAM, let
    me see.. zum zum zum, yeah, you said that. Take a lock at timing
    requirements for a read/write cycle, deadlines etc, shared memory or
    not,
    fucking idiot.




    --- Synchronet 3.21a-Linux NewsLink 1.2