[LAPACK] is certainly in use by very many people, if indirectly, for
example by Python or R.
I don't use either of the two for numerics (I use python for other
tasks). But I use Matlab and Octave. I know for sure that Octave
uses relatively new implementations, and pretty sure that the same
goes for Matlab.
MitchAlsup wrote:
On Fri, 17 Oct 2025 22:20:49 -0000 (UTC), Lawrence D’Oliveiro wrote:
Short-vector SIMD was introduced along an entirely separate
evolutionary path, namely that of bringing DSP-style operations
into general-purpose CPUs.
MMX was designed to kill off the plug in Modems.
MMX was quite obviously (also) intended for short vectors of
typically 8 and 16-bit elements, it was the enabler for sw DVD
decoding. ZoranDVD was the first to properly handle 30 frames/second
with zero skips, it needed a PentiumMMX-200 to do so.
In many cases one can enlarge data structures to multiple of SIMD vector
size (and align them properly). There requires some extra code, but mot
too much and all of it is outside inner loop. So, there is some waste,
but rather small due to unused elements.
Of course, there is still trouble due to different SIMD vector size
and/or different SIMD instructions sets.
On 18/10/2025 03:05, Lawrence D’Oliveiro wrote:
On Sat, 18 Oct 2025 00:42:27 GMT, MitchAlsup wrote:
On Fri, 17 Oct 2025 22:20:49 -0000 (UTC), Lawrence D’Oliveiro wrote:
First of all, we have some “HDR” monitors around now that can output a >>>> much greater gradation of brightness levels. These can be used to
produce apparent brightnesses greater than 100%.
It is unlikely that monitors will ever get much beyond 11-bits of pixel
depth per color.
I think bragging rights alone will see it grow beyond that. Look at
tandem
OLEDs.
Like many things, human perception of brightness is not linear - it is somewhat logarithmic. So even though we might not be able to
distinguish anywhere close to 2000 different nuances of one primary
colour, we /can/ perceive a very wide dynamic range. Having a large
number of bits on a linear scale can be more convenient in practice than trying to get accurate non-linear scaling.
On Sat, 18 Oct 2025 19:24:21 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
Michael S <already5chosen@yahoo.com> schrieb:
It is possible that LAPAC API was not updated in decades,
The API of existing LAPACK routines was not changed (AFAIK),
but there were certainly additions. It is also possible to chose
64-bit integers at build time.
although I'd
expect that even at API level there were at least small additions,
if not changes. But if you are right that LAPAC implementation was
not updated in decade than you could be sure that it is either not
used by anybody or used by very few people.
It is certainly in use by very many people, if indirectly, for example
by Python or R.
Does Python (numpy and scipy, I suppose) or R linked against
implementation of LAPACK from 40 or 30 years ago, as suggested by Mitch?
Somehow, I don't believe it.
I don't use either of the two for numerics (I use python for other
tasks). But I use Matlab and Octave. I know for sure that Octave uses relatively new implementations, and pretty sure that the same goes
for Matlab.
Personally, when I need LAPAC-like functionality then I tend to use
BLAS routines either from Intel MKL or from OpenBLAS.
Different level of application. You use LAPACK when you want to do
things like calculating eigenvalues or singular value decomposition,
see https://www.netlib.org/lapack/lug/node19.html . If you use
BLAS directly, you might want to check if there is a routine
in LAPACK which does what you need to do.
Higher-level algos I am interested in are mostly our own inventions.
I can look, of course, but the chances that they are present in LAPACK
are very low.
In fact, Even BLAS L3 I don't use all that often (and lower levels
of BLAS never).
Not because APIs do not match my needs. They typpically do. But
because standard implementations are optimized for big or huge matrices.
My needs are medium matrices. A lot of medium matrices.
My own implementations of standard algorithms for medium-sized
matrices, most importantly of Cholesky decomposition, tend to be much
faster than those in OTS BLAS librares. And preparatioon of my own
didn't take a lot of time. After all those are simple algorithms.
Speaking of Cray, the US Mint are issuing some new $1 coins featuringMy guess: Well below 0.1% unless they get told what it is.
various famous persons/things, and one of them has a depiction of the
Cray-1 on it.
From the photo I’ve seen, it’s an overhead view, looking like a
stylized letter C. So I wonder, even with the accompanying legend “CRAY-1 SUPERCOMPUTERâ€, how many people will realize that’s actually a
picture of the computer?
<https://www.tomshardware.com/tech-industry/new-us-usd1-coins-to-feature-steve-jobs-and-cray-1-supercomputer-us-mints-2026-american-innovation-program-to-memorialize-computing-history>
It is unlikely that monitors will ever get much beyond 11-bits of pixel depth per color.I do not understand why monitor would go beyond 9-bits. Most people
can't see beyond 7 or 8-bits color component depth. Keeping the
component depth 10-bits or less allows colors to fit into 32-bits.
Bits beyond 8 would be for some sea creatures or viewable with special glasses?
MitchAlsup <user5857@newsgrouper.org.invalid> schrieb:
LAPAC has not been updated in decades, yet is as relevant today as
the first day it was available.
Lapack's basics have not changed, but it is still actively maintained,
with errors being fixed and new features added.
If you look at the most recent major release, you will see that a lot
is going on: https://www.netlib.org/lapack/lapack-3.12.0.html
One important thing seems to be changes to 64-bit integers.
And I love changes like
- B = BB*CS + DD*SN
- C = -AA*SN + CC*CS
+ B = ( BB*CS ) + ( DD*SN )
+ C = -( AA*SN ) + ( CC*CS )
which makes sure that compilers don't emit FMA instructions and
change rounding (which, apparently, reduced accuracy enormously
for one routine.
(According to the Fortran standard, the compiler has to honor--- Synchronet 3.21a-Linux NewsLink 1.2
parentheses).
On Fri, 17 Oct 2025 20:54:23 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:
No, old hammer does not work well. Unless you consider delivering
5-10% of possible performance as "working well".
Michael S <already5chosen@yahoo.com> posted:
On Fri, 17 Oct 2025 20:54:23 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:
No, old hammer does not work well. Unless you consider delivering
5-10% of possible performance as "working well".
Are you suggesting that a brand new #3 ball peen hammer is usefully
better than a 30 YO #3 ball peen hammer ???
On Sat, 18 Oct 2025 10:21:32 +0200, Terje Mathisen wrote:
MitchAlsup wrote:
On Fri, 17 Oct 2025 22:20:49 -0000 (UTC), Lawrence D’Oliveiro wrote:
Short-vector SIMD was introduced along an entirely separate
evolutionary path, namely that of bringing DSP-style operations
into general-purpose CPUs.
MMX was designed to kill off the plug in Modems.
MMX was quite obviously (also) intended for short vectors of
typically 8 and 16-bit elements, it was the enabler for sw DVD
decoding. ZoranDVD was the first to properly handle 30 frames/second
with zero skips, it needed a PentiumMMX-200 to do so.
I think the initial “killer app” for short-vector SIMD was very much video encoding/decoding, not audio encoding/decoding. Audio was
already easy enough to manage with general-purpose CPUs of the 1990s.
On 19/10/2025 03:17, Lawrence D’Oliveiro wrote:Having SIMD available was a key part of making the open source Ogg
On Sat, 18 Oct 2025 10:21:32 +0200, Terje Mathisen wrote:
MitchAlsup wrote:
On Fri, 17 Oct 2025 22:20:49 -0000 (UTC), Lawrence D’Oliveiro wrote:
Short-vector SIMD was introduced along an entirely separate
evolutionary path, namely that of bringing DSP-style operations
into general-purpose CPUs.
MMX was designed to kill off the plug in Modems.
MMX was quite obviously (also) intended for short vectors of
typically 8 and 16-bit elements, it was the enabler for sw DVD
decoding. ZoranDVD was the first to properly handle 30 frames/second
with zero skips, it needed a PentiumMMX-200 to do so.
I think the initial “killer app†for short-vector SIMD was very much
video encoding/decoding, not audio encoding/decoding. Audio was
already easy enough to manage with general-purpose CPUs of the 1990s.
Agreed. But having SIMD made audio processing more efficient, which was
a nice bonus - especially if you wanted more than CD quality audio.
David Brown wrote:
On 19/10/2025 03:17, Lawrence D’Oliveiro wrote:
On Sat, 18 Oct 2025 10:21:32 +0200, Terje Mathisen wrote:
MitchAlsup wrote:
On Fri, 17 Oct 2025 22:20:49 -0000 (UTC), Lawrence D’Oliveiro wrote:
Short-vector SIMD was introduced along an entirely separate
evolutionary path, namely that of bringing DSP-style operations
into general-purpose CPUs.
MMX was designed to kill off the plug in Modems.
MMX was quite obviously (also) intended for short vectors of
typically 8 and 16-bit elements, it was the enabler for sw DVD
decoding. ZoranDVD was the first to properly handle 30 frames/second
with zero skips, it needed a PentiumMMX-200 to do so.
I think the initial “killer app†for short-vector SIMD was very much
video encoding/decoding, not audio encoding/decoding. Audio was
already easy enough to manage with general-purpose CPUs of the 1990s.
Agreed. But having SIMD made audio processing more efficient, which
was a nice bonus - especially if you wanted more than CD quality audio.
Having SIMD available was a key part of making the open source Ogg
Vorbis decoder 3x faster.
It worked on MMX/SSE/SSE2/Altivec.
On Fri, 17 Oct 2025 20:54:23 GMT
MitchAlsup <user5857@newsgrouper.org.invalid> wrote:
George Neuner <gneuner2@comcast.net> posted:
Hope the attributions are correct.
On Wed, 15 Oct 2025 22:31:32 GMT, MitchAlsup
<user5857@newsgrouper.org.invalid> wrote:
:
Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:
On Wed, 15 Oct 2025 05:55:40 GMT, Anton Ertl wrote:
In any case, even with these languages there are still
software projects that fail, miss their deadlines and have
overrun their budget ...
A lot of these projects were unnecessary. Once someone figured out
how to make the (17 kinds of) hammers one needs, there it little
need to make a new hammer architecture.
Windows could have stopped at W7, and many MANY people would have
been happier... The mouse was more precise in W7 than in W8 ...
With a little upgrade for new PCIe architecture along the way
rather than redesigning whole kit and caboodle for tablets and
phones which did not work BTW...
Office application work COULD have STOPPED in 2003, eXcel in 1998,
... and few people would have cared. Many SW projects are driven
not by demand for the product, but pushed by companies to make
already satisfied users have to upgrade.
Those programmers could have transitioned to new SW projects
rather than redesigning the same old thing 8 more times. Presto,
there is now enough well trained SW engineers to tackle the undone
SW backlog.
The problem is that decades of "New & Improved" consumer products
have conditioned the public to expect innovation (at minimum new
packaging and/or advertising) every so often.
Bringing it back to computers: consider that a FOSS library which
hasn't seen an update for 2 years likely would be passed over by
many current developers due to concern that the project has been
abandoned. That perception likely would not change even if the
author(s) responded to inquiries, the library was suitable "as is"
for the intended use, and the lack of recent updates can be
explained entirely by a lack of new bug reports.
LAPAC has not been updated in decades, yet is as relevant today as
the first day it was available.
It is possible that LAPAC API was not updated in decades, although I'd
expect that even at API level there were at least small additions, if
not changes. But if you are right that LAPAC implementation was not
updated in decade than you could be sure that it is either not used by anybody or used by very few people.
AFAICS at logical level interface stays the same. There is significant change: in old times you were on your own trying to interface
Lapack from C. Now you can get C interface.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,089 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 153:51:03 |
| Calls: | 13,921 |
| Calls today: | 2 |
| Files: | 187,021 |
| D/L today: |
3,755 files (944M bytes) |
| Messages: | 2,457,163 |