<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>https://youtu.be/VYhAGnsnO7w
OTP, no SPI, UART or I²C, but still...
Clifford Heath
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>Interesting. They have some very off-brand FPGA type devices as well at very low prices, but they still don't do me any favors with the packages.
OTP, no SPI, UART or I²C, but still...
Clifford Heath
On Tuesday, October 9, 2018 at 9:05:27 PM UTC-4, Clifford Heath wrote:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
Interesting. They have some very off-brand FPGA type devices as well at very low prices, but they still don't do me any favors with the packages.
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept.
1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
On 10/10/2018 02:05, Clifford Heath wrote:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
Has anyone actually used them - or worked out where to get the ICE and
how much it costs ?
MK
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
At least the 8 pin version has both a PWM as well as a comparator, so
making an ADC wouldn't be too hard.
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative adressing mode. On would want to reserve a few memory locations as pseudo-registers to help with that, but that only goes so far.
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backendCPUs like this (and others that aren't like this) should be programmed in Forth. It's a great tool for small MCUs and many times can be hosted on the target although not likely in this case. Still, you can bring enough functionality onto the MCU to allow direct downloads and many debugging features without an ICE.
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative adressing mode. On would want to reserve a few memory locations as pseudo-registers to help with that, but that only goes so far.
On 12/10/18 08:50, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
It looks like the lowest 16 memory addresses could be considered >pseudo-registers - they are the ones that can be used for direct memory >access rather than needing indirect access.
And I don't think inefficient reentrant functions would be much of a
worry on a device with so little code space!
Some of the examples in the datasheet were given in C - that implies
that there already is a C compiler for the device. Has anyone tried the
IDE?
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
They even make dual-core variants (the part where the first digit in the
part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU
is shared. In particular, the I/O registers are also shared, which means
some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.
Philipp
The real issue would be the small RAM size.
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
The real issue would be the small RAM size.
Devices with this architecture go up to 256 B of RAM (but they then cost
a few cent more).
Philipp
On 11/10/18 15:04, Michael Kellett wrote:
On 10/10/2018 02:05, Clifford Heath wrote:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
Has anyone actually used them - or worked out where to get the ICE and
how much it costs ?
MK
The cost of the ICE is not going to be significant for most people - you usually use a chip like this when you want huge quantities (even though
it is available in small numbers).
What turns me off here is the programming procedure for the OTP devices.
There is no information on it - just a simple one-at-a-time programmer device. That is useless for production - you need an automated system,
or support from existing automated programmers, or at the very least the programming information so that you can build your own specialist
programmer. There is no point in buying a microcontroller for $0.03 if
the time taken to manually take a device out a tube, manually program
it, and manually put it back in another tube for the pick-and-place
costs you $1 production time.
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring
enough functionality onto the MCU to allow direct downloads and many debugging features without an ICE.
Rick C.
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>> assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative >> adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring
enough functionality onto the MCU to allow direct downloads and many debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are detailsYour point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
that can make a huge difference in how efficient it is. To make Forth
work well on a small chip you need a Forth-specific instruction set to target the stack processing. For example, adding two numbers in this
chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard
to get the same results from a chip that doesn't have this kind of stack-based instruction set.
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >> >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >> >>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend >> >> for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative >> >> adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring
enough functionality onto the MCU to allow direct downloads and many
debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are details
that can make a huge difference in how efficient it is. To make Forth
work well on a small chip you need a Forth-specific instruction set to
target the stack processing. For example, adding two numbers in this
chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard
to get the same results from a chip that doesn't have this kind of
stack-based instruction set.
Your point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented with no registers.
I think that means in general the CPU will be slow compared to a register based design.
That actually means it is easier to have a fast Forth implementation compared to other compilers since there won't be a significant penalty for using a stack.
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be programmed
in Forth.
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown
wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus
Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory
locations, so why would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic
through the accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of
commented assembly program listing.
It would be nice to have a C compiler, and registers help
with that.
Looking at the instruction set, it should be possible to make a
backend for this in SDCC; the architecture looks more
C-friendly than the existing pic14 and pic16 backends. But it
surely isn't as nice as stm8 or z80. reentrant functions will
be inefficent: No registers, and no sp-relative adressing mode.
On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many
times can be hosted on the target although not likely in this
case. Still, you can bring enough functionality onto the MCU to
allow direct downloads and many debugging features without an
ICE.
Rick C.
Forth is a good language for very small devices, but there are
details that can make a huge difference in how efficient it is. To
make Forth work well on a small chip you need a Forth-specific
instruction set to target the stack processing. For example,
adding two numbers in this chip is two instructions - load
accumulator from memory X, add accumulator to memory Y. In a Forth
cpu, you'd have a single instruction that does "pop two numbers,
add them, push the result". That gives a very efficient and compact
instruction set. But it is hard to get the same results from a
chip that doesn't have this kind of stack-based instruction set.
Your point is what exactly? You are comparing running forth on some
other chip to running forth on this chip. How is that useful? There
are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented
with no registers. I think that means in general the CPU will be
slow compared to a register based design. That actually means it is
easier to have a fast Forth implementation compared to other
compilers since there won't be a significant penalty for using a
stack.
On Sat, 13 Oct 2018 05:06:23 -0700 (PDT),
gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>>>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>>>>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>>>>> assembly program listing.
It would be nice to have a C compiler, and registers help with that. >>>>>>
Looking at the instruction set, it should be possible to make a backend >>>>> for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8 >>>>> or z80.
reentrant functions will be inefficent: No registers, and no sp-relative >>>>> adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring
enough functionality onto the MCU to allow direct downloads and many
debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are details
that can make a huge difference in how efficient it is. To make Forth
work well on a small chip you need a Forth-specific instruction set to
target the stack processing. For example, adding two numbers in this
chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard >>> to get the same results from a chip that doesn't have this kind of
stack-based instruction set.
Your point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented with no registers.
Depending how you look at it, you could claim that it has 64 registers
and no RAM. It is a quite orthogonal single address architecture. You
can do practically all single operand instructions (like inc/dec, shift/rotate etc.) either in the accumulator but equally well in any
of the 64 "registers". For two operand instructions (such as add/sub,
and/or etc,), either the source or destination can be in the memory "register".
Both Acc = Acc Op Memory or alternatively Memory = Acc Op Memory are
valid.
Thus the accumulator is needed only for two operand instructions, but
not for single operand instructions.
I think that means in general the CPU will be slow compared to a register based design.
What is the difference, you have 64 on chip RAM bytes or 64 single
byte on chip registers. The situation would have been different with
on-chip registers and off chip RAM, with the memory bottleneck.
Of course, there were odd architectures like the TI 9900 with a set of sixteen 16 bit general purpose register in RAN. The set could be
switched fast in interrupts, but slowed down any general purpose
register access.
That actually means it is easier to have a fast Forth implementation compared to other compilers since there won't be a significant penalty for using a stack.
For a stack computer you need a pointer register with preferably autoincrement/decrement support. This processor has indirect access
and single instruction increment or decrement support without
disturbing the accumulator.Thus not so bad after all for stack
computing.
I don't think that an interpreted Forth is feasible for this particular
MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed address".
On 18-10-13 18:31 , Niklas Holsti wrote:
I don't think that an interpreted Forth is feasible for this particular
MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed
address".
Ok, before anyone else notices, I admit I forgot about implementing an indirect jump by pushing the target address on the stack and executing a return instruction. That would work for this machine.
On 13/10/18 17:00, upsidedown@downunder.com wrote:
On Sat, 13 Oct 2018 05:06:23 -0700 (PDT),
gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>>>>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>>>>>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>>>>>> assembly program listing.
It would be nice to have a C compiler, and registers help with that. >>>>>>>
Looking at the instruction set, it should be possible to make a backend >>>>>> for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8 >>>>>> or z80.
reentrant functions will be inefficent: No registers, and no sp-relative >>>>>> adressing mode. On would want to reserve a few memory locations as >>>>>> pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring >>>>> enough functionality onto the MCU to allow direct downloads and many >>>>> debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are details >>>> that can make a huge difference in how efficient it is. To make Forth >>>> work well on a small chip you need a Forth-specific instruction set to >>>> target the stack processing. For example, adding two numbers in this
chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard >>>> to get the same results from a chip that doesn't have this kind of
stack-based instruction set.
Your point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented with no registers.
Depending how you look at it, you could claim that it has 64 registers
and no RAM. It is a quite orthogonal single address architecture. You
can do practically all single operand instructions (like inc/dec,
shift/rotate etc.) either in the accumulator but equally well in any
of the 64 "registers". For two operand instructions (such as add/sub,
and/or etc,), either the source or destination can be in the memory
"register".
Not quite, no. Only the first 16 memory addresses are directly
accessible for most instructions, with the first 32 addresses being >available for word-based instructions. So you could liken it to a
device with 16 registers and indirect memory access to the rest of ram.
M.n Only addressed in 0~0xF (0~15) is allowed
On 18-10-12 19:11 , gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote: >>> Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>>> assembly program listing.
The data-sheet describes the OTP program memory as "1KW", probably
meaning 1024 instructions. The length of an instruction is not defined,
as far as I could see.
That actually means it is easier to have a fast Forth implementation
compared to other compilers since there won't be a significant penalty
for using a stack.
And one more iteration (sorry...)
On 18-10-13 19:46 , Niklas Holsti wrote:
On 18-10-13 18:31 , Niklas Holsti wrote:
I don't think that an interpreted Forth is feasible for this particular
MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed
address".
Ok, before anyone else notices, I admit I forgot about implementing an
indirect jump by pushing the target address on the stack and executing a
return instruction. That would work for this machine.
Except that one can only "push" the accumulator and flag registers, >combined, and the flag register cannot be set directly, and has only 4 >working bits.
What would work, as an indirect jump, is to set the Stack Pointer (sp)
to point at a RAM word that contains the target address, and then
execute a return. But then one has lost the actual Stack Pointer value.
On Sat, 13 Oct 2018 19:59:06 +0300, Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:
And one more iteration (sorry...)
On 18-10-13 19:46 , Niklas Holsti wrote:
On 18-10-13 18:31 , Niklas Holsti wrote:
I don't think that an interpreted Forth is feasible for this particular >>>> MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed >>>> address".
Ok, before anyone else notices, I admit I forgot about implementing an
indirect jump by pushing the target address on the stack and executing a >>> return instruction. That would work for this machine.
Except that one can only "push" the accumulator and flag registers,
combined, and the flag register cannot be set directly, and has only 4
working bits.
What would work, as an indirect jump, is to set the Stack Pointer (sp)
to point at a RAM word that contains the target address, and then
execute a return. But then one has lost the actual Stack Pointer value.
Just call a "Jumper" routine, the call pushes the return address on
stack. In "Jumper" read SP from IO address space, indirectly modify
the return address on stack as needed and perform a ret instruction,
causing a jump to the modified return address and it also restores the
SP to the value before the call.
Except that one can only "push" the accumulator and flag registers,
combined, and the flag register cannot be set directly, and has only 4 working bits.
On Fri, 12 Oct 2018 22:06:02 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
The real issue would be the small RAM size.
Devices with this architecture go up to 256 B of RAM (but they then cost
a few cent more).
Philipp
Did you find the binary encoding of various instruction formats, i.e
how many bits allocated to the operation code and how many for the
address field ?
My initial guess was that the instruction word is simple 8 bit opcode
+ 8 bit address, but the bit and word address limits for the smaller
models would suggest that for some op-codes, the op-code field might
be wider than 8 bits and address fields narrower than 8 bits (e.g. bit
and word addressing).
Am 13.10.2018 um 18:59 schrieb Niklas Holsti:
Except that one can only "push" the accumulator and flag registers,
combined, and the flag register cannot be set directly, and has only 4
working bits.
It seems unclear to me which of acc and sp is pushed first.
But if acc is pushed first, one could do
pushaf;
mov a, sp;
inc a;
mov sp, a;
to push any desired byte onto the stack.
If you want a hardware minimal processor the Maxim 32660 looks like fun
3mm square, 24 pin Cortex M4, 96MHz, 256k flash, 96k RAM, £1.16 (10 off).
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
On Sat, 13 Oct 2018 05:06:23 -0700 (PDT),How fast are instructions that access memory? Most MCUs will perform register operations in a single cycle. Even though RAM may be on chip, it typically is not as fast as registers because it is usually not multiported. DSP chips are an exception with dual and even triple ported on chip RAM.
gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >> >>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >> >>>> assembly program listing.
It would be nice to have a C compiler, and registers help with that. >> >>>
Looking at the instruction set, it should be possible to make a backend >> >> for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8 >> >> or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring
enough functionality onto the MCU to allow direct downloads and many
debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are details
that can make a huge difference in how efficient it is. To make Forth
work well on a small chip you need a Forth-specific instruction set to
target the stack processing. For example, adding two numbers in this
chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard >> to get the same results from a chip that doesn't have this kind of
stack-based instruction set.
Your point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented with no registers.
Depending how you look at it, you could claim that it has 64 registers
and no RAM. It is a quite orthogonal single address architecture. You
can do practically all single operand instructions (like inc/dec, shift/rotate etc.) either in the accumulator but equally well in any
of the 64 "registers". For two operand instructions (such as add/sub,
and/or etc,), either the source or destination can be in the memory "register".
Both Acc = Acc Op Memory or alternatively Memory = Acc Op Memory are
valid.
Thus the accumulator is needed only for two operand instructions, but
not for single operand instructions.
Yeah, I'm familiar with the 9900. In the 990 it worked well because the CPU was TTL and not so fast. Once the CPU was on a single chip the external RAM was not fast enough to keep up really and instruction timings were dominated by the memory.I think that means in general the CPU will be slow compared to a register based design.
What is the difference, you have 64 on chip RAM bytes or 64 single
byte on chip registers. The situation would have been different with
on-chip registers and off chip RAM, with the memory bottleneck.
Of course, there were odd architectures like the TI 9900 with a set of sixteen 16 bit general purpose register in RAN. The set could be
switched fast in interrupts, but slowed down any general purpose
register access.
The stack in memory is usually a bottle neck because memory is typically slow so optimizations would be done to keep operands in registers. In this chip no optimizations are possible, but likely it wouldn't be too bad as long as the stack operations are flexible enough. But then I don't think you said this CPU has the sort of addressing that allows an operand in memory to be used and popped off the stack in one opcode as many, higher level CPUs do. So adding the two numbers on the stack would involve keeping the top of stack in the accumulator, adding the next item on the stack from memory to the accumulator, then another instruction to adjust the stack pointer which is also in memory. So two instructions? How many clock cycles?That actually means it is easier to have a fast Forth implementation compared to other compilers since there won't be a significant penalty for using a stack.
For a stack computer you need a pointer register with preferably autoincrement/decrement support. This processor has indirect access
and single instruction increment or decrement support without
disturbing the accumulator.Thus not so bad after all for stack
computing.
On 18-10-12 19:11 , gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>> assembly program listing.
The data-sheet describes the OTP program memory as "1KW", probably
meaning 1024 instructions. The length of an instruction is not defined,
as far as I could see.
It would be nice to have a C compiler, and registers help with that.
The data-sheet mentions something they call "Mini-C".
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative >> adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be programmed
in Forth.
I don't think that an interpreted Forth is feasible for this particular
MCU. Where would the Forth program (= list of pointers to "words") be stored? I found no instructions for reading data from the OTP program memory, and the 64-byte RAM will not hold a non-trivial program together with the data for that program.
Moreover, there is no indirect jump instruction -- "jump to a computed address". The closest is "pcadd a", which can be used to implement a 256-entry case statement. You would be limited to a total of 256 words.
Moreover, each RAM-resident pointer to RAM uses 2 octets of RAM, giving
a 16-bit RAM address, although for this MCU a 6-bit address would be
enough. Apparently the same architecture has implementations with more
RAM and 16-bit RAM addresses.
That said, one could perhaps implement a compiled Forth for this machine.
On 13/10/18 14:06, gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown
wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus
Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory
locations, so why would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic
through the accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of
commented assembly program listing.
It would be nice to have a C compiler, and registers help
with that.
Looking at the instruction set, it should be possible to make a
backend for this in SDCC; the architecture looks more
C-friendly than the existing pic14 and pic16 backends. But it
surely isn't as nice as stm8 or z80. reentrant functions will
be inefficent: No registers, and no sp-relative adressing mode.
On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many
times can be hosted on the target although not likely in this
case. Still, you can bring enough functionality onto the MCU to
allow direct downloads and many debugging features without an
ICE.
Rick C.
Forth is a good language for very small devices, but there are
details that can make a huge difference in how efficient it is. To
make Forth work well on a small chip you need a Forth-specific
instruction set to target the stack processing. For example,
adding two numbers in this chip is two instructions - load
accumulator from memory X, add accumulator to memory Y. In a Forth
cpu, you'd have a single instruction that does "pop two numbers,
add them, push the result". That gives a very efficient and compact
instruction set. But it is hard to get the same results from a
chip that doesn't have this kind of stack-based instruction set.
Your point is what exactly? You are comparing running forth on some
other chip to running forth on this chip. How is that useful? There
are many other chips that run very fast. So?
My point is that /this/ CPU is not a good match for Forth, though many
other very cheap CPUs are. Whether or not you think that matches "CPUs
like this should be programmed in Forth" depends on what you mean by
"CPUs like this", and what you think the benefits of Forth are.
I believe others have said the instruction set is memory oriented
with no registers. I think that means in general the CPU will be
slow compared to a register based design. That actually means it is
easier to have a fast Forth implementation compared to other
compilers since there won't be a significant penalty for using a
stack.
It has a single register, not unlike the "W" register in small PIC
devices. Yes, I expect it is going to be slower than you would get from having a few more registers. But it is missing (AFAICS) auto-increment
and decrement modes, and has only load/store operations with indirect access.
So if you have two 8-bit bytes x and y, then adding them as "x += y;" is:
mov a, y; // 1 clock
add x, a; // 1 clock
If you have a data stack pointer "dsp", and want a standard Forth "+" operation, you have:
idxm a, dsp; // 2 clock
mov temp, a; // 1 clock
dec dsp; // 1 clock
idxm a, dsp; // 2 clock
add a, temp; // 1 clock
idxm dsp, a; // 2 clock
That is 9 clocks, instead of 2, and 6 instructions instead of 3.
Of course you could make a Forth compiler for the device - but you would have to make an optimising Forth compiler that avoids needing a data
stack, just as you do on many other small microcontollers (and just as a
C compiler would do). This is /not/ a processor that fits well with
Forth or that would give a clear translation from Forth to assembly, as
is the case on some very small microcontrollers.
Keep the TOS in the accumulator
What does idxm do? Looks like an indirect load?
Can this address mode be combined with any operations?
How fast are instructions that access memory? Most MCUs will perform register operations in a single cycle. Even though RAM may be on
chip, it typically is not as fast as registers because it is usually
not multiported. DSP chips are an exception with dual and even
triple ported on chip RAM.
Am 14.10.2018 um 03:20 schrieb gnuarm.deletethisbit@gmail.com:
How fast are instructions that access memory? Most MCUs will perform
register operations in a single cycle. Even though RAM may be on
chip, it typically is not as fast as registers because it is usually
not multiported. DSP chips are an exception with dual and even
triple ported on chip RAM.
All instructions except for jumps are 1 cycle. Jumps if taken are 2
cycles, 1 otherwise.
Philipp
Am 12.10.2018 um 22:45 schrieb upsidedown@downunder.com:
On Fri, 12 Oct 2018 22:06:02 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
The real issue would be the small RAM size.
Devices with this architecture go up to 256 B of RAM (but they then cost >>> a few cent more).
Philipp
Did you find the binary encoding of various instruction formats, i.e
how many bits allocated to the operation code and how many for the
address field ?
My initial guess was that the instruction word is simple 8 bit opcode
+ 8 bit address, but the bit and word address limits for the smaller
models would suggest that for some op-codes, the op-code field might
be wider than 8 bits and address fields narrower than 8 bits (e.g. bit
and word addressing).
People have tried before (https://www.mikrocontroller.net/topic/449689, >https://stackoverflow.com/questions/49842256/reverse-engineer-assembler-which-probably-encrypts-code).
Apparently, even with access to the tools it is not obvious.
However, a Chinese manual contains these examples:
5E0A MOV A BB1
1B21 COMP A #0x21
2040 T0SN CF
5C0B MOV BB2 A
C028 GOTO 0x28
0030 WDRESET
1F00 MOV A #0x0
0082 MOV SP A
Philipp
This is quite curious. I wonder
- Has anyone actually received the devices they ordered? The cheaper variants seem to be sold out.
Michael Kellett <mk@mkesc.co.uk> writes:
If you want a hardware minimal processor the Maxim 32660 looks like fun
3mm square, 24 pin Cortex M4, 96MHz, 256k flash, 96k RAM, £1.16 (10 off).
That's not minimal ;). More practically, the 3mm square package sounds
like a WLCSP which I think requires specialized ($$$) board fab
facilities (it can't be hand soldered or done with normal reflow
processes). Part of the Padauk part's attraction is the 6-pin SOT23
package.
Here's a complete STM8 board for 0.77 USD shipped:
https://www.aliexpress.com/item//32527571163.html
It has 8k of program flash and 1k of ram and can run a resident Forth interpreter. I think they also make a SOIC-8 version of the cpu. I
bought a few of those boards for around 0.50 each last year so I guess
they have gotten a bit more expensive since then.
On Sat, 13 Oct 2018 18:27:13 +0200, David Brown
<david.brown@hesbynett.no> wrote:
On 13/10/18 17:00, upsidedown@downunder.com wrote:
On Sat, 13 Oct 2018 05:06:23 -0700 (PDT),
gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown wrote: >>>>> On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why
would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the >>>>>>>> accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented >>>>>>>>> assembly program listing.
It would be nice to have a C compiler, and registers help with that. >>>>>>>>
Looking at the instruction set, it should be possible to make a backend >>>>>>> for this in SDCC; the architecture looks more C-friendly than the >>>>>>> existing pic14 and pic16 backends. But it surely isn't as nice as stm8 >>>>>>> or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as >>>>>>> pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many times can be hosted
on the target although not likely in this case. Still, you can bring >>>>>> enough functionality onto the MCU to allow direct downloads and many >>>>>> debugging features without an ICE.
Rick C.
Forth is a good language for very small devices, but there are details >>>>> that can make a huge difference in how efficient it is. To make Forth >>>>> work well on a small chip you need a Forth-specific instruction set to >>>>> target the stack processing. For example, adding two numbers in this >>>>> chip is two instructions - load accumulator from memory X, add
accumulator to memory Y. In a Forth cpu, you'd have a single
instruction that does "pop two numbers, add them, push the result".
That gives a very efficient and compact instruction set. But it is hard >>>>> to get the same results from a chip that doesn't have this kind of
stack-based instruction set.
Your point is what exactly? You are comparing running forth on some other chip to running forth on this chip. How is that useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented with no registers.
Depending how you look at it, you could claim that it has 64 registers
and no RAM. It is a quite orthogonal single address architecture. You
can do practically all single operand instructions (like inc/dec,
shift/rotate etc.) either in the accumulator but equally well in any
of the 64 "registers". For two operand instructions (such as add/sub,
and/or etc,), either the source or destination can be in the memory
"register".
Not quite, no. Only the first 16 memory addresses are directly
accessible for most instructions, with the first 32 addresses being
available for word-based instructions. So you could liken it to a
device with 16 registers and indirect memory access to the rest of ram.
Really ?
In the manual
M.n Only addressed in 0~0xF (0~15) is allowed
The M.n notation is for bit operations, in which M is the byte address
and n is the bit number in byte. Restricting M to 4 bits makes sense,
since n requires 3 bits, thus the total address size for bit
operations would be 7 bits.
I couldn't find a reference that the restriction on M also applies to
byte access. Where is it ?
On Saturday, October 13, 2018 at 11:00:30 AM UTC-4,
upsid...@downunder.com wrote:
On Sat, 13 Oct 2018 05:06:23 -0700 (PDT),
gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown
wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp
Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory
locations, so why would you need a lot of CPU
registers.
Being able to (say) add register to register saves
traffic through the accumulator and therefore
instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages
of commented assembly program listing.
It would be nice to have a C compiler, and registers help
with that.
Looking at the instruction set, it should be possible to
make a backend for this in SDCC; the architecture looks
more C-friendly than the existing pic14 and pic16 backends.
But it surely isn't as nice as stm8 or z80. reentrant
functions will be inefficent: No registers, and no
sp-relative adressing mode. On would want to reserve a few
memory locations as pseudo-registers to help with that, but
that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and
many times can be hosted on the target although not likely in
this case. Still, you can bring enough functionality onto the
MCU to allow direct downloads and many debugging features
without an ICE.
Rick C.
Forth is a good language for very small devices, but there are
details that can make a huge difference in how efficient it is.
To make Forth work well on a small chip you need a
Forth-specific instruction set to target the stack processing.
For example, adding two numbers in this chip is two
instructions - load accumulator from memory X, add accumulator
to memory Y. In a Forth cpu, you'd have a single instruction
that does "pop two numbers, add them, push the result". That
gives a very efficient and compact instruction set. But it is
hard to get the same results from a chip that doesn't have this
kind of stack-based instruction set.
Your point is what exactly? You are comparing running forth on
some other chip to running forth on this chip. How is that
useful? There are many other chips that run very fast. So?
I believe others have said the instruction set is memory oriented
with no registers.
Depending how you look at it, you could claim that it has 64
registers and no RAM. It is a quite orthogonal single address
architecture. You can do practically all single operand
instructions (like inc/dec, shift/rotate etc.) either in the
accumulator but equally well in any of the 64 "registers". For two
operand instructions (such as add/sub, and/or etc,), either the
source or destination can be in the memory "register".
Both Acc = Acc Op Memory or alternatively Memory = Acc Op Memory
are valid.
Thus the accumulator is needed only for two operand instructions,
but not for single operand instructions.
How fast are instructions that access memory? Most MCUs will perform register operations in a single cycle. Even though RAM may be on
chip, it typically is not as fast as registers because it is usually
not multiported. DSP chips are an exception with dual and even
triple ported on chip RAM.
On Saturday, October 13, 2018 at 12:21:51 PM UTC-4, David Brown wrote:
On 13/10/18 14:06, gnuarm.deletethisbit@gmail.com wrote:
On Saturday, October 13, 2018 at 6:46:20 AM UTC-4, David Brown
wrote:
On 12/10/18 18:11, gnuarm.deletethisbit@gmail.com wrote:
On Friday, October 12, 2018 at 2:50:53 AM UTC-4, Philipp Klaus
Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory
locations, so why would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic
through the accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of
commented assembly program listing.
It would be nice to have a C compiler, and registers help
with that.
Looking at the instruction set, it should be possible to make a
backend for this in SDCC; the architecture looks more
C-friendly than the existing pic14 and pic16 backends. But it
surely isn't as nice as stm8 or z80. reentrant functions will
be inefficent: No registers, and no sp-relative adressing mode.
On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
CPUs like this (and others that aren't like this) should be
programmed in Forth. It's a great tool for small MCUs and many
times can be hosted on the target although not likely in this
case. Still, you can bring enough functionality onto the MCU to
allow direct downloads and many debugging features without an
ICE.
Rick C.
Forth is a good language for very small devices, but there are
details that can make a huge difference in how efficient it is. To
make Forth work well on a small chip you need a Forth-specific
instruction set to target the stack processing. For example,
adding two numbers in this chip is two instructions - load
accumulator from memory X, add accumulator to memory Y. In a Forth
cpu, you'd have a single instruction that does "pop two numbers,
add them, push the result". That gives a very efficient and compact
instruction set. But it is hard to get the same results from a
chip that doesn't have this kind of stack-based instruction set.
Your point is what exactly? You are comparing running forth on some
other chip to running forth on this chip. How is that useful? There
are many other chips that run very fast. So?
My point is that /this/ CPU is not a good match for Forth, though many
other very cheap CPUs are. Whether or not you think that matches "CPUs
like this should be programmed in Forth" depends on what you mean by
"CPUs like this", and what you think the benefits of Forth are.
I believe others have said the instruction set is memory oriented
with no registers. I think that means in general the CPU will be
slow compared to a register based design. That actually means it is
easier to have a fast Forth implementation compared to other
compilers since there won't be a significant penalty for using a
stack.
It has a single register, not unlike the "W" register in small PIC
devices. Yes, I expect it is going to be slower than you would get from
having a few more registers. But it is missing (AFAICS) auto-increment
and decrement modes, and has only load/store operations with indirect
access.
So if you have two 8-bit bytes x and y, then adding them as "x += y;" is:
mov a, y; // 1 clock
add x, a; // 1 clock
Keep the TOS in the accumulator and I think you end up with
add a, x; // 1 clock
inc DSTKPTR; // adjust stack pointer - 1 clock?
Does that work? Reading below, I guess not.
If you have a data stack pointer "dsp", and want a standard Forth "+"
operation, you have:
idxm a, dsp; // 2 clock
mov temp, a; // 1 clock
dec dsp; // 1 clock
idxm a, dsp; // 2 clock
add a, temp; // 1 clock
idxm dsp, a; // 2 clock
That is 9 clocks, instead of 2, and 6 instructions instead of 3.
What does idxm do? Looks like an indirect load? Can this address
mode be combined with any operations? Are operations limited in the addressing modes? This seems like a very, very simple CPU, but for the
money, I guess I get it.
Of course you could make a Forth compiler for the device - but you would
have to make an optimising Forth compiler that avoids needing a data
stack, just as you do on many other small microcontollers (and just as a
C compiler would do). This is /not/ a processor that fits well with
Forth or that would give a clear translation from Forth to assembly, as
is the case on some very small microcontrollers.
OK
And one more iteration (sorry...)
On 18-10-13 19:46 , Niklas Holsti wrote:
On 18-10-13 18:31 , Niklas Holsti wrote:
I don't think that an interpreted Forth is feasible for this particular
MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed
address".
Ok, before anyone else notices, I admit I forgot about implementing an
indirect jump by pushing the target address on the stack and executing a
return instruction. That would work for this machine.
Except that one can only "push" the accumulator and flag registers, combined, and the flag register cannot be set directly, and has only 4 working bits.
What would work, as an indirect jump, is to set the Stack Pointer (sp)
to point at a RAM word that contains the target address, and then
execute a return. But then one has lost the actual Stack Pointer value.
A stack-based system is often a good choice for very small cpus - it is certainly popular for 4-bit microcontrollers. But it seems that the designers of this device simply haven't considered support for
Forth-style coding to be important.
Interesting, this at least confirms that the instruction word is 16
bits. In a Harvard architecture, the word length could have been
13-17 bits, with some dirty encodings in 113 bit case., but a cleaner encoding with 14-17 bit instruction words.
Assuming one would like to make an encoding for exactly 1024 code
words and 64 byte data memory, a tighter encoding would be possible.
Of course a manufacturer with small and larger processors, would make
sense to use the same encoding for all processors, which is slightly inefficient for smaller models.
Anyway 1 kW/64 byes case, the following code points would be required:
2048 = 2 x 1024 call, goto
1792 = 7 x 256 Immediate data (8 bit)
2304 = 36 x 64 M-referense (6 bit)
1024 = 8 x 128 Bit ref (M and IO 3+4 bits
others
This might barely fit into 13 bits, with some nasty encoding.
Limiting M-refeence to 4 bits (0-15), but you still can't fit into 12
bit instruction length.
So with 16 bit word length, I do not understand why word reference is
limited to 4-5 bits.The bit address limit makes more sense, so that it
would not consume 4096 code points.
With such small ROM/RAM sizes, who needs reentrant functions ?
On 13/10/18 18:59, Niklas Holsti wrote:
And one more iteration (sorry...)
On 18-10-13 19:46 , Niklas Holsti wrote:
On 18-10-13 18:31 , Niklas Holsti wrote:
I don't think that an interpreted Forth is feasible for this particular >>> MCU. ...
Moreover, there is no indirect jump instruction -- "jump to a computed >>> address".
Ok, before anyone else notices, I admit I forgot about implementing an
indirect jump by pushing the target address on the stack and executing a >> return instruction. That would work for this machine.
Except that one can only "push" the accumulator and flag registers, combined, and the flag register cannot be set directly, and has only 4 working bits.
What would work, as an indirect jump, is to set the Stack Pointer (sp)
to point at a RAM word that contains the target address, and then
execute a return. But then one has lost the actual Stack Pointer value.
Or you could read the SP, put that address into a different word memory location, and use that for indirect access to write to the stack.Efficiency has to be relative on such a limited machine. If there are no registers nearly everything is going to be clumsy and slow. I'm not sure using this CPU with Forth would be at all bad even if the CPU is not intended for Forth.
It is all possible, but not particularly efficient.
They even make dual-core variants […]
On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
Nah... not sure. 4c is too much... :-DOTP, no SPI, UART or I²C, but still...
Clifford Heath
If you are willing to pay 0.04$, you can get twice the RAM and program
memory (not OTP for this one):
https://detail.1688.com/offer/562502806054.html
Philipp
luni, 15 octombrie 2018, 15:35:22 UTC+3, Philipp Klaus Krause a scris:
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
Too much you say? How about THIS deal???OTP, no SPI, UART or I²C, but still...
Clifford Heath
If you are willing to pay 0.04$, you can get twice the RAM and program memory (not OTP for this one):
https://detail.1688.com/offer/562502806054.html
Philipp
Nah... not sure. 4c is too much... :-D
http://www.youboy.com/s504250937.html
Three for a penny! But wait, there's MORE!!! It also has more memory
and an ADC.
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
gnuarm.deletethisbit@gmail.com writes:
http://www.youboy.com/s504250937.html
Three for a penny! But wait, there's MORE!!! It also has more memory
and an ADC.
That's 0.35 Chinese Yuan (not Japanese Yen, which uses a similar-looking currency symbol) so about 0.05 USD.
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone. With an efficent stack-pointer-relative addresing mode, you
put all local varibles on the stack and only need as much RAM as the
local variables along the longest path in the call tree.
If your local variables are all static, the local variables of two
functions that never get called at the same time still both takespace in
RAM at the same time.
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
Example: main() calls f() and g(); both f() and g() call h(). All four >functions are in different translation units, f() and g() both use a lot
of local variables, while main() and h() use little. Without link-time >optimization, the compiler will use about as much RAM as f() and g() >together, when the local variables are static. When they are put on the >stack, it will only need as much RAM as either f() or g().
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone.
With an efficent stack-pointer-relative addresing mode, you
put all local varibles on the stack and only need as much RAM as the
local variables along the longest path in the call tree.
If your local variables are all static, the local variables of two
functions that never get called at the same time still both takespace in
RAM at the same time.
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
Philipp Klaus Krause <pkk@spth.de> writes:
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
Normally you'd use whole-program optimization, I thought. I don't know
if SDCC supports that, but GCC does, as do the more serious commercial embedded compilers.
On Mon, 15 Oct 2018 10:44:07 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone. With an efficent stack-pointer-relative addresing mode, you
put all local varibles on the stack and only need as much RAM as the
local variables along the longest path in the call tree.
If you do not have efficient stack pointer relative addressing modes,
why would you put local variables on stack ?
If your local variables are all static, the local variables of two
functions that never get called at the same time still both takespace in
RAM at the same time.
Just create global variables Tmp1, Tmp2, Tmp3 ... and use these as
function local variables. As long as two functions do not call each
other directly or indirectly, you can safely use these global
variables as function local variables.
Am 15.10.2018 um 10:44 schrieb Philipp Klaus Krause:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone.
Absolutely not. Reentrant functions are a massive nuisance on fully
embedded systems, if only because they routinely make it impossible to determine the actual stack size usage.
With an efficent stack-pointer-relative addresing mode, you
put all local varibles on the stack and only need as much RAM as the
local variables along the longest path in the call tree.
And without such an addressing mode, you don't, because you'll suffer
badly in every conceivable aspect.
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
On the contrary: it's precisely the compilers for such stack-starved architectures (e.g. the 8051) that have been coupling behind-the-scenes static allocation of automatic variables with whole-program overlay
analysis since effectively forever. They really had to, because the alternative would be painful to the point of being unusable.
Am 15.10.2018 um 16:29 schrieb Paul Rubin:
Philipp Klaus Krause <pkk@spth.de> writes:
Compilers can sometimes overly local variables on non-reentrant
functions as an optimization, but that will only work for some cases;
often it would require link-timeoptimization, which is not that common
in compilers for small µCs.
Normally you'd use whole-program optimization, I thought. I don't know
if SDCC supports that, but GCC does, as do the more serious commercial
embedded compilers.
Does GCC support any of these very simple µC architectures?
I thought
anyhting supported by GCC tends to have rather powerful insturction sets
and plenty of registers aynway, so functions could be made reentrant by default without any problems resulting.
While some link-time optimizations are commonly requested features for
SDCC, currently none are supported. In SDCC, even inter-procedural optimizations within the same translation unit are not as powerful as
they should be.
Well, there always is a lot of work to do on SDCC, and there are only a
few volunteers with time to work on it. So SDCC developers priorize
(usually by personal preferences).
Still, when looking at the big picture, SDCC is doing quite well
compared to other compilers for the same architectures (see e.g. http://www.colecovision.eu/stm8/compilers.shtml - comparison from early
2018, around the time of the SDCC 3.7.0 release - current SDCC is 3.8.0).
Philipp
Am 15.10.2018 um 21:22 schrieb Hans-Bernhard Bröker:
Am 15.10.2018 um 10:44 schrieb Philipp Klaus Krause:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone.
Absolutely not. Reentrant functions are a massive nuisance on fully
embedded systems, if only because they routinely make it impossible to
determine the actual stack size usage.
What is the problem?
On the contrary: it's precisely the compilers for such stack-starved
architectures (e.g. the 8051) that have been coupling behind-the-scenes
static allocation of automatic variables with whole-program overlay
analysis since effectively forever. They really had to, because the
alternative would be painful to the point of being unusable.
Well, SDCC when targeting MCS-51 or HC08 would be the combination that I
know a bit about
SDCC doesn't really have link-time optimization yet, compilationWell, given the gigantic scale differences between the target hardware
units are handled independently.
Am 16.10.2018 um 10:00 schrieb Philipp Klaus Krause:I don't believe this is correct. Reentrance is a problem any time a routine is entered again before it is exited from a prior call. This can happen without multiple threads when a routine is called from a routine that was ultimately called from within the routine. I suppose you might consider this to be recursion, but my point is this can happen without the intent of using recursion.
Am 15.10.2018 um 21:22 schrieb Hans-Bernhard Bröker:
Am 15.10.2018 um 10:44 schrieb Philipp Klaus Krause:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
With such small ROM/RAM sizes, who needs reentrant functions ?
Everyone.
Absolutely not. Reentrant functions are a massive nuisance on fully
embedded systems, if only because they routinely make it impossible to
determine the actual stack size usage.
What is the problem?
The major part of it is that I mixed up Reentrance with Recursion there
... sorry for that.
OTOH, one does tend to influence the other. Without recursion, one
would only really need reentrance to be able to call the same function
from separate threads of execution. On controllers this small, that
would only happen if you're calling the same function from inside an interrupt handler and the main loop.
On Tuesday, October 16, 2018 at 4:52:44 PM UTC-4, Hans-Bernhard
Bröker wrote:
OTOH, one does tend to influence the other. Without recursion,
one would only really need reentrance to be able to call the same
function from separate threads of execution. On controllers this
small, that would only happen if you're calling the same function
from inside an interrupt handler and the main loop.
I don't believe this is correct. Reentrance is a problem any time a
routine is entered again before it is exited from a prior call. This
can happen without multiple threads when a routine is called from a
routine that was ultimately called from within the routine. I
suppose you might consider this to be recursion,
but my point is this
can happen without the intent of using recursion.
Am 16.10.2018 um 23:01 schrieb gnuarm.deletethisbit@gmail.com:We call it mutual recursion.
Reentrance is a problem any time a
routine is entered again before it is exited from a prior call. ThisOh, there's no doubt about it: that's recursion all right.
can happen without multiple threads when a routine is called from a
routine that was ultimately called from within the routine. I
suppose you might consider this to be recursion,
Some might prefer to qualify it as indirect recursion, a.k.a. a loop in
the call graph, but it's still recursion.
Am 16.10.2018 um 23:01 schrieb gnuarm.deletethisbit@gmail.com:Clearly there would be a bug, but it is just as much that the routine wasn't designed for recursion and that would be the most likely fix.
On Tuesday, October 16, 2018 at 4:52:44 PM UTC-4, Hans-Bernhard
Bröker wrote:
OTOH, one does tend to influence the other. Without recursion,
one would only really need reentrance to be able to call the same
function from separate threads of execution. On controllers this
small, that would only happen if you're calling the same function
from inside an interrupt handler and the main loop.
I don't believe this is correct. Reentrance is a problem any time a routine is entered again before it is exited from a prior call. This
can happen without multiple threads when a routine is called from a
routine that was ultimately called from within the routine. I
suppose you might consider this to be recursion,
Oh, there's no doubt about it: that's recursion all right.
Some might prefer to qualify it as indirect recursion, a.k.a. a loop in
the call graph, but it's still recursion.
but my point is this
can happen without the intent of using recursion.
I'll asume we agree on this: unintended recursion is clear a bug in the
code, every time.
That could arguably be classified an actual benefit of using a such a stack-starved CPU architecture: any competent C compiler for it willAre you swearing at me in French? ;)
have to perform call tree analysis anyway, so it finds that particular
bug "en passant".
More typical C toolchains relying on stack-centric calling conventionsYeah, I'm not much of a C programmer, so I wouldn't know about such tools. What made me think of this is a problem often encountered by novices in Forth. Some system words use globally static data and can be called twice from different code before the first call has ended use of the data structure. Not quite the same thing as recursion, but the same result.
might not bother with such analysis, and thus won't see the bug. Until
you use the accompanying stack size calculation tool, that is, which
will barf.
I'll asume we agree on this: unintended recursion is clear a bug in the
code, every time.
Without recursion, one
would only really need reentrance to be able to call the same function
from separate threads of execution. On controllers this small, that
would only happen if you're calling the same function from inside an interrupt handler and the main loop. And frankly: you really don't want
to do that. If an ISR on this kind of hardware becomes big enough you
feel the need to split it into sub-functions, that almost certainly
means you've picked entirely the wrong tool for the job.
In other words: for this kind of system (very small, with rotten
stack-based addressing), not only doesn't everyone need re-entrant
functions, it's more like nobody does.
I don't think anyone has ever seriously claimed SDCC to be anywhere near
the pinnacle of compiler design for the 8051. ;-P
Frankly, just looking at statements in this thread has me thinking that
the usual suspects among commercial offerings from 20 years ago might
still run circles around it today.
When I am faced with someone else's code to examine or maintain, I often
run it through Doxygen with "generate documentation for /everything/ -
caller graphs, callee graphs, cross-linked source, etc." It can make it quick to jump around in the code. And recursive (or re-entrant,
whichever you prefer) code stands out like a sore thumb, as long as the
code is single-threaded - you get loops in the call graphs.
On 18-10-17 01:46 , David Brown wrote:
...
When I am faced with someone else's code to examine or maintain, I often
run it through Doxygen with "generate documentation for /everything/ -
caller graphs, callee graphs, cross-linked source, etc." It can make it
quick to jump around in the code. And recursive (or re-entrant,
whichever you prefer) code stands out like a sore thumb, as long as the
code is single-threaded - you get loops in the call graphs.
Anecdote: some years ago, when I was applying a WCET analysis tool to
someone else's program, the tool found recursion. This surprised the
people I was working with, because they had generated call graphs for
the program, analysed them visually, and found no recursive, looping paths.
Turned out that they had asked the call-graph tool to optimize the size
of the window used to display the call-graphs. The tool did as it was
told, with the result that the line segments on the path for the
recursive call went down to the bottom edge of the diagram, then
*merged* with the lower border line of the diagram, followed that lower border, went up one side of the diagram -- still merged with the border
line -- and then reentered the diagram to point at the source of the recursive call, effectively making the loop very hard to see...
(It turned out that this recursion was intentional. At this point, the program was sending an alarm message, but the alarm buffer was full, so
the alarm routine called itself to send an alarm about the full buffer
-- and that worked, because one buffer slot was reserved, by design, for
this "buffer full" alarm.)
On 18-10-17 01:46 , David Brown wrote:
...
When I am faced with someone else's code to examine or maintain, I often run it through Doxygen with "generate documentation for /everything/ - caller graphs, callee graphs, cross-linked source, etc." It can make it quick to jump around in the code. And recursive (or re-entrant,
whichever you prefer) code stands out like a sore thumb, as long as the code is single-threaded - you get loops in the call graphs.
Anecdote: some years ago, when I was applying a WCET analysis tool to someone else's program, the tool found recursion. This surprised the
people I was working with, because they had generated call graphs for
the program, analysed them visually, and found no recursive, looping paths.
Turned out that they had asked the call-graph tool to optimize the size
of the window used to display the call-graphs. The tool did as it was
told, with the result that the line segments on the path for the
recursive call went down to the bottom edge of the diagram, then
*merged* with the lower border line of the diagram, followed that lower border, went up one side of the diagram -- still merged with the border
line -- and then reentered the diagram to point at the source of the recursive call, effectively making the loop very hard to see...
(It turned out that this recursion was intentional. At this point, the program was sending an alarm message, but the alarm buffer was full, so
the alarm routine called itself to send an alarm about the full buffer
-- and that worked, because one buffer slot was reserved, by design, for this "buffer full" alarm.)
The major part of it is that I mixed up Reentrance with Recursion there
... sorry for that.
OTOH, one does tend to influence the other. Without recursion, one
would only really need reentrance to be able to call the same function
from separate threads of execution.
On Wednesday, October 17, 2018 at 2:35:46 AM UTC-4, Niklas Holsti
wrote:
On 18-10-17 01:46 , David Brown wrote: ...
When I am faced with someone else's code to examine or maintain,
I often run it through Doxygen with "generate documentation for
/everything/ - caller graphs, callee graphs, cross-linked source,
etc." It can make it quick to jump around in the code. And
recursive (or re-entrant, whichever you prefer) code stands out
like a sore thumb, as long as the code is single-threaded - you
get loops in the call graphs.
Anecdote: some years ago, when I was applying a WCET analysis tool
to someone else's program, the tool found recursion. This surprised
the people I was working with, because they had generated call
graphs for the program, analysed them visually, and found no
recursive, looping paths.
Turned out that they had asked the call-graph tool to optimize the
size of the window used to display the call-graphs. The tool did as
it was told, with the result that the line segments on the path for
the recursive call went down to the bottom edge of the diagram,
then *merged* with the lower border line of the diagram, followed
that lower border, went up one side of the diagram -- still merged
with the border line -- and then reentered the diagram to point at
the source of the recursive call, effectively making the loop very
hard to see...
(It turned out that this recursion was intentional. At this point,
the program was sending an alarm message, but the alarm buffer was
full, so the alarm routine called itself to send an alarm about the
full buffer -- and that worked, because one buffer slot was
reserved, by design, for this "buffer full" alarm.)
Seems to me what actually failed was that they knew they had
recursion in the design but didn't realize the fact that they didn't
see the recursion in the call graphs was an error that should have
been caught.
Am 16.10.2018 um 22:52 schrieb Hans-Bernhard Bröker:
Without recursion, one
would only really need reentrance to be able to call the same function
from separate threads of execution. On controllers this small, that
would only happen if you're calling the same function from inside an
interrupt handler and the main loop. And frankly: you really don't want
to do that. If an ISR on this kind of hardware becomes big enough you
feel the need to split it into sub-functions, that almost certainly
means you've picked entirely the wrong tool for the job.
In other words: for this kind of system (very small, with rotten
stack-based addressing), not only doesn't everyone need re-entrant
functions, it's more like nobody does.
Multithreading matters here. It is not common on such small devices, but
this one is an exception: Padauk sells multiple dual-core variants of
this controller and one 8-core variant.
On 18-10-17 17:08 , gnuarm.deletethisbit@gmail.com wrote:Do you know the intended purpose of the call graphs? It seems to me that it would be to match expectations to what was coded. It shouldn't matter who was doing the evaluation, there should have been an accounting of expectations regarding the presence and/or absence of recursion.
On Wednesday, October 17, 2018 at 2:35:46 AM UTC-4, Niklas Holsti
wrote:
On 18-10-17 01:46 , David Brown wrote: ...
When I am faced with someone else's code to examine or maintain,
I often run it through Doxygen with "generate documentation for
/everything/ - caller graphs, callee graphs, cross-linked source,
etc." It can make it quick to jump around in the code. And
recursive (or re-entrant, whichever you prefer) code stands out
like a sore thumb, as long as the code is single-threaded - you
get loops in the call graphs.
Anecdote: some years ago, when I was applying a WCET analysis tool
to someone else's program, the tool found recursion. This surprised
the people I was working with, because they had generated call
graphs for the program, analysed them visually, and found no
recursive, looping paths.
Turned out that they had asked the call-graph tool to optimize the
size of the window used to display the call-graphs. The tool did as
it was told, with the result that the line segments on the path for
the recursive call went down to the bottom edge of the diagram,
then *merged* with the lower border line of the diagram, followed
that lower border, went up one side of the diagram -- still merged
with the border line -- and then reentered the diagram to point at
the source of the recursive call, effectively making the loop very
hard to see...
(It turned out that this recursion was intentional. At this point,
the program was sending an alarm message, but the alarm buffer was
full, so the alarm routine called itself to send an alarm about the
full buffer -- and that worked, because one buffer slot was
reserved, by design, for this "buffer full" alarm.)
Seems to me what actually failed was that they knew they had
recursion in the design but didn't realize the fact that they didn't
see the recursion in the call graphs was an error that should have
been caught.
The guys creating and viewing the call-graphs were not the designers of
the program, either, so they didn't know, but for sure it was something
they should have discovered and remarked on as part of their work.
While I have been playing around with the idea of making some RTOS for
such 1kW/64B machine (realistically supporting 2-3 tasks such as a foregroud/bacground monitor) realistically having 2 or 8 thread is no
very realistic, even if the hardware supports it.
The 8 core version might be usable for xCore style "pseudo-interupts"
running a single DSP sample or PLC loop at a time. This would require
8 input pins, each starting its own thread.
But of course, the same rules should apply to pseudo-interrupts as
real interrupts regarding re-entrancy etc.
On Wednesday, October 17, 2018 at 11:37:14 AM UTC-4, Niklas Holsti
wrote:
On 18-10-17 17:08 , gnuarm.deletethisbit@gmail.com wrote:
On Wednesday, October 17, 2018 at 2:35:46 AM UTC-4, Niklas
Holsti wrote:
On 18-10-17 01:46 , David Brown wrote: ...
When I am faced with someone else's code to examine or
maintain, I often run it through Doxygen with "generate
documentation for /everything/ - caller graphs, callee
graphs, cross-linked source, etc." It can make it quick to
jump around in the code. And recursive (or re-entrant,
whichever you prefer) code stands out like a sore thumb, as
long as the code is single-threaded - you get loops in the
call graphs.
Anecdote: some years ago, when I was applying a WCET analysis
tool to someone else's program, the tool found recursion. This
surprised the people I was working with, because they had
generated call graphs for the program, analysed them visually,
and found no recursive, looping paths.
Turned out that they had asked the call-graph tool to optimize
the size of the window used to display the call-graphs. The
tool did as it was told, with the result that the line segments
on the path for the recursive call went down to the bottom edge
of the diagram, then *merged* with the lower border line of the
diagram, followed that lower border, went up one side of the
diagram -- still merged with the border line -- and then
reentered the diagram to point at the source of the recursive
call, effectively making the loop very hard to see...
(It turned out that this recursion was intentional. At this
point, the program was sending an alarm message, but the alarm
buffer was full, so the alarm routine called itself to send an
alarm about the full buffer -- and that worked, because one
buffer slot was reserved, by design, for this "buffer full"
alarm.)
Seems to me what actually failed was that they knew they had
recursion in the design but didn't realize the fact that they
didn't see the recursion in the call graphs was an error that
should have been caught.
The guys creating and viewing the call-graphs were not the
designers of the program, either, so they didn't know, but for sure
it was something they should have discovered and remarked on as
part of their work.
Do you know the intended purpose of the call graphs?
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.
On Wed, 10 Oct 2018 19:29:13 -0700, Paul RubinLEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
<no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.
Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.
Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.
Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:There are advantages to using several soft core processors, each sized and customized to the need.
On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >I like that it's in a 6-pin SOT23 package since there aren't many other >MCUs that small.
Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.
Rick C.
It's hard to picture an application where you couldn't spare a few hundred LUTs.
I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.
On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just >an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.
Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) type replacing relay logic. These had typically at least AND, OR, NOT, (XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block, from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?
LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose (Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.
Rick C.
It's hard to picture an application where you couldn't spare a few hundred LUTs.
There are advantages to using several soft core processors, each sized and customized to the need.
I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"
A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.
It is hard for me to imagine applications where a 1 bit processor
would be useful. A useful N bit processor can be built in a small
number of LUTs. I've built a 16 bit processor in just 600 LUTs and
I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the
processing speed requirement was quite low and you can save LUTs with
a bit serial processor. I just don't know how many or why it would
matter. Even the smallest FPGAs have thousands of LUTs. It's hard
to picture an application where you couldn't spare a few hundred
LUTs.
On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote:There are a small number of examples:
On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin <no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just
an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram,
enough for plenty of MCU things. Didn't check if it has an ADC or PWM.
I like that it's in a 6-pin SOT23 package since there aren't many other
MCUs that small.
Slightly OT, but I have often wonder how primitive a computer architecture can be and still do some useful work. In the tube/discrete/SSI times, there were quite a lot 1 bit processors. There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers with the same instructions as the PLC but also at least a 1 bit SUB (and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA (Serial Boolean Analyser) http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. For the re-entrance enthusiasts, it contains stack pointer relative addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 Darlington buffers may be needed to drive loads typically found in PLC
environment.
Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or for truly serial computers ?
LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose
(Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
It is hard for me to imagine applications where a 1 bit processor would be useful. A useful N bit processor can be built in a small number of LUTs. I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the processing speed requirement was quite low and you can save LUTs with a bit serial processor. I just don't know how many or why it would matter. Even the smallest FPGAs have thousands of LUTs. It's hard to picture an application where you couldn't spare a few hundred LUTs.
Rick C.
It's hard to picture an application where you couldn't spare a few hundred LUTs.
There are advantages to using several soft core processors, each sized and customized to the need.
I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
There are many under 600 LUTs, including 32-bit. Had hoped the full featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs: https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"
A big rational for small soft core processors is that they replace LUTs (slow speed logic) with block RAM (instructions). And they are completely deterministic as opposed to doing the same by time slicing a ASIC (ARM) processor.
I won't argue a bit that softcores and especially *customizable* softcore CPUs aren't useful. I was talking about there being at best a very tiny region of utility for 1-bit processors.
My 600 LUT processor didn't trade off much for performance. It would run pretty fast and was pretty capable. In addition the word size was independent of the instruction set. That said, there are apps where a much less powerful processor would do fine and saving a few more LUTs would be useful.
Rick C.
there being at best a very tiny region of utility for 1-bit processors
On Sunday, October 21, 2018 at 12:51:34 PM UTC-5, gnuarm.del...@gmail.com wrote:
On Sunday, October 21, 2018 at 12:31:34 PM UTC-4, jim.bra...@ieee.org wrote: >>> On Sunday, October 21, 2018 at 10:08:06 AM UTC-5, gnuarm.del...@gmail.com wrote:
On Sunday, October 21, 2018 at 10:47:26 AM UTC-4, jim.bra...@ieee.org wrote:
On Sunday, October 21, 2018 at 8:27:35 AM UTC-5, upsid...@downunder.com wrote:
On Wed, 10 Oct 2018 19:29:13 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:
Clifford Heath <no.spam@please.net> writes:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html>
<http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or Iæ¶Ž, but still...
That is impressive! Seems to be an 8-bit RISC with no registers, just >>>>>>> an accumulator, a cute concept. 1K of program OTP and 64 bytes of ram, >>>>>>> enough for plenty of MCU things. Didn't check if it has an ADC or PWM. >>>>>>> I like that it's in a 6-pin SOT23 package since there aren't many other >>>>>>> MCUs that small.
Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller) >>>>>> type replacing relay logic. These had typically at least AND, OR, NOT, >>>>>> (XOR) instructions.The other group was used as truly serial computers >>>>>> with the same instructions as the PLC but also at least a 1 bit SUB >>>>>> (and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions. >>>>>>
One that immediately comes in mind is the MC14500B PLC building block, >>>>>> from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four >>>>>> banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package. >>>>>> For the re-entrance enthusiasts, it contains stack pointer relative >>>>>> addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >>>>>> Darlington buffers may be needed to drive loads typically found in PLC >>>>>> environment.
Anyone seen more modern 1 bit chips either for relay replacement or >>>>>> for truly serial computers ?
Anyone seen more modern 1 bit chips either for relay replacement or >>>>> ]> for truly serial computers ?
LEM1_9 and LEM4_9 are FPGA soft cores that are intended for that purpose >>>>> (Logic Emulation Machine) https://opencores.org/project/lem1_9min
Jim Brakefield
It is hard for me to imagine applications where a 1 bit processor
would be useful. A useful N bit processor can be built in a small
number of LUTs. I've built a 16 bit processor in just 600 LUTs and
I've seen processors in a bit less.
I discussed this with someone once and he imagined apps where the
processing speed requirement was quite low and you can save LUTs with
a bit serial processor. I just don't know how many or why it would
matter. Even the smallest FPGAs have thousands of LUTs. It's hard to >>>> picture an application where you couldn't spare a few hundred LUTs.
Rick C.
It's hard to picture an application where you couldn't spare a few hundred LUTs.
There are advantages to using several soft core processors, each sized
and customized to the need.
I've built a 16 bit processor in just 600 LUTs and I've seen processors in a bit less.
There are many under 600 LUTs, including 32-bit. Had hoped the full
featured LEM design would be under 100 LUTs.
Have done some rough research of whats available for under 600 LUTs:
https://opencores.org/project/up_core_list/downloads
select: "By Performance Metric"
A big rational for small soft core processors is that they replace LUTs
(slow speed logic) with block RAM (instructions). And they are
completely deterministic as opposed to doing the same by time slicing a
ASIC (ARM) processor.
I won't argue a bit that softcores and especially *customizable*
softcore CPUs aren't useful. I was talking about there being at best a
very tiny region of utility for 1-bit processors.
My 600 LUT processor didn't trade off much for performance. It would
run pretty fast and was pretty capable. In addition the word size was
independent of the instruction set. That said, there are apps where a
much less powerful processor would do fine and saving a few more LUTs would be useful.
Rick C.
there being at best a very tiny region of utility for 1-bit processors
There are a small number of examples:
Bit serial processors such as DEC PDP8L, early vacuum tube & drum
machines, for example Bendix G-15.
Bit serial Cordic
Also telling, is that 4-bit processors for calculators have been replaced
by 8-bit processors.
My inspiration was EDIF, which was/is output from VHDL & Verilog
compilers. E.g. use EDIF as a machine language. In the context of logic simulation, greater FPGA capacity possible for slow logic.
This effort also lead to a theoretical insight for brain modelling: There
is greater information content in the wiring than in the logic. The
human brain has 2<<36+ neurons requiring 36-bits of information for each connection and only 16 or so bits for the state/configuration of each synapse. Also a FPGA requires 60+ bits to route each LUT input (assuming
all LUT inputs in use) whereas each possible input can be specified by 20 bits or less (1M LUT FPGA).
Of course optimizing simulators convert the EDIF to an existing machine language. Likewise for industrial automation (ladder logic, ...).
Jim Brakefield
Slightly OT, but I have often wonder how primitive a computer
architecture can be and still do some useful work. In the
tube/discrete/SSI times, there were quite a lot 1 bit processors.
There were at least two types, the PLC (programmable Logic Controller)
type replacing relay logic. These had typically at least AND, OR, NOT,
(XOR) instructions.The other group was used as truly serial computers
with the same instructions as the PLC but also at least a 1 bit SUB
(and ADD) instructions to implement all mathematical functions.
However, in the LSI era, there down't seem to be many implement ions.
One that immediately comes in mind is the MC14500B PLC building block,
from the 1970's, which requires quite lot of support chips (code
memory, PC, /O chips) to do some useful work.
After much searching, I found the (NI) National Instruments SBA
(Serial Boolean Analyser)
http://www.wass.net/othermanuals/GI%20SBA.pdf
from the same era, with 1024 word instructions (8 bit) ROM and four
banks of 30 _bits_ data memory and 30 I/O pins in a 40 pin package.
For the re-entrance enthusiasts, it contains stack pointer relative >addressing :-). THe I/O pins are 5 V TTL compatible, so a few ULN2803 >Darlington buffers may be needed to drive loads typically found in PLC >environment.
Anyone seen more modern 1 bit chips either for relay replacement or
for truly serial computers ?
Tim <cpldcpu+usenet@gmail.com> wrote:
This is quite curious. I wonder
- Has anyone actually received the devices they ordered? The cheaper
variants seem to be sold out.
I think they've sold out since they went viral. EEVblog did a video showing 550 in stock - that's only $16 worth of parts, not hard to imagine they've been bought up.
The other option is they're some kind of EOL part and 3c is the 'reduced to clear' price - which they have done, very successfully.
Theo
On 12/10/18 08:50, Philipp Klaus Krause wrote:
Am 12.10.2018 um 01:08 schrieb Paul Rubin:
upsidedown@downunder.com writes:
There is a lot of operations that will update memory locations, so why >>>> would you need a lot of CPU registers.
Being able to (say) add register to register saves traffic through the
accumulator and therefore instructions.
1 KiB = 0.5 KiW is quite a lot, it is about 10-15 pages of commented
assembly program listing.
It would be nice to have a C compiler, and registers help with that.
Looking at the instruction set, it should be possible to make a backend
for this in SDCC; the architecture looks more C-friendly than the
existing pic14 and pic16 backends. But it surely isn't as nice as stm8
or z80.
reentrant functions will be inefficent: No registers, and no sp-relative
adressing mode. On would want to reserve a few memory locations as
pseudo-registers to help with that, but that only goes so far.
It looks like the lowest 16 memory addresses could be considered pseudo-registers - they are the ones that can be used for direct memory access rather than needing indirect access.
On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
They even make dual-core variants (the part where the first digit in the
part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU
is shared. In particular, the I/O registers are also shared, which means
some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.
Philipp
Interesting, that would make it easy to run a multitasking RTOS (foreground/background) monitor, which might justify the use of some reentrant library routines :-). But in reality, the available memory (ROM/RAM) is so small so that you could easily manage this with static
memory allocations.
Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:
On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.
Philipp
Interesting, that would make it easy to run a multitasking RTOS
(foreground/background) monitor, which might justify the use of some
reentrant library routines :-). But in reality, the available memory
(ROM/RAM) is so small so that you could easily manage this with static
memory allocations.
But static memory allocation would require one copy of each function per thread. And the linker would have to analyze the call graph to always
call the correct function for each thread. Function pointers get
complicated.
Unfortunately, reentrancy becomes even harder with
hardware-multithreading: TO access the stack, one has to construct a
pointer to the stack location in a memory location. That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making access even more inefficient. And that spinlock will cause issues with interrupts (a solution might be to heavily restrict interrupt routines, essentially allowing not much more than setting some global variables).
The there is the trade-off of using one such memory location per
function vs. per program (the latter reducing memroy usage, but
resulting in less paralellism).
The pseudo-registers one would want to use are not so much a problem for interrupt routines (they would just need saving and thus increase
interrupt overhead a bit), but for hardware parallelism. Essentially all access to them would again have to be protected by a spinlock.
All these problems could have relatively easily been avoided by
providing an efficient stack-pointer-relative addressing mode. Having a
few general-purpose or index registers would have somewhat helped as well.
Philipp
And you'll end up with a low-end Cortex ...
Am 12.10.18 um 20:39 schrieb upsidedown@downunder.com:
On Fri, 12 Oct 2018 10:18:56 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 10.10.2018 um 03:05 schrieb Clifford Heath:
<https://lcsc.com/product-detail/PADAUK_PADAUK-Tech-PMS150C_C129127.html> >>>> <http://www.padauk.com.tw/upload/doc/PMS150C%20datasheet%20V004_EN_20180124.pdf>
OTP, no SPI, UART or I²C, but still...
Clifford Heath
They even make dual-core variants (the part where the first digit in the >>> part number is '2'). It seems program counter, stack pointer, flag
register and accumulator are per-core, while the rest, including the ALU >>> is shared. In particular, the I/O registers are also shared, which means >>> some multiplier registers would also be - but currently all variants
with integrated multiplier are single-core.
Use of the ALU is shared byt he two cores, alternating by clock cycle.
Philipp
Interesting, that would make it easy to run a multitasking RTOS
(foreground/background) monitor, which might justify the use of some
reentrant library routines :-). But in reality, the available memory
(ROM/RAM) is so small so that you could easily manage this with static
memory allocations.
But static memory allocation would require one copy of each function per >thread.
And the linker would have to analyze the call graph to always
call the correct function for each thread.
Function pointers get complicated.
Unfortunately, reentrancy becomes even harder with
hardware-multithreading:
TO access the stack, one has to construct a
pointer to the stack location in a memory location.
That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making >access even more inefficient. And that spinlock will cause issues with >interrupts (a solution might be to heavily restrict interrupt routines, >essentially allowing not much more than setting some global variables).
The there is the trade-off of using one such memory location per
function vs. per program (the latter reducing memroy usage, but
resulting in less paralellism).
The pseudo-registers one would want to use are not so much a problem for >interrupt routines (they would just need saving and thus increase
interrupt overhead a bit), but for hardware parallelism. Essentially all >access to them would again have to be protected by a spinlock.
All these problems could have relatively easily been avoided by
providing an efficient stack-pointer-relative addressing mode. Having a
few general-purpose or index registers would have somewhat helped as well.
Philipp
But static memory allocation would require one copy of each function per
thread.
For a foreground/background monitor, the worst case would be two
copies of static data, if both threads use the same rubroutine.
And the linker would have to analyze the call graph to always
call the correct function for each thread.
Linker for such small target ?
With such small processor, just track any dependencies manually.
Function pointers get complicated.
Do you really insist of using function pointer with such small
targets?
Unfortunately, reentrancy becomes even harder with
hardware-multithreading:
With two hardware threads, you would need at most two copies of static
data.
TO access the stack, one has to construct a
pointer to the stack location in a memory location.
Why would you want to access the stack ?
The stack is usable for handling return addresses, but I guess that a hardware thread must have its own return address stack pointer.
That memory location
(as any pseudo-registers) is then shared among all running instances of
the function. So it needs to be protected (e.g. with a spinlock), making
access even more inefficient. And that spinlock will cause issues with
interrupts (a solution might be to heavily restrict interrupt routines,
essentially allowing not much more than setting some global variables).
Disabling all interrupts for the duration of some critical operations
is often enough, but of course, the number of instructions executed
during interrupt disabled should be minimized.
Am 08.11.18 um 20:52 schrieb upsidedown@downunder.com:
But static memory allocation would require one copy of each function per >>> thread.
For a foreground/background monitor, the worst case would be two
copies of static data, if both threads use the same rubroutine.
And the linker would have to analyze the call graph to always
call the correct function for each thread.
Linker for such small target ?
Of course. The support routines the compiler uses reside in some
library, the linker links them in if necessary. Also, the larger
variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
One might want to e.g. have one .c file for handling I²", one for the
soft UART, etc.
And the linker would have to analyze the call graph to always
call the correct function for each thread.
Linker for such small target ?
Of course. The support routines the compiler uses reside in some
library, the linker links them in if necessary. Also, the larger
variants are not that small, with up to 256 B of RAM and 8 KB of ROM.
One might want to e.g. have one .c file for handling I²", one for the
soft UART, etc.
A linker is required, if the libraries are (for copyright reasons)
delivered as binary object code only.
However, if the library are delivered as source files and the compiler/assembler has even a rudimentary #include mechanism, just
include those library files you need. With a include or macro
processor with parameter passing, just invoke same include file or
macro twice with different parameters for different static variable instances.
Of course, linkers are also needed, if very primitive compilation
machines are used, such as floppy based Intellecs or Exorcisers. It
could take a day to compile a large program all the way from sources,
with multiple floppy changes to get the final absolute file to a
single floppy, ready to be burnt into EPROMS for an additional hour or
two. In such environment compiling, linking and burning only the
source file changed would speed up program development a lot.
When using a modern PC for compilation, there are no such issues.
On Fri, 12 Oct 2018 22:06:02 +0200, Philipp Klaus Krause <pkk@spth.de>
wrote:
Am 12.10.2018 um 20:30 schrieb upsidedown@downunder.com:
The real issue would be the small RAM size.
Devices with this architecture go up to 256 B of RAM (but they then cost
a few cent more).
Philipp
Did you find the binary encoding of various instruction formats, i.e
how many bits allocated to the operation code and how many for the
address field ?
My initial guess was that the instruction word is simple 8 bit opcode
+ 8 bit address, but the bit and word address limits for the smaller
models would suggest that for some op-codes, the op-code field might
be wider than 8 bits and address fields narrower than 8 bits (e.g. bit
and word addressing).
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,064 |
Nodes: | 10 (0 / 10) |
Uptime: | 148:16:08 |
Calls: | 13,691 |
Calls today: | 1 |
Files: | 186,936 |
D/L today: |
33 files (6,120K bytes) |
Messages: | 2,410,934 |