I can do a *HUNDRED* of your benches with my single word version!
... and still you wouldn't beat me ..
<mike drop>
Hans Bezemer
From the other optimizers team :o)
Extra simple and easy to maintain
(for well-formed hh:mm:ss):
# : hms ':' parse evaluate ':' parse evaluate bl parse evaluate ;Â ok
# hms 12:59:21Â ok
12 59 21 #
--
OK, let's move that simple optimization
process a little bit further; what if we
do the same even to these two variable calls
at the very beginning, that are outside the loop?
VARIABLE C6Â ok
VARIABLE C1Â ok
: TIMESTRSCAN2
 1 [ C6 ] LITERAL !  1 [ C1 ] LITERAL !
t;R >R 0 0 R> R>Â OVER + 1-
 DO
  I C@ DUP 58 =
  IF
    DROP
    [ C6 ] LITERAL @ 60 * [ C6 ] LITERAL !
    1 [ C1 ] LITERAL !
  ELSE
    48 - [ C1 ] LITERAL @ * [ C6 ] LITERAL @ M* D+
    10 [ C1 ] LITERAL !
  THEN
 -1 +LOOP
;Â ok
: TLOOP4 TICKS PAD 8 30000 0 DO 2DUP TIMESTRSCAN2 2DROP LOOP
2DROP TICKS 2SWAP D- D. ;Â ok
S" 12:34:56" PAD SWAP CMOVEÂ ok
TLOOP4 9Â ok
That's enough for today. :)
Let's find out the difference between my
and "Mark Twain" 's approach to the solution
of "parsing 'time string' task". For simplicity
I'll do everything using DX Forth.
My solution is:
VARIABLE C6
VARIABLE C1
: TIMESTRSCAN ( addr count -- d )
 1 C6 ! 1 C1 !
;R >R 0 0 R> R>Â OVER + 1-
 DO
  I C@ DUP 58 =
  IF
    DROP
    C6 @ 60 * C6 !
    1 C1 !
  ELSE
    48 - C1 @ * C6 @ M* D+
    10 C1 !
  THEN
 -1 +LOOP
;
On 11/06/2025 11:54 pm, LIT wrote:
Let's find out the difference between my
and "Mark Twain" 's approach to the solution
of "parsing 'time string' task". For simplicity
I'll do everything using DX Forth.
My solution is:
VARIABLE C6
VARIABLE C1
: TIMESTRSCAN ( addr count -- d )
 1 C6 ! 1 C1 !
t;R >R 0 0 R> R>Â OVER + 1-
 DO
  I C@ DUP 58 =
  IF
    DROP
    C6 @ 60 * C6 !
    1 C1 !
  ELSE
    48 - C1 @ * C6 @ M* D+
    10 C1 !
  THEN
 -1 +LOOP
;
Same using stack ops:
: TIMESTRSCAN2 ( addr count -- d )
>R >R 0 0 1 1 ( C6 C1) R> R>
OVER + 1- DO
I C@ DUP 58 = IF DROP
DROP 60 * ( C6) 1 ( C1)
ELSE
48 - * OVER ( C6) >R UM* D+ R> 10 ( C1)
THEN
-1 +LOOP 2DROP ;
12% faster and 20% smaller by my measurement.
On 11-06-2025 20:00, dxf wrote:
On 11/06/2025 11:54 pm, LIT wrote:
Let's find out the difference between my
and "Mark Twain" 's approach to the solution
of "parsing 'time string' task". For simplicity
I'll do everything using DX Forth.
My solution is:
VARIABLE C6
VARIABLE C1
: TIMESTRSCAN ( addr count -- d )
  1 C6 ! 1 C1 !
  >R >R 0 0 R> R>
  OVER + 1-
  DO
   I C@ DUP 58 =
   IF
     DROP
     C6 @ 60 * C6 !
     1 C1 !
   ELSE
     48 - C1 @ * C6 @ M* D+
     10 C1 !
   THEN
  -1 +LOOP
;
Same using stack ops:
: TIMESTRSCAN2 ( addr count -- d )
  >R >R 0 0 1 1 ( C6 C1) R> R>
  OVER + 1- DO
    I C@ DUP 58 = IF DROP
      DROP 60 * ( C6) 1 ( C1)
    ELSE
      48 - * OVER ( C6) >R UM* D+ R> 10 ( C1)
    THEN
  -1 +LOOP 2DROP ;
12% faster and 20% smaller by my measurement.
It's slightly faster than the original on 4tH:
real   0m5,527s
user   0m5,522s
sys   0m0,004s
Opcodes: his: 39, yours: 37
On 11-06-2025 18:29, minforth wrote:
 From the other optimizers team :o)
Extra simple and easy to maintain
(for well-formed hh:mm:ss):
# : hms ':' parse evaluate ':' parse evaluate bl parse evaluate ;Â ok
# hms 12:59:21Â ok
12 59 21 #
--Â
Can't replicate it in 4tH, but I'll take it! :)
It's slightly faster than the original on 4tH:
real 0m5,527s
user 0m5,522s
sys 0m0,004s
Opcodes: his: 39, yours: 37
I also ran miniforth's version, and timed this the line:
HMS 12:34:56 .S BYE
It was not fast:
minforth cycles=834951
Make of this what you will.
From the other optimizers team :o)
Extra simple and easy to maintain
(for well-formed hh:mm:ss):
# : hms ':' parse evaluate ':' parse evaluate bl parse evaluate ;Â ok
# hms 12:59:21Â ok
12 59 21 #
--
This version with the string in memory, no error checking, and using a variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
This version with the string in memory, no error checking, and using a variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
This version with the string in memory, no error checking, and using a variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
This version with the string in memory, no error checking, and using a variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
: hms ( a u -- h m s ) drop p !^^^^^^^
2digit advance 2digit advance 2digit advance ;
: hms ( a u -- h m s ) drop p !This serves no purpose.
2digit advance 2digit advance 2digit advance ;
This version with the string in memory, no error checking, and using a variable, seems simplest to me.This inspired me to write the following
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
It doesn't do the same thing. It isn't faster, which points out
the '60 *' and 'M* D+' are not the limiting factors.
mhx@iae.nl (mhx) writes:
It doesn't do the same thing. It isn't faster, which points out
the '60 *' and 'M* D+' are not the limiting factors.
Oh I copied the interface from another post, didn't realize it was
supposed to convert to seconds. I didn't care about the speed since any slowness in any of these versions can be blamed on the compiler ;).
Here is another version:
variable p
: advance 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + advance ; \ skips trailing colon
: hms ( a u -- n ) drop p ! 2digit 60 * 2digit + 60 * 2digit + ;
: test clearstack s" 12:34:56" hms ;
test .s
iSPICE> TEST TEST TEST TEST
\ dtimescan : 2515 clock ticks elapsed, 45296
\ timestrscan : 419 clock ticks elapsed, 45296
\ HMS : 419 clock ticks elapsed, 45296
It makes no difference. Whatever is holding it down must
be quite severe.
This version with the string in memory, no error checking, and using a variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
mhx@iae.nl (mhx) writes:
It doesn't do the same thing. It isn't faster, which points out
the '60 *' and 'M* D+' are not the limiting factors.
Oh I copied the interface from another post, didn't realize it was
supposed to convert to seconds. I didn't care about the speed since any slowness in any of these versions can be blamed on the compiler ;).
Here is another version:
variable p
: advance 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + advance ; \ skips trailing colon
: hms ( a u -- n ) drop p ! 2digit 60 * 2digit + 60 * 2digit + ;
: test clearstack s" 12:34:56" hms ;
test .s
: keep { lo hi adr len } lo adr len ;
: get { adr len } 0. adr len >number keep 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
mhx@iae.nl (mhx) writes:
iSPICE> TEST TEST TEST TEST
\ dtimescan : 2515 clock ticks elapsed, 45296
\ timestrscan : 419 clock ticks elapsed, 45296
\ HMS : 419 clock ticks elapsed, 45296
It makes no difference. Whatever is holding it down must
be quite severe.
Is dtimescan the one that uses double word arithmetic for 16 bit
processors? It's doing more stuff, I would think. Weird that it
catches up after a few tries. Cache warming?
On 13/06/2025 8:24 am, B. Pym wrote:
: keep { lo hi adr len } lo adr len ;
: get { adr len } 0. adr len >number keep 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
Not worth the locals IMO. OTOH I guarantee (number) will be re-used.
: (number) ( adr len -- ud adr' len' ) 0. 2swap >number ;
: get ( adr len -- u adr' len' ) (number) rot drop 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
On Thu, 12 Jun 2025 02:59:56 -0700
Paul Rubin <no.email@nospam.invalid> wrote:
This version with the string in memory, no error checking, and using a
variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
This inspired me to write the following
Requiring a well formed time string and a 64 bit cell size
: hms ( a u -- h m s )
drop @ $30303A30303A3030 -
dup $FF0000FF0000FF00 and 8 rshift
swap $00FF0000FF0000FF and dup 3 lshift swap 2* +
+
dup $ff and
swap dup 24 rshift $ff and
swap 48 rshift $ff and ;
: test1 100000000 0 do "12:34:56" hms 2drop drop loop ;
timer-reset test1 .elapsed 107 ms elapsed ok
if I inline hmns the time goes down to 71 ms
lxf64 now produces native code and works well!
seea hms
0x41FA70 488B5D00 mov rbx, qword [rbp]
0x41FA74 488B1B mov rbx, qword [rbx]
0x41FA77 48B830303A30303A3030 mov rax, 0x30303A30303A3030 0x41FA81 4829C3 sub rbx, rax
0x41FA84 48B800FF0000FF0000FF mov rax, 0xFF0000FF0000FF00 0x41FA8E 4889D9 mov rcx, rbx
0x41FA91 4821C1 and rcx, rax
0x41FA94 48C1E908 shr rcx, 0x8
0x41FA98 48B8FF0000FF0000FF00 mov rax, 0xFF0000FF0000FF 0x41FAA2 4821C3 and rbx, rax
0x41FAA5 4889D8 mov rax, rbx
0x41FAA8 48C1E003 shl rax, 0x3
0x41FAAC 48D1E3 shl rbx, 0x1
0x41FAAF 4801C3 add rbx, rax
0x41FAB2 4801CB add rbx, rcx
0x41FAB5 4889D8 mov rax, rbx
0x41FAB8 4825FF000000 and rax, 0xFF
0x41FABE 4889D9 mov rcx, rbx
0x41FAC1 48C1E918 shr rcx, 0x18
0x41FAC5 4881E1FF000000 and rcx, 0xFF
0x41FACC 48C1EB30 shr rbx, 0x30
0x41FAD0 4881E3FF000000 and rbx, 0xFF
0x41FAD7 48894DF8 mov qword [rbp-0x8], rcx
0x41FADB 48894500 mov qword [rbp], rax
0x41FADF 488D6DF8 lea rbp, [rbp-0x8]
0x41FAE3 C3 ret
116 bytes, 26 instructions
Best Regards
Peter Fälth
Paul Rubin wrote:
This version with the string in memory, no error checking, and using a
variable, seems simplest to me.
variable p
: advance ( -- ) 1 p +! ;
: digit ( -- n ) p @ c@ '0' - advance ;
: 2digit ( -- n ) digit 10 * digit + ;
: hms ( a u -- h m s ) drop p !
2digit advance 2digit advance 2digit advance ;
: test clearstack s" 12:34:56" hms ;
test .s
: keep { lo hi adr len } lo adr len ;
: get { adr len } 0. adr len >number keep 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
test .s
That's why I consider the use of global variables in library functions
worse than locals. You're polluting the namespace.
HMS is three times faster than dtimescan and timestrscan, while
HMS2 is 3.5 times slower (as expected).
mhx@iae.nl (mhx) writes:
HMS is three times faster than dtimescan and timestrscan, while
HMS2 is 3.5 times slower (as expected).
Is HMS the one that I posted, and HMS2 the version that's almost the
same? Why the speed difference: just because of the extra memory
traffic of leaving extra things on the stack?
Hans Bezemer <the.beez.speaks@gmail.com> writes:
That's why I consider the use of global variables in library functions
worse than locals. You're polluting the namespace.
Obviously there are ways around that with wordlists, but it would be
nice if doing that was more convenient. You could have program sections
with their own encapsulated variables.
Hans Bezemer <the.beez.speaks@gmail.com> writes:
That's why I consider the use of global variables in library functions
worse than locals. You're polluting the namespace.
Obviously there are ways around that with wordlists, but it would be
nice if doing that was more convenient. You could have program sections
with their own encapsulated variables.
On 13/06/2025 8:24 am, B. Pym wrote:
: keep { lo hi adr len } lo adr len ;
: get { adr len } 0. adr len >number keep 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
Not worth the locals IMO. OTOH I guarantee (number) will be re-used.
: (number) ( adr len -- ud adr' len' ) 0. 2swap >number ;
: get ( adr len -- u adr' len' ) (number) rot drop 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
On 13-06-2025 06:46, dxf wrote:
On 13/06/2025 8:24 am, B. Pym wrote:
: keep { lo hi adr len } lo adr len ;
: get { adr len } 0. adr len >number keep 1 /string ;
: hms ( adr len -- h m s) get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
Not worth the locals IMO. OTOH I guarantee (number) will be re-used.
: (number) ( adr len -- ud adr' len' )Â 0. 2swap >number ;
: get ( adr len -- u adr' len' )Â (number) rot drop 1 /string ;
: hms ( adr len -- h m s)Â get get get 2drop ;
: test clearstack s" 12:34:56" hms ;
Frankly, this is the first time I see how >NUMBER can be used as a dedicated parsing tool.
Fun part, though -- in 4tH, single numbers are preferred (for reasons listed in the manual) and hence: double numbers are expensive.
I do have both a single number version of >NUMBER as well as a double number version.
...
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,064 |
Nodes: | 10 (0 / 10) |
Uptime: | 153:22:17 |
Calls: | 13,691 |
Calls today: | 1 |
Files: | 186,936 |
D/L today: |
2,526 files (731M bytes) |
Messages: | 2,411,055 |