Forum: War Ensemble BBS

Event loop and http::geturl

From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 06:55:45 2025

From Newsgroup: comp.lang.tcl

This is related to my other "too many nested evaluations".

I have taken a snapshot of my apache logs that caused the "..too many
nested evaluations" and stripped down my program a bit so I could try
and track down what was going on.

So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

In the log I'm seeing:
----------------------
A before do_get_abuse 132.226.122.74 and ::last_ip is ##
do_get_abuse 132.226.122.74
get_abuse 132.226.122.74
get_abuse 132.226.122.74 before geturl
A before do_get_abuse 132.226.122.74 and ::last_ip is ##
----------------------

The "A before ..." line is coming from the proc checkForError that is
attached to the fileevent

set ::accessLog [open "|cat access_test.log" r]
fconfigure $::accessLog -blocking 0 -buffering line
fileevent $::accessLog readable [list checkForError $::accessLog]

checkForError eventually calls do_get_abuse

proc do_get_abuse {ip} {
log "do_get_abuse $ip"
if { [catch {get_abuse $ip} result] } {
puts "get_abuse failed ... $result"
exit
}
}

and in get_abuse I have

log " get_abuse $ip before geturl"
set token [::http::geturl ${url}?${query} -method GET -headers $headers]
log " get_abuse $ip after geturl"

It never gets to the "log get_abuse $ip after geturl", which it should
BEFORE the next fileevent readable event is processed.

This is at least contrary to the man page. From the man page:

The ::http::geturl command blocks until the operation completes, unless
the -command option specifies a callback that is invoked when the HTTP transaction completes.

I'm going to start building a slow-ish webpage on one of our servers to investigate further with a minimal program.

Hopefully, someone on the guru team could check the http::geturl code?

Thanks
Jonathan.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 07:51:25 2025

From Newsgroup: comp.lang.tcl

Here's my simple test to confirm that ::http:geturl is running
asynchronously - unless I completely misunderstand what "block" means. slow.htm pauses for 2 seconds and outputs the date and time.
-----------
#!/usr/bin/tclsh

package require http
package require tls

http::register https 443 [list ::tls::socket -autoservername true]

proc test1 {} {
incr ::cnt
set url "https://congresstravel.com.au/events.cgi/xx/slow.htm"
puts " test1 $::cnt before geturl "
set token [::http::geturl ${url} -method GET]
puts " $::cnt data is [::http::data $token]"
::http::cleanup $token
}

proc check {chan} {
if {[gets $chan line] >= 0} {
test1
}
}
proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}
proc doClose {} {
set ::close "close"
}

set fd [open "test.txt" "w"]
for {set x 1} {$x < 10} {incr x} {
puts $fd $x
}
close $fd
queue
after 20000 doClose
vwait close
file delete "test.txt"
-----------
output
===========
test1 1 before geturl
test1 2 before geturl
test1 3 before geturl
test1 4 before geturl
test1 5 before geturl
test1 6 before geturl
test1 7 before geturl
test1 8 before geturl
test1 9 before geturl
9 data is 24/06/25 07:47:03
9 data is 24/06/25 07:47:03
9 data is 24/06/25 07:47:03
9 data is 24/06/25 07:47:02
9 data is 24/06/25 07:47:02
9 data is 24/06/25 07:47:01
9 data is 24/06/25 07:47:01
9 data is 24/06/25 07:47:01
9 data is 24/06/25 07:47:01
===========
--- Synchronet 3.21a-Linux NewsLink 1.2

From Rich@rich@example.invalid to comp.lang.tcl on Tue Jun 24 04:21:01 2025

From Newsgroup: comp.lang.tcl

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

It does. It is documented as such:

man n http:

Note: The event queue is even used without the -command option. As a
side effect, arbitrary commands may be processed while http::geturl is
running.

The code snippets below are from http-2.9.5.tm which was distributed
(at least) with 8.6.12:

Buried deep in http::geturl:

# geturl does EVERYTHING asynchronously, so if the user
# calls it synchronously, we just do a wait here.
http::wait $token

And the implementation of http::wait is:

proc http::wait {token} {
variable $token
upvar 0 $token state

if {![info exists state(status)] || $state(status) eq ""} {
# We must wait on the original variable name, not the upvar alias
vwait ${token}(status)
}

return [status $token]
}

And the 'vwait' there reenters the event loop and allows other events
to be processed.
--- Synchronet 3.21a-Linux NewsLink 1.2

From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 18:01:08 2025

From Newsgroup: comp.lang.tcl

On 24/6/25 14:21, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

So, it looks like ::http::geturl is operating asynchronously, despite my
program NOT using -command.

It does. It is documented as such:

man n http:

Note: The event queue is even used without the -command option. As a
side effect, arbitrary commands may be processed while http::geturl is
running.

The code snippets below are from http-2.9.5.tm which was distributed
(at least) with 8.6.12:

Buried deep in http::geturl:

# geturl does EVERYTHING asynchronously, so if the user
# calls it synchronously, we just do a wait here.
http::wait $token

And the implementation of http::wait is:

proc http::wait {token} {
variable $token
upvar 0 $token state

if {![info exists state(status)] || $state(status) eq ""} {
# We must wait on the original variable name, not the upvar alias
vwait ${token}(status)
}

return [status $token]
}

And the 'vwait' there reenters the event loop and allows other events
to be processed.

OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I
need the geturl to finish before anything else happens.
--- Synchronet 3.21a-Linux NewsLink 1.2

From et99@et99@rocketship1.me to comp.lang.tcl on Tue Jun 24 17:19:23 2025

From Newsgroup: comp.lang.tcl

On 6/24/2025 1:01 AM, Jonathan Kelly wrote:

On 24/6/25 14:21, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

So, it looks like ::http::geturl is operating asynchronously, despite my >>> program NOT using -command.

It does. It is documented as such:

man n http:

   Note: The event queue is even used without the -command option.   As a
   side effect, arbitrary commands may be processed while http::geturl is >>    running.

The code snippets below are from http-2.9.5.tm which was distributed
(at least) with 8.6.12:

Buried deep in http::geturl:

         # geturl does EVERYTHING asynchronously, so if the user
         # calls it synchronously, we just do a wait here.
         http::wait $token

And the implementation of http::wait is:

     proc http::wait {token} {
         variable $token
         upvar 0 $token state

         if {![info exists state(status)] || $state(status) eq ""} { >>              # We must wait on the original variable name, not the upvar alias
             vwait ${token}(status)
         }

         return [status $token]
     }

And the 'vwait' there reenters the event loop and allows other events
to be processed.

OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

I would think you can use this option on the geturl call:

-command callback

--- Synchronet 3.21a-Linux NewsLink 1.2

From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 00:03:52 2025

From Newsgroup: comp.lang.tcl

On 6/24/2025 5:19 PM, et99 wrote:

On 6/24/2025 1:01 AM, Jonathan Kelly wrote:

OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

I would think you can use this option on the geturl call:

-command callback

I see you already know about -command, so perhaps what you really want is an example of how to use it.

proc httpCallback {token} { ;# the -command callback - called when transaction completes
upvar 0 $token state ;# use this to get the results
set ::urldone 1 ;# block on the setting of this variable
return
}

unset -nocomplain ::urldone :# rules out a race condition
http::geturl <your url> -command httpCallback
if {![info exists ::urldone]} {vwait ::urldone} ;# no need to block if the variable exists

The above code is being cautious. There may not be any possibility of a race condition, but this way we rule it out, even if they change the code in the future. It never hurts to unset a variable first that you're going to set later anyway.

--- Synchronet 3.21a-Linux NewsLink 1.2

From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 01:57:28 2025

From Newsgroup: comp.lang.tcl

On 6/25/2025 12:03 AM, et99 wrote:

On 6/24/2025 5:19 PM, et99 wrote:

... snip ....

It has just now occurred to me that you are running your [test1] proc as a fileevent script. Read the vwait manual under the section:

"NESTED VWAITS BY EXAMPLE"

I use geturl synchronously with no issues. But I do a single url request and wait for it, in the main line code - NOT inside an event.

The code I presented in the prior posting is how you could use -command and get a synchronous result. It is only really useful if you were going to do something between the geturl and the wait for it to be done. Otherwise, you could just call it synchronously - but NOT inside an event, if another fileevent might trigger before the first one is done.

As you will see with the example in the manual, things have to unwind, so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works like a stack.

That's why the timestamps are output in reverse order of when the geturl was called.

I'm not sure exactly what you want to accomplish. But is sounds to me like you need to do some queuing or co-routines. I have code I wrote that does single queue with 1 or more servers using threads. I sometimes use it for just a single server to get my own queuing of events.

Unfortunately, I can't use it with tcl 9.0 because of a race condition bug with respect to package requires inside threads that has been ticketed but not yet looked into.

(sorry for so many postings :)

-e

--- Synchronet 3.21a-Linux NewsLink 1.2

From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Thu Jun 26 04:20:51 2025

From Newsgroup: comp.lang.tcl

On 25/6/25 18:57, et99 wrote:

On 6/25/2025 12:03 AM, et99 wrote:

On 6/24/2025 5:19 PM, et99 wrote:

... snip ....

It has just now occurred to me that you are running your [test1] proc as
a fileevent script. Read the vwait manual under the section:

"NESTED VWAITS BY EXAMPLE"

I use geturl synchronously with no issues. But I do a single url request
and wait for it, in the main line code - NOT inside an event.

The code I presented in the prior posting is how you could use -command
and get a synchronous result. It is only really useful if you were going
to do something between the geturl and the wait for it to be done. Otherwise, you could just call it synchronously - but NOT inside an
event, if another fileevent might trigger before the first one is done.

As you will see with the example in the manual, things have to unwind,
so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works
like a stack.

That's why the timestamps are output in reverse order of when the geturl
was called.

I'm not sure exactly what you want to accomplish. But is sounds to me
like you need to do some queuing or co-routines. I have code I wrote
that does single queue with 1 or more servers using threads. I sometimes
use it for just a single server to get my own queuing of events.

Unfortunately, I can't use it with tcl 9.0 because of a race condition
bug with respect to package requires inside threads that has been
ticketed but not yet looked into.

(sorry for so many postings :)

-e

Thanks for looking at it. Yes, I had to do a queue - my case is exactly
like the test1 code I posted ... the events that end up triggering the
geturl come in quicker than the geturl can process, and the geturl
re-enters the event loop under the hood enabling more
geturl-triggering-events to queue up in the event queue(?) - eventually
enough to crash something. Anyway I did this ...
----------------
set ::test_busy 0
set ::test_queue {}

proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}
proc check {chan} {
if {[gets $chan line] >= 0} {
queue_test $line
}
proc queue_test {n} {
set task [list test1 $n]
lappend ::test_queue $task
maybe_run_test
}
proc maybe_run_test {} {
if {$::test_busy} return
if {[llength $::test_queue] == 0} return

set ::test_busy 1

# get first in queue
set next [lindex $::test_queue 0]

# remove first from queue
set ::test_queue [lrange $::test_queue 1 end]

# run first
uplevel #0 $next

set ::test_busy 0
maybe_run_test
}
----------------

--- Synchronet 3.21a-Linux NewsLink 1.2

From Rich@rich@example.invalid to comp.lang.tcl on Wed Jun 25 21:32:42 2025

From Newsgroup: comp.lang.tcl

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open text.txt directly:

set ::input [open test.txt r]

And achieve the same result.

--- Synchronet 3.21a-Linux NewsLink 1.2

From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 18:52:33 2025

From Newsgroup: comp.lang.tcl

On 6/25/2025 2:32 PM, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open text.txt directly:

set ::input [open test.txt r]

And achieve the same result.

I was also curious about this. But I'm also wondering why this is even event driven at all? Why not simply, in pseudo code:

while 1 {
read...a line
if end of file, break
geturl
do something with the url results
}

If there's also a gui that the OP wants to keep alive, it should not be starved, since the synchronous form of geturl is calling vwait, and that would allow gui events to get processed while waiting for the url request to complete.

-e

--- Synchronet 3.21a-Linux NewsLink 1.2

From Alan Grunwald@nospam.nurdglaw@gmail.com to comp.lang.tcl on Thu Jun 26 09:30:03 2025

From Newsgroup: comp.lang.tcl

On 26/06/2025 02:52, et99 wrote:

On 6/25/2025 2:32 PM, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
   set ::input [open "|cat test.txt" r]
   fconfigure $::input -blocking 0 -buffering line
   fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open text.txt
directly:

set ::input [open test.txt r]

And achieve the same result.

I was also curious about this. But I'm also wondering why this is even
event driven at all? Why not simply, in pseudo code:

while 1 {
   read...a line
   if end of file, break
   geturl
   do something with the url results
}

If there's also a gui that the OP wants to keep alive, it should not be starved, since the synchronous form of geturl is calling vwait, and that would allow gui events to get processed while waiting for the url
request to complete.

-e

My pseudo code is generally

while !eof {
read a line
if line is not empty
do stuff
endif
endwhile

I used to be puzzled why I needed the test for non-emptiness. I never
worked out why, nowadays I simply accept that's the way of things and do it.

It might be that I come from a VMS background and assume that there's a newline at the end of each "record" in the file (which I find convenient because without one, the shell prompt gets appended to the last line
when using cat at the command line). Have I missed something? Is there something inherently daft about my preconceptions?
--- Synchronet 3.21a-Linux NewsLink 1.2

From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Thu Jun 26 12:14:24 2025

From Newsgroup: comp.lang.tcl

* Alan Grunwald <nospam.nurdglaw@gmail.com>
| My pseudo code is generally

| while !eof {
| read a line
| if line is not empty
| do stuff
| endif
| endwhile

| I used to be puzzled why I needed the test for non-emptiness. I never
| worked out why, nowadays I simply accept that's the way of things and
| do it.

If the read hits EOF, an empty line is returned which is not actually in
the file. Depending on your data, this may or may not be a problem
(if you're not interested in empty lines in the data, then no problem).

Usually a better pattern for line-oriented data
on a channel in blocking mode is

while {[gets $fd line] >= 0} {
# line has been read, possibly empty
...
}
close $fd

HTH
R'
--- Synchronet 3.21a-Linux NewsLink 1.2

From Rich@rich@example.invalid to comp.lang.tcl on Thu Jun 26 17:08:29 2025

From Newsgroup: comp.lang.tcl

et99 <et99@rocketship1.me> wrote:

On 6/25/2025 2:32 PM, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open
text.txt directly:

set ::input [open test.txt r]

And achieve the same result.

I was also curious about this. But I'm also wondering why this is
even event driven at all? Why not simply, in pseudo code:

My guess: the above was OP's "test case" code. The real code is
reading an Apache log file as Apache logs to the file, so 'event
driven' in that senario does make some sense.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Fri Jun 27 05:09:07 2025

From Newsgroup: comp.lang.tcl

On 27/6/25 03:08, Rich wrote:

et99 <et99@rocketship1.me> wrote:

On 6/25/2025 2:32 PM, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
set ::input [open "|cat test.txt" r]
fconfigure $::input -blocking 0 -buffering line
fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open
text.txt directly:

set ::input [open test.txt r]

And achieve the same result.

I was also curious about this. But I'm also wondering why this is
even event driven at all? Why not simply, in pseudo code:

My guess: the above was OP's "test case" code. The real code is
reading an Apache log file as Apache logs to the file, so 'event
driven' in that senario does make some sense.

What Rich said. Before I realised geturl is *always* asynchronous, I had
read the man for geturl where it said geturl "blocked". I needed to
simplify my program as a test case to prove something was broken. Turned
out, the problem was my understanding, though I still think the manual
page is mis-leading. The relevant

"Note: The event queue is even used without the -command option. As a
side effect, arbitrary commands may be processed while http::geturl is running."

is in the general description at the top, and I had just been reading
the geturl function description.
--- Synchronet 3.21a-Linux NewsLink 1.2

From et99@et99@rocketship1.me to comp.lang.tcl on Thu Jun 26 14:02:34 2025

From Newsgroup: comp.lang.tcl

On 6/26/2025 12:09 PM, Jonathan Kelly wrote:

On 27/6/25 03:08, Rich wrote:

et99 <et99@rocketship1.me> wrote:

On 6/25/2025 2:32 PM, Rich wrote:

Jonathan Kelly <jonkelly@fastmail.fm> wrote:

proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
}

Curious why you are opening a pipe to cat, having cat read and print
the contents, and then consuming that, when you can just open
text.txt directly:

set ::input [open test.txt r]

And achieve the same result.

I was also curious about this. But I'm also wondering why this is
even event driven at all? Why not simply, in pseudo code:

My guess: the above was OP's "test case" code. The real code is
reading an Apache log file as Apache logs to the file, so 'event
driven' in that senario does make some sense.

What Rich said. Before I realised geturl is *always* asynchronous, I had read the man for geturl where it said geturl "blocked". I needed to simplify my program as a test case to prove something was broken. Turned out, the problem was my understanding, though I still think the manual page is mis-leading. The relevant

"Note: The event queue is even used without the -command option. As a side effect, arbitrary commands may be processed while http::geturl is running."

is in the general description at the top, and I had just been reading the geturl function description.

I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent <channel> readable <script>] trigger constantly? Isn't this in effect a tight polling loop?

The manual says:

"A channel is also considered to be readable if an end of file or error condition is present on the underlying file or device. It is important for script to check for these conditions and handle them appropriately; for example, if there is no special check for end of file, an infinite loop may occur where script reads no data, returns, and is immediately invoked again."

To avoid this problem, one is normally supposed to close the file or remove the read handler. I've never written a log handler like this one, so I'm not sure what the correct approach would be.

--- Synchronet 3.21a-Linux NewsLink 1.2

From Christian Gollwitzer@auriocus@gmx.de to comp.lang.tcl on Fri Jun 27 16:35:08 2025

From Newsgroup: comp.lang.tcl

Am 26.06.25 um 23:02 schrieb et99:

I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent
<channel> readable <script>] trigger constantly? Isn't this in effect a tight polling loop?

The underlying mechanism is select() or poll(). To my knowledge, this
only works for pipes/sockets, not for files. The "tail -f" program runs
stat() in a loop to see if the file date or size has changed.

On Linux, you could also use inotify (there is a Tcl package) to get
callbacks when the file is changed

Christian
--- Synchronet 3.21a-Linux NewsLink 1.2

From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Fri Jun 27 18:18:32 2025

From Newsgroup: comp.lang.tcl

* Christian Gollwitzer <auriocus@gmx.de>
| Am 26.06.25 um 23:02 schrieb et99:
| > I wonder, if you are reading a file that is being written from
| > another process, sort of like a "tail" program, doesn't tcl's
| > [fileevent
| > <channel> readable <script>] trigger constantly? Isn't this in
| > effect a tight polling loop?

| The underlying mechanism is select() or poll(). To my knowledge, this
| only works for pipes/sockets, not for files.

select() also works for files, the effect is that it indeed triggers immediately on each call.

man select(2)
[...]
readfds
The file descriptors in this set are watched to see if they are
ready for reading.
** A file descriptor is ready for reading if a read operation will not block;
in particular, a file descriptor is also ready on end-of-file.

(** emphasis by me).

The effect in TCL of setting a readable-fileevent on a regular disk file
is indeed that the event fires repeatedly until EOF, blocking any GUI
updates which would run "after idle".

R'
--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,064
Nodes:	10 (0 / 10)
Uptime:	170:42:30
Calls:	13,692
Files:	186,936
D/L today:	100 files (20,246K bytes)
Messages:	2,411,676

Event loop and http::geturl

Who's Online

System Info