• Event loop and http::geturl

    From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 06:55:45 2025
    From Newsgroup: comp.lang.tcl

    This is related to my other "too many nested evaluations".

    I have taken a snapshot of my apache logs that caused the "..too many
    nested evaluations" and stripped down my program a bit so I could try
    and track down what was going on.

    So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

    In the log I'm seeing:
    ----------------------
    A before do_get_abuse 132.226.122.74 and ::last_ip is ##
    do_get_abuse 132.226.122.74
    get_abuse 132.226.122.74
    get_abuse 132.226.122.74 before geturl
    A before do_get_abuse 132.226.122.74 and ::last_ip is ##
    ----------------------

    The "A before ..." line is coming from the proc checkForError that is
    attached to the fileevent

    set ::accessLog [open "|cat access_test.log" r]
    fconfigure $::accessLog -blocking 0 -buffering line
    fileevent $::accessLog readable [list checkForError $::accessLog]

    checkForError eventually calls do_get_abuse

    proc do_get_abuse {ip} {
    log "do_get_abuse $ip"
    if { [catch {get_abuse $ip} result] } {
    puts "get_abuse failed ... $result"
    exit
    }
    }

    and in get_abuse I have

    log " get_abuse $ip before geturl"
    set token [::http::geturl ${url}?${query} -method GET -headers $headers]
    log " get_abuse $ip after geturl"

    It never gets to the "log get_abuse $ip after geturl", which it should
    BEFORE the next fileevent readable event is processed.

    This is at least contrary to the man page. From the man page:

    The ::http::geturl command blocks until the operation completes, unless
    the -command option specifies a callback that is invoked when the HTTP transaction completes.

    I'm going to start building a slow-ish webpage on one of our servers to investigate further with a minimal program.

    Hopefully, someone on the guru team could check the http::geturl code?

    Thanks
    Jonathan.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 07:51:25 2025
    From Newsgroup: comp.lang.tcl

    Here's my simple test to confirm that ::http:geturl is running
    asynchronously - unless I completely misunderstand what "block" means. slow.htm pauses for 2 seconds and outputs the date and time.
    -----------
    #!/usr/bin/tclsh

    package require http
    package require tls

    http::register https 443 [list ::tls::socket -autoservername true]

    proc test1 {} {
    incr ::cnt
    set url "https://congresstravel.com.au/events.cgi/xx/slow.htm"
    puts " test1 $::cnt before geturl "
    set token [::http::geturl ${url} -method GET]
    puts " $::cnt data is [::http::data $token]"
    ::http::cleanup $token
    }

    proc check {chan} {
    if {[gets $chan line] >= 0} {
    test1
    }
    }
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }
    proc doClose {} {
    set ::close "close"
    }

    set fd [open "test.txt" "w"]
    for {set x 1} {$x < 10} {incr x} {
    puts $fd $x
    }
    close $fd
    queue
    after 20000 doClose
    vwait close
    file delete "test.txt"
    -----------
    output
    ===========
    test1 1 before geturl
    test1 2 before geturl
    test1 3 before geturl
    test1 4 before geturl
    test1 5 before geturl
    test1 6 before geturl
    test1 7 before geturl
    test1 8 before geturl
    test1 9 before geturl
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:03
    9 data is 24/06/25 07:47:02
    9 data is 24/06/25 07:47:02
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    9 data is 24/06/25 07:47:01
    ===========
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Tue Jun 24 04:21:01 2025
    From Newsgroup: comp.lang.tcl

    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my program NOT using -command.

    It does. It is documented as such:

    man n http:

    Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is
    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

    # geturl does EVERYTHING asynchronously, so if the user
    # calls it synchronously, we just do a wait here.
    http::wait $token

    And the implementation of http::wait is:

    proc http::wait {token} {
    variable $token
    upvar 0 $token state

    if {![info exists state(status)] || $state(status) eq ""} {
    # We must wait on the original variable name, not the upvar alias
    vwait ${token}(status)
    }

    return [status $token]
    }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Tue Jun 24 18:01:08 2025
    From Newsgroup: comp.lang.tcl

    On 24/6/25 14:21, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my
    program NOT using -command.

    It does. It is documented as such:

    man n http:

    Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is
    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

    # geturl does EVERYTHING asynchronously, so if the user
    # calls it synchronously, we just do a wait here.
    http::wait $token

    And the implementation of http::wait is:

    proc http::wait {token} {
    variable $token
    upvar 0 $token state

    if {![info exists state(status)] || $state(status) eq ""} {
    # We must wait on the original variable name, not the upvar alias
    vwait ${token}(status)
    }

    return [status $token]
    }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I
    need the geturl to finish before anything else happens.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From et99@et99@rocketship1.me to comp.lang.tcl on Tue Jun 24 17:19:23 2025
    From Newsgroup: comp.lang.tcl

    On 6/24/2025 1:01 AM, Jonathan Kelly wrote:
    On 24/6/25 14:21, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    So, it looks like ::http::geturl is operating asynchronously, despite my >>> program NOT using -command.

    It does.  It is documented as such:

    man n http:

       Note: The event queue is even used without the -command option.   As a
       side effect, arbitrary commands may be processed while http::geturl is >>    running.


    The code snippets below are from http-2.9.5.tm which was distributed
    (at least) with 8.6.12:

    Buried deep in http::geturl:

             # geturl does EVERYTHING asynchronously, so if the user
             # calls it synchronously, we just do a wait here.
             http::wait $token

    And the implementation of http::wait is:

         proc http::wait {token} {
             variable $token
             upvar 0 $token state

             if {![info exists state(status)] || $state(status) eq ""} { >>              # We must wait on the original variable name, not the upvar alias
                 vwait ${token}(status)
             }

             return [status $token]
         }

    And the 'vwait' there reenters the event loop and allows other events
    to be processed.

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

    I would think you can use this option on the geturl call:

    -command callback



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 00:03:52 2025
    From Newsgroup: comp.lang.tcl

    On 6/24/2025 5:19 PM, et99 wrote:
    On 6/24/2025 1:01 AM, Jonathan Kelly wrote:

    OK. Is there a way to ACTUALLY get geturl to block, or equivalent? I need the geturl to finish before anything else happens.

    I would think you can use this option on the geturl call:

    -command callback




    I see you already know about -command, so perhaps what you really want is an example of how to use it.


    proc httpCallback {token} { ;# the -command callback - called when transaction completes
    upvar 0 $token state ;# use this to get the results
    set ::urldone 1 ;# block on the setting of this variable
    return
    }

    unset -nocomplain ::urldone :# rules out a race condition
    http::geturl <your url> -command httpCallback
    if {![info exists ::urldone]} {vwait ::urldone} ;# no need to block if the variable exists

    The above code is being cautious. There may not be any possibility of a race condition, but this way we rule it out, even if they change the code in the future. It never hurts to unset a variable first that you're going to set later anyway.



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 01:57:28 2025
    From Newsgroup: comp.lang.tcl

    On 6/25/2025 12:03 AM, et99 wrote:
    On 6/24/2025 5:19 PM, et99 wrote:



    ... snip ....


    It has just now occurred to me that you are running your [test1] proc as a fileevent script. Read the vwait manual under the section:

    "NESTED VWAITS BY EXAMPLE"

    I use geturl synchronously with no issues. But I do a single url request and wait for it, in the main line code - NOT inside an event.

    The code I presented in the prior posting is how you could use -command and get a synchronous result. It is only really useful if you were going to do something between the geturl and the wait for it to be done. Otherwise, you could just call it synchronously - but NOT inside an event, if another fileevent might trigger before the first one is done.

    As you will see with the example in the manual, things have to unwind, so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works like a stack.

    That's why the timestamps are output in reverse order of when the geturl was called.

    I'm not sure exactly what you want to accomplish. But is sounds to me like you need to do some queuing or co-routines. I have code I wrote that does single queue with 1 or more servers using threads. I sometimes use it for just a single server to get my own queuing of events.

    Unfortunately, I can't use it with tcl 9.0 because of a race condition bug with respect to package requires inside threads that has been ticketed but not yet looked into.

    (sorry for so many postings :)

    -e



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Thu Jun 26 04:20:51 2025
    From Newsgroup: comp.lang.tcl

    On 25/6/25 18:57, et99 wrote:
    On 6/25/2025 12:03 AM, et99 wrote:
    On 6/24/2025 5:19 PM, et99 wrote:



    ... snip ....


    It has just now occurred to me that you are running your [test1] proc as
    a fileevent script.  Read the vwait manual under the section:

    "NESTED VWAITS BY EXAMPLE"

    I use geturl synchronously with no issues. But I do a single url request
    and wait for it, in the main line code - NOT inside an event.

    The code I presented in the prior posting is how you could use -command
    and get a synchronous result. It is only really useful if you were going
    to do something between the geturl and the wait for it to be done. Otherwise, you could just call it synchronously - but NOT inside an
    event, if another fileevent might trigger before the first one is done.

    As you will see with the example in the manual, things have to unwind,
    so if your fileevents occur fast enough, they may have triggered before earlier geturl calls will have had time to unwind. The event loop works
    like a stack.

    That's why the timestamps are output in reverse order of when the geturl
    was called.

    I'm not sure exactly what you want to accomplish. But is sounds to me
    like you need to do some queuing or co-routines. I have code I wrote
    that does single queue with 1 or more servers using threads. I sometimes
    use it for just a single server to get my own queuing of events.

    Unfortunately, I can't use it with tcl 9.0 because of a race condition
    bug with respect to package requires inside threads that has been
    ticketed but not yet looked into.

    (sorry for so many postings :)

    -e



    Thanks for looking at it. Yes, I had to do a queue - my case is exactly
    like the test1 code I posted ... the events that end up triggering the
    geturl come in quicker than the geturl can process, and the geturl
    re-enters the event loop under the hood enabling more
    geturl-triggering-events to queue up in the event queue(?) - eventually
    enough to crash something. Anyway I did this ...
    ----------------
    set ::test_busy 0
    set ::test_queue {}

    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }
    proc check {chan} {
    if {[gets $chan line] >= 0} {
    queue_test $line
    }
    proc queue_test {n} {
    set task [list test1 $n]
    lappend ::test_queue $task
    maybe_run_test
    }
    proc maybe_run_test {} {
    if {$::test_busy} return
    if {[llength $::test_queue] == 0} return

    set ::test_busy 1

    # get first in queue
    set next [lindex $::test_queue 0]

    # remove first from queue
    set ::test_queue [lrange $::test_queue 1 end]

    # run first
    uplevel #0 $next

    set ::test_busy 0
    maybe_run_test
    }
    ----------------

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Wed Jun 25 21:32:42 2025
    From Newsgroup: comp.lang.tcl

    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From et99@et99@rocketship1.me to comp.lang.tcl on Wed Jun 25 18:52:33 2025
    From Newsgroup: comp.lang.tcl

    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.



    I was also curious about this. But I'm also wondering why this is even event driven at all? Why not simply, in pseudo code:

    while 1 {
    read...a line
    if end of file, break
    geturl
    do something with the url results
    }

    If there's also a gui that the OP wants to keep alive, it should not be starved, since the synchronous form of geturl is calling vwait, and that would allow gui events to get processed while waiting for the url request to complete.

    -e

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Alan Grunwald@nospam.nurdglaw@gmail.com to comp.lang.tcl on Thu Jun 26 09:30:03 2025
    From Newsgroup: comp.lang.tcl

    On 26/06/2025 02:52, et99 wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
       set ::input [open "|cat test.txt" r]
       fconfigure $::input -blocking 0 -buffering line
       fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open text.txt
    directly:

    set ::input [open test.txt r]

    And achieve the same result.



    I was also curious about this. But I'm also wondering why this is even
    event driven at all? Why not simply, in pseudo code:

    while 1 {
       read...a line
       if end of file, break
       geturl
       do something with the url results
    }

    If there's also a gui that the OP wants to keep alive, it should not be starved, since the synchronous form of geturl is calling vwait, and that would allow gui events to get processed while waiting for the url
    request to complete.

    -e

    My pseudo code is generally

    while !eof {
    read a line
    if line is not empty
    do stuff
    endif
    endwhile

    I used to be puzzled why I needed the test for non-emptiness. I never
    worked out why, nowadays I simply accept that's the way of things and do it.

    It might be that I come from a VMS background and assume that there's a newline at the end of each "record" in the file (which I find convenient because without one, the shell prompt gets appended to the last line
    when using cat at the command line). Have I missed something? Is there something inherently daft about my preconceptions?
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Thu Jun 26 12:14:24 2025
    From Newsgroup: comp.lang.tcl

    * Alan Grunwald <nospam.nurdglaw@gmail.com>
    | My pseudo code is generally

    | while !eof {
    | read a line
    | if line is not empty
    | do stuff
    | endif
    | endwhile

    | I used to be puzzled why I needed the test for non-emptiness. I never
    | worked out why, nowadays I simply accept that's the way of things and
    | do it.

    If the read hits EOF, an empty line is returned which is not actually in
    the file. Depending on your data, this may or may not be a problem
    (if you're not interested in empty lines in the data, then no problem).

    Usually a better pattern for line-oriented data
    on a channel in blocking mode is

    while {[gets $fd line] >= 0} {
    # line has been read, possibly empty
    ...
    }
    close $fd

    HTH
    R'
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Thu Jun 26 17:08:29 2025
    From Newsgroup: comp.lang.tcl

    et99 <et99@rocketship1.me> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this. But I'm also wondering why this is
    even event driven at all? Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code. The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Jonathan Kelly@jonkelly@fastmail.fm to comp.lang.tcl on Fri Jun 27 05:09:07 2025
    From Newsgroup: comp.lang.tcl

    On 27/6/25 03:08, Rich wrote:
    et99 <et99@rocketship1.me> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
    set ::input [open "|cat test.txt" r]
    fconfigure $::input -blocking 0 -buffering line
    fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this. But I'm also wondering why this is
    even event driven at all? Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code. The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    What Rich said. Before I realised geturl is *always* asynchronous, I had
    read the man for geturl where it said geturl "blocked". I needed to
    simplify my program as a test case to prove something was broken. Turned
    out, the problem was my understanding, though I still think the manual
    page is mis-leading. The relevant

    "Note: The event queue is even used without the -command option. As a
    side effect, arbitrary commands may be processed while http::geturl is running."

    is in the general description at the top, and I had just been reading
    the geturl function description.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From et99@et99@rocketship1.me to comp.lang.tcl on Thu Jun 26 14:02:34 2025
    From Newsgroup: comp.lang.tcl

    On 6/26/2025 12:09 PM, Jonathan Kelly wrote:
    On 27/6/25 03:08, Rich wrote:
    et99 <et99@rocketship1.me> wrote:
    On 6/25/2025 2:32 PM, Rich wrote:
    Jonathan Kelly <jonkelly@fastmail.fm> wrote:
    proc queue {} {
        set ::input [open "|cat test.txt" r]
        fconfigure $::input -blocking 0 -buffering line
        fileevent $::input readable [list check $::input]
    }

    Curious why you are opening a pipe to cat, having cat read and print
    the contents, and then consuming that, when you can just open
    text.txt directly:

    set ::input [open test.txt r]

    And achieve the same result.

    I was also curious about this.  But I'm also wondering why this is
    even event driven at all?  Why not simply, in pseudo code:

    My guess: the above was OP's "test case" code.  The real code is
    reading an Apache log file as Apache logs to the file, so 'event
    driven' in that senario does make some sense.

    What Rich said. Before I realised geturl is *always* asynchronous, I had read the man for geturl where it said geturl "blocked". I needed to simplify my program as a test case to prove something was broken. Turned out, the problem was my understanding, though I still think the manual page is mis-leading. The relevant

    "Note: The event queue is even used without the -command option. As a side effect, arbitrary commands may be processed while http::geturl is running."

    is in the general description at the top, and I had just been reading the geturl function description.


    I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent <channel> readable <script>] trigger constantly? Isn't this in effect a tight polling loop?

    The manual says:

    "A channel is also considered to be readable if an end of file or error condition is present on the underlying file or device. It is important for script to check for these conditions and handle them appropriately; for example, if there is no special check for end of file, an infinite loop may occur where script reads no data, returns, and is immediately invoked again."

    To avoid this problem, one is normally supposed to close the file or remove the read handler. I've never written a log handler like this one, so I'm not sure what the correct approach would be.


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Christian Gollwitzer@auriocus@gmx.de to comp.lang.tcl on Fri Jun 27 16:35:08 2025
    From Newsgroup: comp.lang.tcl

    Am 26.06.25 um 23:02 schrieb et99:
    I wonder, if you are reading a file that is being written from another process, sort of like a "tail" program, doesn't tcl's [fileevent
    <channel> readable <script>] trigger constantly? Isn't this in effect a tight polling loop?

    The underlying mechanism is select() or poll(). To my knowledge, this
    only works for pipes/sockets, not for files. The "tail -f" program runs
    stat() in a loop to see if the file date or size has changed.

    On Linux, you could also use inotify (there is a Tcl package) to get
    callbacks when the file is changed

    Christian
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Fri Jun 27 18:18:32 2025
    From Newsgroup: comp.lang.tcl

    * Christian Gollwitzer <auriocus@gmx.de>
    | Am 26.06.25 um 23:02 schrieb et99:
    | > I wonder, if you are reading a file that is being written from
    | > another process, sort of like a "tail" program, doesn't tcl's
    | > [fileevent
    | > <channel> readable <script>] trigger constantly? Isn't this in
    | > effect a tight polling loop?

    | The underlying mechanism is select() or poll(). To my knowledge, this
    | only works for pipes/sockets, not for files.

    select() also works for files, the effect is that it indeed triggers immediately on each call.

    man select(2)
    [...]
    readfds
    The file descriptors in this set are watched to see if they are
    ready for reading.
    ** A file descriptor is ready for reading if a read operation will not block;
    in particular, a file descriptor is also ready on end-of-file.

    (** emphasis by me).

    The effect in TCL of setting a readable-fileevent on a regular disk file
    is indeed that the event fires repeatedly until EOF, blocking any GUI
    updates which would run "after idle".

    R'
    --- Synchronet 3.21a-Linux NewsLink 1.2