• How to get rid of client disconnected network sockets

    From Marcel Mueller@news.5.maazl@spamgourmet.org to comp.os.linux.networking on Thu Jun 12 12:54:16 2025
    From Newsgroup: comp.os.linux.networking

    Hello!

    from time to time network sockets are stuck for a very long time when
    the client is no longer online:

    netstat -4np
    tcp 0 0 192.168.121.129:40830 192.168.121.1:49000 VERBUNDEN 1449463/vdr
    tcp 0 0 192.168.121.129:2049 192.168.121.137:939 VERBUNDEN -
    tcp 0 0 192.168.121.129:2049 192.168.121.139:912 VERBUNDEN -
    tcp 45 0 192.168.121.129:5143 192.168.121.24:41072 VERBUNDEN -
    tcp 0 0 192.168.121.129:22 192.168.121.137:55826 VERBUNDEN 1475830/sshd-sessio
    tcp 0 0 192.168.121.129:2049 192.168.121.143:857 VERBUNDEN -
    tcp 0 0 192.168.121.129:22 192.168.121.137:42194 VERBUNDEN 1479707/sshd-sessio
    tcp 45 0 192.168.121.129:5143 192.168.121.24:43014 VERBUNDEN -
    tcp 45 0 192.168.121.129:5143 192.168.121.33:56454 VERBUNDEN -
    tcp 0 0 192.168.121.129:5143 192.168.121.33:52444 VERBUNDEN 1460557/VBoxHeadles

    The clients 192.168.121.24 is disconnected for at least an hour, but the sockets at the server seem to stay for an infinite time.
    At some point I can no longer connect to the service at :5143. The
    client software (xfreerdp) just hangs.

    The only way to recover from this state is to end the listening process.
    Since this is a virtual machine this option is not very applicable.

    The lost connections are often related to client network disconnects,
    e.g. by a laptop entering suspend or WLAN interference. But this should
    result in a socket with state CLOSE_WAIT which is cleaned up after a
    short time. Of course, it is not reproducible. It happens only from time
    to time.

    How can I force the sockets to disconnect?

    System is Debian 13 running VBox headless VMs. (I do not use VBox NAT.)


    Marcel
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Andrzej Adam Filip@anfi@onet.eu to comp.os.linux.networking on Thu Jun 12 13:24:04 2025
    From Newsgroup: comp.os.linux.networking

    Marcel Mueller <news.5.maazl@spamgourmet.org> wrote:
    from time to time network sockets are stuck for a very long time when
    the client is no longer online:

    netstat -4np
    tcp 0 0 192.168.121.129:40830 192.168.121.1:49000
    VERBUNDEN 1449463/vdr tcp 0 0
    192.168.121.129:2049 192.168.121.137:939 VERBUNDEN -
    tcp 0 0 192.168.121.129:2049 192.168.121.139:912
    VERBUNDEN - tcp 45 0
    192.168.121.129:5143 192.168.121.24:41072 VERBUNDEN -
    tcp 0 0 192.168.121.129:22 192.168.121.137:55826
    VERBUNDEN 1475830/sshd-sessio tcp 0 0
    192.168.121.129:2049 192.168.121.143:857 VERBUNDEN -
    tcp 0 0 192.168.121.129:22 192.168.121.137:42194
    VERBUNDEN 1479707/sshd-sessio tcp 45 0
    192.168.121.129:5143 192.168.121.24:43014 VERBUNDEN -
    tcp 45 0 192.168.121.129:5143 192.168.121.33:56454
    VERBUNDEN - tcp 0 0
    192.168.121.129:5143 192.168.121.33:52444 VERBUNDEN
    1460557/VBoxHeadles

    The clients 192.168.121.24 is disconnected for at least an hour, but
    the sockets at the server seem to stay for an infinite time.
    At some point I can no longer connect to the service at :5143. The
    client software (xfreerdp) just hangs.

    The only way to recover from this state is to end the listening
    process. Since this is a virtual machine this option is not very
    applicable.

    The lost connections are often related to client network disconnects,
    e.g. by a laptop entering suspend or WLAN interference. But this
    should result in a socket with state CLOSE_WAIT which is cleaned up
    after a short time. Of course, it is not reproducible. It happens only
    from time to time.

    How can I force the sockets to disconnect?

    System is Debian 13 running VBox headless VMs. (I do not use VBox NAT.)

    AFAIR Classic recommendation is to change tcp keep_alive timeout.
    It changes when kernel checks (sends probe packets) over inactive
    tcp connection. The default is after 2 hours (7200 seconds).
    AFAIR some people reported changing it to 10 or 15 minutes.

    Keywords for searches: linux tcp_keepalive_time https://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html
    --
    [Andrew] Andrzej A. Filip
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.os.linux.networking on Thu Jun 12 21:53:56 2025
    From Newsgroup: comp.os.linux.networking

    On Thu, 12 Jun 2025 12:54:16 +0200, Marcel Mueller wrote:

    The clients 192.168.121.24 is disconnected for at least an hour, but the sockets at the server seem to stay for an infinite time.

    This to me points to a defect in the protocol, that it does not
    periodically exchange “are you there?” packets (even, say, once every 1 minute or 5 minutes), just to be sure the other end is still up.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Computer Nerd Kev@not@telling.you.invalid to comp.os.linux.networking on Fri Jun 13 22:38:34 2025
    From Newsgroup: comp.os.linux.networking

    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 12 Jun 2025 12:54:16 +0200, Marcel Mueller wrote:
    The clients 192.168.121.24 is disconnected for at least an hour, but the
    sockets at the server seem to stay for an infinite time.

    This to me points to a defect in the protocol, that it does not
    periodically exchange "are you there?" packets (even, say, once every 1 minute or 5 minutes), just to be sure the other end is still up.

    But maybe the network connection was just interrupted and it'll be
    back in a few minutes? Since my home internet via mobile broadband
    is unreliable that often happens during SSH sessions. I can come
    back in 15min and my SSH terminals are all working again after the
    signal came back (well, not always, but sometimes). Mosh, which
    uses UDP, handles that better though.
    --
    __ __
    #_ < |\| |< _#
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.os.linux.networking on Fri Jun 13 23:40:14 2025
    From Newsgroup: comp.os.linux.networking

    On 13 Jun 2025 22:38:34 +1000, Computer Nerd Kev wrote:

    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

    On Thu, 12 Jun 2025 12:54:16 +0200, Marcel Mueller wrote:

    The clients 192.168.121.24 is disconnected for at least an hour, but
    the sockets at the server seem to stay for an infinite time.

    This to me points to a defect in the protocol, that it does not
    periodically exchange "are you there?" packets (even, say, once every 1
    minute or 5 minutes), just to be sure the other end is still up.

    But maybe the network connection was just interrupted and it'll be back
    in a few minutes?

    Set the timeout accordingly.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Marcel Mueller@news.5.maazl@spamgourmet.org to comp.os.linux.networking on Sat Jun 14 11:56:11 2025
    From Newsgroup: comp.os.linux.networking

    Am 13.06.25 um 14:38 schrieb Computer Nerd Kev:
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 12 Jun 2025 12:54:16 +0200, Marcel Mueller wrote:
    The clients 192.168.121.24 is disconnected for at least an hour, but the >>> sockets at the server seem to stay for an infinite time.

    This to me points to a defect in the protocol, that it does not
    periodically exchange "are you there?" packets (even, say, once every 1
    minute or 5 minutes), just to be sure the other end is still up.

    But maybe the network connection was just interrupted and it'll be
    back in a few minutes? Since my home internet via mobile broadband
    is unreliable that often happens during SSH sessions. I can come
    back in 15min and my SSH terminals are all working again after the
    signal came back (well, not always, but sometimes).

    For this purpose I would recommend "screen". It keeps your ssh session
    as long as you like.

    Mosh, which
    uses UDP, handles that better though.

    UDP sadly fails if a NAT router is in between. Since it cannot know when
    the connection is closed it just discards the port after a timeout w/o traffic, typically only a few minutes.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Marcel Mueller@news.5.maazl@spamgourmet.org to comp.os.linux.networking on Sat Jun 14 14:35:27 2025
    From Newsgroup: comp.os.linux.networking

    Am 12.06.25 um 13:24 schrieb Andrzej Adam Filip:
    AFAIR Classic recommendation is to change tcp keep_alive timeout.
    It changes when kernel checks (sends probe packets) over inactive
    tcp connection. The default is after 2 hours (7200 seconds).
    AFAIR some people reported changing it to 10 or 15 minutes.

    I played a bit around with different settings, but with only limited
    success.

    The problem is primarily with excessive retransmissions. The ping
    latency becomes up to 20 seconds (!) while packet loss is quite low.
    I got some improvement by significantly reducing the TX queue depth of
    the WLAN device (50 instead of 1000). It makes no sense to send a packet
    20 seconds later.

    The basic problem seems to be that the transmission power is restricted
    to 15 dBm although channel 100 allows significantly higher level (26
    dBm). The router seems to be aware of that but the Intel Centrino N 6205 device not. So link quality is highly asymmetric.
    No idea whether the power limit is due to hardware restrictions or a
    driver bug. I guess the first.

    However, I found a workaround for the orphaned connections. Using systemd-socket-proxyd I can control almost all socket options very fime-grained. It also starts the VM if not yet running.


    Marcel
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.os.linux.networking on Sat Jun 14 23:37:45 2025
    From Newsgroup: comp.os.linux.networking

    On Sat, 14 Jun 2025 11:56:11 +0200, Marcel Mueller wrote:

    Am 13.06.25 um 14:38 schrieb Computer Nerd Kev:

    Mosh, which uses UDP, handles that better though.

    UDP sadly fails if a NAT router is in between. Since it cannot know when
    the connection is closed it just discards the port after a timeout w/o traffic, typically only a few minutes.

    Again, that depends on the application-level protocol. If that has
    provision for sending “noop” replies and responses, then there is a way to keep the NAT context alive, even when there is no actual data to be
    passed.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From candycanearter07@candycanearter07@candycanearter07.nomail.afraid to comp.os.linux.networking on Wed Jun 18 05:30:09 2025
    From Newsgroup: comp.os.linux.networking

    Marcel Mueller <news.5.maazl@spamgourmet.org> wrote at 09:56 this Saturday (GMT):
    Am 13.06.25 um 14:38 schrieb Computer Nerd Kev:
    Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
    On Thu, 12 Jun 2025 12:54:16 +0200, Marcel Mueller wrote:
    The clients 192.168.121.24 is disconnected for at least an hour, but the >>>> sockets at the server seem to stay for an infinite time.

    This to me points to a defect in the protocol, that it does not
    periodically exchange "are you there?" packets (even, say, once every 1
    minute or 5 minutes), just to be sure the other end is still up.

    But maybe the network connection was just interrupted and it'll be
    back in a few minutes? Since my home internet via mobile broadband
    is unreliable that often happens during SSH sessions. I can come
    back in 15min and my SSH terminals are all working again after the
    signal came back (well, not always, but sometimes).

    For this purpose I would recommend "screen". It keeps your ssh session
    as long as you like.

    Or tmux.

    Mosh, which
    uses UDP, handles that better though.

    UDP sadly fails if a NAT router is in between. Since it cannot know when
    the connection is closed it just discards the port after a timeout w/o traffic, typically only a few minutes.


    Is there any way to change the timeout period?
    --
    user <candycane> is generated from /dev/urandom
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Marcel Mueller@news.5.maazl@spamgourmet.org to comp.os.linux.networking on Wed Jun 18 10:00:02 2025
    From Newsgroup: comp.os.linux.networking

    Am 18.06.25 um 07:30 schrieb candycanearter07:
    UDP sadly fails if a NAT router is in between. Since it cannot know when
    the connection is closed it just discards the port after a timeout w/o
    traffic, typically only a few minutes.

    Is there any way to change the timeout period?

    With OpenWRT: likely. But I have never seen a ready-to-use or provider
    router that allows this kind of setting.


    Marcel
    --- Synchronet 3.21a-Linux NewsLink 1.2