From nobody Sat Apr 20 16:17:31 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1634042516; cv=none; d=zohomail.com; s=zohoarc; b=BDD8QKTSFPz5I9//Ar0YpFD05iyqwmc/lcaPWwaKJtp45+yGnCemtwvrOpTXL5y8GAhPsX/jIac+kK0oVWZH//yIftbiqSUCf8rXLpbCMAe//IzAXcoFceM/T2QxzuFaDFIchIi34x0re+7VOcAuoIPlh8kHSo0NWxP5rcA5kiQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1634042516; h=Content-Type:Content-Transfer-Encoding:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=1AkE3lYL8Nk6mileqHy7GjfFHzQ25olS4jqXn5VEYno=; b=KmJoYkeaBTqdmS0SZ4t7pRHhy9ulzsNFEzzuGtzj+02ostmvPDYHMBqBYeXLG1GprjSLb+NR17kDlNL7SLtet5oZ7VQNEWI8XzJ27w/GCpHoYF4pS1cw4S7xmTWfs/ZK9MoVes//tRQd24BkVAoubEwsGDmU8ccIbrfS0O+vFg4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1634042516876280.4436774987788; Tue, 12 Oct 2021 05:41:56 -0700 (PDT) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-560-mI2bMPzqNC2xz7TTU86JdQ-1; Tue, 12 Oct 2021 08:41:52 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A77AA100CCF2; Tue, 12 Oct 2021 12:41:47 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8980C2B060; Tue, 12 Oct 2021 12:41:47 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id CEF754EA2A; Tue, 12 Oct 2021 12:41:46 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 19CCfjod017232 for ; Tue, 12 Oct 2021 08:41:45 -0400 Received: by smtp.corp.redhat.com (Postfix) id 033B05D9D5; Tue, 12 Oct 2021 12:41:45 +0000 (UTC) Received: from maggie.redhat.com (unknown [10.43.2.26]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8195E5D9C6 for ; Tue, 12 Oct 2021 12:41:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634042515; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=1AkE3lYL8Nk6mileqHy7GjfFHzQ25olS4jqXn5VEYno=; b=Qtiy2LuvcIPjPPl7XUliFMcqp0XGhYSo2L2pZ2YLumd4LZ/1Ukm5bani3rK2XPQpwGC9iP dTy0Lx/5zzpo5E2oPRoTT19EhogbUeUdUwvig2Hj6miA06eZyfjFicMRJpc/ybEJr9/QKz Lzwmf3lTLHokvKZEaebDzOlwVzBA5+E= X-MC-Unique: mI2bMPzqNC2xz7TTU86JdQ-1 From: Michal Privoznik To: libvir-list@redhat.com Subject: [PATCH] rpc: Temporarily stop accept()-ing new clients on EMFILE Date: Tue, 12 Oct 2021 14:41:39 +0200 Message-Id: <2857a75e9f234eeff9d535af0f8ba7a76e7adefe.1634042491.git.mprivozn@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: libvir-list@redhat.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=libvir-list-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1634042518848100001 Content-Type: text/plain; charset="utf-8" This commit is related to 5de203f879 which I pushed a few days ago. While that commit prioritized closing clients socket over the rest of I/O process, this one goes one step further and temporarily suspends processing new connection requests. A brief recapitulation of the problem: 1) assume that libvirt is at the top of RLIMIT_NOFILE (that is no new FDs can be opened). 2) we have a client trying to connect to a UNIX/TCP socket Because of 2) our event loop sees POLLIN on the socket and thus calls virNetServerServiceAccept(). But since no new FDs can be opened (because of 1)) the request is not handled and we will get the same event on next iteration. The poll() will exit immediately because there is an event on the socket. Thus we end up in an endless loop. To break the loop and stop burning CPU cycles we can stop listening for events on the socket and set up a timer tho enable listening again after some time (I chose 5 seconds because of no obvious reason). There's another area where we play with temporarily suspending accept() of new clients - when a client disconnects and we check max_clients against number of current clients. Problem here is that max_clients can be orders of magnitude larger than RLIMIT_NOFILE but more importantly, what this code considers client disconnect is not equal to closing client's FD. A client disconnecting means that the corresponding client structure is removed from the internal list of clients. Closing of the client's FD is done from event loop - asynchronously. To avoid this part stepping on the toes of my fix, let's make the code NOP if socket timer (as described above) is active. Signed-off-by: Michal Privoznik Reviewed-by: Daniel P. Berrang=C3=A9 --- src/libvirt_remote.syms | 1 + src/rpc/virnetserver.c | 9 ++++++ src/rpc/virnetserverservice.c | 53 ++++++++++++++++++++++++++++++++++- src/rpc/virnetserverservice.h | 2 ++ src/rpc/virnetsocket.c | 15 ++++++++++ 5 files changed, 79 insertions(+), 1 deletion(-) diff --git a/src/libvirt_remote.syms b/src/libvirt_remote.syms index b4265adf2e..942e1013a6 100644 --- a/src/libvirt_remote.syms +++ b/src/libvirt_remote.syms @@ -221,6 +221,7 @@ virNetServerServiceNewTCP; virNetServerServiceNewUNIX; virNetServerServicePreExecRestart; virNetServerServiceSetDispatcher; +virNetServerServiceTimerActive; virNetServerServiceToggle; =20 =20 diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c index b3214883ee..dc8f32b095 100644 --- a/src/rpc/virnetserver.c +++ b/src/rpc/virnetserver.c @@ -252,6 +252,15 @@ virNetServerDispatchNewMessage(virNetServerClient *cli= ent, static void virNetServerCheckLimits(virNetServer *srv) { + size_t i; + + for (i =3D 0; i < srv->nservices; i++) { + if (virNetServerServiceTimerActive(srv->services[i])) { + VIR_DEBUG("Skipping client-related limits evaluation"); + return; + } + } + VIR_DEBUG("Checking client-related limits to re-enable or temporarily " "suspend services: nclients=3D%zu nclients_max=3D%zu " "nclients_unauth=3D%zu nclients_unauth_max=3D%zu", diff --git a/src/rpc/virnetserverservice.c b/src/rpc/virnetserverservice.c index 0c4c437a49..214eae1acb 100644 --- a/src/rpc/virnetserverservice.c +++ b/src/rpc/virnetserverservice.c @@ -43,6 +43,8 @@ struct _virNetServerService { int auth; bool readonly; size_t nrequests_client_max; + int timer; + bool timerActive; =20 virNetTLSContext *tls; =20 @@ -71,9 +73,25 @@ static void virNetServerServiceAccept(virNetSocket *sock, { virNetServerService *svc =3D opaque; virNetSocket *clientsock =3D NULL; + int rc; =20 - if (virNetSocketAccept(sock, &clientsock) < 0) + rc =3D virNetSocketAccept(sock, &clientsock); + if (rc < 0) { + if (rc =3D=3D -2) { + /* Could not accept new client due to EMFILE. Suspend listenin= g on + * the socket and set up a timer to enable it later. Hopefully, + * some FDs will be closed meanwhile. */ + VIR_DEBUG("Temporarily suspending listening on svc=3D%p becaus= e accept() on sock=3D%p failed (errno=3D%d)", + svc, sock, errno); + + virNetServerServiceToggle(svc, false); + + svc->timerActive =3D true; + /* Retry in 5 seconds. */ + virEventUpdateTimeout(svc->timer, 5 * 1000); + } goto cleanup; + } =20 if (!clientsock) /* Connection already went away */ goto cleanup; @@ -88,6 +106,21 @@ static void virNetServerServiceAccept(virNetSocket *soc= k, } =20 =20 +static void +virNetServerServiceTimerFunc(int timer, + void *opaque) +{ + virNetServerService *svc =3D opaque; + + VIR_DEBUG("Resuming listening on service svc=3D%p after previous suspe= nd", svc); + + virNetServerServiceToggle(svc, true); + + virEventUpdateTimeout(timer, -1); + svc->timerActive =3D false; +} + + static virNetServerService * virNetServerServiceNewSocket(virNetSocket **socks, size_t nsocks, @@ -117,6 +150,14 @@ virNetServerServiceNewSocket(virNetSocket **socks, svc->nrequests_client_max =3D nrequests_client_max; svc->tls =3D virObjectRef(tls); =20 + virObjectRef(svc); + svc->timer =3D virEventAddTimeout(-1, virNetServerServiceTimerFunc, + svc, virObjectFreeCallback); + if (svc->timer < 0) { + virObjectUnref(svc); + goto error; + } + for (i =3D 0; i < svc->nsocks; i++) { if (virNetSocketListen(svc->socks[i], max_queued_clients) < 0) goto error; @@ -407,6 +448,9 @@ void virNetServerServiceDispose(void *obj) virNetServerService *svc =3D obj; size_t i; =20 + if (svc->timer >=3D 0) + virEventRemoveTimeout(svc->timer); + for (i =3D 0; i < svc->nsocks; i++) virObjectUnref(svc->socks[i]); g_free(svc->socks); @@ -438,3 +482,10 @@ void virNetServerServiceClose(virNetServerService *svc) virNetSocketClose(svc->socks[i]); } } + + +bool +virNetServerServiceTimerActive(virNetServerService *svc) +{ + return svc->timerActive; +} diff --git a/src/rpc/virnetserverservice.h b/src/rpc/virnetserverservice.h index ab5798938e..f3d55a9cc0 100644 --- a/src/rpc/virnetserverservice.h +++ b/src/rpc/virnetserverservice.h @@ -78,3 +78,5 @@ void virNetServerServiceToggle(virNetServerService *svc, bool enabled); =20 void virNetServerServiceClose(virNetServerService *svc); + +bool virNetServerServiceTimerActive(virNetServerService *svc); diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c index 212089520d..60dff71718 100644 --- a/src/rpc/virnetsocket.c +++ b/src/rpc/virnetsocket.c @@ -2067,6 +2067,19 @@ int virNetSocketListen(virNetSocket *sock, int backl= og) return 0; } =20 + +/** + * virNetSocketAccept: + * @sock: socket to accept connection on + * @clientsock: returned client socket + * + * For given socket @sock accept incoming connection and create + * @clientsock representation of the new accepted connection. + * + * Returns: 0 on success, + * -2 if accepting failed due to EMFILE error, + * -1 otherwise. + */ int virNetSocketAccept(virNetSocket *sock, virNetSocket **clientsock) { int fd =3D -1; @@ -2087,6 +2100,8 @@ int virNetSocketAccept(virNetSocket *sock, virNetSock= et **clientsock) errno =3D=3D EAGAIN) { ret =3D 0; goto cleanup; + } else if (errno =3D=3D EMFILE) { + ret =3D -2; } =20 virReportSystemError(errno, "%s", --=20 2.32.0