From nobody Sun Feb 8 22:34:45 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1518787350077619.1776914170867; Fri, 16 Feb 2018 05:22:30 -0800 (PST) Received: from localhost ([::1]:48190 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1emfy9-0005W0-59 for importer@patchew.org; Fri, 16 Feb 2018 08:22:29 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51782) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1emfsa-0001JA-4M for qemu-devel@nongnu.org; Fri, 16 Feb 2018 08:16:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1emfsY-0005kP-5s for qemu-devel@nongnu.org; Fri, 16 Feb 2018 08:16:44 -0500 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:34850 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1emfsX-0005jz-Vv for qemu-devel@nongnu.org; Fri, 16 Feb 2018 08:16:42 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A3385407519B for ; Fri, 16 Feb 2018 13:16:41 +0000 (UTC) Received: from dgilbert-t530.redhat.com (unknown [10.36.118.23]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3F29B213AEE2; Fri, 16 Feb 2018 13:16:40 +0000 (UTC) From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, maxime.coquelin@redhat.com, marcandre.lureau@redhat.com, peterx@redhat.com, imammedo@redhat.com, mst@redhat.com Date: Fri, 16 Feb 2018 13:16:05 +0000 Message-Id: <20180216131625.9639-10-dgilbert@redhat.com> In-Reply-To: <20180216131625.9639-1-dgilbert@redhat.com> References: <20180216131625.9639-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 16 Feb 2018 13:16:41 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 16 Feb 2018 13:16:41 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'dgilbert@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 09/29] postcopy: Allow registering of fd handler X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: aarcange@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Allow other userfaultfd's to be registered into the fault thread so that handlers for shared memory can get responses. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/migration.c | 6 ++ migration/migration.h | 2 + migration/postcopy-ram.c | 209 +++++++++++++++++++++++++++++++++++--------= ---- migration/postcopy-ram.h | 21 +++++ migration/trace-events | 2 + 5 files changed, 187 insertions(+), 53 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 86d69120a6..d49eed1c5b 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -155,6 +155,8 @@ MigrationIncomingState *migration_incoming_get_current(= void) if (!once) { mis_current.state =3D MIGRATION_STATUS_NONE; memset(&mis_current, 0, sizeof(MigrationIncomingState)); + mis_current.postcopy_remote_fds =3D g_array_new(FALSE, TRUE, + sizeof(struct PostCopyF= D)); qemu_mutex_init(&mis_current.rp_mutex); qemu_event_init(&mis_current.main_thread_load_event, false); once =3D true; @@ -177,6 +179,10 @@ void migration_incoming_state_destroy(void) qemu_fclose(mis->from_src_file); mis->from_src_file =3D NULL; } + if (mis->postcopy_remote_fds) { + g_array_free(mis->postcopy_remote_fds, TRUE); + mis->postcopy_remote_fds =3D NULL; + } =20 qemu_event_reset(&mis->main_thread_load_event); } diff --git a/migration/migration.h b/migration/migration.h index 848f638a20..d158e62cf2 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -48,6 +48,8 @@ struct MigrationIncomingState { QemuMutex rp_mutex; /* We send replies from multiple threads */ void *postcopy_tmp_page; void *postcopy_tmp_zero_page; + /* PostCopyFD's for external userfaultfds & handlers of shared memory = */ + GArray *postcopy_remote_fds; =20 QEMUBH *bh; =20 diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index fa98cf353b..d118b78bf5 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -542,29 +542,43 @@ static void *postcopy_ram_fault_thread(void *opaque) MigrationIncomingState *mis =3D opaque; struct uffd_msg msg; int ret; + size_t index; RAMBlock *rb =3D NULL; RAMBlock *last_rb =3D NULL; /* last RAMBlock we sent part of */ =20 trace_postcopy_ram_fault_thread_entry(); qemu_sem_post(&mis->fault_thread_sem); =20 + struct pollfd *pfd; + size_t pfd_len =3D 2 + mis->postcopy_remote_fds->len; + + pfd =3D g_new0(struct pollfd, pfd_len); + + pfd[0].fd =3D mis->userfault_fd; + pfd[0].events =3D POLLIN; + pfd[1].fd =3D mis->userfault_quit_fd; + pfd[1].events =3D POLLIN; /* Waiting for eventfd to go positive */ + trace_postcopy_ram_fault_thread_fds_core(pfd[0].fd, pfd[1].fd); + for (index =3D 0; index < mis->postcopy_remote_fds->len; index++) { + struct PostCopyFD *pcfd =3D &g_array_index(mis->postcopy_remote_fd= s, + struct PostCopyFD, index); + pfd[2 + index].fd =3D pcfd->fd; + pfd[2 + index].events =3D POLLIN; + trace_postcopy_ram_fault_thread_fds_extra(2 + index, pcfd->idstr, + pcfd->fd); + } + while (true) { ram_addr_t rb_offset; - struct pollfd pfd[2]; + int poll_result; =20 /* * We're mainly waiting for the kernel to give us a faulting HVA, * however we can be told to quit via userfault_quit_fd which is * an eventfd */ - pfd[0].fd =3D mis->userfault_fd; - pfd[0].events =3D POLLIN; - pfd[0].revents =3D 0; - pfd[1].fd =3D mis->userfault_quit_fd; - pfd[1].events =3D POLLIN; /* Waiting for eventfd to go positive */ - pfd[1].revents =3D 0; - - if (poll(pfd, 2, -1 /* Wait forever */) =3D=3D -1) { + poll_result =3D poll(pfd, pfd_len, -1 /* Wait forever */); + if (poll_result =3D=3D -1) { error_report("%s: userfault poll: %s", __func__, strerror(errn= o)); break; } @@ -574,57 +588,117 @@ static void *postcopy_ram_fault_thread(void *opaque) break; } =20 - ret =3D read(mis->userfault_fd, &msg, sizeof(msg)); - if (ret !=3D sizeof(msg)) { - if (errno =3D=3D EAGAIN) { - /* - * if a wake up happens on the other thread just after - * the poll, there is nothing to read. - */ - continue; + if (pfd[0].revents) { + poll_result--; + ret =3D read(mis->userfault_fd, &msg, sizeof(msg)); + if (ret !=3D sizeof(msg)) { + if (errno =3D=3D EAGAIN) { + /* + * if a wake up happens on the other thread just after + * the poll, there is nothing to read. + */ + continue; + } + if (ret < 0) { + error_report("%s: Failed to read full userfault " + "message: %s", + __func__, strerror(errno)); + break; + } else { + error_report("%s: Read %d bytes from userfaultfd " + "expected %zd", + __func__, ret, sizeof(msg)); + break; /* Lost alignment, don't know what we'd read ne= xt */ + } } - if (ret < 0) { - error_report("%s: Failed to read full userfault message: %= s", - __func__, strerror(errno)); - break; - } else { - error_report("%s: Read %d bytes from userfaultfd expected = %zd", - __func__, ret, sizeof(msg)); - break; /* Lost alignment, don't know what we'd read next */ + if (msg.event !=3D UFFD_EVENT_PAGEFAULT) { + error_report("%s: Read unexpected event %ud from userfault= fd", + __func__, msg.event); + continue; /* It's not a page fault, shouldn't happen */ } - } - if (msg.event !=3D UFFD_EVENT_PAGEFAULT) { - error_report("%s: Read unexpected event %ud from userfaultfd", - __func__, msg.event); - continue; /* It's not a page fault, shouldn't happen */ - } =20 - rb =3D qemu_ram_block_from_host( - (void *)(uintptr_t)msg.arg.pagefault.address, - true, &rb_offset); - if (!rb) { - error_report("postcopy_ram_fault_thread: Fault outside guest: = %" - PRIx64, (uint64_t)msg.arg.pagefault.address); - break; - } + rb =3D qemu_ram_block_from_host( + (void *)(uintptr_t)msg.arg.pagefault.address, + true, &rb_offset); + if (!rb) { + error_report("postcopy_ram_fault_thread: Fault outside gue= st: %" + PRIx64, (uint64_t)msg.arg.pagefault.address); + break; + } =20 - rb_offset &=3D ~(qemu_ram_pagesize(rb) - 1); - trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, + rb_offset &=3D ~(qemu_ram_pagesize(rb) - 1); + trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.addr= ess, qemu_ram_get_idstr(rb), rb_offset); + /* + * Send the request to the source - we want to request one + * of our host page sizes (which is >=3D TPS) + */ + if (rb !=3D last_rb) { + last_rb =3D rb; + migrate_send_rp_req_pages(mis, qemu_ram_get_idstr(rb), + rb_offset, qemu_ram_pagesize(rb)); + } else { + /* Save some space */ + migrate_send_rp_req_pages(mis, NULL, + rb_offset, qemu_ram_pagesize(rb)); + } + } =20 - /* - * Send the request to the source - we want to request one - * of our host page sizes (which is >=3D TPS) - */ - if (rb !=3D last_rb) { - last_rb =3D rb; - migrate_send_rp_req_pages(mis, qemu_ram_get_idstr(rb), - rb_offset, qemu_ram_pagesize(rb)); - } else { - /* Save some space */ - migrate_send_rp_req_pages(mis, NULL, - rb_offset, qemu_ram_pagesize(rb)); + /* Now handle any requests from external processes on shared memor= y */ + /* TODO: May need to handle devices deregistering during postcopy = */ + for (index =3D 2; index < pfd_len && poll_result; index++) { + if (pfd[index].revents) { + struct PostCopyFD *pcfd =3D + &g_array_index(mis->postcopy_remote_fds, + struct PostCopyFD, index - 2); + + poll_result--; + if (pfd[index].revents & POLLERR) { + error_report("%s: POLLERR on poll %zd fd=3D%d", + __func__, index, pcfd->fd); + pfd[index].events =3D 0; + continue; + } + + ret =3D read(pcfd->fd, &msg, sizeof(msg)); + if (ret !=3D sizeof(msg)) { + if (errno =3D=3D EAGAIN) { + /* + * if a wake up happens on the other thread just a= fter + * the poll, there is nothing to read. + */ + continue; + } + if (ret < 0) { + error_report("%s: Failed to read full userfault " + "message: %s (shared) revents=3D%d", + __func__, strerror(errno), + pfd[index].revents); + /*TODO: Could just disable this sharer */ + break; + } else { + error_report("%s: Read %d bytes from userfaultfd " + "expected %zd (shared)", + __func__, ret, sizeof(msg)); + /*TODO: Could just disable this sharer */ + break; /*Lost alignment,don't know what we'd read = next*/ + } + } + if (msg.event !=3D UFFD_EVENT_PAGEFAULT) { + error_report("%s: Read unexpected event %ud " + "from userfaultfd (shared)", + __func__, msg.event); + continue; /* It's not a page fault, shouldn't happen */ + } + /* Call the device handler registered with us */ + ret =3D pcfd->handler(pcfd, &msg); + if (ret) { + error_report("%s: Failed to resolve shared fault on %z= d/%s", + __func__, index, pcfd->idstr); + /* TODO: Fail? Disable this sharer? */ + } + } } } trace_postcopy_ram_fault_thread_exit(); @@ -954,3 +1028,32 @@ PostcopyState postcopy_state_set(PostcopyState new_st= ate) { return atomic_xchg(&incoming_postcopy_state, new_state); } + +/* Register a handler for external shared memory postcopy + * called on the destination. + */ +void postcopy_register_shared_ufd(struct PostCopyFD *pcfd) +{ + MigrationIncomingState *mis =3D migration_incoming_get_current(); + + mis->postcopy_remote_fds =3D g_array_append_val(mis->postcopy_remote_f= ds, + *pcfd); +} + +/* Unregister a handler for external shared memory postcopy + */ +void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd) +{ + guint i; + MigrationIncomingState *mis =3D migration_incoming_get_current(); + GArray *pcrfds =3D mis->postcopy_remote_fds; + + for (i =3D 0; i < pcrfds->len; i++) { + struct PostCopyFD *cur =3D &g_array_index(pcrfds, struct PostCopyF= D, i); + if (cur->fd =3D=3D pcfd->fd) { + mis->postcopy_remote_fds =3D g_array_remove_index(pcrfds, i); + return; + } + } +} + diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index bee21d4401..4bda5aa509 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -141,4 +141,25 @@ void postcopy_remove_notifier(NotifierWithReturn *n); /* Call the notifier list set by postcopy_add_start_notifier */ int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp); =20 +struct PostCopyFD; + +/* ufd is a pointer to the struct uffd_msg *TODO: more Portable! */ +typedef int (*pcfdhandler)(struct PostCopyFD *pcfd, void *ufd); + +struct PostCopyFD { + int fd; + /* Data to pass to handler */ + void *data; + /* Handler to be called whenever we get a poll event */ + pcfdhandler handler; + /* A string to use in error messages */ + char *idstr; +}; + +/* Register a userfaultfd owned by an external process for + * shared memory. + */ +void postcopy_register_shared_ufd(struct PostCopyFD *pcfd); +void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd); + #endif diff --git a/migration/trace-events b/migration/trace-events index 93961dea16..1e617ad7a6 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -190,6 +190,8 @@ postcopy_place_page_zero(void *host_addr) "host=3D%p" postcopy_ram_enable_notify(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" +postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitf= d: %d" +postcopy_ram_fault_thread_fds_extra(size_t index, const char *name, int fd= ) "%zd/%s: %d" postcopy_ram_fault_thread_quit(void) "" postcopy_ram_fault_thread_request(uint64_t hostaddr, const char *ramblock,= size_t offset) "Request for HVA=3D0x%" PRIx64 " rb=3D%s offset=3D0x%zx" postcopy_ram_incoming_cleanup_closeuf(void) "" --=20 2.14.3