From nobody Tue May 14 20:23:14 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=cloud.com ARC-Seal: i=1; a=rsa-sha256; t=1695040123; cv=none; d=zohomail.com; s=zohoarc; b=UH26Ld+HMzQgiVpUO25M5C3QK2ediHSiH8u/PGAqcA2cBsdhNkhMXNCo/U1JMYl9ObI21sHHOKej409Vjw8bWqMJJHPMT+Qr5U8lsHl9ElIxwfmQ1sU/j4fHH+RQB65n7oZIr6+9s+v3NwS3C/0rZ8leTGpreOB4PhuxRUmLJL4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1695040123; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=0fpCgVjY/UDRsPYgZC1Uh38RaXgYWqjSfswfqVWb528=; b=F+LteCpyogOGuHe0jxOGGItsvrk3R3nMz6ffOC/1GaCAmj6h6XCwkL+WLYMIrgnnpMbibrnDqOIzes8lp7QjFRyNOZHzhsfwwGlynkbsPZY0uAWXYtOlIqXyqUur6mM+DsdWL7ZX8fWxQz5rM/UPwYNokipMclMaeCiLjKNuRcE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1695040123408333.091547439549; Mon, 18 Sep 2023 05:28:43 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.604011.941198 (Exim 4.92) (envelope-from ) id 1qiDMX-0005vV-10; Mon, 18 Sep 2023 12:28:25 +0000 Received: by outflank-mailman (output) from mailman id 604011.941198; Mon, 18 Sep 2023 12:28:25 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qiDMW-0005vO-UZ; Mon, 18 Sep 2023 12:28:24 +0000 Received: by outflank-mailman (input) for mailman id 604011; Mon, 18 Sep 2023 12:28:23 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1qiDMV-0005vI-81 for xen-devel@lists.xenproject.org; Mon, 18 Sep 2023 12:28:23 +0000 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [2a00:1450:4864:20::336]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id da3285fe-561e-11ee-9b0d-b553b5be7939; Mon, 18 Sep 2023 14:28:20 +0200 (CEST) Received: by mail-wm1-x336.google.com with SMTP id 5b1f17b1804b1-401f68602a8so48940515e9.3 for ; Mon, 18 Sep 2023 05:28:20 -0700 (PDT) Received: from CTX-Georges-MBP.citrite.net (default-46-102-197-194.interdsl.co.uk. [46.102.197.194]) by smtp.gmail.com with ESMTPSA id 7-20020a05600c020700b003fe2de3f94fsm12187475wmi.12.2023.09.18.05.28.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Sep 2023 05:28:19 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: da3285fe-561e-11ee-9b0d-b553b5be7939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.com; s=cloud; t=1695040100; x=1695644900; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=0fpCgVjY/UDRsPYgZC1Uh38RaXgYWqjSfswfqVWb528=; b=jluUl7mmvXhmz37p9q7gdZCcDGKAwasO8tw8bYUEWgUL7lPW+IEWVdvc7VW72OIFT5 LZn9MS/B/Le6CiEhvgAET3z/iU/qAMCVesuKqi1k88g4/VcX/45Y28Wo1GE2krdULOmy igo3U+Cx4pLjN4mlr/zIbRYU1OKykPFnc6NRs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695040100; x=1695644900; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0fpCgVjY/UDRsPYgZC1Uh38RaXgYWqjSfswfqVWb528=; b=kp0bb1XumaMgdVGFBsl5YFQFa8+T4VZkiMCMGiyglZjMOo2unHvcMnT+jVrp4/ehAC PiweA7QuqmsREzQY8PQzAN2JOrkAkVJHZ64PkzjVsUreDdk38eOjM6BRr5x08h+p6zHE tDR09aAotqHgaFowU1WqHPI5i4gV+rj9YPjz4+eAm6cpyOuzB/0BM5tUfAyhczmFaJGW +Y0ri+Ueu+EBNPyvHYqwRysmzo5HFU7SxzdzLaAZgFFQ2mQQIJ1c22iqGLw3YDdurove VOBKb02fKhZ2ZRGD0MpcYhK8zd/r6bbGp5YXcE9UrieJ/8NoSqQTe94ahP6vIa9+1cjA +/uA== X-Gm-Message-State: AOJu0YzXO71mq0YTmWOKWhCHQbGSazxOTc992gQdklrryjY3X+A6gQ6j z2pAjJCfXYjAYi94/pbuxRXNC6bti2uOb5+5Uvw= X-Google-Smtp-Source: AGHT+IGUTBZZcc3rUxpiPomh2PXvx4Ib8EcuV9+BCL73YaUYEqPRMGG6+1+fIOcURSVfyd6l5HoMvQ== X-Received: by 2002:a05:600c:225a:b0:401:2ee0:7558 with SMTP id a26-20020a05600c225a00b004012ee07558mr7265745wmm.32.1695040099749; Mon, 18 Sep 2023 05:28:19 -0700 (PDT) From: George Dunlap To: xen-devel@lists.xenproject.org Cc: George Dunlap , Andrew Cooper , George Dunlap , Jan Beulich , Julien Grall , Stefano Stabellini , Wei Liu Subject: [PATCH] docs: Document a policy for when to deviate from specifications Date: Mon, 18 Sep 2023 13:28:16 +0100 Message-ID: <20230918122817.6577-1-george.dunlap@cloud.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @cloud.com) X-ZM-MESSAGEID: 1695040124558100001 Content-Type: text/plain; charset="utf-8" There is an ongoing disagreement among maintainers for how Xen should handle deviations to specifications such as ACPI or EFI. Write up an explicit policy, and include two worked-out examples from recent discussions. Signed-off-by: George Dunlap Acked-by: Julien Grall Reviewed-by: Marek Marczykowski-G=C3=B3recki --- NB that the technical descriptions of the costs of the accommodations or lack thereof I've just gathered from reading the discussions; I'm not familiar enough with the details to assert things about them. So please correct any technical issues. --- docs/policy/FollowingSpecifications.md | 219 +++++++++++++++++++++++++ 1 file changed, 219 insertions(+) create mode 100644 docs/policy/FollowingSpecifications.md diff --git a/docs/policy/FollowingSpecifications.md b/docs/policy/Following= Specifications.md new file mode 100644 index 0000000000..a197f01f65 --- /dev/null +++ b/docs/policy/FollowingSpecifications.md @@ -0,0 +1,219 @@ +# Guidelines for following specifications + +## In general, follow specifications + +In general, specifications such as ACPI and EFI should be followed. + +## Accommodate non-compliant systems if it doesn't affect compliant systems + +Sometimes, however, there occur situations where real systems "in the +wild" violate these specifications, or at least our interpretation of +them (henceforth called "non-compliant"). If we can accommodate +non-compliant systems without affecting any compliant systems, then we +should do so. + +## If accommodation would affect theoretical compliant systems that are + not known to exist, and Linux and/or Windows takes the + accommodation, take the accommodation unless there's a + reason not to. + +Sometimes, however, there occur situations where real, non-compliant +systems "in the wild" cannot be accommodated without affecting +theoretical compliant systems; but there are no known theoretical +compliant systems which exist. If Linux and/or Windows take the +accommodation, then from a cost/benefits perspective it's probably best +for us to take the accommodation as well. + +This is really a generalization of the next principle; the "reason not +to" would be in the form of a cost-benefits analysis as described in +the next section showing why the "special case" doesn't apply to the +accommodation in question. + +## If things aren't clear, do a cost-benefits analysis + +Sometimes, however, things are more complicated or less clear. In +that case, we should do a cost-benefits analysis for a particular +accommodation. Things which should be factored into the analysis: + +N-1: The number of non-compliant systems that require the accommodation + N-1a: The number of known current systems + N-1b: The probable number of unknown current systems + N-1c: The probable number of unknown future systems + +N-2 The severity of the effect of non-accommodation on these systems + +C-1: The number of compliant systems that would be affected by the accommo= dation + C-1a: The number of known current systems + C-1b: The probable number of unknown current systems + C-1c: The probable number of unknown future systems + +C-2 The severity of the effect of accommodation on these systems + +Intuitively, N-1 * N-2 gives us N, the cost of not making the +accommodation, and C-1 * C-2 gives us C, the cost of taking the +accommodation. If N > C, then we should take the accommodation; if C > +N, then we shouldn't. + +The idea isn't to come up with actual numbers to plug in here +(although that's certainly an option if someone wants to), but to +explain the general idea we're trying to get at. + +A couple of other principles to factor in: + +Vendors tend to copy themselves and other vendors. If one or two +major vendors are known to create compliant or non-compliant systems +in a particular way, then there are likely to be more unknown and +future systems which will be affected by / need a similar accommodation +respectively; that is, we should raise our estimates of N-1{b,c} and +C-1{b,c}. + +Some downstreams already implement accommodations, and test on a +variety of hardware. If downstreams such as QubesOS or XenServer / +XCP-ng implement the accommodations, then N-1 * N-2 is likely to be +non-negligible, and C-1 * C-2 is likely to be negligible. + +Windows and Linux are widely tested. If Windows and/or Linux make a +particular accommodation, and that accommodation has remained stable +without being reverted, then it's likely that the number of unknown +current systems that are affected by the accommodation is negligible; +that is, we should lower the C-1b estimate. + +Vendors tend to test server hardware on Windows and Linux. If Windows +and/or Linux make a particular accommodation, then it's unlikely that +future systems will be affected by the accommodation; that is, we +should lower the C-1c estimate. + +# Example applications + +Here are some examples of how these principles can be applied. + +## ACPI MADT tables containing ~0 + +Xen disables certain kinds of features on CPU hotplug systems; for +example, it will avoid using TSC, which is faster and more power +efficient (since on a hot-pluggable system it won't be reliable), and +instead fall back to other timer sources which are slower and less +power efficient. + +Some hardware vendors have (it seems) begun making a single ACPI table +image for a range of similar systems, with MADT entries for the number +of CPUs based on the system with the most CPUs, and then for the +systems with fewer CPUs, replacing the APIC IDs in the MADT table with +~0, to indicate that those entries aren't valid. These systems are +not hotplug capable. Sometimes the invalid slots are on a separate +socket. + +One interpretation of the spec is that a system with such MADT entries +could actually have an extra socket, and that later the system could +update the MADT table, populating the APIC IDs with real values. + +If Xen finds an MADT where all slots are either populated or filled +with APICID ~0, , should it consider it a multi-socket hotplug system, +disable features available on single-socket systems? Or should it +accommodate the systems above, treating the system as systems +incapable of hotplug? + +N-1a: People have clearly found a number of systems in the wild, from +different vendors, that exhibit this property; it's a non-negligible +number of systems. + +N-1b,c: Since these systems are from different vendors, and there seem to +be a fair number of them, there are likely to be many more that we +don't know about; and likely to be many more produced in the future. + +N-2: Xen will use more expensive (both time and power-wise) clock +sources unless the user manually modifies the Xen command-line. + +C-1a,b: There are no known systems that implement phyical CPU hotplug +whatsoever, much less a system that uses ~0 for APICIDs. + +There are hypervisors that implement *virtual* CPU hotplug; but they +don't use ~0 for APICIDs. + +C-1c: It seems that physical CPU hotplug is an unsolved problem: it was +worked on for quite a while and then abandoned. So it seems fairly +unlikely that any physical CPU hotplug systems will come to exist any +time in the near future. + +If any hotplug systems were created, they would only be affected if +they happened to use ~0 the APIC ID of the empty slots in the MADT +table. This by itself seems unlikely, given the number of vendors who +are now using that to mean "invalid slot", and the fact that virtual +hotplug systems don't do this. + +Furthermore, Linux has been treating such entries as permanently +invalid since 2016. If any system were to implement physical CPU +hotplug in the future, and use ~0 as a placeholder APIC ID, it's very +likely they would test it on Linux, discover that it doesn't work, and +modify the system to enable it to work (perhaps copying QEMU's +behavior). It seems likely that Windows will do the same thing, +further reducing the probability that any system like this will make +it into production. + +So the potential number of future systems affected by this before we +can implement a fix seems very small indeed. + +C-2: If such a system did exist, everything would work fine at boot; +the only issue would be that when an extra CPU was plugged in, nothing +would happen. This could be overridden by a command-line argument. + +Adding these all together, there's a widespread, moderate cost to not +accommodating these systems, and an uncertain and small cost to +accommodating them. So it makes sense to apply the accommodation. + +## Calling EFI Reboot method + +One interpretation of the EFI spec is that operating systems should +call the EFI ResetSystem method in preference to the ACPI reboot +method. + +However, although the ResetSystem method is required by the EFI spec, +a large number of different systems doesn't actully work, at least +when called by Xen: a large number of systems don't cleanly reboot +after calling the EFI REBOOT method, but rather crash or fail in some +other random way. + +(One reason for this is that the Windows EFI test doesn't call the EFI +ResetSystem method, but calls the ACPI reboot method. One possibile +explanation for the repeated pattern is that vendors smoke-test the +ResetSystem method from the EFI shell, which has its own memory map; +but fail to test it when running on the OS memory map.) + +Should Xen follow our interpretation of the EFI spec, and call the +ResetSystem method in preference to the ACPI reboot method? Or should +Xen accommodate systems with broken ResetSystem methods, and call the +ACPI reboot method by default? + +N-1a: There are clearly a large number of systems which exhibit this +property. + +N-1b,c: Given the large number of diverse vendors who make this +mistake, it seems likely that there are even more that we don't know +about, and this will continue into the future. + +N-2: Systems are incapable of rebooting cleanly unless the right runes +are put into the Xen command line to make it prefer using the ACPI +reboot method. + +C-1a: A system would only be negatively affected if 1) an ACPI reboot +method exists, 2) an EFI method exists, and 3) calling the ACPI method +in preference to the EFI method causes some sort of issue. So far +nobody has run into such a system. + +C-1b,c: The Windows EFI test explicitly tests the ACPI reboot method +on EFI systems. Linux also prefers calling the ACPI reboot method +even when an EFI method is available. The chance of someone shipping +a system that had a problem while that was the case is very tiny: it +basically wouldn't run either of the two most important operating +systems. + +C-2: It seems likely that the worst that could happen is what's +happening now when calling the EFI method: that the ACPI method would +cause a weird crash, which then would reboot or hang. + +XenServer has shipped this accommodation for several years now. + +Adding these altogether, the cost of non-accommodation is widespread +and moderate; that is to say, non-negligible; and the cost of +accommodation is theoretical and tiny. So it makes sense to apply the +accommodation. \ No newline at end of file --=20 2.42.0