arch/arm64/kernel/hibernate.c | 13 +++---- arch/ia64/kernel/process.c | 8 +--- arch/parisc/kernel/processor.c | 4 +- arch/powerpc/kernel/machine_kexec_64.c | 4 +- arch/sparc/kernel/ds.c | 8 +++- arch/x86/kernel/topology.c | 4 +- arch/x86/mm/mmio-mod.c | 8 +++- arch/x86/xen/smp.c | 4 +- drivers/base/core.c | 4 ++ drivers/base/cpu.c | 4 +- drivers/firmware/psci/psci_checker.c | 6 ++- drivers/xen/cpu_hotplug.c | 2 +- include/linux/cpu.h | 6 ++- kernel/cpu.c | 53 ++++++++++++++++++++++++-- kernel/smp.c | 9 +---- kernel/torture.c | 15 ++++++-- 16 files changed, 106 insertions(+), 46 deletions(-)
Using cpu_up/down directly to bring cpus online/offline loses synchronization with sysfs and could suffer from a race similar to what is described in commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization during LPM"). cpu_up/down seem to be more of a internal implementation detail for the cpu subsystem to use to boot up cpus, perform suspend/resume and low level hotplug operations. Users outside of the cpu subsystem would be better using the device core API to bring a cpu online/offline which is the interface used to hotplug memory and other system devices. Several users have already migrated to use the device core API, this series converts the remaining users and hides cpu_up/down from internal users at the end. I still need to update the documentation to remove references to cpu_up/down and advocate for device_online/offline instead if this series will make its way through. I noticed this problem while working on a hack to disable offlining a particular CPU but noticed that setting the offline_disabled attribute in the device struct isn't enough because users can easily bypass the device core. While my hack isn't a valid use case but it did highlight the inconsistency in the way cpus are being onlined/offlined and this attempt hopefully improves on this. The first 6 patches fixes arch users. The next 5 patches fixes generic code users. Particularly creating a new special exported API for the device core to use instead of cpu_up/down. Maybe we can do something more restrictive than that. The last patch removes cpu_up/down from cpu.h and unexport the functions. In some cases where the use of cpu_up/down seemed legitimate, I encapsulated the logic in a higher level - special purposed function; and converted the code to use that instead. I did run the rcu torture, lock torture and psci checker tests and no problem was noticed. I did perform build tests on all arch affected except for parisc. Hopefully I got the CC list right for all the patches. Apologies in advance if some people were omitted from some patches but they should have been CCed. CC: Armijn Hemel <armijn@tjaldur.nl> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Bjorn Helgaas <bhelgaas@google.com> CC: Borislav Petkov <bp@alien8.de> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: Catalin Marinas <catalin.marinas@arm.com> CC: Christophe Leroy <christophe.leroy@c-s.fr> CC: Daniel Lezcano <daniel.lezcano@linaro.org> CC: Davidlohr Bueso <dave@stgolabs.net> CC: "David S. Miller" <davem@davemloft.net> CC: Eiichi Tsukata <devel@etsukata.com> CC: Enrico Weigelt <info@metux.net> CC: Fenghua Yu <fenghua.yu@intel.com> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Helge Deller <deller@gmx.de> CC: "H. Peter Anvin" <hpa@zytor.com> CC: Ingo Molnar <mingo@kernel.org> CC: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> CC: James Morse <james.morse@arm.com> CC: Jiri Kosina <jkosina@suse.cz> CC: Josh Poimboeuf <jpoimboe@redhat.com> CC: Josh Triplett <josh@joshtriplett.org> CC: Juergen Gross <jgross@suse.com> CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> CC: Mark Rutland <mark.rutland@arm.com> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Nadav Amit <namit@vmware.com> CC: Nicholas Piggin <npiggin@gmail.com> CC: "Paul E. McKenney" <paulmck@kernel.org> CC: Paul Mackerras <paulus@samba.org> CC: Pavankumar Kondeti <pkondeti@codeaurora.org> CC: "Peter Zijlstra (Intel)" <peterz@infradead.org> CC: "Rafael J. Wysocki" <rafael@kernel.org> CC: Ram Pai <linuxram@us.ibm.com> CC: Richard Fontana <rfontana@redhat.com> CC: Sakari Ailus <sakari.ailus@linux.intel.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Steve Capper <steve.capper@arm.com> CC: Thiago Jung Bauermann <bauerman@linux.ibm.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Tony Luck <tony.luck@intel.com> CC: Will Deacon <will@kernel.org> CC: Zhenzhong Duan <zhenzhong.duan@oracle.com> CC: linux-arm-kernel@lists.infradead.org CC: linux-ia64@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: linux-parisc@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: sparclinux@vger.kernel.org CC: x86@kernel.org CC: xen-devel@lists.xenproject.org Qais Yousef (12): arm64: hibernate.c: create a new function to handle cpu_up(sleep_cpu) x86: Replace cpu_up/down with devcie_online/offline powerpc: Replace cpu_up/down with device_online/offline ia64: Replace cpu_down with freeze_secondary_cpus sparc: Replace cpu_up/down with device_online/offline parisc: Replace cpu_up/down with device_online/offline driver: base: cpu: export device_online/offline driver: xen: Replace cpu_up/down with device_online/offline firmware: psci: Replace cpu_up/down with device_online/offline torture: Replace cpu_up/down with device_online/offline smp: Create a new function to bringup nonboot cpus online cpu: Hide cpu_up/down arch/arm64/kernel/hibernate.c | 13 +++---- arch/ia64/kernel/process.c | 8 +--- arch/parisc/kernel/processor.c | 4 +- arch/powerpc/kernel/machine_kexec_64.c | 4 +- arch/sparc/kernel/ds.c | 8 +++- arch/x86/kernel/topology.c | 4 +- arch/x86/mm/mmio-mod.c | 8 +++- arch/x86/xen/smp.c | 4 +- drivers/base/core.c | 4 ++ drivers/base/cpu.c | 4 +- drivers/firmware/psci/psci_checker.c | 6 ++- drivers/xen/cpu_hotplug.c | 2 +- include/linux/cpu.h | 6 ++- kernel/cpu.c | 53 ++++++++++++++++++++++++-- kernel/smp.c | 9 +---- kernel/torture.c | 15 ++++++-- 16 files changed, 106 insertions(+), 46 deletions(-) -- 2.17.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Hi Thomas On 10/30/19 15:38, Qais Yousef wrote: > Using cpu_up/down directly to bring cpus online/offline loses synchronization > with sysfs and could suffer from a race similar to what is described in > commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization > during LPM"). > > cpu_up/down seem to be more of a internal implementation detail for the cpu > subsystem to use to boot up cpus, perform suspend/resume and low level hotplug > operations. Users outside of the cpu subsystem would be better using the device > core API to bring a cpu online/offline which is the interface used to hotplug > memory and other system devices. > > Several users have already migrated to use the device core API, this series > converts the remaining users and hides cpu_up/down from internal users at the > end. > > I still need to update the documentation to remove references to cpu_up/down > and advocate for device_online/offline instead if this series will make its way > through. > > I noticed this problem while working on a hack to disable offlining > a particular CPU but noticed that setting the offline_disabled attribute in the > device struct isn't enough because users can easily bypass the device core. > While my hack isn't a valid use case but it did highlight the inconsistency in > the way cpus are being onlined/offlined and this attempt hopefully improves on > this. > > The first 6 patches fixes arch users. > > The next 5 patches fixes generic code users. Particularly creating a new > special exported API for the device core to use instead of cpu_up/down. > Maybe we can do something more restrictive than that. > > The last patch removes cpu_up/down from cpu.h and unexport the functions. > > In some cases where the use of cpu_up/down seemed legitimate, I encapsulated > the logic in a higher level - special purposed function; and converted the code > to use that instead. > > I did run the rcu torture, lock torture and psci checker tests and no problem > was noticed. I did perform build tests on all arch affected except for parisc. > > Hopefully I got the CC list right for all the patches. Apologies in advance if > some people were omitted from some patches but they should have been CCed. I had to make an educated guess that you're probably the 'maintainer' of cpu hotplug - but there's no explicit entry that says that. Please let me know if I need to bring the attention of others too. The series do have few rough edges to address, but it's relatively straightforward and I think does offer a nice improvement in the form of consolidating the API for bringing up/down cpus from external subsystems/drivers. Beside fix the inconsistency of device's core view of the state of the cpu which can happen when cpu_{up/down} are called directly. The downside I see is that the external API to bring cpus up/down for suspend/resume and at boot seem to have grown a bit organically (I've added a couple in this series to address 2 direct users of cpu_{up,down}). We might need to rethink this API, but I think this is outside the scope of this series. Any thoughts/feedback would be appreciated. Thanks -- Qais Yousef _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On Mon, 18 Nov 2019, Qais Yousef wrote: > I had to make an educated guess that you're probably the 'maintainer' of cpu > hotplug - but there's no explicit entry that says that. Please let me know if > I need to bring the attention of others too. :) > The series do have few rough edges to address, but it's relatively > straightforward and I think does offer a nice improvement in the form of > consolidating the API for bringing up/down cpus from external > subsystems/drivers. Beside fix the inconsistency of device's core view of the > state of the cpu which can happen when cpu_{up/down} are called directly. > > The downside I see is that the external API to bring cpus up/down for > suspend/resume and at boot seem to have grown a bit organically (I've added > a couple in this series to address 2 direct users of cpu_{up,down}). We might > need to rethink this API, but I think this is outside the scope of this series. > > Any thoughts/feedback would be appreciated. Let me have a look. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.