blk_mq_pci_map_queues and blk_mq_virtio_map_queues will create a CPU to
hardware queue mapping based on affinity information. These two function
share common code and only differ on how the affinity information is
retrieved. Also, those functions are located in the block subsystem
where it doesn't really fit in. They are virtio and pci subsystem
specific.
Thus introduce provide a generic mapping function which uses the
irq_get_affinity callback from bus_type.
Originally idea from Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
---
block/blk-mq-cpumap.c | 37 +++++++++++++++++++++++++++++++++++++
include/linux/blk-mq.h | 2 ++
2 files changed, 39 insertions(+)
diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c
index 9638b25fd52124f0173e968ebdca5f1fe0b42ad9..db22a7d523a2762b76398fdd768f55efd1d6d669 100644
--- a/block/blk-mq-cpumap.c
+++ b/block/blk-mq-cpumap.c
@@ -11,6 +11,7 @@
#include <linux/smp.h>
#include <linux/cpu.h>
#include <linux/group_cpus.h>
+#include <linux/device/bus.h>
#include "blk.h"
#include "blk-mq.h"
@@ -54,3 +55,39 @@ int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int index)
return NUMA_NO_NODE;
}
+
+/**
+ * blk_mq_hctx_map_queues - Create CPU to hardware queue mapping
+ * @qmap: CPU to hardware queue map.
+ * @dev: The device to map queues.
+ * @offset: Queue offset to use for the device.
+ *
+ * Create a CPU to hardware queue mapping in @qmap. The struct bus_type
+ * irq_get_affinity callback will be used to retrieve the affinity.
+ */
+void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap,
+ struct device *dev, unsigned int offset)
+
+{
+ const struct cpumask *mask;
+ unsigned int queue, cpu;
+
+ if (!dev->bus->irq_get_affinity)
+ goto fallback;
+
+ for (queue = 0; queue < qmap->nr_queues; queue++) {
+ mask = dev->bus->irq_get_affinity(dev, queue + offset);
+ if (!mask)
+ goto fallback;
+
+ for_each_cpu(cpu, mask)
+ qmap->mq_map[cpu] = qmap->queue_offset + queue;
+ }
+
+ return;
+
+fallback:
+ WARN_ON_ONCE(qmap->nr_queues > 1);
+ blk_mq_clear_mq_map(qmap);
+}
+EXPORT_SYMBOL_GPL(blk_mq_hctx_map_queues);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 2035fad3131fb60781957095ce8a3a941dd104be..1a85fdcb443c154390cd29f2b1f2a807bf10bfe3 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -923,6 +923,8 @@ void blk_mq_unfreeze_queue_non_owner(struct request_queue *q);
void blk_freeze_queue_start_non_owner(struct request_queue *q);
void blk_mq_map_queues(struct blk_mq_queue_map *qmap);
+void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap,
+ struct device *dev, unsigned int offset);
void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
void blk_mq_quiesce_queue_nowait(struct request_queue *q);
--
2.47.0
On 11/12/24 14:26, Daniel Wagner wrote: > blk_mq_pci_map_queues and blk_mq_virtio_map_queues will create a CPU to > hardware queue mapping based on affinity information. These two function > share common code and only differ on how the affinity information is > retrieved. Also, those functions are located in the block subsystem > where it doesn't really fit in. They are virtio and pci subsystem > specific. > > Thus introduce provide a generic mapping function which uses the > irq_get_affinity callback from bus_type. > > Originally idea from Ming Lei <ming.lei@redhat.com> > > Signed-off-by: Daniel Wagner <wagi@kernel.org> > --- > block/blk-mq-cpumap.c | 37 +++++++++++++++++++++++++++++++++++++ > include/linux/blk-mq.h | 2 ++ > 2 files changed, 39 insertions(+) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
The subject prefix has obviously a typo, should start with 'blk-mq:'
Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Nov 12, 2024 at 02:26:19PM +0100, Daniel Wagner wrote: > blk_mq_pci_map_queues and blk_mq_virtio_map_queues will create a CPU to > hardware queue mapping based on affinity information. These two function > share common code and only differ on how the affinity information is > retrieved. Also, those functions are located in the block subsystem > where it doesn't really fit in. They are virtio and pci subsystem > specific. > > Thus introduce provide a generic mapping function which uses the > irq_get_affinity callback from bus_type. > > Originally idea from Ming Lei <ming.lei@redhat.com> > > Signed-off-by: Daniel Wagner <wagi@kernel.org> > --- > block/blk-mq-cpumap.c | 37 +++++++++++++++++++++++++++++++++++++ > include/linux/blk-mq.h | 2 ++ > 2 files changed, 39 insertions(+) > > diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c > index 9638b25fd52124f0173e968ebdca5f1fe0b42ad9..db22a7d523a2762b76398fdd768f55efd1d6d669 100644 > --- a/block/blk-mq-cpumap.c > +++ b/block/blk-mq-cpumap.c > @@ -11,6 +11,7 @@ > #include <linux/smp.h> > #include <linux/cpu.h> > #include <linux/group_cpus.h> > +#include <linux/device/bus.h> > > #include "blk.h" > #include "blk-mq.h" > @@ -54,3 +55,39 @@ int blk_mq_hw_queue_to_node(struct blk_mq_queue_map *qmap, unsigned int index) > > return NUMA_NO_NODE; > } > + > +/** > + * blk_mq_hctx_map_queues - Create CPU to hardware queue mapping > + * @qmap: CPU to hardware queue map. > + * @dev: The device to map queues. > + * @offset: Queue offset to use for the device. > + * > + * Create a CPU to hardware queue mapping in @qmap. The struct bus_type > + * irq_get_affinity callback will be used to retrieve the affinity. > + */ > +void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap, > + struct device *dev, unsigned int offset) > + > +{ > + const struct cpumask *mask; > + unsigned int queue, cpu; > + > + if (!dev->bus->irq_get_affinity) > + goto fallback; I think this is better than hard-coding it, but are you sure that the bus will always be bound to the device here so that you have a valid bus-> pointer? thanks, greg k-h
On Tue, Nov 12, 2024 at 02:58:43PM +0100, Greg Kroah-Hartman wrote: > > +void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap, > > + struct device *dev, unsigned int offset) > > + > > +{ > > + const struct cpumask *mask; > > + unsigned int queue, cpu; > > + > > + if (!dev->bus->irq_get_affinity) > > + goto fallback; > > I think this is better than hard-coding it, but are you sure that the > bus will always be bound to the device here so that you have a valid > bus-> pointer? No, I just assumed the bus pointer is always valid. If it is possible to have a device without a bus, than I'll better extend the condition to if (!dev->bus || !dev->bus->irq_get_affinity) goto fallback;
On Tue, Nov 12, 2024 at 04:33:09PM +0100, Daniel Wagner wrote: > On Tue, Nov 12, 2024 at 02:58:43PM +0100, Greg Kroah-Hartman wrote: > > > +void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap, > > > + struct device *dev, unsigned int offset) > > > + > > > +{ > > > + const struct cpumask *mask; > > > + unsigned int queue, cpu; > > > + > > > + if (!dev->bus->irq_get_affinity) > > > + goto fallback; > > > > I think this is better than hard-coding it, but are you sure that the > > bus will always be bound to the device here so that you have a valid > > bus-> pointer? > > No, I just assumed the bus pointer is always valid. If it is possible to > have a device without a bus, than I'll better extend the condition to > > if (!dev->bus || !dev->bus->irq_get_affinity) > goto fallback; I don't know if it's possible as I don't know what codepaths are calling this, it was hard to unwind. But you should check "just to be safe" :) thanks, greg k-h
On Tue, Nov 12, 2024 at 04:42:40PM +0100, Greg Kroah-Hartman wrote: > On Tue, Nov 12, 2024 at 04:33:09PM +0100, Daniel Wagner wrote: > > On Tue, Nov 12, 2024 at 02:58:43PM +0100, Greg Kroah-Hartman wrote: > > > > +void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap, > > > > + struct device *dev, unsigned int offset) > > > > + > > > > +{ > > > > + const struct cpumask *mask; > > > > + unsigned int queue, cpu; > > > > + > > > > + if (!dev->bus->irq_get_affinity) > > > > + goto fallback; > > > > > > I think this is better than hard-coding it, but are you sure that the > > > bus will always be bound to the device here so that you have a valid > > > bus-> pointer? > > > > No, I just assumed the bus pointer is always valid. If it is possible to > > have a device without a bus, than I'll better extend the condition to > > > > if (!dev->bus || !dev->bus->irq_get_affinity) > > goto fallback; > > I don't know if it's possible as I don't know what codepaths are calling > this, it was hard to unwind. But you should check "just to be safe" :) The main path to map_queues is via the probe functions. There are some more paths like when updating a tagset after the number of queues but that is all after the probe function. nvme_probe nvme_alloc_admin_tag_set blk_mq_alloc_tag_set blk_mq_update_queue_map set->ops->map_queues blk_mq_htcx_map_queues nvme_alloc_io_tag_set blk_mq_alloc_tag_set blk_mq_update_queue_map set->ops->map_queues blk_mq_htcx_map_queues virtscsi_probe, hisi_sas_v3_probe, ... scsi_add_host scsi_add_host_with_dma scsi_mq_setup_tags blk_mq_alloc_tag_set blk_mq_update_queue_map set->ops->map_queues blk_mq_htcx_map_queues virtblk_probe blk_mq_alloc_tag_set blk_mq_update_queue_map set->ops->map_queues blk_mq_htcx_map_queues Does this help?
On Tue, Nov 12, 2024 at 05:15:31PM +0100, Daniel Wagner wrote: > On Tue, Nov 12, 2024 at 04:42:40PM +0100, Greg Kroah-Hartman wrote: > > On Tue, Nov 12, 2024 at 04:33:09PM +0100, Daniel Wagner wrote: > > > On Tue, Nov 12, 2024 at 02:58:43PM +0100, Greg Kroah-Hartman wrote: > > > > > +void blk_mq_hctx_map_queues(struct blk_mq_queue_map *qmap, > > > > > + struct device *dev, unsigned int offset) > > > > > + > > > > > +{ > > > > > + const struct cpumask *mask; > > > > > + unsigned int queue, cpu; > > > > > + > > > > > + if (!dev->bus->irq_get_affinity) > > > > > + goto fallback; > > > > > > > > I think this is better than hard-coding it, but are you sure that the > > > > bus will always be bound to the device here so that you have a valid > > > > bus-> pointer? > > > > > > No, I just assumed the bus pointer is always valid. If it is possible to > > > have a device without a bus, than I'll better extend the condition to > > > > > > if (!dev->bus || !dev->bus->irq_get_affinity) > > > goto fallback; > > > > I don't know if it's possible as I don't know what codepaths are calling > > this, it was hard to unwind. But you should check "just to be safe" :) > > The main path to map_queues is via the probe functions. There are some > more paths like when updating a tagset after the number of queues but > that is all after the probe function. > > nvme_probe > nvme_alloc_admin_tag_set > blk_mq_alloc_tag_set > blk_mq_update_queue_map > set->ops->map_queues > blk_mq_htcx_map_queues > nvme_alloc_io_tag_set > blk_mq_alloc_tag_set > blk_mq_update_queue_map > set->ops->map_queues > blk_mq_htcx_map_queues > > virtscsi_probe, hisi_sas_v3_probe, ... > scsi_add_host > scsi_add_host_with_dma > scsi_mq_setup_tags > blk_mq_alloc_tag_set > blk_mq_update_queue_map > set->ops->map_queues > blk_mq_htcx_map_queues > > virtblk_probe > blk_mq_alloc_tag_set > blk_mq_update_queue_map > set->ops->map_queues > blk_mq_htcx_map_queues > > Does this help? Ok, that seems fine. Worst case, you crash and it's obvious that it needs to be checked in the future :) thanks, greg k-h
© 2016 - 2024 Red Hat, Inc.