drivers/nvme/target/nvmet.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Group some variables based on their sizes to reduce holes.
On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512
bytes.
When such a structure is allocated in nvmet_ns_alloc(), because of the way
memory allocation works, when 520 bytes were requested, 1024 bytes were
allocated.
So, on x86_64, this change saves 512 bytes per allocation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
More aggressive grouping could be done to be more future proof, but the
way the struct nvmet_ns is written suggest that some fields should be
kept together. So keep grouping as-is.
Using pahole
Before:
======
struct nvmet_ns {
struct percpu_ref ref; /* 0 16 */
struct block_device * bdev; /* 16 8 */
struct file * file; /* 24 8 */
bool readonly; /* 32 1 */
/* XXX 3 bytes hole, try to pack */
u32 nsid; /* 36 4 */
u32 blksize_shift; /* 40 4 */
/* XXX 4 bytes hole, try to pack */
loff_t size; /* 48 8 */
u8 nguid[16]; /* 56 16 */
/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
uuid_t uuid; /* 72 16 */
u32 anagrpid; /* 88 4 */
bool buffered_io; /* 92 1 */
bool enabled; /* 93 1 */
/* XXX 2 bytes hole, try to pack */
struct nvmet_subsys * subsys; /* 96 8 */
const char * device_path; /* 104 8 */
struct config_group device_group; /* 112 136 */
/* --- cacheline 3 boundary (192 bytes) was 56 bytes ago --- */
struct config_group group; /* 248 136 */
/* --- cacheline 6 boundary (384 bytes) --- */
struct completion disable_done; /* 384 96 */
/* --- cacheline 7 boundary (448 bytes) was 32 bytes ago --- */
mempool_t * bvec_pool; /* 480 8 */
int use_p2pmem; /* 488 4 */
/* XXX 4 bytes hole, try to pack */
struct pci_dev * p2p_dev; /* 496 8 */
int pi_type; /* 504 4 */
int metadata_size; /* 508 4 */
/* --- cacheline 8 boundary (512 bytes) --- */
u8 csi; /* 512 1 */
/* size: 520, cachelines: 9, members: 23 */
/* sum members: 500, holes: 4, sum holes: 13 */
/* padding: 7 */
/* last cacheline: 8 bytes */
};
After:
=====
struct nvmet_ns {
struct percpu_ref ref; /* 0 16 */
struct block_device * bdev; /* 16 8 */
struct file * file; /* 24 8 */
bool readonly; /* 32 1 */
/* XXX 3 bytes hole, try to pack */
u32 nsid; /* 36 4 */
u32 blksize_shift; /* 40 4 */
/* XXX 4 bytes hole, try to pack */
loff_t size; /* 48 8 */
u8 nguid[16]; /* 56 16 */
/* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
uuid_t uuid; /* 72 16 */
u32 anagrpid; /* 88 4 */
bool buffered_io; /* 92 1 */
bool enabled; /* 93 1 */
/* XXX 2 bytes hole, try to pack */
struct nvmet_subsys * subsys; /* 96 8 */
const char * device_path; /* 104 8 */
struct config_group device_group; /* 112 136 */
/* --- cacheline 3 boundary (192 bytes) was 56 bytes ago --- */
struct config_group group; /* 248 136 */
/* --- cacheline 6 boundary (384 bytes) --- */
struct completion disable_done; /* 384 96 */
/* --- cacheline 7 boundary (448 bytes) was 32 bytes ago --- */
mempool_t * bvec_pool; /* 480 8 */
struct pci_dev * p2p_dev; /* 488 8 */
int use_p2pmem; /* 496 4 */
int pi_type; /* 500 4 */
int metadata_size; /* 504 4 */
u8 csi; /* 508 1 */
/* size: 512, cachelines: 8, members: 23 */
/* sum members: 500, holes: 3, sum holes: 9 */
/* padding: 3 */
};
---
drivers/nvme/target/nvmet.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index dc60a22646f7..c50146085fb5 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -79,8 +79,8 @@ struct nvmet_ns {
struct completion disable_done;
mempool_t *bvec_pool;
- int use_p2pmem;
struct pci_dev *p2p_dev;
+ int use_p2pmem;
int pi_type;
int metadata_size;
u8 csi;
--
2.34.1
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
(+Keith, for host side) > --- > More aggressive grouping could be done to be more future proof, but the > way the struct nvmet_ns is written suggest that some fields should be > kept together. So keep grouping as-is. > > you can send RFC and I'll be happy to take a look if it's going have any benefit, it will take some time though.. for host side :- while you are at it, it might be useful to take a look at the structures that are accessed in the fast path on the host side ? unless there is some reason for not doing that ... -ck
Le 28/04/2023 à 01:12, Chaitanya Kulkarni a écrit : > (+Keith, for host side) >> --- >> More aggressive grouping could be done to be more future proof, but the >> way the struct nvmet_ns is written suggest that some fields should be >> kept together. So keep grouping as-is. >> >> > > you can send RFC and I'll be happy to take a look if it's going > have any benefit, it will take some time though.. > > for host side :- > > while you are at it, it might be useful to take a look at the structures > that are accessed in the fast path on the host side ? Ok, why not, but can you help identifying these structures or places considered as fast path? CJ > > unless there is some reason for not doing that ... > > -ck > >
On 4/28/23 00:33, Christophe JAILLET wrote: > Le 28/04/2023 à 01:12, Chaitanya Kulkarni a écrit : >> (+Keith, for host side) >>> --- >>> More aggressive grouping could be done to be more future proof, but the >>> way the struct nvmet_ns is written suggest that some fields should be >>> kept together. So keep grouping as-is. >>> >>> >> >> you can send RFC and I'll be happy to take a look if it's going >> have any benefit, it will take some time though.. >> >> for host side :- >> >> while you are at it, it might be useful to take a look at the structures >> that are accessed in the fast path on the host side ? > > Ok, why not, but can you help identifying these structures or places > considered as fast path? > > CJ you can start with nvme_ns/nvme_ctrl as I remember nvme_ns is used in nvme_setup_rw_cmd() on host and nvme_ctrl is used in nvmet_passthru_execute_cmd(). In general nvme_queue_rq()/nvme_rdma_queue_rq()/ nvmet_bdev_execute_rw()/nvmet_file_execute_rw()/ nvmet_passthru_execute_cmd() are functions to start with, then you can see structs starting with nvme_ prefix (mainly related to ns and ctrl) which are worth taking a look and their benefits, but be careful what you are moving ... Ohh and nvmet_req that is also ... -ck
On 4/27/23 12:47, Christophe JAILLET wrote: > Group some variables based on their sizes to reduce holes. > On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 > bytes. > Although this looks good, we at least need to document what happens on other arch(s) which are not mentioned in the commit log ? is there a possibility that someone might come up with the contradictory data in future for the arch(s) which arch that are not mentioned here ? -ck
On 4/27/23 4:59?PM, Chaitanya Kulkarni wrote: > On 4/27/23 12:47, Christophe JAILLET wrote: >> Group some variables based on their sizes to reduce holes. >> On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 >> bytes. >> > > Although this looks good, we at least need to document what > happens on other arch(s) which are not mentioned in the > commit log ? is there a possibility that someone might come > up with the contradictory data in future for the arch(s) which > arch that are not mentioned here ? The change is certainly not going to make things _worse_ for any arch, so I think that's somewhat of a pointless exercise and an unreasonable ask for something that makes sense on 64-bit arm/x86 and saves half the space. -- Jens Axboe
On 4/27/23 16:01, Jens Axboe wrote: > On 4/27/23 4:59?PM, Chaitanya Kulkarni wrote: >> On 4/27/23 12:47, Christophe JAILLET wrote: >>> Group some variables based on their sizes to reduce holes. >>> On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 >>> bytes. >>> >> Although this looks good, we at least need to document what >> happens on other arch(s) which are not mentioned in the >> commit log ? is there a possibility that someone might come >> up with the contradictory data in future for the arch(s) which >> arch that are not mentioned here ? > The change is certainly not going to make things _worse_ for any arch, > so I think that's somewhat of a pointless exercise and an unreasonable > ask for something that makes sense on 64-bit arm/x86 and saves half the > space. > disregard my comment, looks good... Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> -ck
Le 28/04/2023 à 01:07, Chaitanya Kulkarni a écrit : > On 4/27/23 16:01, Jens Axboe wrote: >> On 4/27/23 4:59?PM, Chaitanya Kulkarni wrote: >>> On 4/27/23 12:47, Christophe JAILLET wrote: >>>> Group some variables based on their sizes to reduce holes. >>>> On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 >>>> bytes. >>>> >>> Although this looks good, we at least need to document what >>> happens on other arch(s) which are not mentioned in the >>> commit log ? is there a possibility that someone might come >>> up with the contradictory data in future for the arch(s) which >>> arch that are not mentioned here ? >> The change is certainly not going to make things _worse_ for any arch, >> so I think that's somewhat of a pointless exercise and an unreasonable >> ask for something that makes sense on 64-bit arm/x86 and saves half the >> space. >> > > disregard my comment, looks good... > > Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> > > -ck > > Hi, All my other nvmet patches have reached -next today, but this one seems to be missing. So this is a gentle reminder, in case it got lost somewhere. CJ
On Mon, Jun 19, 2023 at 08:08:38PM +0200, Christophe JAILLET wrote: > All my other nvmet patches have reached -next today, but this one seems to > be missing. > > So this is a gentle reminder, in case it got lost somewhere. Oops, thanks for the catch. I'll queue this up for the next 6.5 pull request.
Queued up now for nvme-6.5.
On 6/19/2023 11:08 AM, Christophe JAILLET wrote: > Le 28/04/2023 à 01:07, Chaitanya Kulkarni a écrit : >> On 4/27/23 16:01, Jens Axboe wrote: >>> On 4/27/23 4:59?PM, Chaitanya Kulkarni wrote: >>>> On 4/27/23 12:47, Christophe JAILLET wrote: >>>>> Group some variables based on their sizes to reduce holes. >>>>> On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 >>>>> bytes. >>>>> >>>> Although this looks good, we at least need to document what >>>> happens on other arch(s) which are not mentioned in the >>>> commit log ? is there a possibility that someone might come >>>> up with the contradictory data in future for the arch(s) which >>>> arch that are not mentioned here ? >>> The change is certainly not going to make things _worse_ for any arch, >>> so I think that's somewhat of a pointless exercise and an unreasonable >>> ask for something that makes sense on 64-bit arm/x86 and saves half the >>> space. >>> >> >> disregard my comment, looks good... >> >> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> >> >> -ck >> >> > > Hi, > > > All my other nvmet patches have reached -next today, but this one seems > to be missing. > > So this is a gentle reminder, in case it got lost somewhere. > > CJ I believe this patch can still be applied as is on the top of nvme-6.5.. -ck
On 4/27/23 1:47 PM, Christophe JAILLET wrote: > Group some variables based on their sizes to reduce holes. > On x86_64, this shrinks the size of 'struct nvmet_ns' from 520 to 512 > bytes. > > When such a structure is allocated in nvmet_ns_alloc(), because of the way > memory allocation works, when 520 bytes were requested, 1024 bytes were > allocated. > > So, on x86_64, this change saves 512 bytes per allocation. > > Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> > --- > More aggressive grouping could be done to be more future proof, but the > way the struct nvmet_ns is written suggest that some fields should be > kept together. So keep grouping as-is. I think you did the right thing, that move doesn't matter and it brings it to pow-of-2 or less and that is really what matters. So looks fine to me: Reviewed-by: Jens Axboe <axboe@kernel.dk> -- Jens Axboe
© 2016 - 2026 Red Hat, Inc.