drivers/scsi/scsi_debug.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
The tape setup path writes partition metadata one element past the
allocated tape_blocks array when a one-partition configuration is
selected.
That corrupts adjacent state during device initialization before any
command is issued.
Reject a declared multi-partition layout that has no space for partition
1, and initialize partition 1's marker only when partition 1 exists.
Assisted-by: Codex:gpt-5.5-cyber-preview
Signed-off-by: Samuel Moelius <sam.moelius@trailofbits.com>
---
Changes in v2
- Fixed handling of part_1_size == 0 case
drivers/scsi/scsi_debug.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 1515495fd9ea..edcc2f5f6977 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -3661,12 +3661,18 @@ static int partition_tape(struct sdebug_dev_info *devip, int nbr_partitions,
if (part_0_size + part_1_size > TAPE_UNITS)
return -1;
+ if (nbr_partitions > 1 && part_1_size <= 0)
+ return -1;
devip->tape_eop[0] = part_0_size;
devip->tape_blocks[0]->fl_size = TAPE_BLOCK_EOD_FLAG;
devip->tape_eop[1] = part_1_size;
- devip->tape_blocks[1] = devip->tape_blocks[0] +
- devip->tape_eop[0];
- devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
+ if (nbr_partitions > 1) {
+ devip->tape_blocks[1] = devip->tape_blocks[0] +
+ devip->tape_eop[0];
+ devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
+ } else {
+ devip->tape_blocks[1] = NULL;
+ }
for (i = 0 ; i < TAPE_MAX_PARTITIONS; i++)
devip->tape_location[i] = 0;
--
2.43.0
On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote: > The tape setup path writes partition metadata one element past the > allocated tape_blocks array when a one-partition configuration is > selected. > > That corrupts adjacent state during device initialization before any > command is issued. I still don't get what the actual problem is. For a single partition tape I can't see where scsi_debug would actually do anything with tape_blocks[1]. What is it that you're seeing when using scsi_debug that motivates this? Regards, James
On Thu, Jun 4, 2026 at 9:38 AM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote:
> > The tape setup path writes partition metadata one element past the
> > allocated tape_blocks array when a one-partition configuration is
> > selected.
> >
> > That corrupts adjacent state during device initialization before any
> > command is issued.
>
> I still don't get what the actual problem is. For a single partition
> tape I can't see where scsi_debug would actually do anything with
> tape_blocks[1]. What is it that you're seeing when using scsi_debug
> that motivates this?
The bug is a kernel OOB write. I can share a PoC if desired. The PoC
sends this SCSI command through /dev/sgN:
unsigned char format_medium[6] = { 0x04, 0, 0, 0, 0, 0 };
rc = sg_cmd(fd, "format_medium_one_partition",
format_medium, sizeof(format_medium));
Inside sg_cmd(), that becomes an SG_IO ioctl:
hdr.interface_id = 'S';
hdr.dxfer_direction = SG_DXFER_NONE;
hdr.cmd_len = cdb_len;
hdr.cmdp = cdb;
hdr.timeout = 10000;
ioctl(fd, SG_IO, &hdr);
For scsi_debug tape devices, opcode 0x04 dispatches here:
{ 0, 0x4, 0, DS_SSC, 0, resp_format_medium, NULL,
/* FORMAT MEDIUM (6) */ }
resp_format_medium() sees cmd[2] == 0, meaning the default
one-partition format path:
if (cmd[2] != 0) {
...
} else {
res = partition_tape(devip, 1, TAPE_UNITS, 0);
}
So partition_tape() is called with:
nbr_partitions = 1
part_0_size = TAPE_UNITS = 10000
part_1_size = 0
The unpatched code only checked total size:
if (part_0_size + part_1_size > TAPE_UNITS)
return -1;
That passes:
10000 + 0 == 10000
Then it initializes partition 0:
devip->tape_eop[0] = part_0_size;
devip->tape_blocks[0]->fl_size = TAPE_BLOCK_EOD_FLAG;
Then the bug: it initializes partition 1 even though there is only one
partition:
devip->tape_eop[1] = part_1_size;
devip->tape_blocks[1] = devip->tape_blocks[0] +
devip->tape_eop[0];
devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
Because devip->tape_eop[0] == 10000, this computes:
devip->tape_blocks[1] = devip->tape_blocks[0] + 10000
But the allocation has only 10000 elements. So this write is one
element past the allocation.
> On 4. Jun 2026, at 21.33, Samuel Moelius <sam.moelius@trailofbits.com> wrote: > > On Thu, Jun 4, 2026 at 9:38 AM James Bottomley > <James.Bottomley@hansenpartnership.com> wrote: >> >> On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote: >>> The tape setup path writes partition metadata one element past the >>> allocated tape_blocks array when a one-partition configuration is >>> selected. >>> >>> That corrupts adjacent state during device initialization before any >>> command is issued. >> >> I still don't get what the actual problem is. For a single partition >> tape I can't see where scsi_debug would actually do anything with >> tape_blocks[1]. What is it that you're seeing when using scsi_debug >> that motivates this? > > The bug is a kernel OOB write. I can share a PoC if desired. The PoC > sends this SCSI command through /dev/sgN: > > ... > Then the bug: it initializes partition 1 even though there is only one > partition: > > devip->tape_eop[1] = part_1_size; > devip->tape_blocks[1] = devip->tape_blocks[0] + > devip->tape_eop[0]; > devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG; > > Because devip->tape_eop[0] == 10000, this computes: > > devip->tape_blocks[1] = devip->tape_blocks[0] + 10000 > > But the allocation has only 10000 elements. So this write is one > element past the allocation. OK. The bug is not initialization of the pointer but writing the fl_size using the pointer. Good catch! But the patch is not quite correct. If nbr_partitions == 2 and partition_1_size == 0, it sets tape_nbr_partitions = 2 but does not initialize the second partition. This will cause problems. I think partition_tape() should return -1 if nbr_partitions > 0 && part_1_size == 0. All call sites of partition_page() check for error, but the error case has never happened. The code should be checked so that an error return does not cause problems later. Thanks, Kai
On Thu, 2026-06-04 at 22:14 +0300, Kai Mäkisara (Kolumbus) wrote:
>
> > On 4. Jun 2026, at 21.33, Samuel Moelius
> > <sam.moelius@trailofbits.com> wrote:
> >
> > On Thu, Jun 4, 2026 at 9:38 AM James Bottomley
> > <James.Bottomley@hansenpartnership.com> wrote:
> > >
> > > On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote:
> > > > The tape setup path writes partition metadata one element past
> > > > the
> > > > allocated tape_blocks array when a one-partition configuration
> > > > is
> > > > selected.
> > > >
> > > > That corrupts adjacent state during device initialization
> > > > before any
> > > > command is issued.
> > >
> > > I still don't get what the actual problem is. For a single
> > > partition
> > > tape I can't see where scsi_debug would actually do anything with
> > > tape_blocks[1]. What is it that you're seeing when using
> > > scsi_debug
> > > that motivates this?
> >
> > The bug is a kernel OOB write. I can share a PoC if desired. The
> > PoC
> > sends this SCSI command through /dev/sgN:
> >
> > ...
>
> > Then the bug: it initializes partition 1 even though there is only
> > one
> > partition:
> >
> > devip->tape_eop[1] = part_1_size;
> > devip->tape_blocks[1] = devip->tape_blocks[0] +
> > devip->tape_eop[0];
> > devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
> >
> > Because devip->tape_eop[0] == 10000, this computes:
> >
> > devip->tape_blocks[1] = devip->tape_blocks[0] + 10000
> >
> > But the allocation has only 10000 elements. So this write is one
> > element past the allocation.
>
> OK. The bug is not initialization of the pointer but writing the
> fl_size using the pointer. Good catch!
Isn't the fix actually to allocate an extra block for the EOF:
@@ -6648,7 +6648,7 @@ static int scsi_debug_sdev_configure(struct scsi_device *sdp,
if (sdebug_ptype == TYPE_TAPE) {
if (!devip->tape_blocks[0]) {
devip->tape_blocks[0] =
- kzalloc_objs(struct tape_block, TAPE_UNITS);
+ kzalloc_objs(struct tape_block, TAPE_UNITS + 1);
if (!devip->tape_blocks[0])
return 1;
?
Regards,
James
On Thu, Jun 4, 2026 at 3:29 PM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Thu, 2026-06-04 at 22:14 +0300, Kai Mäkisara (Kolumbus) wrote:
> >
> > > On 4. Jun 2026, at 21.33, Samuel Moelius
> > > <sam.moelius@trailofbits.com> wrote:
> > >
> > > On Thu, Jun 4, 2026 at 9:38 AM James Bottomley
> > > <James.Bottomley@hansenpartnership.com> wrote:
> > > >
> > > > On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote:
> > > > > The tape setup path writes partition metadata one element past
> > > > > the
> > > > > allocated tape_blocks array when a one-partition configuration
> > > > > is
> > > > > selected.
> > > > >
> > > > > That corrupts adjacent state during device initialization
> > > > > before any
> > > > > command is issued.
> > > >
> > > > I still don't get what the actual problem is. For a single
> > > > partition
> > > > tape I can't see where scsi_debug would actually do anything with
> > > > tape_blocks[1]. What is it that you're seeing when using
> > > > scsi_debug
> > > > that motivates this?
> > >
> > > The bug is a kernel OOB write. I can share a PoC if desired. The
> > > PoC
> > > sends this SCSI command through /dev/sgN:
> > >
> > > ...
> >
> > > Then the bug: it initializes partition 1 even though there is only
> > > one
> > > partition:
> > >
> > > devip->tape_eop[1] = part_1_size;
> > > devip->tape_blocks[1] = devip->tape_blocks[0] +
> > > devip->tape_eop[0];
> > > devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
> > >
> > > Because devip->tape_eop[0] == 10000, this computes:
> > >
> > > devip->tape_blocks[1] = devip->tape_blocks[0] + 10000
> > >
> > > But the allocation has only 10000 elements. So this write is one
> > > element past the allocation.
> >
> > OK. The bug is not initialization of the pointer but writing the
> > fl_size using the pointer. Good catch!
>
> Isn't the fix actually to allocate an extra block for the EOF:
>
> @@ -6648,7 +6648,7 @@ static int scsi_debug_sdev_configure(struct scsi_device *sdp,
> if (sdebug_ptype == TYPE_TAPE) {
> if (!devip->tape_blocks[0]) {
> devip->tape_blocks[0] =
> - kzalloc_objs(struct tape_block, TAPE_UNITS);
> + kzalloc_objs(struct tape_block, TAPE_UNITS + 1);
> if (!devip->tape_blocks[0])
> return 1;
I'll send a v3 that uses that approach.
> On 4. Jun 2026, at 22.29, James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
>
> On Thu, 2026-06-04 at 22:14 +0300, Kai Mäkisara (Kolumbus) wrote:
>>
>>> On 4. Jun 2026, at 21.33, Samuel Moelius
>>> <sam.moelius@trailofbits.com> wrote:
>>>
>>> On Thu, Jun 4, 2026 at 9:38 AM James Bottomley
>>> <James.Bottomley@hansenpartnership.com> wrote:
>>>>
>>>> On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote:
>>>>> The tape setup path writes partition metadata one element past
>>>>> the
>>>>> allocated tape_blocks array when a one-partition configuration
>>>>> is
>>>>> selected.
>>>>>
>>>>> That corrupts adjacent state during device initialization
>>>>> before any
>>>>> command is issued.
>>>>
>>>> I still don't get what the actual problem is. For a single
>>>> partition
>>>> tape I can't see where scsi_debug would actually do anything with
>>>> tape_blocks[1]. What is it that you're seeing when using
>>>> scsi_debug
>>>> that motivates this?
>>>
>>> The bug is a kernel OOB write. I can share a PoC if desired. The
>>> PoC
>>> sends this SCSI command through /dev/sgN:
>>>
>>> ...
>>
>>> Then the bug: it initializes partition 1 even though there is only
>>> one
>>> partition:
>>>
>>> devip->tape_eop[1] = part_1_size;
>>> devip->tape_blocks[1] = devip->tape_blocks[0] +
>>> devip->tape_eop[0];
>>> devip->tape_blocks[1]->fl_size = TAPE_BLOCK_EOD_FLAG;
>>>
>>> Because devip->tape_eop[0] == 10000, this computes:
>>>
>>> devip->tape_blocks[1] = devip->tape_blocks[0] + 10000
>>>
>>> But the allocation has only 10000 elements. So this write is one
>>> element past the allocation.
>>
>> OK. The bug is not initialization of the pointer but writing the
>> fl_size using the pointer. Good catch!
>
> Isn't the fix actually to allocate an extra block for the EOF:
>
> @@ -6648,7 +6648,7 @@ static int scsi_debug_sdev_configure(struct scsi_device *sdp,
> if (sdebug_ptype == TYPE_TAPE) {
> if (!devip->tape_blocks[0]) {
> devip->tape_blocks[0] =
> - kzalloc_objs(struct tape_block, TAPE_UNITS);
> + kzalloc_objs(struct tape_block, TAPE_UNITS + 1);
> if (!devip->tape_blocks[0])
> return 1;
>
> ?
That came into my mind, too. But I considered it a workaround, not a real fix.
But after considering the other alternatives and possible problems that should be checked,
I think your suggestion is the best way. It solves tha OOB write, but does not change the code
paths.
Thanks,
Kai
> On 4. Jun 2026, at 16.38, James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote: >> The tape setup path writes partition metadata one element past the >> allocated tape_blocks array when a one-partition configuration is >> selected. >> >> That corrupts adjacent state during device initialization before any >> command is issued. > > I still don't get what the actual problem is. For a single partition > tape I can't see where scsi_debug would actually do anything with > tape_blocks[1]. What is it that you're seeing when using scsi_debug > that motivates this? > The code marks partition 1 as EOD and does not corrupt anything. The tape has partitions until either EOD partition is encountered or TAPE_MAX_PARTITIONS is reached. I would be very hesitant to remove this initialization. Thanks, Kai
> On 4. Jun 2026, at 17.52, Kai Mäkisara (Kolumbus) <kai.makisara@kolumbus.fi> wrote: > > > >> On 4. Jun 2026, at 16.38, James Bottomley <James.Bottomley@HansenPartnership.com> wrote: >> >> On Wed, 2026-06-03 at 23:55 +0000, Samuel Moelius wrote: >>> The tape setup path writes partition metadata one element past the >>> allocated tape_blocks array when a one-partition configuration is >>> selected. >>> >>> That corrupts adjacent state during device initialization before any >>> command is issued. >> >> I still don't get what the actual problem is. For a single partition >> tape I can't see where scsi_debug would actually do anything with >> tape_blocks[1]. What is it that you're seeing when using scsi_debug >> that motivates this? >> > The code marks partition 1 as EOD and does not corrupt anything. The tape has partitions > until either EOD partition is encountered or TAPE_MAX_PARTITIONS is reached. I would > be very hesitant to remove this initialization. > I looked more carefully at the code:-) Each partition ends at EOD block. But I would still not like to remove this initialization because, even if it is not necessary, it is harmless. There has been a reason to do the initialization like that at some time. And the code is simpler as it is now. Thanks, Kai
© 2016 - 2026 Red Hat, Inc.