There are 2 cases I have spotted so far: 1) IDE ATAPI read processing. Actually this was reported from field 2) QEMU IO hmp command (found during evaluation of (1)) SCSI code checks during access that blk_is_available(). These patches add same checks on different code paths. Pls decide whether these patches should go through sub-system trees or via block tree. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> CC: John Snow <jsnow@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com>
Am 28.11.2017 um 13:10 hat Denis V. Lunev geschrieben: > There are 2 cases I have spotted so far: > 1) IDE ATAPI read processing. Actually this was reported from field > 2) QEMU IO hmp command (found during evaluation of (1)) > > SCSI code checks during access that blk_is_available(). These patches add > same checks on different code paths. > > Pls decide whether these patches should go through sub-system trees or via > block tree. Why does this always come up in the last minute? :-( The real fix shouldn't be in the devices, but blk_aio_*() needs to handle empty drives gracefully. We already made a similar fix before the last release in commit 4da97120d, and I already didn't like it back then. Of course, we should have used the time to fix this properly, but as you can see, nothing has happened and we're running out of time again. Kevin
On 11/28/2017 07:08 PM, Kevin Wolf wrote: > Am 28.11.2017 um 13:10 hat Denis V. Lunev geschrieben: >> There are 2 cases I have spotted so far: >> 1) IDE ATAPI read processing. Actually this was reported from field >> 2) QEMU IO hmp command (found during evaluation of (1)) >> >> SCSI code checks during access that blk_is_available(). These patches add >> same checks on different code paths. >> >> Pls decide whether these patches should go through sub-system trees or via >> block tree. > Why does this always come up in the last minute? :-( > > The real fix shouldn't be in the devices, but blk_aio_*() needs to > handle empty drives gracefully. We already made a similar fix before the > last release in commit 4da97120d, and I already didn't like it back > then. > > Of course, we should have used the time to fix this properly, but as you > can see, nothing has happened and we're running out of time again. > > Kevin I see. We have started ABRT report collection program a week ago. This is the first result for QEMU from the real customer running his own workload :( What else can I say? Den
On 11/28/2017 11:29 AM, Denis V. Lunev wrote: > On 11/28/2017 07:08 PM, Kevin Wolf wrote: >> Am 28.11.2017 um 13:10 hat Denis V. Lunev geschrieben: >>> There are 2 cases I have spotted so far: >>> 1) IDE ATAPI read processing. Actually this was reported from field >>> 2) QEMU IO hmp command (found during evaluation of (1)) >>> >>> SCSI code checks during access that blk_is_available(). These patches add >>> same checks on different code paths. >>> >>> Pls decide whether these patches should go through sub-system trees or via >>> block tree. >> Why does this always come up in the last minute? :-( >> >> The real fix shouldn't be in the devices, but blk_aio_*() needs to >> handle empty drives gracefully. We already made a similar fix before the >> last release in commit 4da97120d, and I already didn't like it back >> then. >> >> Of course, we should have used the time to fix this properly, but as you >> can see, nothing has happened and we're running out of time again. >> >> Kevin > I see. > > We have started ABRT report collection program a week ago. This > is the first result for QEMU from the real customer running > his own workload :( > > What else can I say? > > Den > It's not your fault, it's mine for letting this go for a release. It's just unfortunate timing. I'm looking at (at least) the IDE portion of this and the underlying cause in block-backend today.
On 28 November 2017 at 17:01, John Snow <jsnow@redhat.com> wrote: > It's not your fault, it's mine for letting this go for a release. It's > just unfortunate timing. I'm looking at (at least) the IDE portion of > this and the underlying cause in block-backend today. If it's not a regression, and nobody reported it for the whole release cycle, we might choose to not fix it in 2.11... thanks -- PMM
On 11/28/2017 08:28 PM, Peter Maydell wrote: > On 28 November 2017 at 17:01, John Snow <jsnow@redhat.com> wrote: >> It's not your fault, it's mine for letting this go for a release. It's >> just unfortunate timing. I'm looking at (at least) the IDE portion of >> this and the underlying cause in block-backend today. > If it's not a regression, and nobody reported it for the whole release > cycle, we might choose to not fix it in 2.11... > > thanks > -- PMM may be this is related report https://patchwork.kernel.org/patch/9779795/ Anyway, I will be fine with any decision. Den
On 11/28/2017 03:10 PM, Denis V. Lunev wrote: > There are 2 cases I have spotted so far: > 1) IDE ATAPI read processing. Actually this was reported from field > 2) QEMU IO hmp command (found during evaluation of (1)) > > SCSI code checks during access that blk_is_available(). These patches add > same checks on different code paths. > > Pls decide whether these patches should go through sub-system trees or via > block tree. > > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > CC: John Snow <jsnow@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > any decision on this?
On 12/11/2017 05:24 AM, Denis V. Lunev wrote: > On 11/28/2017 03:10 PM, Denis V. Lunev wrote: >> There are 2 cases I have spotted so far: >> 1) IDE ATAPI read processing. Actually this was reported from field >> 2) QEMU IO hmp command (found during evaluation of (1)) >> >> SCSI code checks during access that blk_is_available(). These patches add >> same checks on different code paths. >> >> Pls decide whether these patches should go through sub-system trees or via >> block tree. >> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> >> CC: John Snow <jsnow@redhat.com> >> CC: Kevin Wolf <kwolf@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> > any decision on this? > Hi Den, I was investigating it and had a long reply typed out but I didn't want to bore you to tears with my badly written novel. I'll send it shortly. I couldn't reproduce the problem and I can't see how the behavior you are seeing is possible, so I am concerned about the root cause of what's wrong. I will accept the hotfix before 2.12-rc0 or 2.11.1, but I want to try and see what's going wrong in the meantime.
On 11/28/2017 07:10 AM, Denis V. Lunev wrote: > There are 2 cases I have spotted so far: > 1) IDE ATAPI read processing. Actually this was reported from field > 2) QEMU IO hmp command (found during evaluation of (1)) > > SCSI code checks during access that blk_is_available(). These patches add > same checks on different code paths. > > Pls decide whether these patches should go through sub-system trees or via > block tree. > > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > CC: John Snow <jsnow@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > > Denis, I was never able to figure this out, but Stefan added a more rigorous block-wide fix that should prevent these crashes. If you see any anomalies in the ATAPI device again please let me know. Sorry I wasn't able to help any faster.
On 03/13/2018 08:18 PM, John Snow wrote: > > On 11/28/2017 07:10 AM, Denis V. Lunev wrote: >> There are 2 cases I have spotted so far: >> 1) IDE ATAPI read processing. Actually this was reported from field >> 2) QEMU IO hmp command (found during evaluation of (1)) >> >> SCSI code checks during access that blk_is_available(). These patches add >> same checks on different code paths. >> >> Pls decide whether these patches should go through sub-system trees or via >> block tree. >> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com> >> CC: John Snow <jsnow@redhat.com> >> CC: Kevin Wolf <kwolf@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> >> > Denis, I was never able to figure this out, but Stefan added a more > rigorous block-wide fix that should prevent these crashes. If you see > any anomalies in the ATAPI device again please let me know. > > Sorry I wasn't able to help any faster. ok, no prob. Thank you for a feedback Den
© 2016 - 2024 Red Hat, Inc.