In current implementation, packet queue flushing logic seem to suffer
from a deadlock like scenario if a packet is received by the interface
before before Rx ring is initialized by Guest's driver. Consider the
following sequence of events:
1. A QEMU instance is started against a TAP device on Linux
host, running Linux guest, e. g., something to the effect
of:
qemu-system-arm \
-net nic,model=imx.fec,netdev=lan0 \
netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
... rest of the arguments ...
2. Once QEMU starts, but before guest reaches the point where
FEC deriver is done initializing the HW, Guest, via TAP
interface, receives a number of multicast MDNS packets from
Host (not necessarily true for every OS, but it happens at
least on Fedora 25)
3. Recieving a packet in such a state results in
imx_eth_can_receive() returning '0', which in turn causes
tap_send() to disable corresponding event (tap.c:203)
4. Once Guest's driver reaches the point where it is ready to
recieve packets it prepares Rx ring descriptors and writes
ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
more descriptors are ready. And at this points emulation
layer does this:
s->regs[index] = ENET_RDAR_RDAR;
imx_eth_enable_rx(s);
which, combined with:
if (!s->regs[ENET_RDAR]) {
qemu_flush_queued_packets(qemu_get_queue(s->nic));
}
results in Rx queue never being flushed and corresponding
I/O event beign disabled.
Change the code to remember the fact that can_receive callback was
called before Rx ring was ready and use it to make a decision if
receive queue needs to be flushed.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
---
hw/net/imx_fec.c | 6 ++++--
include/hw/net/imx_fec.h | 1 +
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
index 84085afe09..767402909d 100644
--- a/hw/net/imx_fec.c
+++ b/hw/net/imx_fec.c
@@ -544,8 +544,9 @@ static void imx_eth_enable_rx(IMXFECState *s)
if (rx_ring_full) {
FEC_PRINTF("RX buffer full\n");
- } else if (!s->regs[ENET_RDAR]) {
+ } else if (s->needs_flush) {
qemu_flush_queued_packets(qemu_get_queue(s->nic));
+ s->needs_flush = false;
}
s->regs[ENET_RDAR] = rx_ring_full ? 0 : ENET_RDAR_RDAR;
@@ -930,7 +931,8 @@ static int imx_eth_can_receive(NetClientState *nc)
FEC_PRINTF("\n");
- return s->regs[ENET_RDAR] ? 1 : 0;
+ s->needs_flush = !s->regs[ENET_RDAR];
+ return !!s->regs[ENET_RDAR];
}
static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
index 62ad473b05..4bc8f03ec2 100644
--- a/include/hw/net/imx_fec.h
+++ b/include/hw/net/imx_fec.h
@@ -252,6 +252,7 @@ typedef struct IMXFECState {
uint32_t phy_int_mask;
bool is_fec;
+ bool needs_flush;
} IMXFECState;
#endif
--
2.13.5
On 2017年09月19日 03:50, Andrey Smirnov wrote:
> In current implementation, packet queue flushing logic seem to suffer
> from a deadlock like scenario if a packet is received by the interface
> before before Rx ring is initialized by Guest's driver. Consider the
> following sequence of events:
>
> 1. A QEMU instance is started against a TAP device on Linux
> host, running Linux guest, e. g., something to the effect
> of:
>
> qemu-system-arm \
> -net nic,model=imx.fec,netdev=lan0 \
> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
> ... rest of the arguments ...
>
> 2. Once QEMU starts, but before guest reaches the point where
> FEC deriver is done initializing the HW, Guest, via TAP
> interface, receives a number of multicast MDNS packets from
> Host (not necessarily true for every OS, but it happens at
> least on Fedora 25)
>
> 3. Recieving a packet in such a state results in
> imx_eth_can_receive() returning '0', which in turn causes
> tap_send() to disable corresponding event (tap.c:203)
>
> 4. Once Guest's driver reaches the point where it is ready to
> recieve packets it prepares Rx ring descriptors and writes
> ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
> more descriptors are ready. And at this points emulation
> layer does this:
>
> s->regs[index] = ENET_RDAR_RDAR;
> imx_eth_enable_rx(s);
>
> which, combined with:
>
> if (!s->regs[ENET_RDAR]) {
> qemu_flush_queued_packets(qemu_get_queue(s->nic));
> }
Not familiar with FEC, but if you are tracking 0->1 transition, why not
simply introduce a parameter of imx_eth_enable_rx() to force the flushing?
Thanks
>
> results in Rx queue never being flushed and corresponding
> I/O event beign disabled.
>
> Change the code to remember the fact that can_receive callback was
> called before Rx ring was ready and use it to make a decision if
> receive queue needs to be flushed.
>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: qemu-devel@nongnu.org
> Cc: qemu-arm@nongnu.org
> Cc: yurovsky@gmail.com
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> ---
> hw/net/imx_fec.c | 6 ++++--
> include/hw/net/imx_fec.h | 1 +
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
> index 84085afe09..767402909d 100644
> --- a/hw/net/imx_fec.c
> +++ b/hw/net/imx_fec.c
> @@ -544,8 +544,9 @@ static void imx_eth_enable_rx(IMXFECState *s)
>
> if (rx_ring_full) {
> FEC_PRINTF("RX buffer full\n");
> - } else if (!s->regs[ENET_RDAR]) {
> + } else if (s->needs_flush) {
> qemu_flush_queued_packets(qemu_get_queue(s->nic));
> + s->needs_flush = false;
> }
>
> s->regs[ENET_RDAR] = rx_ring_full ? 0 : ENET_RDAR_RDAR;
> @@ -930,7 +931,8 @@ static int imx_eth_can_receive(NetClientState *nc)
>
> FEC_PRINTF("\n");
>
> - return s->regs[ENET_RDAR] ? 1 : 0;
> + s->needs_flush = !s->regs[ENET_RDAR];
> + return !!s->regs[ENET_RDAR];
> }
>
> static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
> index 62ad473b05..4bc8f03ec2 100644
> --- a/include/hw/net/imx_fec.h
> +++ b/include/hw/net/imx_fec.h
> @@ -252,6 +252,7 @@ typedef struct IMXFECState {
> uint32_t phy_int_mask;
>
> bool is_fec;
> + bool needs_flush;
> } IMXFECState;
>
> #endif
On Fri, Sep 22, 2017 at 12:27 AM, Jason Wang <jasowang@redhat.com> wrote:
>
>
> On 2017年09月19日 03:50, Andrey Smirnov wrote:
>>
>> In current implementation, packet queue flushing logic seem to suffer
>> from a deadlock like scenario if a packet is received by the interface
>> before before Rx ring is initialized by Guest's driver. Consider the
>> following sequence of events:
>>
>> 1. A QEMU instance is started against a TAP device on Linux
>> host, running Linux guest, e. g., something to the effect
>> of:
>>
>> qemu-system-arm \
>> -net nic,model=imx.fec,netdev=lan0 \
>> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
>> ... rest of the arguments ...
>>
>> 2. Once QEMU starts, but before guest reaches the point where
>> FEC deriver is done initializing the HW, Guest, via TAP
>> interface, receives a number of multicast MDNS packets from
>> Host (not necessarily true for every OS, but it happens at
>> least on Fedora 25)
>>
>> 3. Recieving a packet in such a state results in
>> imx_eth_can_receive() returning '0', which in turn causes
>> tap_send() to disable corresponding event (tap.c:203)
>>
>> 4. Once Guest's driver reaches the point where it is ready to
>> recieve packets it prepares Rx ring descriptors and writes
>> ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
>> more descriptors are ready. And at this points emulation
>> layer does this:
>>
>> s->regs[index] = ENET_RDAR_RDAR;
>> imx_eth_enable_rx(s);
>>
>> which, combined with:
>>
>> if (!s->regs[ENET_RDAR]) {
>> qemu_flush_queued_packets(qemu_get_queue(s->nic));
>> }
>
>
> Not familiar with FEC, but if you are tracking 0->1 transition, why not
> simply introduce a parameter of imx_eth_enable_rx() to force the flushing?
>
Not sure I fully understand you, are you proposing I get rid of
"needs_flush" parameter in the device state, converting it to be a
parameter to imx_eth_enable_rx(), and then force flushing every time
imx_eth_enable_rx() is called in imx_eth_write()?
That should work, but it'll end up making the emulator code to flush
corresponding NIC queue every time the driver is done processing RX
ring. If that is not a big problem I am more than happy to make that
change.
Thanks,
Andrey Smirnov
On 18 September 2017 at 20:50, Andrey Smirnov <andrew.smirnov@gmail.com> wrote:
> In current implementation, packet queue flushing logic seem to suffer
> from a deadlock like scenario if a packet is received by the interface
> before before Rx ring is initialized by Guest's driver. Consider the
> following sequence of events:
>
> 1. A QEMU instance is started against a TAP device on Linux
> host, running Linux guest, e. g., something to the effect
> of:
>
> qemu-system-arm \
> -net nic,model=imx.fec,netdev=lan0 \
> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
> ... rest of the arguments ...
>
> 2. Once QEMU starts, but before guest reaches the point where
> FEC deriver is done initializing the HW, Guest, via TAP
> interface, receives a number of multicast MDNS packets from
> Host (not necessarily true for every OS, but it happens at
> least on Fedora 25)
>
> 3. Recieving a packet in such a state results in
> imx_eth_can_receive() returning '0', which in turn causes
> tap_send() to disable corresponding event (tap.c:203)
>
> 4. Once Guest's driver reaches the point where it is ready to
> recieve packets it prepares Rx ring descriptors and writes
> ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
> more descriptors are ready. And at this points emulation
> layer does this:
>
> s->regs[index] = ENET_RDAR_RDAR;
> imx_eth_enable_rx(s);
>
> which, combined with:
>
> if (!s->regs[ENET_RDAR]) {
> qemu_flush_queued_packets(qemu_get_queue(s->nic));
> }
>
> results in Rx queue never being flushed and corresponding
> I/O event beign disabled.
>
> Change the code to remember the fact that can_receive callback was
> called before Rx ring was ready and use it to make a decision if
> receive queue needs to be flushed.
>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: qemu-devel@nongnu.org
> Cc: qemu-arm@nongnu.org
> Cc: yurovsky@gmail.com
> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
> ---
> hw/net/imx_fec.c | 6 ++++--
> include/hw/net/imx_fec.h | 1 +
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
> index 84085afe09..767402909d 100644
> --- a/hw/net/imx_fec.c
> +++ b/hw/net/imx_fec.c
> @@ -544,8 +544,9 @@ static void imx_eth_enable_rx(IMXFECState *s)
>
> if (rx_ring_full) {
> FEC_PRINTF("RX buffer full\n");
> - } else if (!s->regs[ENET_RDAR]) {
> + } else if (s->needs_flush) {
> qemu_flush_queued_packets(qemu_get_queue(s->nic));
> + s->needs_flush = false;
> }
>
> s->regs[ENET_RDAR] = rx_ring_full ? 0 : ENET_RDAR_RDAR;
> @@ -930,7 +931,8 @@ static int imx_eth_can_receive(NetClientState *nc)
>
> FEC_PRINTF("\n");
>
> - return s->regs[ENET_RDAR] ? 1 : 0;
> + s->needs_flush = !s->regs[ENET_RDAR];
> + return !!s->regs[ENET_RDAR];
> }
>
> static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
> index 62ad473b05..4bc8f03ec2 100644
> --- a/include/hw/net/imx_fec.h
> +++ b/include/hw/net/imx_fec.h
> @@ -252,6 +252,7 @@ typedef struct IMXFECState {
> uint32_t phy_int_mask;
>
> bool is_fec;
> + bool needs_flush;
> } IMXFECState;
This looks odd -- I don't think you should need extra
state here. Conceptually what you want is:
* in the can_receive callback, test some function of
various bits of device state to decide whether you can
take data
* in the rest of the device, whenever the device state
changes such that you were previously not able to take
data but now you can, call qemu_flush_queued_packets().
You shouldn't need any extra state to do this, you just
need to fix the bug where you have a code path that
flips ENET_RDAR from 0 to 1 without calling flush
(you might for instance have a helper function for
"set ENET_RDAR" that encapsulates setting the state
and arranging that flush is called).
thanks
-- PMM
On Fri, Oct 6, 2017 at 6:56 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 18 September 2017 at 20:50, Andrey Smirnov <andrew.smirnov@gmail.com> wrote:
>> In current implementation, packet queue flushing logic seem to suffer
>> from a deadlock like scenario if a packet is received by the interface
>> before before Rx ring is initialized by Guest's driver. Consider the
>> following sequence of events:
>>
>> 1. A QEMU instance is started against a TAP device on Linux
>> host, running Linux guest, e. g., something to the effect
>> of:
>>
>> qemu-system-arm \
>> -net nic,model=imx.fec,netdev=lan0 \
>> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
>> ... rest of the arguments ...
>>
>> 2. Once QEMU starts, but before guest reaches the point where
>> FEC deriver is done initializing the HW, Guest, via TAP
>> interface, receives a number of multicast MDNS packets from
>> Host (not necessarily true for every OS, but it happens at
>> least on Fedora 25)
>>
>> 3. Recieving a packet in such a state results in
>> imx_eth_can_receive() returning '0', which in turn causes
>> tap_send() to disable corresponding event (tap.c:203)
>>
>> 4. Once Guest's driver reaches the point where it is ready to
>> recieve packets it prepares Rx ring descriptors and writes
>> ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
>> more descriptors are ready. And at this points emulation
>> layer does this:
>>
>> s->regs[index] = ENET_RDAR_RDAR;
>> imx_eth_enable_rx(s);
>>
>> which, combined with:
>>
>> if (!s->regs[ENET_RDAR]) {
>> qemu_flush_queued_packets(qemu_get_queue(s->nic));
>> }
>>
>> results in Rx queue never being flushed and corresponding
>> I/O event beign disabled.
>>
>> Change the code to remember the fact that can_receive callback was
>> called before Rx ring was ready and use it to make a decision if
>> receive queue needs to be flushed.
>>
>> Cc: Peter Maydell <peter.maydell@linaro.org>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Cc: qemu-devel@nongnu.org
>> Cc: qemu-arm@nongnu.org
>> Cc: yurovsky@gmail.com
>> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
>> ---
>> hw/net/imx_fec.c | 6 ++++--
>> include/hw/net/imx_fec.h | 1 +
>> 2 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c
>> index 84085afe09..767402909d 100644
>> --- a/hw/net/imx_fec.c
>> +++ b/hw/net/imx_fec.c
>> @@ -544,8 +544,9 @@ static void imx_eth_enable_rx(IMXFECState *s)
>>
>> if (rx_ring_full) {
>> FEC_PRINTF("RX buffer full\n");
>> - } else if (!s->regs[ENET_RDAR]) {
>> + } else if (s->needs_flush) {
>> qemu_flush_queued_packets(qemu_get_queue(s->nic));
>> + s->needs_flush = false;
>> }
>>
>> s->regs[ENET_RDAR] = rx_ring_full ? 0 : ENET_RDAR_RDAR;
>> @@ -930,7 +931,8 @@ static int imx_eth_can_receive(NetClientState *nc)
>>
>> FEC_PRINTF("\n");
>>
>> - return s->regs[ENET_RDAR] ? 1 : 0;
>> + s->needs_flush = !s->regs[ENET_RDAR];
>> + return !!s->regs[ENET_RDAR];
>> }
>>
>> static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf,
>> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h
>> index 62ad473b05..4bc8f03ec2 100644
>> --- a/include/hw/net/imx_fec.h
>> +++ b/include/hw/net/imx_fec.h
>> @@ -252,6 +252,7 @@ typedef struct IMXFECState {
>> uint32_t phy_int_mask;
>>
>> bool is_fec;
>> + bool needs_flush;
>> } IMXFECState;
>
> This looks odd -- I don't think you should need extra
> state here. Conceptually what you want is:
>
> * in the can_receive callback, test some function of
> various bits of device state to decide whether you can
> take data
> * in the rest of the device, whenever the device state
> changes such that you were previously not able to take
> data but now you can, call qemu_flush_queued_packets().
>
> You shouldn't need any extra state to do this, you just
> need to fix the bug where you have a code path that
> flips ENET_RDAR from 0 to 1 without calling flush
> (you might for instance have a helper function for
> "set ENET_RDAR" that encapsulates setting the state
> and arranging that flush is called).
>
I don't know if you've seen my response to Jason Wang, but I think he
was proposing something similar, and, as I said, that should work fine
and the only reason I didn't do it that way was to avoid doing a flush
every time that host driver drains full RX-ring and gives it back to
the IP block.
I'll give this a try in v2.
Thanks,
Andrey Smirnov
© 2016 - 2025 Red Hat, Inc.