net/ncsi/internal.h | 21 +++++----- net/ncsi/ncsi-pkt.h | 95 +++++++++++++++++++++++++-------------------- net/ncsi/ncsi-rsp.c | 31 +++++++++------ 3 files changed, 82 insertions(+), 65 deletions(-)
From: Hari Kalavakunta <kalavakunta.hari.prasad@gmail.com> Make Get Controller Packet Statistics (GCPS) 64-bit member variables spec-compliant, as per DSP0222 v1.0.0 and forward specs Hari Kalavakunta (2): net: ncsi: Format structure for longer names net: ncsi: Fix GCPS 64-bit member variables net/ncsi/internal.h | 21 +++++----- net/ncsi/ncsi-pkt.h | 95 +++++++++++++++++++++++++-------------------- net/ncsi/ncsi-rsp.c | 31 +++++++++------ 3 files changed, 82 insertions(+), 65 deletions(-) -- 2.47.1
On 8/04/2025 4:19 am, kalavakunta.hari.prasad@gmail.com wrote: > From: Hari Kalavakunta <kalavakunta.hari.prasad@gmail.com> > > Make Get Controller Packet Statistics (GCPS) 64-bit member variables > spec-compliant, as per DSP0222 v1.0.0 and forward specs > > Hari Kalavakunta (2): > net: ncsi: Format structure for longer names > net: ncsi: Fix GCPS 64-bit member variables > > net/ncsi/internal.h | 21 +++++----- > net/ncsi/ncsi-pkt.h | 95 +++++++++++++++++++++++++-------------------- > net/ncsi/ncsi-rsp.c | 31 +++++++++------ > 3 files changed, 82 insertions(+), 65 deletions(-) Looking at e.g. DSP0222 1.2.0a, you're right about the field widths, but it's not particularly explicit about whether the full 64 bits is used. I'd assume so, but do you see the upper bits of e.g. the packet counters return expected data? Otherwise looks good. Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
On 4/7/2025 2:44 PM, Sam Mendoza-Jonas wrote: > On 8/04/2025 4:19 am, kalavakunta.hari.prasad@gmail.com wrote: > > Looking at e.g. DSP0222 1.2.0a, you're right about the field widths, but > it's not particularly explicit about whether the full 64 bits is used. > I'd assume so, but do you see the upper bits of e.g. the packet counters > return expected data? Otherwise looks good. > It is possible that these statistics have not been previously explored or utilized, which may explain why they went unnoticed. As you pointed out, the checksum offset within the struct is not currently being checked, and similarly, the returned packet sizes are also not being verified.
Hello Hari, On Tue, Apr 08, 2025 at 10:01:03AM -0700, Hari Kalavakunta wrote: > On 4/7/2025 2:44 PM, Sam Mendoza-Jonas wrote: > > On 8/04/2025 4:19 am, kalavakunta.hari.prasad@gmail.com wrote: > > > > Looking at e.g. DSP0222 1.2.0a, you're right about the field widths, but > > it's not particularly explicit about whether the full 64 bits is used. > > I'd assume so, but do you see the upper bits of e.g. the packet counters > > return expected data? Otherwise looks good. > > > It is possible that these statistics have not been previously explored or > utilized, which may explain why they went unnoticed. As you pointed out, the > checksum offset within the struct is not currently being checked, and > similarly, the returned packet sizes are also not being verified. Can you please add the checks so that we are sure that hardware, software and the specification all match after your fixes? Also, please do provide the example counter values read from real hardware (even if they're not yet exposed properly you can still obtain them with some hack; don't forget to mention what network card they were read from).
On 4/8/2025 11:26 AM, Paul Fertser wrote: > Can you please add the checks so that we are sure that hardware, > software and the specification all match after your fixes? Sure, I can give a try. Could you please provide an example or guideline that I can use as a reference for proper alignment? > > Also, please do provide the example counter values read from real > hardware (even if they're not yet exposed properly you can still > obtain them with some hack; don't forget to mention what network card > they were read from). Our verification process is centered on confirming that the counter values are accurately populated within the ncsi_channel_stats struct, specifically in the ncsi_rsp_handler_gcps function. This verification is conducted using synthesized statistics, rather than actual data from a network card. Sure, I will provide NCSI packet capture showing the synthesized data used for testing by end of the day.
On Tue, Apr 08, 2025 at 11:53:47AM -0700, Hari Kalavakunta wrote: > On 4/8/2025 11:26 AM, Paul Fertser wrote: > > Can you please add the checks so that we are sure that hardware, > > software and the specification all match after your fixes? > > Sure, I can give a try. Could you please provide an example or guideline > that I can use as a reference for proper alignment? Well, there's ncsi_validate_rsp_pkt() and also some handlers use netdev_warn() or netdev_err() as appropriate and in any case they do not try to use the returned data if it didn't pass validation and return an error instead. > > Also, please do provide the example counter values read from real > > hardware (even if they're not yet exposed properly you can still > > obtain them with some hack; don't forget to mention what network card > > they were read from). > > Our verification process is centered on confirming that the counter values > are accurately populated within the ncsi_channel_stats struct, specifically > in the ncsi_rsp_handler_gcps function. This verification is conducted using > synthesized statistics, rather than actual data from a network card. Sure, I > will provide NCSI packet capture showing the synthesized data used for > testing by end of the day. In other words, you're testing your code only with simulated data so there's no way to guarantee it's going to work on any real life hardware (as we know hardware doesn't always exactly match the specs)? That's unsettling. Please do mention it in the commit log, it's an essential point. Better yet, consider going a bit off-centre after the regular verification and do a control run on real hardware. After all, that's what the code is for so if it all possible it's better to know if it does the actual job before merging (to avoid noise from follow-up patches like yours which fix something that never worked because it was never tested).
On 4/8/2025 12:19 PM, Paul Fertser wrote: > In other words, you're testing your code only with simulated data so > there's no way to guarantee it's going to work on any real life > hardware (as we know hardware doesn't always exactly match the specs)? > That's unsettling. Please do mention it in the commit log, it's an > essential point. Better yet, consider going a bit off-centre after the > regular verification and do a control run on real hardware. > > After all, that's what the code is for so if it all possible it's > better to know if it does the actual job before merging (to avoid > noise from follow-up patches like yours which fix something that never > worked because it was never tested). I would like to request a week's time to integrate a real hardware interface, which will enable me to test and demonstrate end-to-end results. This will also allow me to identify and address any additional issues that may arise during the testing process. Thank you for the feedback.
On Tue, Apr 08, 2025 at 03:02:14PM -0700, Hari Kalavakunta wrote: > On 4/8/2025 12:19 PM, Paul Fertser wrote: > > > In other words, you're testing your code only with simulated data so > > there's no way to guarantee it's going to work on any real life > > hardware (as we know hardware doesn't always exactly match the specs)? > > That's unsettling. Please do mention it in the commit log, it's an > > essential point. Better yet, consider going a bit off-centre after the > > regular verification and do a control run on real hardware. > > > > After all, that's what the code is for so if it all possible it's > > better to know if it does the actual job before merging (to avoid > > noise from follow-up patches like yours which fix something that never > > worked because it was never tested). > > I would like to request a week's time to integrate a real hardware > interface, which will enable me to test and demonstrate end-to-end results. > This will also allow me to identify and address any additional issues that > may arise during the testing process. Thank you for the feedback. Thank you for doing the right thing! Looking forward to your updated patch (please do not forget to consider __be64 for the fields).
On 4/8/2025 3:35 PM, Paul Fertser wrote: > On Tue, Apr 08, 2025 at 03:02:14PM -0700, Hari Kalavakunta wrote: >> On 4/8/2025 12:19 PM, Paul Fertser wrote: >> >>> In other words, you're testing your code only with simulated data so >>> there's no way to guarantee it's going to work on any real life >>> hardware (as we know hardware doesn't always exactly match the specs)? >>> That's unsettling. Please do mention it in the commit log, it's an >>> essential point. Better yet, consider going a bit off-centre after the >>> regular verification and do a control run on real hardware. >>> >>> After all, that's what the code is for so if it all possible it's >>> better to know if it does the actual job before merging (to avoid >>> noise from follow-up patches like yours which fix something that never >>> worked because it was never tested). >> >> I would like to request a week's time to integrate a real hardware >> interface, which will enable me to test and demonstrate end-to-end results. >> This will also allow me to identify and address any additional issues that >> may arise during the testing process. Thank you for the feedback. > > Thank you for doing the right thing! Looking forward to your updated > patch (please do not forget to consider __be64 for the fields). I had not previously considered using __be64 for the struct ncsi_rsp_gcps_pkt, as it is an interface structure. I would like to seek your input on whether it is a good idea to use __be64 for interface messages. In my experience, I haven't come across implementations that utilize __be64. I am unsure about the portability of this approach, particularly with regards to the Management Controller (MC).
On Tue, Apr 08, 2025 at 04:23:43PM -0700, Hari Kalavakunta wrote: > On 4/8/2025 3:35 PM, Paul Fertser wrote: > > Thank you for doing the right thing! Looking forward to your updated > > patch (please do not forget to consider __be64 for the fields). > > I had not previously considered using __be64 for the struct > ncsi_rsp_gcps_pkt, as it is an interface structure. I would like to seek > your input on whether it is a good idea to use __be64 for interface > messages. In my experience, I haven't come across implementations that > utilize __be64. I am unsure about the portability of this approach, > particularly with regards to the Management Controller (MC). I do not see why not[0][1]. What makes MC special, do you imply it doesn't have be64_to_cpu() (be64_to_cpup() for unaligned data) or what? If the values you get from hardware are indeed 64-bit BE clearly open-coding conversions from them is suboptimal. [0] https://elixir.bootlin.com/linux/v6.13.7/A/ident/__be64 [1] https://elixir.bootlin.com/linux/v6.13.7/source/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h#L155
On 4/9/2025 2:08 AM, Paul Fertser wrote: > On Tue, Apr 08, 2025 at 04:23:43PM -0700, Hari Kalavakunta wrote: >> On 4/8/2025 3:35 PM, Paul Fertser wrote: >>> Thank you for doing the right thing! Looking forward to your updated >>> patch (please do not forget to consider __be64 for the fields). >> >> I had not previously considered using __be64 for the struct >> ncsi_rsp_gcps_pkt, as it is an interface structure. I would like to seek >> your input on whether it is a good idea to use __be64 for interface >> messages. In my experience, I haven't come across implementations that >> utilize __be64. I am unsure about the portability of this approach, >> particularly with regards to the Management Controller (MC). > > I do not see why not[0][1]. What makes MC special, do you imply it > doesn't have be64_to_cpu() (be64_to_cpup() for unaligned data) or > what? If the values you get from hardware are indeed 64-bit BE clearly > open-coding conversions from them is suboptimal. > > [0] https://elixir.bootlin.com/linux/v6.13.7/A/ident/__be64 > [1] https://elixir.bootlin.com/linux/v6.13.7/source/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h#L155 Thank you for providing the references. I will proceed with running a test on real hardware with be64_to_cpu() to verify that stat mediation works correctly all the way into the local driver structure. If everything looks correct, I will submit a revised patch.
On 4/8/2025 4:23 PM, Hari Kalavakunta wrote:
> On 4/8/2025 3:35 PM, Paul Fertser wrote:
>> On Tue, Apr 08, 2025 at 03:02:14PM -0700, Hari Kalavakunta wrote:
>>> On 4/8/2025 12:19 PM, Paul Fertser wrote:
>>>
>>>> In other words, you're testing your code only with simulated data so
>>>> there's no way to guarantee it's going to work on any real life
>>>> hardware (as we know hardware doesn't always exactly match the specs)?
>>>> That's unsettling. Please do mention it in the commit log, it's an
>>>> essential point. Better yet, consider going a bit off-centre after the
>>>> regular verification and do a control run on real hardware.
>>>>
>>>> After all, that's what the code is for so if it all possible it's
>>>> better to know if it does the actual job before merging (to avoid
>>>> noise from follow-up patches like yours which fix something that never
>>>> worked because it was never tested).
>>>
>>> I would like to request a week's time to integrate a real hardware
>>> interface, which will enable me to test and demonstrate end-to-end
>>> results.
>>> This will also allow me to identify and address any additional issues
>>> that
>>> may arise during the testing process. Thank you for the feedback.
>>
>> Thank you for doing the right thing! Looking forward to your updated
>> patch (please do not forget to consider __be64 for the fields).
>
> I had not previously considered using __be64 for the struct
> ncsi_rsp_gcps_pkt, as it is an interface structure. I would like to seek
> your input on whether it is a good idea to use __be64 for interface
> messages. In my experience, I haven't come across implementations that
> utilize __be64. I am unsure about the portability of this approach,
> particularly with regards to the Management Controller (MC).
Here are the results from a real hardware test, which validates the patch.
a. Initiate GCPS command to the NIC-2
root@bmc:~# ./ncsi-cmd -i 3 -p 0 -c 0 raw 0x18
<7> Command: type 0x18, payload 0 bytes:
<7> Send Command, CHANNEL : 0x0 , PACKAGE : 0x0, INTERFACE: 0x008cd598
<7> Response 228 bytes: 00 01 00 3b 98 00 00 cc 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2b e5 fc e0 c0 00 00 00 00
65 70 d5 54 00 00 00 00 09 50 66 56 00 00 00 00 03 08 a8 79 00 00 00 00
00 3d 5c 44 00 00 00 00 00 79 38 23 00 00 00 00 00 02 fe 06 00 00 00 00
00 00 02 be 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 b8 21 5f 03 8f da af 00 d8 6d 36
00 73 a3 e7 00 17 20 70 00 00 00 00 00 00 d8 8a 00 00 0e 9c 00 56 4f 83
00 10 17 e2 00 01 5b 76 00 13 6e a3 00 00 f8 cd 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 2c 20 55 f5
b. tcpdump capture of the GCPS
GCPS Command Capture:
0x0000: 0001 003b 1800 0000 0000 0000 0000 0000
0x0010: ffff e7c4 0000 0000 0000 0000 0000 0000
0x0020: 0000 0000 0000 0000 0000 0000 0000
GCPS Response
0x0000: 0001 003b 9800 00cc 0000 0000 0000 0000
0x0010: 0000 0000 0000 0000 0000 0000 0000 002b
0x0020: e5fc e0c0 0000 0000 6570 d554 0000 0000
0x0030: 0950 6656 0000 0000 0308 a879 0000 0000
0x0040: 003d 5c44 0000 0000 0079 3823 0000 0000
0x0050: 0002 fe06 0000 0000 0000 02be 0000 0000
0x0060: 0000 0000 0000 0000 0000 0000 0000 0000
0x0070: 0000 0000 0000 0000 0000 0000 0000 0000
0x0080: 0000 0000 0000 0000 0000 0000 0000 0000
0x0090: 0000 0000 00b8 215f 038f daaf 00d8 6d36
0x00a0: 0073 a3e7 0017 2070 0000 0000 0000 d88a
0x00b0: 0000 0e9c 0056 4f83 0010 17e2 0001 5b76
0x00c0: 0013 6ea3 0000 f8cd 0000 0000 0000 0000
0x00d0: 0000 0000 0000 0000 0000 0000 0000 0000
0x00e0: 2c20 55f5
c. dmesg log (Debug purpose only) to demonstrate correct value read.
[ 1350.369741] ncsi_rsp_handler_gcps ENTRY
[ 1350.369768] ncs->hnc_rx_bytes = 188542148800 (0x2be5fce0c0)
Let us examine the "Total Bytes Received" statistic. Specifically, we'll
look at offset 28..35 (0x1c..0x24), bytes read - '0000 002b e5fc e0c0'.
NCSI now correctly collects this value into it's internal structure.
I just noticed a warning regarding line length in the patch submission,
I will address this issue in version 2 of the patch. Let me know if we
need additional information.
© 2016 - 2026 Red Hat, Inc.