[PATCH] RFC: CODING_STYLE: describe "self" variable convention

Marc-André Lureau posted 1 patch 4 years, 5 months ago
Test checkpatch passed
Test FreeBSD passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Test docker-quick@centos7 passed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20191120125444.31365-1-marcandre.lureau@redhat.com
CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
1 file changed, 32 insertions(+), 6 deletions(-)
[PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Marc-André Lureau 4 years, 5 months ago
Following the discussion in thread "[PATCH v3 13/33] serial: start
making SerialMM a sysbus device", I'd like to recommend the usage of
"self" variable to reference to the OOP-style method instance, as
commonly done in various languages and in GObject world.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
index 427699e0e4..cb6635af71 100644
--- a/CODING_STYLE.rst
+++ b/CODING_STYLE.rst
@@ -102,12 +102,38 @@ Rationale:
 Naming
 ======
 
-Variables are lower_case_with_underscores; easy to type and read.  Structured
-type names are in CamelCase; harder to type but standing out.  Enum type
-names and function type names should also be in CamelCase.  Scalar type
-names are lower_case_with_underscores_ending_with_a_t, like the POSIX
-uint64_t and family.  Note that this last convention contradicts POSIX
-and is therefore likely to be changed.
+Variables are lower_case_with_underscores; easy to type and read.
+
+The most common naming for a variable is an abbreviation of the type
+name.  Some common examples:
+
+.. code-block:: c
+
+    Object *obj;
+    QVirtioSCSI *scsi;
+    SerialMM *smm;
+
+When writing QOM/OOP-style function, a "self" variable allows to refer
+without ambiguity to the instance of the method that is being
+implemented (this is not very common in QEMU code base, but it is
+often a good option to increase the readability and consistency,
+making further refactoring easier as well).  Example:
+
+.. code-block:: c
+
+    serial_mm_flush(SerialMM *self);
+
+    serial_mm_instance_init(Object *o) {
+        SerialMM *self = SERIAL_MM(o);
+        ..
+    }
+
+Structured type names are in CamelCase; harder to type but standing
+out.  Enum type names and function type names should also be in
+CamelCase.  Scalar type names are
+lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
+and family.  Note that this last convention contradicts POSIX and is
+therefore likely to be changed.
 
 When wrapping standard library functions, use the prefix ``qemu_`` to alert
 readers that they are seeing a wrapped version; otherwise avoid this prefix.
-- 
2.24.0


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Philippe Mathieu-Daudé 4 years, 5 months ago
On 11/20/19 1:54 PM, Marc-André Lureau wrote:
> Following the discussion in thread "[PATCH v3 13/33] serial: start
> making SerialMM a sysbus device", I'd like to recommend the usage of
> "self" variable to reference to the OOP-style method instance, as
> commonly done in various languages and in GObject world.
> 
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>   CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
>   1 file changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
> index 427699e0e4..cb6635af71 100644
> --- a/CODING_STYLE.rst
> +++ b/CODING_STYLE.rst
> @@ -102,12 +102,38 @@ Rationale:
>   Naming
>   ======
>   
> -Variables are lower_case_with_underscores; easy to type and read.  Structured
> -type names are in CamelCase; harder to type but standing out.  Enum type
> -names and function type names should also be in CamelCase.  Scalar type
> -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> -uint64_t and family.  Note that this last convention contradicts POSIX
> -and is therefore likely to be changed.
> +Variables are lower_case_with_underscores; easy to type and read.
> +
> +The most common naming for a variable is an abbreviation of the type
> +name.  Some common examples:
> +
> +.. code-block:: c
> +
> +    Object *obj;
> +    QVirtioSCSI *scsi;
> +    SerialMM *smm;
> +
> +When writing QOM/OOP-style function, a "self" variable allows to refer
> +without ambiguity to the instance of the method that is being
> +implemented (this is not very common in QEMU code base, but it is
> +often a good option to increase the readability and consistency,
> +making further refactoring easier as well).  Example:
> +
> +.. code-block:: c
> +
> +    serial_mm_flush(SerialMM *self);
> +
> +    serial_mm_instance_init(Object *o) {
> +        SerialMM *self = SERIAL_MM(o);
> +        ..
> +    }
> +
> +Structured type names are in CamelCase; harder to type but standing
> +out.  Enum type names and function type names should also be in
> +CamelCase.  Scalar type names are
> +lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
> +and family.  Note that this last convention contradicts POSIX and is
> +therefore likely to be changed.
>   
>   When wrapping standard library functions, use the prefix ``qemu_`` to alert
>   readers that they are seeing a wrapped version; otherwise avoid this prefix.
> 

So in this example:

static void pci_unin_agp_init(Object *obj)
{
     UNINHostState *s = UNI_NORTH_AGP_HOST_BRIDGE(obj);
     SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
     PCIHostState *h = PCI_HOST_BRIDGE(obj);

     /* Uninorth AGP bus */
     memory_region_init_io(&h->conf_mem, OBJECT(h),
                           &pci_host_conf_le_ops,
                           obj, "unin-agp-conf-idx", 0x1000);
     memory_region_init_io(&h->data_mem, OBJECT(h),
                           &pci_host_data_le_ops,
                           obj, "unin-agp-conf-data", 0x1000);

     object_property_add_link(obj, "pic", TYPE_OPENPIC,
                              (Object **) &s->pic,
                              qdev_prop_allow_set_link_before_realize,
                              0, NULL);

     sysbus_init_mmio(sbd, &h->conf_mem);
     sysbus_init_mmio(sbd, &h->data_mem);
}

You would change 'Object *obj' -> 'Object *self'?

But here we want to keep 'klass', right?

static void gpex_host_class_init(ObjectClass *klass, void *data)
{
     DeviceClass *dc = DEVICE_CLASS(klass);
     PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);

     hc->root_bus_path = gpex_host_root_bus_path;
     dc->realize = gpex_host_realize;
     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
     dc->fw_name = "pci";
}

Maybe we should restrict 'self' as name of Object type only?
But your example is with SerialMM, so no?


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Alex Bennée 4 years, 5 months ago
Philippe Mathieu-Daudé <philmd@redhat.com> writes:

> On 11/20/19 1:54 PM, Marc-André Lureau wrote:
>> Following the discussion in thread "[PATCH v3 13/33] serial: start
>> making SerialMM a sysbus device", I'd like to recommend the usage of
>> "self" variable to reference to the OOP-style method instance, as
>> commonly done in various languages and in GObject world.
>> Cc: Peter Maydell <peter.maydell@linaro.org>
>> Cc: Daniel P. Berrangé <berrange@redhat.com>
>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> ---
>>   CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
>>   1 file changed, 32 insertions(+), 6 deletions(-)
>> diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
>> index 427699e0e4..cb6635af71 100644
>> --- a/CODING_STYLE.rst
>> +++ b/CODING_STYLE.rst
>> @@ -102,12 +102,38 @@ Rationale:
>>   Naming
>>   ======
>>   -Variables are lower_case_with_underscores; easy to type and read.
>> Structured
>> -type names are in CamelCase; harder to type but standing out.  Enum type
>> -names and function type names should also be in CamelCase.  Scalar type
>> -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
>> -uint64_t and family.  Note that this last convention contradicts POSIX
>> -and is therefore likely to be changed.
>> +Variables are lower_case_with_underscores; easy to type and read.
>> +
>> +The most common naming for a variable is an abbreviation of the type
>> +name.  Some common examples:
>> +
>> +.. code-block:: c
>> +
>> +    Object *obj;
>> +    QVirtioSCSI *scsi;
>> +    SerialMM *smm;
>> +
>> +When writing QOM/OOP-style function, a "self" variable allows to refer
>> +without ambiguity to the instance of the method that is being
>> +implemented (this is not very common in QEMU code base, but it is
>> +often a good option to increase the readability and consistency,
>> +making further refactoring easier as well).  Example:
>> +
>> +.. code-block:: c
>> +
>> +    serial_mm_flush(SerialMM *self);
>> +
>> +    serial_mm_instance_init(Object *o) {
>> +        SerialMM *self = SERIAL_MM(o);
>> +        ..
>> +    }
>> +
>> +Structured type names are in CamelCase; harder to type but standing
>> +out.  Enum type names and function type names should also be in
>> +CamelCase.  Scalar type names are
>> +lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
>> +and family.  Note that this last convention contradicts POSIX and is
>> +therefore likely to be changed.
>>     When wrapping standard library functions, use the prefix
>> ``qemu_`` to alert
>>   readers that they are seeing a wrapped version; otherwise avoid this prefix.
>>
>
> So in this example:
>
> static void pci_unin_agp_init(Object *obj)
> {
>     UNINHostState *s = UNI_NORTH_AGP_HOST_BRIDGE(obj);

Using *s for the contextually appropriate state holding structure is
certainly common enough in the code base. Maybe we should should
document that too?

>     SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
>     PCIHostState *h = PCI_HOST_BRIDGE(obj);
>
>     /* Uninorth AGP bus */
>     memory_region_init_io(&h->conf_mem, OBJECT(h),
>                           &pci_host_conf_le_ops,
>                           obj, "unin-agp-conf-idx", 0x1000);
>     memory_region_init_io(&h->data_mem, OBJECT(h),
>                           &pci_host_data_le_ops,
>                           obj, "unin-agp-conf-data", 0x1000);
>
>     object_property_add_link(obj, "pic", TYPE_OPENPIC,
>                              (Object **) &s->pic,
>                              qdev_prop_allow_set_link_before_realize,
>                              0, NULL);
>
>     sysbus_init_mmio(sbd, &h->conf_mem);
>     sysbus_init_mmio(sbd, &h->data_mem);
> }
>
> You would change 'Object *obj' -> 'Object *self'?

I would have read it as:

  SysBusDevice *self = SYS_BUS_DEVICE(obj);

as the only device object in the example. But perhaps this is a complex
example?

>
> But here we want to keep 'klass', right?
>
> static void gpex_host_class_init(ObjectClass *klass, void *data)
> {
>     DeviceClass *dc = DEVICE_CLASS(klass);
>     PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
>
>     hc->root_bus_path = gpex_host_root_bus_path;
>     dc->realize = gpex_host_realize;
>     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>     dc->fw_name = "pci";
> }
>
> Maybe we should restrict 'self' as name of Object type only?
> But your example is with SerialMM, so no?


--
Alex Bennée

Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Marc-André Lureau 4 years, 5 months ago
Hi

On Wed, Nov 20, 2019 at 8:25 PM Alex Bennée <alex.bennee@linaro.org> wrote:
>
>
> Philippe Mathieu-Daudé <philmd@redhat.com> writes:
>
> > On 11/20/19 1:54 PM, Marc-André Lureau wrote:
> >> Following the discussion in thread "[PATCH v3 13/33] serial: start
> >> making SerialMM a sysbus device", I'd like to recommend the usage of
> >> "self" variable to reference to the OOP-style method instance, as
> >> commonly done in various languages and in GObject world.
> >> Cc: Peter Maydell <peter.maydell@linaro.org>
> >> Cc: Daniel P. Berrangé <berrange@redhat.com>
> >> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> ---
> >>   CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
> >>   1 file changed, 32 insertions(+), 6 deletions(-)
> >> diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
> >> index 427699e0e4..cb6635af71 100644
> >> --- a/CODING_STYLE.rst
> >> +++ b/CODING_STYLE.rst
> >> @@ -102,12 +102,38 @@ Rationale:
> >>   Naming
> >>   ======
> >>   -Variables are lower_case_with_underscores; easy to type and read.
> >> Structured
> >> -type names are in CamelCase; harder to type but standing out.  Enum type
> >> -names and function type names should also be in CamelCase.  Scalar type
> >> -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> >> -uint64_t and family.  Note that this last convention contradicts POSIX
> >> -and is therefore likely to be changed.
> >> +Variables are lower_case_with_underscores; easy to type and read.
> >> +
> >> +The most common naming for a variable is an abbreviation of the type
> >> +name.  Some common examples:
> >> +
> >> +.. code-block:: c
> >> +
> >> +    Object *obj;
> >> +    QVirtioSCSI *scsi;
> >> +    SerialMM *smm;
> >> +
> >> +When writing QOM/OOP-style function, a "self" variable allows to refer
> >> +without ambiguity to the instance of the method that is being
> >> +implemented (this is not very common in QEMU code base, but it is
> >> +often a good option to increase the readability and consistency,
> >> +making further refactoring easier as well).  Example:
> >> +
> >> +.. code-block:: c
> >> +
> >> +    serial_mm_flush(SerialMM *self);
> >> +
> >> +    serial_mm_instance_init(Object *o) {
> >> +        SerialMM *self = SERIAL_MM(o);
> >> +        ..
> >> +    }
> >> +
> >> +Structured type names are in CamelCase; harder to type but standing
> >> +out.  Enum type names and function type names should also be in
> >> +CamelCase.  Scalar type names are
> >> +lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
> >> +and family.  Note that this last convention contradicts POSIX and is
> >> +therefore likely to be changed.
> >>     When wrapping standard library functions, use the prefix
> >> ``qemu_`` to alert
> >>   readers that they are seeing a wrapped version; otherwise avoid this prefix.
> >>
> >
> > So in this example:
> >
> > static void pci_unin_agp_init(Object *obj)
> > {
> >     UNINHostState *s = UNI_NORTH_AGP_HOST_BRIDGE(obj);
>
> Using *s for the contextually appropriate state holding structure is
> certainly common enough in the code base. Maybe we should should
> document that too?

Yes, "s" is very common in qemu. Yet, it's ambiguous to me, as it is
used in other cases, while "self" is explicit for OOP-style, to refer
to the instance type being implemented.

>
> >     SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
> >     PCIHostState *h = PCI_HOST_BRIDGE(obj);
> >
> >     /* Uninorth AGP bus */
> >     memory_region_init_io(&h->conf_mem, OBJECT(h),
> >                           &pci_host_conf_le_ops,
> >                           obj, "unin-agp-conf-idx", 0x1000);
> >     memory_region_init_io(&h->data_mem, OBJECT(h),
> >                           &pci_host_data_le_ops,
> >                           obj, "unin-agp-conf-data", 0x1000);
> >
> >     object_property_add_link(obj, "pic", TYPE_OPENPIC,
> >                              (Object **) &s->pic,
> >                              qdev_prop_allow_set_link_before_realize,
> >                              0, NULL);
> >
> >     sysbus_init_mmio(sbd, &h->conf_mem);
> >     sysbus_init_mmio(sbd, &h->data_mem);
> > }
> >
> > You would change 'Object *obj' -> 'Object *self'?
>
> I would have read it as:
>
>   SysBusDevice *self = SYS_BUS_DEVICE(obj);
>
> as the only device object in the example. But perhaps this is a complex
> example?
>
> >
> > But here we want to keep 'klass', right?
> >
> > static void gpex_host_class_init(ObjectClass *klass, void *data)
> > {
> >     DeviceClass *dc = DEVICE_CLASS(klass);
> >     PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
> >
> >     hc->root_bus_path = gpex_host_root_bus_path;
> >     dc->realize = gpex_host_realize;
> >     set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> >     dc->fw_name = "pci";
> > }
> >
> > Maybe we should restrict 'self' as name of Object type only?
> > But your example is with SerialMM, so no?
>
>
> --
> Alex Bennée
>

thanks


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Marc-André Lureau 4 years, 5 months ago
Hi

On Wed, Nov 20, 2019 at 8:05 PM Philippe Mathieu-Daudé
<philmd@redhat.com> wrote:
>
> On 11/20/19 1:54 PM, Marc-André Lureau wrote:
> > Following the discussion in thread "[PATCH v3 13/33] serial: start
> > making SerialMM a sysbus device", I'd like to recommend the usage of
> > "self" variable to reference to the OOP-style method instance, as
> > commonly done in various languages and in GObject world.
> >
> > Cc: Peter Maydell <peter.maydell@linaro.org>
> > Cc: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >   CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
> >   1 file changed, 32 insertions(+), 6 deletions(-)
> >
> > diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
> > index 427699e0e4..cb6635af71 100644
> > --- a/CODING_STYLE.rst
> > +++ b/CODING_STYLE.rst
> > @@ -102,12 +102,38 @@ Rationale:
> >   Naming
> >   ======
> >
> > -Variables are lower_case_with_underscores; easy to type and read.  Structured
> > -type names are in CamelCase; harder to type but standing out.  Enum type
> > -names and function type names should also be in CamelCase.  Scalar type
> > -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> > -uint64_t and family.  Note that this last convention contradicts POSIX
> > -and is therefore likely to be changed.
> > +Variables are lower_case_with_underscores; easy to type and read.
> > +
> > +The most common naming for a variable is an abbreviation of the type
> > +name.  Some common examples:
> > +
> > +.. code-block:: c
> > +
> > +    Object *obj;
> > +    QVirtioSCSI *scsi;
> > +    SerialMM *smm;
> > +
> > +When writing QOM/OOP-style function, a "self" variable allows to refer
> > +without ambiguity to the instance of the method that is being
> > +implemented (this is not very common in QEMU code base, but it is
> > +often a good option to increase the readability and consistency,
> > +making further refactoring easier as well).  Example:
> > +
> > +.. code-block:: c
> > +
> > +    serial_mm_flush(SerialMM *self);
> > +
> > +    serial_mm_instance_init(Object *o) {
> > +        SerialMM *self = SERIAL_MM(o);
> > +        ..
> > +    }
> > +
> > +Structured type names are in CamelCase; harder to type but standing
> > +out.  Enum type names and function type names should also be in
> > +CamelCase.  Scalar type names are
> > +lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
> > +and family.  Note that this last convention contradicts POSIX and is
> > +therefore likely to be changed.
> >
> >   When wrapping standard library functions, use the prefix ``qemu_`` to alert
> >   readers that they are seeing a wrapped version; otherwise avoid this prefix.
> >
>
> So in this example:
>
> static void pci_unin_agp_init(Object *obj)
> {
>      UNINHostState *s = UNI_NORTH_AGP_HOST_BRIDGE(obj);
>      SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
>      PCIHostState *h = PCI_HOST_BRIDGE(obj);
>
>      /* Uninorth AGP bus */
>      memory_region_init_io(&h->conf_mem, OBJECT(h),
>                            &pci_host_conf_le_ops,
>                            obj, "unin-agp-conf-idx", 0x1000);
>      memory_region_init_io(&h->data_mem, OBJECT(h),
>                            &pci_host_data_le_ops,
>                            obj, "unin-agp-conf-data", 0x1000);
>
>      object_property_add_link(obj, "pic", TYPE_OPENPIC,
>                               (Object **) &s->pic,
>                               qdev_prop_allow_set_link_before_realize,
>                               0, NULL);
>
>      sysbus_init_mmio(sbd, &h->conf_mem);
>      sysbus_init_mmio(sbd, &h->data_mem);
> }
>
> You would change 'Object *obj' -> 'Object *self'?

static void pci_unin_agp_init(Object *obj)
{
  UNINHostState *self = UNI_NORTH_AGP_HOST_BRIDGE(obj);
  ..


When you override a parent method, you can create aliases for the
parent types with the abbreviation rule if necessary. But often, there
are less references to the parent types than the actual type you
implement, so in many cases, PARENT(self) can be less confusing. Your
example is a perhaps a bit more complicated than usual. Yet, having
"self" for the type we are implementing is still more readable to me.

>
> But here we want to keep 'klass', right?
>
> static void gpex_host_class_init(ObjectClass *klass, void *data)
> {
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
>
>      hc->root_bus_path = gpex_host_root_bus_path;
>      dc->realize = gpex_host_realize;
>      set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>      dc->fw_name = "pci";
> }


If we follow a similar rule for class methods, I would suggest:

static void gpex_host_class_init(ObjectClass *oc, void *data)
{
  GPEXClass *klass = GPEX_CLASS(oc);

But in general, class_init has more code dealing with various parent
types, to override parent methods.

>
> Maybe we should restrict 'self' as name of Object type only?
> But your example is with SerialMM, so no?
>

"self" for the instance type, "klass" for the the class type.


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Philippe Mathieu-Daudé 4 years, 4 months ago
On 11/20/19 5:35 PM, Marc-André Lureau wrote:
> Hi
> 
> On Wed, Nov 20, 2019 at 8:05 PM Philippe Mathieu-Daudé
> <philmd@redhat.com> wrote:
>>
>> On 11/20/19 1:54 PM, Marc-André Lureau wrote:
>>> Following the discussion in thread "[PATCH v3 13/33] serial: start
>>> making SerialMM a sysbus device", I'd like to recommend the usage of
>>> "self" variable to reference to the OOP-style method instance, as
>>> commonly done in various languages and in GObject world.
>>>
>>> Cc: Peter Maydell <peter.maydell@linaro.org>
>>> Cc: Daniel P. Berrangé <berrange@redhat.com>
>>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>> ---
>>>    CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
>>>    1 file changed, 32 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
>>> index 427699e0e4..cb6635af71 100644
>>> --- a/CODING_STYLE.rst
>>> +++ b/CODING_STYLE.rst
>>> @@ -102,12 +102,38 @@ Rationale:
>>>    Naming
>>>    ======
>>>
>>> -Variables are lower_case_with_underscores; easy to type and read.  Structured
>>> -type names are in CamelCase; harder to type but standing out.  Enum type
>>> -names and function type names should also be in CamelCase.  Scalar type
>>> -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
>>> -uint64_t and family.  Note that this last convention contradicts POSIX
>>> -and is therefore likely to be changed.
>>> +Variables are lower_case_with_underscores; easy to type and read.
>>> +
>>> +The most common naming for a variable is an abbreviation of the type
>>> +name.  Some common examples:
>>> +
>>> +.. code-block:: c
>>> +
>>> +    Object *obj;
>>> +    QVirtioSCSI *scsi;
>>> +    SerialMM *smm;
>>> +
>>> +When writing QOM/OOP-style function, a "self" variable allows to refer
>>> +without ambiguity to the instance of the method that is being
>>> +implemented (this is not very common in QEMU code base, but it is
>>> +often a good option to increase the readability and consistency,
>>> +making further refactoring easier as well).  Example:
>>> +
>>> +.. code-block:: c
>>> +
>>> +    serial_mm_flush(SerialMM *self);
>>> +
>>> +    serial_mm_instance_init(Object *o) {
>>> +        SerialMM *self = SERIAL_MM(o);
>>> +        ..
>>> +    }
>>> +
>>> +Structured type names are in CamelCase; harder to type but standing
>>> +out.  Enum type names and function type names should also be in
>>> +CamelCase.  Scalar type names are
>>> +lower_case_with_underscores_ending_with_a_t, like the POSIX uint64_t
>>> +and family.  Note that this last convention contradicts POSIX and is
>>> +therefore likely to be changed.
>>>
>>>    When wrapping standard library functions, use the prefix ``qemu_`` to alert
>>>    readers that they are seeing a wrapped version; otherwise avoid this prefix.
>>>
>>
>> So in this example:
>>
>> static void pci_unin_agp_init(Object *obj)
>> {
>>       UNINHostState *s = UNI_NORTH_AGP_HOST_BRIDGE(obj);
>>       SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
>>       PCIHostState *h = PCI_HOST_BRIDGE(obj);
>>
>>       /* Uninorth AGP bus */
>>       memory_region_init_io(&h->conf_mem, OBJECT(h),
>>                             &pci_host_conf_le_ops,
>>                             obj, "unin-agp-conf-idx", 0x1000);
>>       memory_region_init_io(&h->data_mem, OBJECT(h),
>>                             &pci_host_data_le_ops,
>>                             obj, "unin-agp-conf-data", 0x1000);
>>
>>       object_property_add_link(obj, "pic", TYPE_OPENPIC,
>>                                (Object **) &s->pic,
>>                                qdev_prop_allow_set_link_before_realize,
>>                                0, NULL);
>>
>>       sysbus_init_mmio(sbd, &h->conf_mem);
>>       sysbus_init_mmio(sbd, &h->data_mem);
>> }
>>
>> You would change 'Object *obj' -> 'Object *self'?
> 
> static void pci_unin_agp_init(Object *obj)
> {
>    UNINHostState *self = UNI_NORTH_AGP_HOST_BRIDGE(obj);
>    ..
> 
> 
> When you override a parent method, you can create aliases for the
> parent types with the abbreviation rule if necessary. But often, there
> are less references to the parent types than the actual type you
> implement, so in many cases, PARENT(self) can be less confusing. Your
> example is a perhaps a bit more complicated than usual. Yet, having
> "self" for the type we are implementing is still more readable to me.

With your example I understand better "'self' for the instance type" you 
say below, which is "'self' variable allows to refer without ambiguity 
to the instance of the method that is being implemented" from the doc. OK.

> 
>>
>> But here we want to keep 'klass', right?
>>
>> static void gpex_host_class_init(ObjectClass *klass, void *data)
>> {
>>       DeviceClass *dc = DEVICE_CLASS(klass);
>>       PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass);
>>
>>       hc->root_bus_path = gpex_host_root_bus_path;
>>       dc->realize = gpex_host_realize;
>>       set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>>       dc->fw_name = "pci";
>> }
> 
> 
> If we follow a similar rule for class methods, I would suggest:
> 
> static void gpex_host_class_init(ObjectClass *oc, void *data)
> {
>    GPEXClass *klass = GPEX_CLASS(oc);

Clean.

> 
> But in general, class_init has more code dealing with various parent
> types, to override parent methods.

Yes, you are right.

> 
>>
>> Maybe we should restrict 'self' as name of Object type only?
>> But your example is with SerialMM, so no?
>>
> 
> "self" for the instance type, "klass" for the the class type.
> 


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Daniel P. Berrangé 4 years, 4 months ago
On Wed, Nov 20, 2019 at 04:54:44PM +0400, Marc-André Lureau wrote:
> Following the discussion in thread "[PATCH v3 13/33] serial: start
> making SerialMM a sysbus device", I'd like to recommend the usage of
> "self" variable to reference to the OOP-style method instance, as
> commonly done in various languages and in GObject world.

Looking at glib codebase, I don't see 'self' used all that
widely or consistently - much of gio/ uses an abbreviation
of the object type as the variable name.

> 
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Cc: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
>  1 file changed, 32 insertions(+), 6 deletions(-)
> 
> diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
> index 427699e0e4..cb6635af71 100644
> --- a/CODING_STYLE.rst
> +++ b/CODING_STYLE.rst
> @@ -102,12 +102,38 @@ Rationale:
>  Naming
>  ======
>  
> -Variables are lower_case_with_underscores; easy to type and read.  Structured
> -type names are in CamelCase; harder to type but standing out.  Enum type
> -names and function type names should also be in CamelCase.  Scalar type
> -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> -uint64_t and family.  Note that this last convention contradicts POSIX
> -and is therefore likely to be changed.
> +Variables are lower_case_with_underscores; easy to type and read.
> +
> +The most common naming for a variable is an abbreviation of the type
> +name.  Some common examples:
> +
> +.. code-block:: c
> +
> +    Object *obj;
> +    QVirtioSCSI *scsi;
> +    SerialMM *smm;
> +
> +When writing QOM/OOP-style function, a "self" variable allows to refer
> +without ambiguity to the instance of the method that is being
> +implemented (this is not very common in QEMU code base, but it is
> +often a good option to increase the readability and consistency,
> +making further refactoring easier as well).  Example:

For me the first "sniff test" for a new coding style guideline is
whether QEMU actually follows the rule to any significant extent
already. If not, then I think the benefit would have to be very
significant to justify defining it as a rule. We've historically
be quite reluctant to do bulk updates of existing code to apply
new coding styles. Without planning a bulk update, you end up
with a coding style that is followed by 1% of the code and ignored
by the other 99%.

As noted above, the common case in QEMU is for the variable to be an
abbreviation of the type name. The number of places using "*self" is
almost single digits. So I think the idea of standardizing on "self"
is already questionable for QEMU.


I think the reason for the current pattern of abbreviated type name
is that when we're inventing OOP features in C the impl of inheritance
is always sub-optimal, such that you frequently find a need to cast to
parent types.  So in any single method it is common to have multiple
variables all refering to the object "self", each cast to a different
type. To pick one simple example

    QIOChannelFile fioc = qio_channel_file_new(...)
    QIOChannel *ioc = QIO_CHANNEL(fioc)
    Object *obj = OBJECT(fioc)

    qio_channel_read_all(ioc, buf, len, erro);
    object_unref(obj);


I think that using the object type abbreviation for the variable name
is a good thing.  Arbitrarily picking "self" for one of those variables
is unhelpful, as you have to then look back to the declaration of "self"
to remind yourself whether "self" is an QOIChannelFile or a QIOChannel
or an Object.


Constrast with C++ / Java, where I think the use of "self" is a good
thing, because the language has built-in  OOP concepts, such that
you can call a method on any parent class without having to first
cast "self" to the parent type. IOW, in those languages you don't
have to care about the particular types in the class hierarchy when
operating on "self".  This isn't true of C / QEMU's QOM.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] RFC: CODING_STYLE: describe "self" variable convention
Posted by Marc-André Lureau 4 years, 4 months ago
Hi

On Wed, Nov 20, 2019 at 9:05 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Wed, Nov 20, 2019 at 04:54:44PM +0400, Marc-André Lureau wrote:
> > Following the discussion in thread "[PATCH v3 13/33] serial: start
> > making SerialMM a sysbus device", I'd like to recommend the usage of
> > "self" variable to reference to the OOP-style method instance, as
> > commonly done in various languages and in GObject world.
>
> Looking at glib codebase, I don't see 'self' used all that

Only gio in glib actually uses gobject. You would have to look at
other GNOME C projects to realize this is the most common pattern.

> widely or consistently - much of gio/ uses an abbreviation
> of the object type as the variable name.
>
> >
> > Cc: Peter Maydell <peter.maydell@linaro.org>
> > Cc: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  CODING_STYLE.rst | 38 ++++++++++++++++++++++++++++++++------
> >  1 file changed, 32 insertions(+), 6 deletions(-)
> >
> > diff --git a/CODING_STYLE.rst b/CODING_STYLE.rst
> > index 427699e0e4..cb6635af71 100644
> > --- a/CODING_STYLE.rst
> > +++ b/CODING_STYLE.rst
> > @@ -102,12 +102,38 @@ Rationale:
> >  Naming
> >  ======
> >
> > -Variables are lower_case_with_underscores; easy to type and read.  Structured
> > -type names are in CamelCase; harder to type but standing out.  Enum type
> > -names and function type names should also be in CamelCase.  Scalar type
> > -names are lower_case_with_underscores_ending_with_a_t, like the POSIX
> > -uint64_t and family.  Note that this last convention contradicts POSIX
> > -and is therefore likely to be changed.
> > +Variables are lower_case_with_underscores; easy to type and read.
> > +
> > +The most common naming for a variable is an abbreviation of the type
> > +name.  Some common examples:
> > +
> > +.. code-block:: c
> > +
> > +    Object *obj;
> > +    QVirtioSCSI *scsi;
> > +    SerialMM *smm;
> > +
> > +When writing QOM/OOP-style function, a "self" variable allows to refer
> > +without ambiguity to the instance of the method that is being
> > +implemented (this is not very common in QEMU code base, but it is
> > +often a good option to increase the readability and consistency,
> > +making further refactoring easier as well).  Example:
>
> For me the first "sniff test" for a new coding style guideline is
> whether QEMU actually follows the rule to any significant extent
> already. If not, then I think the benefit would have to be very
> significant to justify defining it as a rule. We've historically

It's not a strict rule.

> be quite reluctant to do bulk updates of existing code to apply
> new coding styles. Without planning a bulk update, you end up
> with a coding style that is followed by 1% of the code and ignored
> by the other 99%.

We won't do a bulk update (that was never my intention, that would be
ridiculous).

Adding "self" to the zoo of variable names to refer to the
implentation instance isn't going to make a revolution, but is a good
pattern. I didn't think we would need to argue about it or modify
CODING_STYLE. But I should have known better ;)

>
> As noted above, the common case in QEMU is for the variable to be an
> abbreviation of the type name. The number of places using "*self" is
> almost single digits. So I think the idea of standardizing on "self"
> is already questionable for QEMU.
>
>
> I think the reason for the current pattern of abbreviated type name
> is that when we're inventing OOP features in C the impl of inheritance
> is always sub-optimal, such that you frequently find a need to cast to
> parent types.  So in any single method it is common to have multiple
> variables all refering to the object "self", each cast to a different
> type. To pick one simple example
>
>     QIOChannelFile fioc = qio_channel_file_new(...)
>     QIOChannel *ioc = QIO_CHANNEL(fioc)
>     Object *obj = OBJECT(fioc)
>
>     qio_channel_read_all(ioc, buf, len, erro);
>     object_unref(obj);

This code is probably not a OOP-style "method" (a method associated to
an instnace). I can't imagine what would be "self" here.

>
>
> I think that using the object type abbreviation for the variable name
> is a good thing.  Arbitrarily picking "self" for one of those variables
> is unhelpful, as you have to then look back to the declaration of "self"
> to remind yourself whether "self" is an QOIChannelFile or a QIOChannel
> or an Object.

Is "s" or "ss" or "ioc", "fioc" more readable?

self is of the type being implemented. Usually it is inside a
my_foo_method() in my-foo.c, so you know that self is of MyFoo type
without effort.

>
> Constrast with C++ / Java, where I think the use of "self" is a good
> thing, because the language has built-in  OOP concepts, such that
> you can call a method on any parent class without having to first
> cast "self" to the parent type. IOW, in those languages you don't
> have to care about the particular types in the class hierarchy when
> operating on "self".  This isn't true of C / QEMU's QOM.
>
>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
>


-- 
Marc-André Lureau