From nobody Sat Apr 5 15:56:18 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17380523407111011.3141596006682; Tue, 28 Jan 2025 00:19:00 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tcgeQ-0002Jq-G2; Tue, 28 Jan 2025 03:08:52 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tcgbL-0003ZK-Tb; Tue, 28 Jan 2025 03:05:40 -0500 Received: from isrv.corpit.ru ([86.62.121.231]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tcgbJ-0003A3-QK; Tue, 28 Jan 2025 03:05:39 -0500 Received: from tsrv.corpit.ru (tsrv.tls.msk.ru [192.168.177.2]) by isrv.corpit.ru (Postfix) with ESMTP id 4621FE1B76; Tue, 28 Jan 2025 10:57:09 +0300 (MSK) Received: from localhost.tls.msk.ru (mjt.wg.tls.msk.ru [192.168.177.130]) by tsrv.corpit.ru (Postfix) with ESMTP id B5EB81A6329; Tue, 28 Jan 2025 10:57:34 +0300 (MSK) Received: by localhost.tls.msk.ru (Postfix, from userid 1000) id 37620520E3; Tue, 28 Jan 2025 10:57:34 +0300 (MSK) To: qemu-devel@nongnu.org Cc: qemu-stable@nongnu.org, Fabiano Rosas , Peter Xu , Michael Tokarev Subject: [Stable-9.1.3 47/58] migration: Fix arrays of pointers in JSON writer Date: Mon, 27 Jan 2025 23:25:33 +0300 Message-Id: <20250127202547.3723716-47-mjt@tls.msk.ru> X-Mailer: git-send-email 2.39.5 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Michael Tokarev Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=86.62.121.231; envelope-from=mjt@tls.msk.ru; helo=isrv.corpit.ru X-Spam_score_int: -53 X-Spam_score: -5.4 X-Spam_bar: ----- X-Spam_report: (-5.4 / 5.0 requ) BAYES_00=-1.9, DATE_IN_PAST_06_12=1.543, RCVD_IN_DNSWL_HI=-5, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1738052343051019000 Content-Type: text/plain; charset="utf-8" Currently, if an array of pointers contains a NULL pointer, that pointer will be encoded as '0' in the stream. Since the JSON writer doesn't define a "pointer" type, that '0' will now be an uint8, which is different from the original type being pointed to, e.g. struct. (we're further calling uint8 "nullptr", but that's irrelevant to the issue) That mixed-type array shouldn't be compressed, otherwise data is lost as the code currently makes the whole array have the type of the first element: css =3D {NULL, NULL, ..., 0x5555568a7940, NULL}; {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", "version": 1, "fields": [ ..., {"name": "css", "array_len": 256, "type": "nullptr", "size": 1}, ..., ]} In the above, the valid pointer at position 254 got lost among the compressed array of nullptr. While we could disable the array compression when a NULL pointer is found, the JSON part of the stream still makes part of downtime, so we should avoid writing unecessary bytes to it. Keep the array compression in place, but if NULL and non-NULL pointers are mixed break the array into several type-contiguous pieces : css =3D {NULL, NULL, ..., 0x5555568a7940, NULL}; {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", "version": 1, "fields": [ ..., {"name": "css", "array_len": 254, "type": "nullptr", "size": 1}, {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_im= g", ... }, "size": 768}, {"name": "css", "type": "nullptr", "size": 1}, ..., ]} Now each type-discontiguous region will become a new JSON entry. The reader should interpret this as a concatenation of values, all part of the same field. Parsing the JSON with analyze-script.py now shows the proper data being pointed to at the places where the pointer is valid and "nullptr" where there's NULL: "s390_css (14)": { ... "css": [ "nullptr", "nullptr", ... "nullptr", { "chpids": [ { "in_use": "0x00", "type": "0x00", "is_virtual": "0x00" }, ... ] }, "nullptr", } Reviewed-by: Peter Xu Message-Id: <20250109185249.23952-7-farosas@suse.de> Signed-off-by: Fabiano Rosas (cherry picked from commit 35049eb0d2fc72bb8c563196ec75b4d6c13fce02) Signed-off-by: Michael Tokarev diff --git a/migration/vmstate.c b/migration/vmstate.c index 52704c822c..82bd005a83 100644 --- a/migration/vmstate.c +++ b/migration/vmstate.c @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDe= scription *vmsd, int size =3D vmstate_size(opaque, field); uint64_t old_offset, written_bytes; JSONWriter *vmdesc_loop =3D vmdesc; + bool is_prev_null =3D false; =20 trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems= ); if (field->flags & VMS_POINTER) { first_elem =3D *(void **)first_elem; assert(first_elem || !n_elems || !size); } + for (i =3D 0; i < n_elems; i++) { void *curr_elem =3D first_elem + size * i; const VMStateField *inner_field; + bool is_null; + int max_elems =3D n_elems - i; =20 old_offset =3D qemu_file_transferred(f); if (field->flags & VMS_ARRAY_OF_POINTER) { @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDe= scription *vmsd, * not follow. */ inner_field =3D vmsd_create_fake_nullptr_field(field); + is_null =3D true; } else { inner_field =3D field; + is_null =3D false; + } + + /* + * Due to the fake nullptr handling above, if there's mixed + * null/non-null data, it doesn't make sense to emit a + * compressed array representation spanning the entire arr= ay + * because the field types will be different (e.g. struct + * vs. nullptr). Search ahead for the next null/non-null e= lement + * and start a new compressed array if found. + */ + if (field->flags & VMS_ARRAY_OF_POINTER && + is_null !=3D is_prev_null) { + + is_prev_null =3D is_null; + vmdesc_loop =3D vmdesc; + + for (int j =3D i + 1; j < n_elems; j++) { + void *elem =3D *(void **)(first_elem + size * j); + bool elem_is_null =3D !elem && size; + + if (is_null !=3D elem_is_null) { + max_elems =3D j - i; + break; + } + } } =20 vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, - i, n_elems); + i, max_elems); =20 if (inner_field->flags & VMS_STRUCT) { ret =3D vmstate_save_state(f, inner_field->vmsd, diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py index 923f174f1b..8e1fbf4c9d 100755 --- a/scripts/analyze-migration.py +++ b/scripts/analyze-migration.py @@ -502,15 +502,25 @@ def read(self): field['data'] =3D reader(field, self.file) field['data'].read() =20 - if 'index' in field: - if field['name'] not in self.data: - self.data[field['name']] =3D [] - a =3D self.data[field['name']] - if len(a) !=3D int(field['index']): - raise Exception("internal index of data field unmatche= d (%d/%d)" % (len(a), int(field['index']))) - a.append(field['data']) + fname =3D field['name'] + fdata =3D field['data'] + + # The field could be: + # i) a single data entry, e.g. uint64 + # ii) an array, indicated by it containing the 'index' key + # + # However, the overall data after parsing the whole + # stream, could be a mix of arrays and single data fields, + # all sharing the same field name due to how QEMU breaks + # up arrays with NULL pointers into multiple compressed + # array segments. + if fname not in self.data: + self.data[fname] =3D fdata + elif type(self.data[fname]) =3D=3D list: + self.data[fname].append(fdata) else: - self.data[field['name']] =3D field['data'] + tmp =3D self.data[fname] + self.data[fname] =3D [tmp, fdata] =20 if 'subsections' in self.desc['struct']: for subsection in self.desc['struct']['subsections']: --=20 2.39.5