migration: optimize cpr fd lookup using GHashTable

[RFC V2 0/2] migration: optimize cpr fd lookup using GHashTable

Posted by hongmianquan 3 days ago

Currently, the CPR subsystem in QEMU uses a QLIST to store fds.
In scenarios where a large number of
fds are involved (such as a VM with many vfio-pci devices), looking up an fd
via `cpr_find_fd` becomes a performance bottleneck due to the O(N) linear search.
This patch series optimizes the cpr fd storage by replacing the QLIST
with a GHashTable. The time complexity for `cpr_find_fd` is reduced
from O(N) to O(1).

To demonstrate the performance improvement, we tested the total time
consumed by `cpr_find_fd` (called N times for N fds) under our real-world
business scenarios with different numbers of file descriptors. The results
are measured in nanoseconds:

| Number of FDs | Total time with QLIST (ns) | Total time with GHashTable (ns) |
|---------------|----------------------------|---------------------------------|
| 540           | 936,753                    | 393,358                         |
| 2,870         | 24,102,342                 | 2,212,113                       |
| 7,530         | 152,715,916                | 5,474,310                       |

As shown in the data, the lookup time grows exponentially with the QLIST
as the number of fds increases. With the GHashTable, the time consumption
remains linear (O(1) per lookup), significantly reducing the downtime during
the CPR process.

We choose to migrate the GHashTable directly rather than migrating a
QLIST and rebuilding the hash on the destination side. Since `cpr_find_fd`
is used during normal runtime (where fds are frequently added or deleted),
reconstructing the hash would require us to maintain and synchronize both
a QLIST and a temporary hashtable simultaneously on both the source and
destination sides.
Migrating the GHashTable natively keeps `cpr_state.fds` as the single
source of truth, eliminating synchronization overhead and potential bugs.
Additionally, the new `VMSTATE_GHASH_V` macro provides a reusable API
for other QEMU subsystems to migrate key-value mappings.

The series is structured as follows:
1. The first patch introduces migration support for `GHashTable` by
   adding the `VMSTATE_GHASH_V` macro and related save/load functions.
   This is inspired by previous implementations (e.g., commit 9a85e4b)
   and handles the serialization of hash table keys and values.
2. The second patch refactors `cpr_state.fds` from a QLIST to a
   GHashTable. It defines `CprFdKey` and `CprFdVal`, sets up the hash
   and equal functions, and updates all fd operations (save, delete,
   find, resave, walk) to use the new hash table API.

hongmianquan (2):
  migration: Support ghash migration
  cpr: use hashtable for cpr fds

 include/migration/cpr.h     |   4 +-
 include/migration/vmstate.h |  22 ++++++
 migration/cpr.c             | 129 +++++++++++++++++++++----------
 migration/trace-events      |   5 ++
 migration/vmstate-types.c   | 147 ++++++++++++++++++++++++++++++++++++
 system/vl.c                 |   2 +
 6 files changed, 269 insertions(+), 40 deletions(-)

-- 
2.32.1 (Apple Git-133)

Re: [RFC V2 0/2] migration: optimize cpr fd lookup using GHashTable

Posted by Peter Xu 2 days, 22 hours ago

On Thu, Mar 19, 2026 at 07:38:28PM +0800, hongmianquan wrote:
> Currently, the CPR subsystem in QEMU uses a QLIST to store fds.
> In scenarios where a large number of
> fds are involved (such as a VM with many vfio-pci devices), looking up an fd
> via `cpr_find_fd` becomes a performance bottleneck due to the O(N) linear search.
> This patch series optimizes the cpr fd storage by replacing the QLIST
> with a GHashTable. The time complexity for `cpr_find_fd` is reduced
> from O(N) to O(1).

Some unanswered comments here for v1:

https://lore.kernel.org/all/aacuTpoMucQ1wrUN@x1.local/#t

-- 
Peter Xu

Re: [RFC V2 0/2] migration: optimize cpr fd lookup using GHashTable

Posted by hongmainquan 2 days, 21 hours ago

Thanks for the reminder. I apologize if I missed any specific points 
from your v1 review. Could you please point out which specific comments 
I left unanswered? I will address them immediately.
Regarding your main questions in v1 (performance data and why migrate 
hash), I have updated them in the cover letter of the v2 series, but 
I'll summarize the answers here directly for your convenience:

1. Why the lookup is slow, and the performance improvement data: 
Currently, `find_fd` uses a linear search (O(N)) on a QLIST.
The latency increases with a rising number of fds, and the supporting 
test data is provided in the cover letter.

2. Why migrate a hash instead of migrating a list and reconstructing it:
My primary goal was to accelerate all `find_fd` lookups. To achieve 
this, the underlying data source needs to be replaced with a hash table. 
My initial thought was to simply change the fds structure entirely to a 
GHashTable and migrate it directly. This approach seemed much more 
straightforward to implement than maintaining a conversion between QLIST 
and hash, though, as you rightly pointed out, it does inevitably involve 
changing the CPR ABI.
Secondly, I was concerned about the timing of converting a QLIST to a 
GHashTable. The conversion must happen when the fds list is in its final 
state with no further modifications. However, during normal VM 
operation, there are ongoing operations that save or modify fds. If the 
GHashTable were merely an auxiliary structure, we need to keep the QLIST 
and the GHashTable synchronized.
That said, your suggestion makes a lot of sense. If we treat the 
GHashTable as the primary structure and only convert it to a QLIST in 
the pre_save hook right before migration—then reconstruct the hash table 
from the QLIST on the destination, it would completely work. This 
preserves the original QLIST-based migration ABI.
In short, I believe both approaches are technically feasible, but I 
agree that my current implementation has the downside of breaking the 
CPR ABI. I look forward to discussing this further with you.

Thanks.

在 2026/3/19 21:51, Peter Xu 写道:
> On Thu, Mar 19, 2026 at 07:38:28PM +0800, hongmianquan wrote:
>> Currently, the CPR subsystem in QEMU uses a QLIST to store fds.
>> In scenarios where a large number of
>> fds are involved (such as a VM with many vfio-pci devices), looking up an fd
>> via `cpr_find_fd` becomes a performance bottleneck due to the O(N) linear search.
>> This patch series optimizes the cpr fd storage by replacing the QLIST
>> with a GHashTable. The time complexity for `cpr_find_fd` is reduced
>> from O(N) to O(1).
> 
> Some unanswered comments here for v1:
> 
> https://lore.kernel.org/all/aacuTpoMucQ1wrUN@x1.local/#t
>