In cases when the hash function for a name collides with other entry
already in the hash we prepend to the bucket. This creates a 'stack
effect' on the buckets if we then iterate through the hash. Normally
this is not a problem, but in tests we want deterministic results.
Since it does not matter where we add the entry and it's usually more
probable that a different entry will be accessed next change it to
append to the end of the bucket. Luckily we already iterate throught the
bucket once thus we can easily find the last entry and just connect the
new entry after it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
---
src/util/virhash.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/src/util/virhash.c b/src/util/virhash.c
index 4907c1124f..0e30106041 100644
--- a/src/util/virhash.c
+++ b/src/util/virhash.c
@@ -316,6 +316,7 @@ virHashAddOrUpdateEntry(virHashTablePtr table, const void *name,
{
size_t key, len = 0;
virHashEntryPtr entry;
+ virHashEntryPtr last = NULL;
void *new_name;
if ((table == NULL) || (name == NULL))
@@ -337,6 +338,7 @@ virHashAddOrUpdateEntry(virHashTablePtr table, const void *name,
return -1;
}
}
+ last = entry;
len++;
}
@@ -347,8 +349,11 @@ virHashAddOrUpdateEntry(virHashTablePtr table, const void *name,
entry->name = new_name;
entry->payload = userdata;
- entry->next = table->table[key];
- table->table[key] = entry;
+
+ if (last)
+ last->next = entry;
+ else
+ table->table[key] = entry;
table->nbElems++;
--
2.20.1
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: > In cases when the hash function for a name collides with other entry > already in the hash we prepend to the bucket. This creates a 'stack > effect' on the buckets if we then iterate through the hash. Normally > this is not a problem, but in tests we want deterministic results. > > Since it does not matter where we add the entry and it's usually more > probable that a different entry will be accessed next change it to > append to the end of the bucket. Luckily we already iterate throught the > bucket once thus we can easily find the last entry and just connect the > new entry after it. I'm not understanding how this change makes the tests any more deterministic than before. The determinism for hash bucket placement is already ensured by having a non-random impl forvirHashCodeGen in tests/virdeterministichashmock.c This change simply reverses the order in which we iterate over values when many keys hash to the same bucket. If this mattered for tests then surely, this change would have to update the test suite too to take account of the new reversed ordering ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 15:17:31 +0100, Daniel Berrange wrote: > On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: > > In cases when the hash function for a name collides with other entry > > already in the hash we prepend to the bucket. This creates a 'stack > > effect' on the buckets if we then iterate through the hash. Normally > > this is not a problem, but in tests we want deterministic results. > > > > Since it does not matter where we add the entry and it's usually more > > probable that a different entry will be accessed next change it to > > append to the end of the bucket. Luckily we already iterate throught the > > bucket once thus we can easily find the last entry and just connect the > > new entry after it. > > I'm not understanding how this change makes the tests any more > deterministic than before. > > The determinism for hash bucket placement is already ensured by > having a non-random impl forvirHashCodeGen in > tests/virdeterministichashmock.c > > This change simply reverses the order in which we iterate over values > when many keys hash to the same bucket. If this mattered for tests > then surely, this change would have to update the test suite too to > take account of the new reversed ordering ? If you have a hash collision when adding into the hash and then use virHashForeach the items are formatted in opposite order. Thus if you use this to e.g. store things parsed from XML and then format it back, every single iteration reverses the order. This is not a problem per-se since we can use a different output file, but it's inconvenient since we can't use the same output file. Currently there aren't any collisions which would cause problems, but I have one on my private branch. Given that this change is rather simple I figured we could just stop inverting them. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 04:26:49PM +0200, Peter Krempa wrote: > On Tue, Apr 16, 2019 at 15:17:31 +0100, Daniel Berrange wrote: > > On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: > > > In cases when the hash function for a name collides with other entry > > > already in the hash we prepend to the bucket. This creates a 'stack > > > effect' on the buckets if we then iterate through the hash. Normally > > > this is not a problem, but in tests we want deterministic results. > > > > > > Since it does not matter where we add the entry and it's usually more > > > probable that a different entry will be accessed next change it to > > > append to the end of the bucket. Luckily we already iterate throught the > > > bucket once thus we can easily find the last entry and just connect the > > > new entry after it. > > > > I'm not understanding how this change makes the tests any more > > deterministic than before. > > > > The determinism for hash bucket placement is already ensured by > > having a non-random impl forvirHashCodeGen in > > tests/virdeterministichashmock.c > > > > This change simply reverses the order in which we iterate over values > > when many keys hash to the same bucket. If this mattered for tests > > then surely, this change would have to update the test suite too to > > take account of the new reversed ordering ? > > If you have a hash collision when adding into the hash and then use > virHashForeach the items are formatted in opposite order. > > Thus if you use this to e.g. store things parsed from XML and then > format it back, every single iteration reverses the order. This is not a > problem per-se since we can use a different output file, but it's > inconvenient since we can't use the same output file. > > Currently there aren't any collisions which would cause problems, but I > have one on my private branch. > > Given that this change is rather simple I figured we could just stop > inverting them. Ok, that does make a little more sense, though generally my view would be that if we're doing something that is order sensitive like XML parsing & formatting, then we shouldn't be using a hash table to represent the data - at least not as the primary record. Which bit of code is using the hash in this case ? Is it something purely inside the test suite, or in the real XML handling code ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 15:32:31 +0100, Daniel Berrange wrote: > On Tue, Apr 16, 2019 at 04:26:49PM +0200, Peter Krempa wrote: > > On Tue, Apr 16, 2019 at 15:17:31 +0100, Daniel Berrange wrote: > > > On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: [...] > > Currently there aren't any collisions which would cause problems, but I > > have one on my private branch. > > > > Given that this change is rather simple I figured we could just stop > > inverting them. > > Ok, that does make a little more sense, though generally my view would > be that if we're doing something that is order sensitive like XML > parsing & formatting, then we shouldn't be using a hash table to > represent the data - at least not as the primary record. > > Which bit of code is using the hash in this case ? Is it something > purely inside the test suite, or in the real XML handling code ? I'm adding some private data for blockjobs which are formatted into the status XML. I'm storing them in a hash table as the order isn't really important here and the user should never see this. I was just bothered by the fact that I'd need to have an output file (or change the name of the blockjob) and figured that the insertion into the hash table is very simple and does not really matter how we do it. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 04:41:57PM +0200, Peter Krempa wrote: > On Tue, Apr 16, 2019 at 15:32:31 +0100, Daniel Berrange wrote: > > On Tue, Apr 16, 2019 at 04:26:49PM +0200, Peter Krempa wrote: > > > On Tue, Apr 16, 2019 at 15:17:31 +0100, Daniel Berrange wrote: > > > > On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: > > [...] > > > > Currently there aren't any collisions which would cause problems, but I > > > have one on my private branch. > > > > > > Given that this change is rather simple I figured we could just stop > > > inverting them. > > > > Ok, that does make a little more sense, though generally my view would > > be that if we're doing something that is order sensitive like XML > > parsing & formatting, then we shouldn't be using a hash table to > > represent the data - at least not as the primary record. > > > > Which bit of code is using the hash in this case ? Is it something > > purely inside the test suite, or in the real XML handling code ? > > I'm adding some private data for blockjobs which are formatted into the > status XML. I'm storing them in a hash table as the order isn't really > important here and the user should never see this. > > I was just bothered by the fact that I'd need to have an output file (or > change the name of the blockjob) and figured that the insertion into the > hash table is very simple and does not really matter how we do it. ok, that makes more sense now, thanks for explaining. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Apr 16, 2019 at 04:09:54PM +0200, Peter Krempa wrote: > In cases when the hash function for a name collides with other entry > already in the hash we prepend to the bucket. This creates a 'stack > effect' on the buckets if we then iterate through the hash. Normally > this is not a problem, but in tests we want deterministic results. > > Since it does not matter where we add the entry and it's usually more > probable that a different entry will be accessed next change it to > append to the end of the bucket. Luckily we already iterate throught the > bucket once thus we can easily find the last entry and just connect the > new entry after it. > > Signed-off-by: Peter Krempa <pkrempa@redhat.com> > --- > src/util/virhash.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2024 Red Hat, Inc.