We treat most HTTP errors as non-fatal when fetching assets,
but forgot to handle network level errors. This adds catching
of URLError so that we retry on failure, and will ultimately
trigger graceful skipping in the pre-cache task.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
tests/functional/qemu_test/asset.py | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
index ae2bec3ea5..36f64fe2f6 100644
--- a/tests/functional/qemu_test/asset.py
+++ b/tests/functional/qemu_test/asset.py
@@ -15,7 +15,7 @@
from time import sleep
from pathlib import Path
from shutil import copyfileobj
-from urllib.error import HTTPError
+from urllib.error import HTTPError, URLError
class AssetError(Exception):
def __init__(self, asset, msg, transient=False):
@@ -171,6 +171,13 @@ def fetch(self):
raise AssetError(self, "Unable to download %s: "
"HTTP error %d" % (self.url, e.code))
continue
+ except URLError as e:
+ # This is typically a network/service level error
+ # eg urlopen error [Errno 110] Connection timed out>
+ tmp_cache_file.unlink()
+ self.log.error("Unable to download %s: URL error %s",
+ self.url, e)
+ continue
except Exception as e:
tmp_cache_file.unlink()
raise AssetError(self, "Unable to download %s: %s" % (
--
2.50.1
On 29/08/2025 10.31, Daniel P. Berrangé wrote: > We treat most HTTP errors as non-fatal when fetching assets, > but forgot to handle network level errors. This adds catching > of URLError so that we retry on failure, and will ultimately > trigger graceful skipping in the pre-cache task. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > --- > tests/functional/qemu_test/asset.py | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py > index ae2bec3ea5..36f64fe2f6 100644 > --- a/tests/functional/qemu_test/asset.py > +++ b/tests/functional/qemu_test/asset.py > @@ -15,7 +15,7 @@ > from time import sleep > from pathlib import Path > from shutil import copyfileobj > -from urllib.error import HTTPError > +from urllib.error import HTTPError, URLError > > class AssetError(Exception): > def __init__(self, asset, msg, transient=False): > @@ -171,6 +171,13 @@ def fetch(self): > raise AssetError(self, "Unable to download %s: " > "HTTP error %d" % (self.url, e.code)) > continue > + except URLError as e: > + # This is typically a network/service level error > + # eg urlopen error [Errno 110] Connection timed out> > + tmp_cache_file.unlink() > + self.log.error("Unable to download %s: URL error %s", > + self.url, e) > + continue Hmm, I don't think we should retry on each and every URLError. For example if you have a typo in the server name, you get a "Name or service not known" URLError, and it does not make sense to retry in that case. Also, in case of the server being down, it takes a minute or two 'til the urllib gives up, so if you retry multiple times in that case, you can easily extend the test time by > 5 minutes (as you can see by the timestamps in the example in your cover letter), which is unfortunate, too (considering that there is also a timeout setting for the gitlab CI jobs). I think I'd maybe rather do something like this instead, without retrying the download, just marking certain errors as transient: except URLError as e: tmp_cache_file.unlink() msg = str(e.reason) self.log.error("Unable to download %s: URL error %s", self.url, msg) raise AssetError(self, msg, transient=("Network is unreachable" in msg)) WDYT? Thomas
On Fri, Aug 29, 2025 at 11:00:12AM +0200, Thomas Huth wrote: > On 29/08/2025 10.31, Daniel P. Berrangé wrote: > > We treat most HTTP errors as non-fatal when fetching assets, > > but forgot to handle network level errors. This adds catching > > of URLError so that we retry on failure, and will ultimately > > trigger graceful skipping in the pre-cache task. > > > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > > --- > > tests/functional/qemu_test/asset.py | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py > > index ae2bec3ea5..36f64fe2f6 100644 > > --- a/tests/functional/qemu_test/asset.py > > +++ b/tests/functional/qemu_test/asset.py > > @@ -15,7 +15,7 @@ > > from time import sleep > > from pathlib import Path > > from shutil import copyfileobj > > -from urllib.error import HTTPError > > +from urllib.error import HTTPError, URLError > > class AssetError(Exception): > > def __init__(self, asset, msg, transient=False): > > @@ -171,6 +171,13 @@ def fetch(self): > > raise AssetError(self, "Unable to download %s: " > > "HTTP error %d" % (self.url, e.code)) > > continue > > + except URLError as e: > > + # This is typically a network/service level error > > + # eg urlopen error [Errno 110] Connection timed out> > > + tmp_cache_file.unlink() > > + self.log.error("Unable to download %s: URL error %s", > > + self.url, e) > > + continue > > Hmm, I don't think we should retry on each and every URLError. For example > if you have a typo in the server name, you get a "Name or service not known" > URLError, and it does not make sense to retry in that case. > > Also, in case of the server being down, it takes a minute or two 'til the > urllib gives up, so if you retry multiple times in that case, you can easily > extend the test time by > 5 minutes (as you can see by the timestamps in the > example in your cover letter), which is unfortunate, too (considering that > there is also a timeout setting for the gitlab CI jobs). > > I think I'd maybe rather do something like this instead, without retrying > the download, just marking certain errors as transient: > > except URLError as e: > tmp_cache_file.unlink() > msg = str(e.reason) > self.log.error("Unable to download %s: URL error %s", > self.url, msg) > raise AssetError(self, msg, > transient=("Network is unreachable" in msg)) > > WDYT? Ok, yes, we can do that instead. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2025 Red Hat, Inc.