rust/kernel/alloc/kvec.rs | 111 ++++++++++++++++++++++++++++++++-------------- 1 file changed, 78 insertions(+), 33 deletions(-)
KVec currently has `extend_with` and `extend_from_slice` methods, but no
way extend a vector from a regular iterator as provided by the `Extend`
trait.
Due to the need to provide the GFP flags, `Extend` cannot be implemented
directly, so simply define a homonymous method that takes an extra
`flags` argument.
The aforementioned `extend_with` and `extend_from_slice` can then be
reimplemented as direct invocations of this new method - maybe they can
eventually be removed.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
I was a bit surprised to find no equivalent of the `Extend` trait for
KVec, and while I anticipate to be told the reason for this, I also
didn't hit any hard wall trying to come with my own implementation so
here it is.
I expect the new `extend_with` and `extend_from_slice` to be optimized
into something close to their previous implementations, but am not sure
how I can simply verify that this is the case - any hint would be
appreciated!
---
Changes in v3:
- Use repeat_n() instead of repeat() for extend_with() in order to avoid
an extra clone of the value.
- Be more precise about the cases where extend() will perform optimally
or not in its documentation, and how the vector might be modified even
in case of an error.
- Replace into_iter() with iter() and iter_mut() where suggested by the
kernel test robot.
- Link to v2: https://lore.kernel.org/r/20250405-vec_extend-v2-1-e4a85af43cb3@nvidia.com
Changes in v2:
- Changed the diff algorithm to histogram for a more readable patch.
---
rust/kernel/alloc/kvec.rs | 111 ++++++++++++++++++++++++++++++++--------------
1 file changed, 78 insertions(+), 33 deletions(-)
diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
index ae9d072741cedbb34bed0be0c20cc75472aa53be..b3c4227e232cca3b5c17930abc63810240b9c4da 100644
--- a/rust/kernel/alloc/kvec.rs
+++ b/rust/kernel/alloc/kvec.rs
@@ -454,30 +454,86 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
}
}
+impl<T, A: Allocator> Vec<T, A> {
+ /// Extends the vector by the elements of `iter`.
+ ///
+ /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
+ /// imprecise implementations - albeit in a non-optimal way.
+ ///
+ /// This method returns an error if a memory reallocation required to accommodate the new items
+ /// failed. In this case, callers must assume that some (but not all) elements of `iter` might
+ /// have been added to the vector.
+ ///
+ /// # Note on optimal behavior and correctness
+ ///
+ /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`]
+ /// implementation of the `iter` is.
+ ///
+ /// It performs optimally with at most a single memory reallocation if the lower bound of
+ /// `size_hint` is the exact number of items actually yielded.
+ ///
+ /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to
+ /// cover the whole iterator from the successive lower bounds returned by `size_hint`.
+ ///
+ /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory
+ /// might be reserved.
+ ///
+ /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items
+ /// are yielded by the iterator and returns. This may result in some items not being added if
+ /// there were still some remaining.
+ ///
+ /// In the kernel most iterators are expected to have a precise and correct `size_hint`
+ /// implementation, so this should nicely optimize out for these cases.
+ pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
+ where
+ I: IntoIterator<Item = T>,
+ {
+ let mut iter = iter.into_iter();
+
+ loop {
+ let low_bound = match iter.size_hint() {
+ // No more items expected, we can return.
+ (0, Some(0)) => break,
+ // Possibly more items but not certain, tentatively add one.
+ (0, _) => 1,
+ // More items pending, reserve space for the lower bound.
+ (low_bound, _) => low_bound,
+ };
+
+ self.reserve(low_bound, flags)?;
+
+ // Number of items we actually added.
+ let added_items = self
+ .spare_capacity_mut()
+ .iter_mut()
+ // Take a mutable reference to the iterator so we can reuse it in the next
+ // iteration of the loop if needed.
+ .zip(&mut iter)
+ .fold(0, |count, (dst, src)| {
+ dst.write(src);
+
+ count + 1
+ });
+
+ // SAFETY:
+ // - `self.len() + added_items <= self.capacity()` due to the call to `reserve` above,
+ // - items `[self.len()..self.len() + added_items - 1]` are initialized.
+ unsafe { self.set_len(self.len() + added_items) };
+
+ // `size_hint` was incorrect and our iterator ended before its advertized lower bound.
+ if added_items < low_bound {
+ break;
+ }
+ }
+
+ Ok(())
+ }
+}
+
impl<T: Clone, A: Allocator> Vec<T, A> {
/// Extend the vector by `n` clones of `value`.
pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), AllocError> {
- if n == 0 {
- return Ok(());
- }
-
- self.reserve(n, flags)?;
-
- let spare = self.spare_capacity_mut();
-
- for item in spare.iter_mut().take(n - 1) {
- item.write(value.clone());
- }
-
- // We can write the last element directly without cloning needlessly.
- spare[n - 1].write(value);
-
- // SAFETY:
- // - `self.len() + n < self.capacity()` due to the call to reserve above,
- // - the loop and the line above initialized the next `n` elements.
- unsafe { self.set_len(self.len() + n) };
-
- Ok(())
+ self.extend(core::iter::repeat_n(value, n), flags)
}
/// Pushes clones of the elements of slice into the [`Vec`] instance.
@@ -496,18 +552,7 @@ pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), Al
/// # Ok::<(), Error>(())
/// ```
pub fn extend_from_slice(&mut self, other: &[T], flags: Flags) -> Result<(), AllocError> {
- self.reserve(other.len(), flags)?;
- for (slot, item) in core::iter::zip(self.spare_capacity_mut(), other) {
- slot.write(item.clone());
- }
-
- // SAFETY:
- // - `other.len()` spare entries have just been initialized, so it is safe to increase
- // the length by the same number.
- // - `self.len() + other.len() <= self.capacity()` is guaranteed by the preceding `reserve`
- // call.
- unsafe { self.set_len(self.len() + other.len()) };
- Ok(())
+ self.extend(other.iter().cloned(), flags)
}
/// Create a new `Vec<T, A>` and extend it by `n` clones of `value`.
---
base-commit: a2cc6ff5ec8f91bc463fd3b0c26b61166a07eb11
change-id: 20250405-vec_extend-4321251acc21
Best regards,
--
Alexandre Courbot <acourbot@nvidia.com>
On Sun, Apr 06, 2025 at 10:01:55PM +0900, Alexandre Courbot wrote:
> KVec currently has `extend_with` and `extend_from_slice` methods, but no
> way extend a vector from a regular iterator as provided by the `Extend`
> trait.
>
> Due to the need to provide the GFP flags, `Extend` cannot be implemented
> directly, so simply define a homonymous method that takes an extra
> `flags` argument.
This is the same approach I took with Vec::collect(); I think this is fine for
now. One we attempt to implement more collections, we should implement our own
Extend and FromIterator traits.
> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
> index ae9d072741cedbb34bed0be0c20cc75472aa53be..b3c4227e232cca3b5c17930abc63810240b9c4da 100644
> --- a/rust/kernel/alloc/kvec.rs
> +++ b/rust/kernel/alloc/kvec.rs
> @@ -454,30 +454,86 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
> }
> }
>
> +impl<T, A: Allocator> Vec<T, A> {
Please re-use the existing impl block above, i.e.
diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
index b3c4227e232c..72659b017553 100644
--- a/rust/kernel/alloc/kvec.rs
+++ b/rust/kernel/alloc/kvec.rs
@@ -452,9 +452,7 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
Ok(())
}
-}
-impl<T, A: Allocator> Vec<T, A> {
/// Extends the vector by the elements of `iter`.
///
/// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
> + /// Extends the vector by the elements of `iter`.
> + ///
> + /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
> + /// imprecise implementations - albeit in a non-optimal way.
> + ///
> + /// This method returns an error if a memory reallocation required to accommodate the new items
> + /// failed. In this case, callers must assume that some (but not all) elements of `iter` might
> + /// have been added to the vector.
> + ///
> + /// # Note on optimal behavior and correctness
> + ///
> + /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`]
> + /// implementation of the `iter` is.
> + ///
> + /// It performs optimally with at most a single memory reallocation if the lower bound of
> + /// `size_hint` is the exact number of items actually yielded.
> + ///
> + /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to
> + /// cover the whole iterator from the successive lower bounds returned by `size_hint`.
> + ///
> + /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory
> + /// might be reserved.
> + ///
> + /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items
> + /// are yielded by the iterator and returns. This may result in some items not being added if
> + /// there were still some remaining.
> + ///
> + /// In the kernel most iterators are expected to have a precise and correct `size_hint`
> + /// implementation, so this should nicely optimize out for these cases.
I agree, hence I think we should enforce to be provided with a guaranteed
correct size hint and simplify the code. I think we should extend the signature.
pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
where
I: IntoIterator<Item = T>,
I::IntoIter: ExactSizeIterator,
And implement ExactSizeIterator for IntoIter.
The only thing that bothers me a bit is that the documentation [1] of
ExactSizeIterator sounds a bit ambiguous.
It says: "When implementing an ExactSizeIterator, you must also implement
Iterator. When doing so, the implementation of Iterator::size_hint *must*
return the exact size of the iterator."
But it also says: "Note that this trait is a safe trait and as such does not and
cannot guarantee that the returned length is correct. This means that unsafe
code must not rely on the correctness of Iterator::size_hint. The unstable and
unsafe TrustedLen trait gives this additional guarantee."
Acknowledging the latter, I think we should implement our own trait for this
instead. Our own version of TrustedLen seems reasonable to me.
[1] https://doc.rust-lang.org/std/iter/trait.ExactSizeIterator.html
On Mon Apr 7, 2025 at 8:01 PM JST, Danilo Krummrich wrote:
> On Sun, Apr 06, 2025 at 10:01:55PM +0900, Alexandre Courbot wrote:
>> KVec currently has `extend_with` and `extend_from_slice` methods, but no
>> way extend a vector from a regular iterator as provided by the `Extend`
>> trait.
>>
>> Due to the need to provide the GFP flags, `Extend` cannot be implemented
>> directly, so simply define a homonymous method that takes an extra
>> `flags` argument.
>
> This is the same approach I took with Vec::collect(); I think this is fine for
> now. One we attempt to implement more collections, we should implement our own
> Extend and FromIterator traits.
>
>> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
>> index ae9d072741cedbb34bed0be0c20cc75472aa53be..b3c4227e232cca3b5c17930abc63810240b9c4da 100644
>> --- a/rust/kernel/alloc/kvec.rs
>> +++ b/rust/kernel/alloc/kvec.rs
>> @@ -454,30 +454,86 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
>> }
>> }
>>
>> +impl<T, A: Allocator> Vec<T, A> {
>
> Please re-use the existing impl block above, i.e.
>
> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
> index b3c4227e232c..72659b017553 100644
> --- a/rust/kernel/alloc/kvec.rs
> +++ b/rust/kernel/alloc/kvec.rs
> @@ -452,9 +452,7 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
>
> Ok(())
> }
> -}
>
> -impl<T, A: Allocator> Vec<T, A> {
> /// Extends the vector by the elements of `iter`.
> ///
> /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
>
>> + /// Extends the vector by the elements of `iter`.
>> + ///
>> + /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with
>> + /// imprecise implementations - albeit in a non-optimal way.
>> + ///
>> + /// This method returns an error if a memory reallocation required to accommodate the new items
>> + /// failed. In this case, callers must assume that some (but not all) elements of `iter` might
>> + /// have been added to the vector.
>> + ///
>> + /// # Note on optimal behavior and correctness
>> + ///
>> + /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`]
>> + /// implementation of the `iter` is.
>> + ///
>> + /// It performs optimally with at most a single memory reallocation if the lower bound of
>> + /// `size_hint` is the exact number of items actually yielded.
>> + ///
>> + /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to
>> + /// cover the whole iterator from the successive lower bounds returned by `size_hint`.
>> + ///
>> + /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory
>> + /// might be reserved.
>> + ///
>> + /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items
>> + /// are yielded by the iterator and returns. This may result in some items not being added if
>> + /// there were still some remaining.
>> + ///
>> + /// In the kernel most iterators are expected to have a precise and correct `size_hint`
>> + /// implementation, so this should nicely optimize out for these cases.
>
> I agree, hence I think we should enforce to be provided with a guaranteed
> correct size hint and simplify the code. I think we should extend the signature.
>
> pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
> where
> I: IntoIterator<Item = T>,
> I::IntoIter: ExactSizeIterator,
>
> And implement ExactSizeIterator for IntoIter.
>
> The only thing that bothers me a bit is that the documentation [1] of
> ExactSizeIterator sounds a bit ambiguous.
>
> It says: "When implementing an ExactSizeIterator, you must also implement
> Iterator. When doing so, the implementation of Iterator::size_hint *must*
> return the exact size of the iterator."
>
> But it also says: "Note that this trait is a safe trait and as such does not and
> cannot guarantee that the returned length is correct. This means that unsafe
> code must not rely on the correctness of Iterator::size_hint. The unstable and
> unsafe TrustedLen trait gives this additional guarantee."
Yeah ExactSizeIterator is not the solution to this, since it can be
implemented without an unsafe block and the implementer is perfectly
free to provide an incorrect value - so we cannot trust its result.
>
> Acknowledging the latter, I think we should implement our own trait for this
> instead. Our own version of TrustedLen seems reasonable to me.
That sounds reasonable and would greatly simplify the code (and remove
most of my fears about its optimization). Let me explore that direction.
Thanks,
Alex.
On Tue Apr 8, 2025 at 10:34 PM JST, Alexandre Courbot wrote: > On Mon Apr 7, 2025 at 8:01 PM JST, Danilo Krummrich wrote: >>> + /// Extends the vector by the elements of `iter`. >>> + /// >>> + /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with >>> + /// imprecise implementations - albeit in a non-optimal way. >>> + /// >>> + /// This method returns an error if a memory reallocation required to accommodate the new items >>> + /// failed. In this case, callers must assume that some (but not all) elements of `iter` might >>> + /// have been added to the vector. >>> + /// >>> + /// # Note on optimal behavior and correctness >>> + /// >>> + /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`] >>> + /// implementation of the `iter` is. >>> + /// >>> + /// It performs optimally with at most a single memory reallocation if the lower bound of >>> + /// `size_hint` is the exact number of items actually yielded. >>> + /// >>> + /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to >>> + /// cover the whole iterator from the successive lower bounds returned by `size_hint`. >>> + /// >>> + /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory >>> + /// might be reserved. >>> + /// >>> + /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items >>> + /// are yielded by the iterator and returns. This may result in some items not being added if >>> + /// there were still some remaining. >>> + /// >>> + /// In the kernel most iterators are expected to have a precise and correct `size_hint` >>> + /// implementation, so this should nicely optimize out for these cases. >> >> I agree, hence I think we should enforce to be provided with a guaranteed >> correct size hint and simplify the code. I think we should extend the signature. >> >> pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError> >> where >> I: IntoIterator<Item = T>, >> I::IntoIter: ExactSizeIterator, >> >> And implement ExactSizeIterator for IntoIter. >> >> The only thing that bothers me a bit is that the documentation [1] of >> ExactSizeIterator sounds a bit ambiguous. >> >> It says: "When implementing an ExactSizeIterator, you must also implement >> Iterator. When doing so, the implementation of Iterator::size_hint *must* >> return the exact size of the iterator." >> >> But it also says: "Note that this trait is a safe trait and as such does not and >> cannot guarantee that the returned length is correct. This means that unsafe >> code must not rely on the correctness of Iterator::size_hint. The unstable and >> unsafe TrustedLen trait gives this additional guarantee." > > Yeah ExactSizeIterator is not the solution to this, since it can be > implemented without an unsafe block and the implementer is perfectly > free to provide an incorrect value - so we cannot trust its result. > >> >> Acknowledging the latter, I think we should implement our own trait for this >> instead. Our own version of TrustedLen seems reasonable to me. > > That sounds reasonable and would greatly simplify the code (and remove > most of my fears about its optimization). Let me explore that direction. Well, that turned out to be an interesting rabbit hole. Leveraging the existing traits seems a bit difficult: - `ExactSizeIterator` cannot be implemented for adapters that increase the length of their iterators, because if one of them is already `usize::MAX` long then the size wouldn't be exact anymore. [1] - And `TrustedLen` cannot be implemented for adapters that make an iterator shorter, because if the iterator returns more than `usize::MAX` items (i.e. has an upper bound set to `None`) then the adapter can't predict the actual length. [2] So in both cases, the model breaks at the limit. OTOH, in our case we want to gather items into some collection, meaning that we are quite unlikely to ever reach that limit, as doing so would likely trigger an OOM anyway. Which means that we need to come with our own unsafe trait (`ExactSizeCollectible`?), which will have its own limits. It shall only be used to collect things (because we are unlikely to reach a size of `usize::MAX` in that context), and will take the lower bound of `size_hint` at face value, meaning it might collect less than the whole collection if the lower bound of the iterator or one of its adapters ever reaches `usize::MAX`. Again in the context of a collection this should never happen, but it's still a limitation. If we can live with this, then with a bit of code (because the new trait would need to be implemented for every iterator and adapter we want to collect out there) we should be able to provide an efficient, one-pass `collect()` method. Thoughts? [1] https://doc.rust-lang.org/std/iter/trait.ExactSizeIterator.html#when-shouldnt-an-adapter-be-exactsizeiterator [2] https://doc.rust-lang.org/core/iter/trait.TrustedLen.html#when-shouldnt-an-adapter-be-trustedlen
On Mon, Apr 21, 2025 at 05:15:29PM +0900, Alexandre Courbot wrote: > On Tue Apr 8, 2025 at 10:34 PM JST, Alexandre Courbot wrote: > > On Mon Apr 7, 2025 at 8:01 PM JST, Danilo Krummrich wrote: > >>> + /// Extends the vector by the elements of `iter`. > >>> + /// > >>> + /// This uses [`Iterator::size_hint`] to optimize memory reallocations, but will work even with > >>> + /// imprecise implementations - albeit in a non-optimal way. > >>> + /// > >>> + /// This method returns an error if a memory reallocation required to accommodate the new items > >>> + /// failed. In this case, callers must assume that some (but not all) elements of `iter` might > >>> + /// have been added to the vector. > >>> + /// > >>> + /// # Note on optimal behavior and correctness > >>> + /// > >>> + /// The efficiency of this method depends on how reliable the [`Iterator::size_hint`] > >>> + /// implementation of the `iter` is. > >>> + /// > >>> + /// It performs optimally with at most a single memory reallocation if the lower bound of > >>> + /// `size_hint` is the exact number of items actually yielded. > >>> + /// > >>> + /// If `size_hint` is more vague, there may be as many memory reallocations as necessary to > >>> + /// cover the whole iterator from the successive lower bounds returned by `size_hint`. > >>> + /// > >>> + /// If `size_hint` signals more items than actually yielded by the iterator, some unused memory > >>> + /// might be reserved. > >>> + /// > >>> + /// Finally, whenever `size_hint` returns `(0, Some(0))`, the method assumes that no more items > >>> + /// are yielded by the iterator and returns. This may result in some items not being added if > >>> + /// there were still some remaining. > >>> + /// > >>> + /// In the kernel most iterators are expected to have a precise and correct `size_hint` > >>> + /// implementation, so this should nicely optimize out for these cases. > >> > >> I agree, hence I think we should enforce to be provided with a guaranteed > >> correct size hint and simplify the code. I think we should extend the signature. > >> > >> pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError> > >> where > >> I: IntoIterator<Item = T>, > >> I::IntoIter: ExactSizeIterator, > >> > >> And implement ExactSizeIterator for IntoIter. > >> > >> The only thing that bothers me a bit is that the documentation [1] of > >> ExactSizeIterator sounds a bit ambiguous. > >> > >> It says: "When implementing an ExactSizeIterator, you must also implement > >> Iterator. When doing so, the implementation of Iterator::size_hint *must* > >> return the exact size of the iterator." > >> > >> But it also says: "Note that this trait is a safe trait and as such does not and > >> cannot guarantee that the returned length is correct. This means that unsafe > >> code must not rely on the correctness of Iterator::size_hint. The unstable and > >> unsafe TrustedLen trait gives this additional guarantee." > > > > Yeah ExactSizeIterator is not the solution to this, since it can be > > implemented without an unsafe block and the implementer is perfectly > > free to provide an incorrect value - so we cannot trust its result. > > > >> > >> Acknowledging the latter, I think we should implement our own trait for this > >> instead. Our own version of TrustedLen seems reasonable to me. > > > > That sounds reasonable and would greatly simplify the code (and remove > > most of my fears about its optimization). Let me explore that direction. > > Well, that turned out to be an interesting rabbit hole. > > Leveraging the existing traits seems a bit difficult: > > - `ExactSizeIterator` cannot be implemented for adapters that increase the > length of their iterators, because if one of them is already `usize::MAX` long > then the size wouldn't be exact anymore. [1] > > - And `TrustedLen` cannot be implemented for adapters that make an iterator > shorter, because if the iterator returns more than `usize::MAX` items (i.e. > has an upper bound set to `None`) then the adapter can't predict the actual > length. [2] Why is this a problem for the above implementation of Vec::extend()? I just looked it up and it seems that std [1] does the same thing. Do I miss anything? [1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_extend.rs#L25 > > So in both cases, the model breaks at the limit. OTOH, in our case we want to > gather items into some collection, meaning that we are quite unlikely to ever > reach that limit, as doing so would likely trigger an OOM anyway. > > Which means that we need to come with our own unsafe trait > (`ExactSizeCollectible`?), which will have its own limits. It shall only be > used to collect things (because we are unlikely to reach a size of `usize::MAX` > in that context), and will take the lower bound of `size_hint` at face value, > meaning it might collect less than the whole collection if the lower bound of > the iterator or one of its adapters ever reaches `usize::MAX`. Again in the > context of a collection this should never happen, but it's still a limitation. > > If we can live with this, then with a bit of code (because the new trait would > need to be implemented for every iterator and adapter we want to collect out > there) we should be able to provide an efficient, one-pass `collect()` method. > > Thoughts? > > [1] https://doc.rust-lang.org/std/iter/trait.ExactSizeIterator.html#when-shouldnt-an-adapter-be-exactsizeiterator > [2] https://doc.rust-lang.org/core/iter/trait.TrustedLen.html#when-shouldnt-an-adapter-be-trustedlen >
On Wed Apr 23, 2025 at 2:03 AM JST, Danilo Krummrich wrote: >> Well, that turned out to be an interesting rabbit hole. >> >> Leveraging the existing traits seems a bit difficult: >> >> - `ExactSizeIterator` cannot be implemented for adapters that increase the >> length of their iterators, because if one of them is already `usize::MAX` long >> then the size wouldn't be exact anymore. [1] >> >> - And `TrustedLen` cannot be implemented for adapters that make an iterator >> shorter, because if the iterator returns more than `usize::MAX` items (i.e. >> has an upper bound set to `None`) then the adapter can't predict the actual >> length. [2] > > Why is this a problem for the above implementation of Vec::extend()? > > I just looked it up and it seems that std [1] does the same thing. Do I miss > anything? > > [1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_extend.rs#L25 The problem I see is that if you try and do something like: vec.extend((0..10).into_iter().skip(2)); with the standard library, then the use of `skip` will remove the `TrustedLen` implementation from the resulting iterator and `extend_desugared` will be called instead of `extend_trusted`, which could add some unwanted (and unexpected) overhead. If we want an implementation of `extend` as simple as "confidently increase the length of the vector and copy the new items into it, once", then we need a trait that can be implemented on both shrinking and extending adapters. Anything else and we might trick the caller into a code path less efficient than expected (i.e. my original version, which generates more core even for the obvious cases that are `extend_with` and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the kernel, then `extend` could not be called at all with this particular iterator. There is also the fact that `TrustedLen` is behind a nightly feature, which I guess is another obstacle for using it.
On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: > On Wed Apr 23, 2025 at 2:03 AM JST, Danilo Krummrich wrote: > >> Well, that turned out to be an interesting rabbit hole. > >> > >> Leveraging the existing traits seems a bit difficult: > >> > >> - `ExactSizeIterator` cannot be implemented for adapters that increase the > >> length of their iterators, because if one of them is already `usize::MAX` long > >> then the size wouldn't be exact anymore. [1] > >> > >> - And `TrustedLen` cannot be implemented for adapters that make an iterator > >> shorter, because if the iterator returns more than `usize::MAX` items (i.e. > >> has an upper bound set to `None`) then the adapter can't predict the actual > >> length. [2] > > > > Why is this a problem for the above implementation of Vec::extend()? > > > > I just looked it up and it seems that std [1] does the same thing. Do I miss > > anything? > > > > [1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_extend.rs#L25 > > The problem I see is that if you try and do something like: > > vec.extend((0..10).into_iter().skip(2)); > > with the standard library, then the use of `skip` will remove the > `TrustedLen` implementation from the resulting iterator Skip implements TrustedLen, no? > and > `extend_desugared` will be called instead of `extend_trusted`, which > could add some unwanted (and unexpected) overhead. > > If we want an implementation of `extend` as simple as "confidently > increase the length of the vector and copy the new items into it, once", > then we need a trait that can be implemented on both shrinking and > extending adapters. Anything else and we might trick the caller into a > code path less efficient than expected (i.e. my original version, which > generates more core even for the obvious cases that are `extend_with` > and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the > kernel, then `extend` could not be called at all with this particular > iterator. I think you can't solve all problems within this single function, since other than std we don't have spcialization. So, if we need both the fastpath and the slowpath it needs to be separate methods unfortunately. I'd rather stick to the fastpath for now. Unless you have a specific use-case for something else? > There is also the fact that `TrustedLen` is behind a nightly feature, > which I guess is another obstacle for using it. Yeah, I think we have to implement our own TrustedLen trait for this reason and add the corresponding impls. However, if we can come up with a marker trait that deviates in requirements and guarantees from std, but is a better fit for the kernel, I'm fine exploring this direction too.
On Wed Apr 23, 2025 at 6:47 PM JST, Danilo Krummrich wrote: > On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: >> On Wed Apr 23, 2025 at 2:03 AM JST, Danilo Krummrich wrote: >> >> Well, that turned out to be an interesting rabbit hole. >> >> >> >> Leveraging the existing traits seems a bit difficult: >> >> >> >> - `ExactSizeIterator` cannot be implemented for adapters that increase the >> >> length of their iterators, because if one of them is already `usize::MAX` long >> >> then the size wouldn't be exact anymore. [1] >> >> >> >> - And `TrustedLen` cannot be implemented for adapters that make an iterator >> >> shorter, because if the iterator returns more than `usize::MAX` items (i.e. >> >> has an upper bound set to `None`) then the adapter can't predict the actual >> >> length. [2] >> > >> > Why is this a problem for the above implementation of Vec::extend()? >> > >> > I just looked it up and it seems that std [1] does the same thing. Do I miss >> > anything? >> > >> > [1] https://github.com/rust-lang/rust/blob/master/library/alloc/src/vec/spec_extend.rs#L25 >> >> The problem I see is that if you try and do something like: >> >> vec.extend((0..10).into_iter().skip(2)); >> >> with the standard library, then the use of `skip` will remove the >> `TrustedLen` implementation from the resulting iterator > > Skip implements TrustedLen, no? According to TrustedLen's documentation it doesn't: > This is why Skip<I> isn’t TrustedLen, even when I implements TrustedLen. (https://doc.rust-lang.org/std/iter/trait.TrustedLen.html#when-shouldnt-an-adapter-be-trustedlen) But if I look at the code for `Skip`, well sure enough it's there: https://doc.rust-lang.org/src/core/iter/adapters/skip.rs.html#289 ... with the caveat that `TrustedRandomAccess` must also be implemented. Now that makes sense! > >> and >> `extend_desugared` will be called instead of `extend_trusted`, which >> could add some unwanted (and unexpected) overhead. >> >> If we want an implementation of `extend` as simple as "confidently >> increase the length of the vector and copy the new items into it, once", >> then we need a trait that can be implemented on both shrinking and >> extending adapters. Anything else and we might trick the caller into a >> code path less efficient than expected (i.e. my original version, which >> generates more core even for the obvious cases that are `extend_with` >> and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the >> kernel, then `extend` could not be called at all with this particular >> iterator. > > I think you can't solve all problems within this single function, since other > than std we don't have spcialization. > > So, if we need both the fastpath and the slowpath it needs to be separate > methods unfortunately. I'd rather stick to the fastpath for now. Unless you have > a specific use-case for something else? No, and that's also basically what I was suggesting with the `ExactSizeCollectible` trait, something that works only for the fast path. I don't think there is much use (if any) for the slow path in the kernel anyway. So I think the idea will be to introduce that `ExactSizeCollectible` unsafe trait that would be a mix of `TrustedLen` and `TrustedRandomAccess`. Then we can hand-pick all the iterator types and adapters that should implement it. Thanks for the fruitful discussion - I didn't know about the use of specialization in the standard library. Cheers, Alex.
On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: > The problem I see is that if you try and do something like: > > vec.extend((0..10).into_iter().skip(2)); > > with the standard library, then the use of `skip` will remove the > `TrustedLen` implementation from the resulting iterator and > `extend_desugared` will be called instead of `extend_trusted`, which > could add some unwanted (and unexpected) overhead. > > If we want an implementation of `extend` as simple as "confidently > increase the length of the vector and copy the new items into it, once", > then we need a trait that can be implemented on both shrinking and > extending adapters. Anything else and we might trick the caller into a > code path less efficient than expected (i.e. my original version, which > generates more core even for the obvious cases that are `extend_with` > and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the > kernel, then `extend` could not be called at all with this particular > iterator. > > There is also the fact that `TrustedLen` is behind a nightly feature, > which I guess is another obstacle for using it. The stdlib alloc crate relies on specialization to speed up methods related to iterators. We can't use specialization, so losing these optimizations is simply a cost of not using the upstream alloc library that we have to accept. Alice
On Wed Apr 23, 2025 at 5:51 PM JST, Alice Ryhl wrote:
> On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote:
>> The problem I see is that if you try and do something like:
>>
>> vec.extend((0..10).into_iter().skip(2));
>>
>> with the standard library, then the use of `skip` will remove the
>> `TrustedLen` implementation from the resulting iterator and
>> `extend_desugared` will be called instead of `extend_trusted`, which
>> could add some unwanted (and unexpected) overhead.
>>
>> If we want an implementation of `extend` as simple as "confidently
>> increase the length of the vector and copy the new items into it, once",
>> then we need a trait that can be implemented on both shrinking and
>> extending adapters. Anything else and we might trick the caller into a
>> code path less efficient than expected (i.e. my original version, which
>> generates more core even for the obvious cases that are `extend_with`
>> and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the
>> kernel, then `extend` could not be called at all with this particular
>> iterator.
>>
>> There is also the fact that `TrustedLen` is behind a nightly feature,
>> which I guess is another obstacle for using it.
>
> The stdlib alloc crate relies on specialization to speed up methods
> related to iterators. We can't use specialization, so losing these
> optimizations is simply a cost of not using the upstream alloc library
> that we have to accept.
Yeah I was surprised to see
impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A>
where
I: Iterator<Item = T>
and
impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A>
where
I: TrustedLen<Item = T>
in the standard library, which clearly looks like an overlap. Didn't
know it was relying on a non-standard feature.
That's going to limit what we can do in the kernel, but nonetheless if
we can support only the cases that can be optimized I think we would
have our bases covered.
On Wed, Apr 23, 2025 at 06:40:07PM +0900, Alexandre Courbot wrote: > On Wed Apr 23, 2025 at 5:51 PM JST, Alice Ryhl wrote: > > On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: > >> The problem I see is that if you try and do something like: > >> > >> vec.extend((0..10).into_iter().skip(2)); > >> > >> with the standard library, then the use of `skip` will remove the > >> `TrustedLen` implementation from the resulting iterator and > >> `extend_desugared` will be called instead of `extend_trusted`, which > >> could add some unwanted (and unexpected) overhead. > >> > >> If we want an implementation of `extend` as simple as "confidently > >> increase the length of the vector and copy the new items into it, once", > >> then we need a trait that can be implemented on both shrinking and > >> extending adapters. Anything else and we might trick the caller into a > >> code path less efficient than expected (i.e. my original version, which > >> generates more core even for the obvious cases that are `extend_with` > >> and `extend_from_slice`). Or if we rely on `TrustedLen` solely in the > >> kernel, then `extend` could not be called at all with this particular > >> iterator. > >> > >> There is also the fact that `TrustedLen` is behind a nightly feature, > >> which I guess is another obstacle for using it. > > > > The stdlib alloc crate relies on specialization to speed up methods > > related to iterators. We can't use specialization, so losing these > > optimizations is simply a cost of not using the upstream alloc library > > that we have to accept. > > Yeah I was surprised to see > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > where > I: Iterator<Item = T> > > and > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > where > I: TrustedLen<Item = T> > > in the standard library, which clearly looks like an overlap. Didn't > know it was relying on a non-standard feature. > > That's going to limit what we can do in the kernel, but nonetheless if > we can support only the cases that can be optimized I think we would > have our bases covered. I think if it's a critical path and we really need the performance, we can use a non-standard/non-stable feature or get that stabilized. Regards, Boqun
On Wed, Apr 23, 2025 at 09:03:40AM -0700, Boqun Feng wrote: > On Wed, Apr 23, 2025 at 06:40:07PM +0900, Alexandre Courbot wrote: > > On Wed Apr 23, 2025 at 5:51 PM JST, Alice Ryhl wrote: > > > On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: > > > The stdlib alloc crate relies on specialization to speed up methods > > > related to iterators. We can't use specialization, so losing these > > > optimizations is simply a cost of not using the upstream alloc library > > > that we have to accept. > > > > Yeah I was surprised to see > > > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > > where > > I: Iterator<Item = T> > > > > and > > > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > > where > > I: TrustedLen<Item = T> > > > > in the standard library, which clearly looks like an overlap. Didn't > > know it was relying on a non-standard feature. > > > > That's going to limit what we can do in the kernel, but nonetheless if > > we can support only the cases that can be optimized I think we would > > have our bases covered. > > I think if it's a critical path and we really need the performance, we > can use a non-standard/non-stable feature or get that stabilized. We should not expect that we can just stabilize even a minimum form of specialization. It's a very non-trivial feature. Alice
On Thu, Apr 24, 2025 at 11:50:45AM +0000, Alice Ryhl wrote: > On Wed, Apr 23, 2025 at 09:03:40AM -0700, Boqun Feng wrote: > > On Wed, Apr 23, 2025 at 06:40:07PM +0900, Alexandre Courbot wrote: > > > On Wed Apr 23, 2025 at 5:51 PM JST, Alice Ryhl wrote: > > > > On Wed, Apr 23, 2025 at 10:02:58AM +0900, Alexandre Courbot wrote: > > > > The stdlib alloc crate relies on specialization to speed up methods > > > > related to iterators. We can't use specialization, so losing these > > > > optimizations is simply a cost of not using the upstream alloc library > > > > that we have to accept. > > > > > > Yeah I was surprised to see > > > > > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > > > where > > > I: Iterator<Item = T> > > > > > > and > > > > > > impl<T, I, A: Allocator> SpecExtend<T, I> for Vec<T, A> > > > where > > > I: TrustedLen<Item = T> > > > > > > in the standard library, which clearly looks like an overlap. Didn't > > > know it was relying on a non-standard feature. > > > > > > That's going to limit what we can do in the kernel, but nonetheless if > > > we can support only the cases that can be optimized I think we would > > > have our bases covered. > > > > I think if it's a critical path and we really need the performance, we > > can use a non-standard/non-stable feature or get that stabilized. > > We should not expect that we can just stabilize even a minimum form of > specialization. It's a very non-trivial feature. > Maybe I should have worded differently, the point is we should not limit ourselves to "a cost of not using the upstream alloc library that we have to accept", so yeah specialization is a non-trivial feature, but what we can do is presenting the problem to the Rust community and seeking for a solution for our problem. Similar to `CoercePointee`, unsize coerce in general is a feature that is hard to be stabilized, but for a specific problem that we care, we found a solution. The Linux kernel needs should be one of the sources that push new features in Rust language, don't you agree? Regards, Boqun > Alice
© 2016 - 2026 Red Hat, Inc.