Rust - Removing values

2023-01-10

Previously published

This article was previously published on len-learns-rust.com. A full index of these articles can be found here.

Now that we can insert intervals into our collection we need to be able to remove them. There are three ways to remove values from our collection:

remove the first interval in the collection
remove the first value in the collection
remove an arbitrary value from the collection

The first is the easiest and the following tests show how it should work:

    #[test]
    #[should_panic(expected = "Empty!")]
    fn test_remove_first_interval_when_empty() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.is_empty(), true);

        intervals.remove_first_interval();
    }

    #[test]
    fn test_remove_first_inverval() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.insert_interval(4, 10), true);
        assert_eq!(intervals.insert_interval(12, 20), true);

        assert_eq!(intervals.dump(), "[4,10], [12,20]");

        let first = intervals.remove_first_interval();

        assert_eq!(first.lower(), 4);
        assert_eq!(first.upper(), 10);

        assert_eq!(intervals.dump(), "[12,20]");
    }

The actual code ends up a bit like this:

    pub fn remove_first_interval(&mut self) -> Interval {
        let first_it = self.intervals.iter().next();

        if let Some(first_interval) = first_it {
            let ret = first_interval.clone();

            self.intervals.remove(&ret);

            return ret;
        }

        panic!("Empty!");
    }

Unstable features

There’s actually a better way to do this, I think, which is to use the, currently unstable, `std::collections::BTreeSet<>::pop_first() method. I haven’t yet worked out how to use these features and, ideally, I’d use them in a conditional manner in the code.

Next is the removal of the first value in the collection. This is, effectively, lower from the first interval. There are a few more tests for this one:

    #[test]
    #[should_panic(expected = "Empty!")]
    fn test_remove_first_value_when_empty() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.is_empty(), true);

        intervals.remove_first_value();
    }

    #[test]
    fn test_remove_first_value() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.insert_interval(4, 10), true);
        assert_eq!(intervals.insert_interval(12, 20), true);

        assert_eq!(intervals.dump(), "[4,10], [12,20]");

        assert_eq!(intervals.remove_first_value(), 4);

        assert_eq!(intervals.dump(), "[5,10], [12,20]");
    }

    #[test]
    fn test_remove_first_value_consumes_first_interval() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.insert_interval(4, 5), true);
        assert_eq!(intervals.insert_interval(12, 20), true);

        assert_eq!(intervals.dump(), "[4,5], [12,20]");

        assert_eq!(intervals.remove_first_value(), 4);

        assert_eq!(intervals.dump(), "[5], [12,20]");

        assert_eq!(intervals.remove_first_value(), 5);

        assert_eq!(intervals.dump(), "[12,20]");
    }

    #[test]
    fn test_remove_first_value_to_remove_all_values() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.insert_interval(u8::MIN, u8::MAX), true);

        assert_eq!(intervals.dump(), "[0,255]");

        for i in u8::MIN..u8::MAX {
            assert_eq!(intervals.remove_first_value(), i);
        }
        assert_eq!(intervals.remove_first_value(), u8::MAX);

        assert_eq!(intervals.dump(), "");
    }

The actual code to do the work is pretty simple and builds on removing the first interval.

    pub fn remove_first_value(&mut self) -> u8 {
        let first_interval = self.remove_first_interval();

        let first_value = first_interval.lower();

        if first_interval.lower() != first_interval.upper()
        {
            self.intervals
                .insert(Interval::new(first_value + 1, first_interval.upper()));
        }

        first_value
    }

Finally we can remove an arbitrary value. This will be needed once we start to build our “id manager” if we want to be able to reuse the ids only after using all of them once. For this use case we need to step along the ids and allocate them in order, since ids may be released back to the collection before we get to the end of the id sequence we need to remove arbitrary (ascending) values.

The tests look like this:

    #[test]
    fn test_remove_value_from_lowest_value_of_interval() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.remove_value(4), false);

        assert_eq!(intervals.insert_interval(4, 10), true);
        assert_eq!(intervals.dump(), "[4,10]");

        assert_eq!(intervals.remove_value(4), true);
        assert_eq!(intervals.dump(), "[5,10]");
    }

    #[test]
    fn test_remove_value_from_inside_interval() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.remove_value(6), false);

        assert_eq!(intervals.insert_interval(4, 10), true);
        assert_eq!(intervals.dump(), "[4,10]");

        assert_eq!(intervals.remove_value(6), true);
        assert_eq!(intervals.dump(), "[4,5], [7,10]");
    }

    #[test]
    fn test_remove_value_from_highest_value_of_interval() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.remove_value(10), false);

        assert_eq!(intervals.insert_interval(4, 10), true);
        assert_eq!(intervals.dump(), "[4,10]");

        assert_eq!(intervals.remove_value(10), true);
        assert_eq!(intervals.dump(), "[4,9]");
    }

The code required for these to pass might look something like this:

    pub fn remove_value(&mut self, value: u8) -> bool {
        if let Some(interval) = self.find(&Interval::new_single_value_interval(value)) {
            if interval.lower() < value {
                self.intervals
                    .insert(Interval::new(interval.lower(), value - 1));
            }

            if value + 1 <= interval.upper() {
                self.intervals
                    .insert(Interval::new(value + 1, interval.upper()));
            }

            self.intervals.remove(&interval);

            return true;
        }

        false
    }

with find() being implemented in terms of std::std::collections::BTreeSet<>::get()

    fn find(&self, interval: &Interval) -> Option<Interval> {

        let item = self.intervals.get(interval);

        if let Some(interval) = item
        {
            Some(interval.clone());
        }

        None
    }

but our comparison and equality for the Interval don’t allow that. So, for now at least, we end up with this, which needs to do two lookups in the worst case:

    fn find(&self, interval: &Interval) -> Option<Interval> {
        let before = self.intervals.range((Unbounded, Included(interval)));
        
        let prev = before.max();
        
        if let Some(prev) = prev {
            if prev.overlaps(interval) {
                return Some(prev.clone());
            }
        }
        
        let after = self.intervals.range((Included(interval), Unbounded));
        
        let next = after.min();
        
        if let Some(next) = next {
            if next.overlaps(interval) {
                return Some(next.clone());
            }
        }
        
        None
    }

Finally we can add a test that removes all the values using our new remove_value() function.

    #[test]
    fn test_remove_value_to_remove_all_values() {
        let mut intervals = Intervals::new();

        assert_eq!(intervals.insert_interval(u8::MIN, u8::MAX), true);

        assert_eq!(intervals.dump(), "[0,255]");

        for i in u8::MIN..u8::MAX {
            assert_eq!(intervals.remove_value(i), true);
        }
        assert_eq!(intervals.remove_value(u8::MAX), true);

        assert_eq!(intervals.dump(), "");
    }

Compiler says "no!"

Unfortunately, this shows up a bug in our implementation of remove_value() which causes an overflow but the compiler tells us where it is:

attempt to add with overflow
thread 'intervals::tests::test_remove_value_to_remove_all_values' panicked at 'attempt to add with overflow', src\intervals.rs:73:16
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library\std\src\panicking.rs:584
   1: core::panicking::panic_fmt
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library\core\src\panicking.rs:142
   2: core::panicking::panic
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library\core\src\panicking.rs:48
   3: idmanager::intervals::Intervals::remove_value
             at .\src\intervals.rs:73
   4: idmanager::intervals::tests::test_remove_value_to_remove_all_values
             at .\src\intervals.rs:738
   5: idmanager::intervals::tests::test_remove_value_to_remove_all_values::closure$0
             at .\src\intervals.rs:728
   6: core::ops::function::FnOnce::call_once<idmanager::intervals::tests::test_remove_value_to_remove_all_values::closure_env$0,tuple$<> >
             at /rustc/897e37553bba8b42751c67658967889d11ecd120\library\core\src\ops\function.rs:248
   7: core::ops::function::FnOnce::call_once
             at /rustc/897e37553bba8b42751c67658967889d11ecd120/library\core\src\ops\function.rs:248

Be aware

Although our test here catches the overflow issue we can’t rely on that happening with release code. As noted in “the book” these overflow/underflow checks are a feature of the debug build and are elided in the release build for performance reasons. Overflow/underflow is allowed in release builds and you get the “normal” two’s complement wrapping as you would in C++.

Join in

The code to this point can be found here and there is a second version, which fixes the bug, we’ll take about that now…

The bug, as indicated by the compiler, is here, in remove_value():

            if value + 1 <= interval.upper() {
                self.intervals
                    .insert(Interval::new(value + 1, interval.upper()));
            }

We take a u8 value of 255 and add 1 to it to make sure that the ’next’ value is still part of the interval that we’re working on. The fix is simply to do that differently and avoid the +1 until after we know that we’re still part of the interval (and therefore cannot be the maximum value for a u8). The original code was just sloppy thinking.

            if value < interval.upper() {
                self.intervals
                    .insert(Interval::new(value + 1, interval.upper()));
            }

Once the compiler made me aware of this issue I decided to look at the rest of the code to see if the same sloppy thinking has added any other latent bugs that the existing tests don’t cover.

The problem is also present in Interval::extends_upper() and Interval::extends_lower() once again the fix is easy, but first we’ll add some failing tests:

    #[test]
    fn test_extends_lower_new_interval_is_max() {
        let interval1 = Interval {
            lower: 10,
            upper: 13,
        };

        let new_interval = Interval {
            lower: 14,
            upper: u8::MAX,
        };

        assert_eq!(interval1.extends_lower(&new_interval), false);
    }

    #[test]
    fn test_extends_lower_new_interval_is_min() {
        let interval1 = Interval {
            lower: 10,
            upper: 13,
        };

        let new_interval = Interval {
            lower: u8::MIN,
            upper: 17,
        };

        assert_eq!(interval1.extends_lower(&new_interval), false);
    }

    #[test]
    fn test_extends_upper_new_interval_is_max() {
        let interval1 = Interval {
            lower: 10,
            upper: 13,
        };

        let new_interval = Interval {
            lower: 14,
            upper: u8::MAX,
        };

        assert_eq!(interval1.extends_upper(&new_interval), true);
    }

    #[test]
    fn test_extends_upper_new_interval_is_min() {
        let interval1 = Interval {
            lower: 10,
            upper: 13,
        };

        let new_interval = Interval {
            lower: u8::MIN,
            upper: 17,
        };

        assert_eq!(interval1.extends_upper(&new_interval), false);
    }

The test that calls extends_lower() when the new interval’s upper is u8::MAX fails because we try and add one to it and the test that calls extends_upper() when the new interval’s lower is u8::MIN fails because we try and subtract one from it. Now we have failing tests the fixes are easy and obvious, changing:

pub fn extends_lower(&self, value: &Self) -> bool {
        let next_value = value.upper + 1;

        next_value == self.lower
    }

    pub fn extends_upper(&self, value: &Self) -> bool {
        let next_value = value.lower - 1;

        next_value == self.upper
    }

pub fn extends_lower(&self, value: &Self) -> bool {
        if value.upper == u8::MAX {
            return false;
        }

        let next_value = value.upper + 1;

        next_value == self.lower
    }

    pub fn extends_upper(&self, value: &Self) -> bool {
        if value.lower == u8::MIN {
            return false;
        }

        let next_value = value.lower - 1;

        next_value == self.upper
    }

and our tests pass and we’re done. Next up we’ll use the code that we’ve developed this far to build the “id manager” that we set out to build and then we’ll make it all generic so that we can deal with different id ranges.

Join in

The code can be found here on GitHub each step on the journey will have one or more separate directories of code, so this article’s code is here and here this allows for easy comparison of changes at each stage.

Of course, there may be a better way; leave comments if you’d like to help me learn.