Rust – how to efficiently remove characters from beggining of string?

Rust – how to efficiently remove characters from beggining of string?

Problem Description:

I have a long string stored in a variable in Rust. I often remove some characters from its front with a drain method and use the value returned from it:

my_str.drain(0..i).collect::<String>();

The problem is, that draining from this string is done really often in the program and it’s slowing it down a lot (it takes ~99.6% of runtime). This is a very expensive operation, since every time, the entire string has to be moved left in the memory.

I do not drain from the end of the string at all (which should be much faster), just from the front.

How can I make this more efficient? Is there some alternative to String, that uses a different memory layout, which would be better for this use case?

Solution – 1

As proposed by @SUTerliakov, using VecDeque<char> in this case is much more effective than String either with the pop_front method or the drain method (when draining from the front of course)

Solution – 2

As stated by @Jmb, keeping the original string intact and working with slices is certainly a big win.

I don’t know, from the question, the context and usage of these strings, but this quick and dirty benchmark shows a substantial difference in performances.

This benchmark is flawed because there is a useless clone() at each repetition, there is no warm-up, there is no black-box for the result, there are no statistics… but it just gives an idea.

use std::time::Instant;

fn with_drain(mut my_str: String) -> usize {
    let mut total = 0;
    'work: loop {
        for &i in [1, 2, 3, 4, 5].iter().cycle() {
            if my_str.len() < i {
                break 'work;
            }
            let s = my_str.drain(0..i).collect::<String>();
            total += s.len();
        }
    }
    total
}

fn with_slice(my_str: String) -> usize {
    let mut total = 0;
    let mut pos = 0;
    'work: loop {
        for &i in [1, 2, 3, 4, 5].iter().cycle() {
            let next_pos = pos + i;
            if my_str.len() <= next_pos {
                break 'work;
            }
            let s = &my_str[pos..next_pos];
            pos = next_pos;
            total += s.len();
        }
    }
    total
}

fn main() {
    let my_str="I have a long string stored in a variable in Rust.
I often remove some characters from its front with a drain method and use the value returned from it:
my_str.drain(0..i).collect::<String>();
The problem is, that draining from this string is done really often in the program and it's slowing it down a lot (it takes ~99.6% of runtime). This is a very expensive operation, since every time, the entire string has to be moved left in the memory.
I do not drain from the end of the string at all (which should be much faster), just from the front.
How can I make this more efficient? Is there some alternative to String, that uses a different memory layout, which would be better for this use case?
".to_owned();
    let repeat = 1_000_000;
    let instant = Instant::now();
    for _ in 0..repeat {
        let _ = with_drain(my_str.clone());
    }
    let drain_duration = instant.elapsed();
    let instant = Instant::now();
    for _ in 0..repeat {
        let _ = with_slice(my_str.clone());
    }
    let slice_duration = instant.elapsed();
    println!("{:?} {:?}", drain_duration, slice_duration);
}
/*
$ cargo run --release
    Finished release [optimized] target(s) in 0.00s
     Running `target/release/prog`
5.017018957s 310.466253ms
*/

Solution – 3

If you can’t use slices because of the lifetimes, you could use a type that provides shared-ownership like SharedString from the shared-string crate or Str from the bytes-utils crate. The former looks more fully-featured but both provide methods that can take the prefix from a string in O(1) because the original data is never moved.

Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.
Accept
Reject