Monday, May 8, 2017

Rust Memory Management

In the light of my latest fascination with Rust programming language, I've started to make small presentation about Rust at my office, since I'm not the only one at our company who is interested in Rust. My first presentation in Feb was about a very general introduction to the language but at that time I had not yet really used the language for anything real myself so I was a complete novice myself and didn't have a very good idea of how memory management really works. While working on my gps-share project in my limited spare time, I came across quite a few issues related to memory management but I overcame all of them with help from kind folks at #rust-beginners IRC channel and the small but awesome Rust-GNOME community.

Having learnt some essentials of memory management, I thought I share my knowledge/experience with folks at the office. The talk was not well-attended due to conflicts with other meetings at office but the few folks who attended were very interested and asked some interesting and difficult questions (i-e the perfect audience). One of the questions was if I could put this up as a blog post so here I am. :)

Basics


Let's start with some basics: In Rust,

  1. stack allocation is preferred over the heap allocation and that's where everything is allocated by default.
  2. There is strict ownership semantics involved so each value can only and only have one owner at a particular time.
  3. When you pass a value to a function, you move the ownership of that value to the function argument and similarly, when you return a value from a function, you pass the ownership of the return value to the caller.

Now these rules make Rust very secure but at the same time if you had no way to allocate on the heap or be able to share data between different parts of your code and/or threads, you can't get very far with Rust. So we're provided with mechanisms to (kinda) work around these very strict rules, without compromising on safety these rules provide. Let's start with a simple code that will work fine in many other languages:

fn add_first_element(v1: Vec<i32>, v2: Vec<i32>) -> i32 {
    return v1[0] + v2[0];
}

fn main() {
    let v1 = vec![1, 2, 3];
    let v2 = vec![1, 2, 3];

    let answer = add_first_element(v1, v2);

    // We can use `v1` and `v2` here!
    println!("{} + {} = {}", v1[0], v2[0], answer);
}

This gives us an error from rustc:

error[E0382]: use of moved value: `v1`
  --> sample1.rs:13:30
   |
10 |     let answer = add_first_element(v1, v2);
   |                                    -- value moved here
...
13 |     println!("{} + {} = {}", v1[0], v2[0], answer);
   |                              ^^ value used here after move
   |
   = note: move occurs because `v1` has type `std::vec::Vec<i32>`, which does not implement the `Copy` trait

error[E0382]: use of moved value: `v2`
  --> sample1.rs:13:37
   |
10 |     let answer = add_first_element(v1, v2);
   |                                        -- value moved here
...
13 |     println!("{} + {} = {}", v1[0], v2[0], answer);
   |                                     ^^ value used here after move
   |
   = note: move occurs because `v2` has type `std::vec::Vec<i32>`, which does not implement the `Copy` trait

What's happening is that we passed 'v1' and 'v2' to add_first_element() and hence we passed its ownership to add_first_element() as well and hence we can't use it afterwards. If Vec was a Copy type (like all primitive types), we won't get this error because Rust will copy the value for add_first_element and pass those copies to it. In this particular case the solution is easy:

Borrowing


fn add_first_element(v1: &Vec<i32>, v2: &Vec<i32>) -> i32 {
    return v1[0] + v2[0]; 
}

fn main() {
    let v1 = vec![1, 2, 3];
    let v2 = vec![1, 2, 3];

    let answer = add_first_element(&v1, &v2);

    // We can use `v1` and `v2` here!
    println!("{} + {} = {}", v1[0], v2[0], answer);
}                 

This one compiles and runs as expected. What we did was to convert the arguments into reference types. References are Rust's way of borrowing the ownership. So while add_first_element() is running, it owns 'v1' and 'v2' but not after it returns. Hence this code works.

While borrowing is very nice and very helpful, in the end it's temporary. The following code won't build:

struct Heli {
    reg: String
}

impl Heli {
    fn new(reg: String) -> Heli {
        Heli { reg: reg }
    }
    
    fn hover(& self) {
        println!("{} is hovering", self.reg);
    }
}

fn main() {
    let reg = "G-HONI".to_string();
    let heli = Heli::new(reg);

    println!("Registration {}", reg);
    heli.hover();
}

rustc says:

error[E0382]: use of moved value: `reg`
  --> sample3.rs:20:33
   |
18 |     let heli = Heli::new(reg);
   |                          --- value moved here
19 | 
20 |     println!("Registration {}", reg);
   |                                 ^^^ value used here after move
   |
   = note: move occurs because `reg` has type `std::string::String`, which does not implement the `Copy` 

If String had Copy trait implemented for it, this code would have compiled. But if efficiency is a concern at all for you (it is for Rust), you wouldn't want most values to be copied around all the time. We can't use a reference here as Heli::new() above needs to keep the passed 'reg'. Also note that the issue here is not that 'reg' was passed to Heli:new() and used afterwards by Heli::hover() afterwards but the fact that we tried to use 'reg' after we have given its ownership to Heli instance through Heli::new().

I realize that the above code doesn't make use of borrowing but if we were to make use of that, we'll have to declare lifetimes for the 'reg' field and the code still won't work because we want to keep the 'reg' in our Heli struct. There is a better solution here:

Rc


use std::rc::Rc;                                                                                         

struct Heli {
    reg: Rc<String>
}

impl Heli {
    fn new(reg: Rc<String>) -> Heli {
        Heli { reg: reg }
    }

    fn hover(& self) {
        println!("{} is hovering", self.reg);
    }
}

fn main() { 
    let reg = Rc::new("G-HONI".to_string());
    let heli = Heli::new(reg.clone());

    println!("Registration {}", reg);
    heli.hover();
}

This code builds and runs successfully. Rc stands for "Reference Counted" so by putting data into this generic container, adds reference counting to the data in question. Note that while you had to explicitly call clone() method of Rc to increment its refcount, you don't need to do anything to decrease the refcount. Each time an Rc reference goes out of scope, the reference is decremented automatically and when it reaches 0, the container Rc and its contained data are freed.

Cool, Rc is super easy to use so we can just use it in all situations where we need shared ownership? Not quite! You can't use Rc to share data between threads. So this code won't compile:

use std::rc::Rc;                                                                                         
use std::thread;

struct Heli {
    reg: Rc<String>
}

impl Heli {
    fn new(reg: Rc<String>) -> Heli {
        Heli { reg: reg }
    }

    fn hover(& self) {
        println!("{} is hovering", self.reg);
    }
}

fn main() { 
    let reg = Rc::new("G-HONI".to_string());
    let heli = Heli::new(reg.clone());
    
    let t = thread::spawn(move || {
        heli.hover();
    });
    println!("Registration {}", reg);

    t.join().unwrap();
}

It results in:

error[E0277]: the trait bound `std::rc::Rc<std::string::String>: std::marker::Send` is not satisfied in `[closure@sample5.rs:22:27: 24:6 heli:Heli]`
  --> sample5.rs:22:13
   |
22 |     let t = thread::spawn(move || {
   |             ^^^^^^^^^^^^^ within `[closure@sample5.rs:22:27: 24:6 heli:Heli]`, the trait `std::marker::Send` is not implemented for `std::rc::Rc<std::string::String>`
   |
   = note: `std::rc::Rc<std::string::String>` cannot be sent between threads safely
   = note: required because it appears within the type `Heli`
   = note: required because it appears within the type `[closure@sample5.rs:22:27: 24:6 heli:Heli]`
   = note: required by `std::thread::spawn`

The issue here is that to be able to share data between more than one threads, the data must be of a type that implements Send trait. However not only implementing Send for all types would be very impractical solution, there is also performance penalties associated with implementing Send (which is why Rc doesn't implement Send).

Introducing Arc


Arc stands for Atomic Reference Counting and it's the thread-safe sibling of Rc.

use std::sync::Arc;                                                                                      
use std::thread;

struct Heli {
    reg: Arc<String>
}

impl Heli {
    fn new(reg: Arc<String>) -> Heli {
        Heli { reg: reg }
    }

    fn hover(& self) {
        println!("{} is hovering", self.reg);
    }
}

fn main() {
    let reg = Arc::new("G-HONI".to_string());
    let heli = Heli::new(reg.clone());

    let t = thread::spawn(move || {
        heli.hover();
    });
    println!("Registration {}", reg);

    t.join().unwrap();
}

This one works and the only difference is that we used Arc instead of Rc. Cool, so now we have a very efficient by thread-unsafe way to share data between different parts of the code but also a thread-safe mechanism as well. We're done then? Not quite! This code won't work:

use std::sync::Arc;                                                                                      
use std::thread;

struct Heli {
    reg: Arc<String>,
    status: Arc<String>
}

impl Heli {
    fn new(reg: Arc<String>, status: Arc<String>) -> Heli {
        Heli { reg: reg,
               status: status }
    }

    fn hover(& self) {
        self.status.clear();
        self.status.push_str("hovering");
        println!("{} is {}", self.reg, self.status);
    }
}   

fn main() { 
    let reg = Arc::new("G-HONI".to_string());
    let status = Arc::new("".to_string());
    let mut heli = Heli::new(reg.clone(), status.clone());

    let t = thread::spawn(move || {
        heli.hover();
    });
    println!("main: {} is {}", reg, status);

    t.join().unwrap();
}

This gives us two errors:

error: cannot borrow immutable borrowed content as mutable
  --> sample7.rs:16:9
   |
16 |         self.status.clear();
   |         ^^^^^^^^^^^ cannot borrow as mutable

error: cannot borrow immutable borrowed content as mutable
  --> sample7.rs:17:9
   |
17 |         self.status.push_str("hovering");
   |         ^^^^^^^^^^^ cannot borrow as mutable

The issue is that Arc is unable to handle mutation of data from difference threads and hence doesn't give you mutable reference to contained data.

Mutex


For sharing mutable data between threads, you need another type in combination with Arc: Mutex. Let's make the above code work:

use std::sync::Arc;                                                                                      
use std::sync::Mutex;
use std::thread;

struct Heli {
    reg: Arc<String>,
    status: Arc<Mutex<String>>
}

impl Heli {
    fn new(reg: Arc<String>, status: Arc<Mutex<String>>) -> Heli {
        Heli { reg: reg,
               status: status }
    }

    fn hover(& self) {
        let mut status = self.status.lock().unwrap();
        status.clear();
        status.push_str("hovering");
        println!("thread: {} is {}", self.reg, status.as_str());
    }
}
    
fn main() {
    let reg = Arc::new("G-HONI".to_string());
    let status = Arc::new(Mutex::new("".to_string()));
    let heli = Heli::new(reg.clone(), status.clone());

    let t = thread::spawn(move || {
        heli.hover();
    });

    println!("main: {} is {}", reg, status.lock().unwrap().as_str());

    t.join().unwrap();
}

This code will work. Notice how you don't have to explicitly unlock the mutex after using. Rust is all about scopes. When the unlocked value goes out of the scope, mutex is automatically unlocked.

Other container types


Mutexes are rather expensive and sometimes you have shared date between threads but not all threads are mutating it (all the time) and that's where RwLock becomes useful. I won't go into details here but it's almost identical to Mutex, except that threads can take read-only locks and since it's possible to safely share non-mutable state between threads, it's a lot more efficient than threads locking other threads each time they access the data.

Another container types I didn't mention above, is Box. The basic use of Box is that it's a very generic and simple way of allocating data on the heap. It's typically used to turn an unsized type into a sized type. The module documentation has a simple example on that.

What about lifetimes


One of my colleagues who had had some experience with Rust was surprised that I didn't cover lifetimes in my talk. Firstly, I think it deserves a separate talk of it's own. Secondly, if you make clever use of the container types available to you and described above, most often you don't have to deal with lifetimes. Thirdly, lifetimes is Rust is something that I still struggle with, each time I have to deal with it so I feel a bit unqualified to teach others about how they work.

The end


I hope you find some of the information above useful. If you are looking for other resources on learning Rust, the Rust book is currently your best bet. I am still a newbie at Rust so if you see some mistakes in this post, please do let me know in the comments section.

Happy safe hacking!

Thursday, April 6, 2017

GNOME ❤ Rust Hackfest in Mexico

While I'm known as a Vala fanboy in GNOME, I've tried to stress time and again that I see Vala as more a practical solution than an ideal one. "Safe programming" has always been something that intrigued me, having dealt with numerous crashes and other hard-to-debug runtime issues in the past. So when I first heard of Rust some years back, it got me super excited but it was not exactly stable  and there was no integration with GNOME libraries or D-Bus and hence it was not at all a viable option for developing desktop code. Lately (in past 2 years) things have significantly changed. Not only we have Rust 1.0 but we also have crates that provide integration with GNOME libraries and D-Bus. On top of that, some of us took steps to start converting some C code into Rust and many of us started seriously talking with Rust hackers to make Rust a first class programming language for GNOME.

To make things really go foward, we decided to arrange a hackfest, which took place last week at the Red Hat offices in Mexico city. The event was a big success in my opinion. The actual work done and started during the hackfest aside, it brought two communities much closer together and we learnt quite a lot from each other in a very short amount of time. The main topics at the hackfest were:
  • GObject-introspection consumption by Rust.
  • GObject creation from Rust.
  • Better out of the box Rust support in GNOME Builder
  • GMainLoop and Tokio integration
  • D-Bus bindings
While most folks were focused on the first three and I did participate in discussions on all these topics (except for Builder, of which I don't know anything), I spent most of my time looking into the last one. D-Bus is widely used in automotive industry these days and I serve that industry these days so it made sense, aside from my person interest in D-Bus. We established (some of it before the hackfest) that to make Rust attractive to C and Vala developers, we need to provide:
  1. Syntactic sugar for making D-Bus coding simple

    Very similar to what Vala offers. Antoni already had a project, dbus-macros that targets this goal through the use of Rust's (powerful) macro system. So I spent a lot of time fixing and improving dbus-macros crate. Having Antoni and other Rust experts in the same room, greatly helped me get around some very hard to decipher compiler issues. I found out (the hard way) that while rustc is very good at spotting errors, it fails miserably to give you the proper context when it comes to macros. I complained enough about this to Mozilla folks that I'm sure they'll be looking into fixing that experience at some point in near future. :)

    We also contacted the author of dbus crate, David Henningsson over e-mail about a few D-Bus related subjects (more below) including this one. (I was surprised to find out that he also lives in Sweden). His opinion was that we probably should be using procedural macros for this. I agree with him, except that procedural macros are not yet in stable Rust. So for now, I decided to continue with current approach of the project.

    During the hackfest, I became the maintainer of the dbus-macros crate since the first thing I did was to reduce the very small amount of code by 70%. Next, I created a backlog for myself and worked my way through it one issue at a time. I'm going to continue with that.

  2. Asynchronous D-Bus methods

    While ability to make D-Bus method calls asynchronously from clients is very important (you don't want to block the UI for your IPC), it would be also very nice for services to be able to asynchronously handle method calls as well. Brian Anderson from Mozilla was working on this during the hackfest. His approach was to hack dbus crate to add async API through the use of tokio crate. I spent most of the second day of hackfest, sitting next to Brian for some peer-programming. The author of tokio, Alex Crichton, sitting next to us helped us a lot in understanding tokio API. In the end, Brian submitted a working proof of concept for client-side async calls, which will hopefully provide a very good bases for David's actual implementation.

  3. Code generation from D-Bus introspection XML

    With both GLib and Qt providing utilities to generate code for handling D-Bus for a decade now, most projects doing D-Bus make use of this. My intention was too look into this during the hackfest but just before, I found out that David had not only already started this work in dbus crate but also his approach is exactly what I'd have gone for. So while I decided not to work on this, I did have lengthy (electronic) conversations with David about how to consolidate code generation with dbus-macros.

    Ideally, the API of the generated code should be very similar to one you'd manually create using dbus-macros to make it easy for developers to switch from one approach to another. But since David and I didn't agree with current dbus-macros approach, I kind of gave-up on this goal, at least for now. Once macro procedures stabilize, there is a good chance we will change dbus-macros (though it'll be a completely new version or maybe even a different crate) to make use of them and we can revisit consolidation of code generation and dbus-macros.
A few weeks prior to the event, I decided to create a new project, gps-share. The aim is to provide ability to share your (standalone) GPS device from your laptop/desktop to other devices on the network and at the same time add standalone GPS device support into Geoclue (without any new feature code in Geoclue). I decided to write it in Rust for a few reasons, one of them being my desire to learn enough about the language before the event (I hadn't wrote any serious/complicated code in Rust before) and another one was to have an actual test case for D-Bus adventures (it's supposed to talk to Avahi on D-Bus). I'm glad that I did that since I encountered a few issues with dbus-macros when using them in gps-share and the awesome Mozilla folks were able to help me figure them out very quickly. Otherwise it would have taken me a very long time to figure the issues.




On the last day of hackfest, after a delicious lunch, we decided to go for a long stroll around Mexico city and hang out in the park, where we had more interesting conversations, about life, universe and everything (including Rust and GNOME).

After the hackfest, I stayed around for 3 more days. On Saturday, I mostly hung out with Federico, Christian, Antoni and Joaquín. We walked around the city center and watched Federico and Joaquín interviewed by Rancho Electronico folks. I was really excited to see that they use GNOME for their desktop and GStreamer for streaming. The guy handling the streaming was very glad to meet someone with GStreamer experience.

On Sunday, I rented a car and went to a hike at Tepoztlán with Felipe. Driving in Mexico wasn't easy so having a Mexican with me, helped a lot.


And on Monday, we drove to the Sun pyramid.


I would like to thank both GNOME Foundation and my employer, Pelagicore for sponsoring my participation to this event.


Tuesday, March 7, 2017

GDP meets GSoC

Are you a student? Passionate about Open Source? Want your code to run on next generation of automobiles? You're in luck! Genivi Development Platform will be participating in Google Summer of Code this summer and you are welcome to participate. We have collected a bunch of ideas for what would be a good 3 month project for a student but you're more than welcome to suggest your own project. The ideas page, also has instructions on how to get started with GDP.

We look forward to your participation!