Te damos la bienvenida a Comprehensive Rust 🦀

Build workflow GitHub contributors GitHub stars

Este es un curso de Rust de tres días que ha desarrollado el equipo de Android de Google. El curso abarca todo lo relacionado con Rust, desde la sintaxis básica hasta temas avanzados como los genéricos y la gestión de errores. También incluye contenidos específicos de Android el último día.

La última versión del curso se puede encontrar en https://google.github.io/comprehensive-rust/. Si lo estás leyendo en otro lugar, consulta allí para obtener actualizaciones.

El objetivo del curso es enseñarte Rust. Suponemos que no sabes nada sobre Rust y esperamos lograr lo siguiente:

  • Darte un entendimiento comprensivo de la sintaxis y lenguaje Rust.
  • Permitirte modificar programas de Rust y escribir otros nuevos.
  • Brindarte idiomática propia de Rust.

We call the first four course days Rust Fundamentals.

Basándonos en esto, te invitamos a profundizar en uno o más temas especializados:

  • Android: un curso de medio día sobre el uso de Rust en el desarrollo de la plataforma Android (AOSP). En él se incluye la interoperabilidad con C, C++ y Java.
  • Chromium: a half-day course on using Rust within Chromium based browsers. This includes interoperability with C++ and how to include third-party crates in Chromium.
  • Bare Metal: una clase de un día sobre el uso de Rust para el desarrollo bare-metal (insertado). Se tratarán tanto los microcontroladores como los procesadores de aplicaciones.
  • Concurrencia: una clase de un día sobre concurrencia en Rust. Abordaremos tanto la concurrencia clásica (programación interrumpible mediante hilos y exclusiones mutuas), como la concurrencia async / await (multitarea cooperativa mediante traits future).

Objetivos que no trataremos

Rust es un lenguaje muy amplio y no podremos abarcarlo todo en unos pocos días. Algunos de los objetivos que no se plantean en este curso son los siguientes:

Suposiciones

El curso presupone que ya sabes programar. Rust es un lenguaje estáticamente tipado y, a veces, haremos comparaciones con C y C++ para explicarlo mejor o contrastar nuestro enfoque.

Si sabes programar en un lenguaje dinámicamente tipado, como Python o JavaScript, podrás seguir el ritmo sin problema.

Este es un ejemplo de una nota del orador. Las utilizaremos para añadir información adicional a las diapositivas. Puede tratarse de puntos clave que el instructor debería tratar, así como de respuestas a preguntas frecuentes que surgen en clase.

Desarrollo del curso

Esta página está dirigida al instructor del curso.

A continuación, te ofrecemos información general sobre cómo se ha desarrollado el curso en Google.

We typically run classes from 9:00 am to 4:00 pm, with a 1 hour lunch break in the middle. This leaves 3 hours for the morning class and 3 hours for the afternoon class. Both sessions contain multiple breaks and time for students to work on exercises.

Antes de impartir el curso, te recomdamos hacer lo siguiente:

  1. Familiarízate con el material del curso. Hemos incluido notas del orador para destacar los puntos clave (ayúdanos a añadir más notas de este tipo). Cuando hagas una presentación, asegúrate de abrir las notas del orador en una ventana emergente (haz clic en el enlace que tiene una pequeña flecha junto a “Notas del orador”). De esta manera, tendrás una pantalla despejada para mostrar a la clase.

  2. Decide on the dates. Since the course takes four days, we recommend that you schedule the days over two weeks. Course participants have said that they find it helpful to have a gap in the course since it helps them process all the information we give them.

  3. Busca una sala con capacidad suficiente para los participantes presenciales. Recomendamos una sala para entre 15 y 25 personas. Es el tamaño ideal para que los alumnos se sientan cómodos haciendo preguntas y para que el profesor tenga tiempo de responderlas. Asegúrate de que en la sala haya mesas para ti y para los alumnos: todos necesitaréis sentaros y trabajar con vuestros portátiles. Además, como instructor, programarás mucho en directo, por lo que un atril no te resultará muy útil.

  4. El mismo día del curso, llega con antelación a la clase para preparar todo lo necesario. Te recomendamos que realices la presentación directamente desde mdbook serve en tu portátil (consulta las [instrucciones de instalación][3]). Así conseguirás un rendimiento óptimo y que no haya demoras al pasar de una página a otra. También podrás corregir las erratas a medida que tú o los participantes del curso las detectéis.

  5. Deja que los alumnos resuelvan los ejercicios por sí mismos o en pequeños grupos. Solemos dedicar entre 30 y 45 minutos a los ejercicios por la mañana y por la tarde (incluido el tiempo para revisar las soluciones). Asegúrate de preguntar a los asistentes si les está costando hacerlo o si hay algo en lo que puedas ayudarles. Cuando veas que varias personas tienen el mismo problema, coméntalo delante de la clase y ofrece una solución. Por ejemplo, enséñales dónde encontrar la información importante en la biblioteca estándar.

Eso es todo. ¡Buena suerte con el curso! Esperamos que te diviertas tanto como nosotros.

Después, envíanos un comentario para que podamos seguir mejorando el curso. Estaremos encantados de que nos cuentes qué aspectos destacarías y qué se puede mejorar. Tus alumnos también pueden enviarnos sus sugerencias!

Estructura del curso

Esta página está dirigida al instructor del curso.

Fundamentos de Rust

The first four days make up Rust Fundamentals. The days are fast paced and we cover a lot of ground!

Course schedule:

Información más detallada

In addition to the 4-day class on Rust Fundamentals, we cover some more specialized topics:

Rust en Android

Rust en Android es un curso de medio día sobre el uso de Rust para el desarrollo de la plataforma Android. En él se incluye la interoperabilidad con C, C++ y Java.

Necesitarás conseguir el AOSP. Descarga el repositorio del curso en el mismo ordenador y mueve el directorio src/android/ a la raíz del AOSP. De esta forma, el sistema de compilación de Android verá los archivos Android.bp en src/android/.

Asegúrate que adb sync funciona con tu emulador o en un dispositivo físico y haz pre-build en todos los ejemplos de Android usando src/android/build_all.sh. Lee el script para ver los comandos que corren y asegúrate que funcionan cuando lo corres a mano.

Rust in Chromium

The Rust in Chromium deep dive is a half-day course on using Rust as part of the Chromium browser. It includes using Rust in Chromium’s gn build system, bringing in third-party libraries (“crates”) and C++ interoperability.

You will need to be able to build Chromium — a debug, component build is recommended for speed but any build will work. Ensure that you can run the Chromium browser that you’ve built.

Bare-Metal Rust

Bare Metal Rust es una clase de un día sobre cómo usar Rust para el desarrollo bare-metal (insertado). Se tratarán tanto microcontroladores como procesadores de aplicaciones.

Para la parte de los microcontroladores, necesitarás comprar con antelación la segunda versión de la placa programable BBC micro:bit. Todo el mundo deberá instalar una serie de paquetes, tal como se describe en la página de bienvenida.

Concurrencia en Rust

Concurrencia en profundidad es una clase de un día sobre la concurrencia clásica y la concurrencia async/await.

Necesitarás configurar un nuevo crate, y descargar y preparar las dependencias. A continuación, podrás copiar y pegar los ejemplos en src/main.rs para experimentar con ellos:

cargo init concurrency
cd concurrency
cargo add tokio --features full
cargo run

Formato

El curso está pensado para ser muy interactivo, por lo que te recomendamos que dejes que las preguntas guíen el aprendizaje de Rust.

Combinaciones de teclas

Existen varias combinaciones de teclas útiles en mdBook:

  • Flecha izquierda: ir a la página anterior.
  • Flecha derecha: ir a la página siguiente.
  • Ctrl + Intro: ejecutar el código de ejemplo seleccionado.
  • s: activar la barra de búsqueda.

Traducciones

El curso se ha traducido a otros idiomas gracias a grupo de maravillosos voluntarios:

Cambia el idioma con el selector situado en la esquina superior derecha.

Traducciones Incompletas

Hay muchas traducciones todavía en curso. A continuación, incluimos enlaces a las traducciones más actualizadas:

Si quieres ayudar en esta iniciativa, consulta nuestras instrucciones para empezar. Las traducciones se coordinan en la herramienta de seguimiento de incidencias.

Usando Cargo

Cuando empieces a informarte sobre Rust, conocerás Cargo, la herramienta estándar que se utiliza en el ecosistema de Rust para crear y ejecutar sus aplicaciones. En este artículo, te ofrecemos una breve descripción de lo que es Cargo, cómo se integra en el ecosistema más amplio y cómo encaja en esta formación.

Instalación

Sigue las instrucciones que se indican en https://rustup.rs/.

This will give you the Cargo build tool (cargo) and the Rust compiler (rustc). You will also get rustup, a command line utility that you can use to install to different compiler versions.

After installing Rust, you should configure your editor or IDE to work with Rust. Most editors do this by talking to rust-analyzer, which provides auto-completion and jump-to-definition functionality for VS Code, Emacs, Vim/Neovim, and many others. There is also a different IDE available called RustRover.

  • On Debian/Ubuntu, you can also install Cargo, the Rust source and the Rust formatter via apt. However, this gets you an outdated rust version and may lead to unexpected behavior. The command would be:

    sudo apt install cargo rust-src rustfmt
    

El ecosistema de Rust

El ecosistema de Rust se compone de varias herramientas, entre las que se incluyen las siguientes:

  • rustc: el compilador de Rust que convierte archivos .rs en binarios y otros formatos intermedios.

  • cargo: the Rust dependency manager and build tool. Cargo knows how to download dependencies, usually hosted on https://crates.io, and it will pass them to rustc when building your project. Cargo also comes with a built-in test runner which is used to execute unit tests.

  • rustup: the Rust toolchain installer and updater. This tool is used to install and update rustc and cargo when new versions of Rust are released. In addition, rustup can also download documentation for the standard library. You can have multiple versions of Rust installed at once and rustup will let you switch between them as needed.

Puntos clave:

  • Rust cuenta con un programa de lanzamiento rápido en el que se publica una nueva versión cada seis semanas. Las nuevas versiones mantienen la retrocompatibilidad con las versiones anteriores, además de habilitar nuevas funciones.

  • Hay tres canales de lanzamiento: “stable”, “beta” y “nightly”.

  • Las funciones nuevas se prueban en “nightly”, y “beta” es lo que se convierte en “estable” cada seis semanas.

  • Las dependencias también pueden resolverse desde [registros] alternativos, git, carpetas, etc.

  • Rust también tiene varias [ediciones]: la más actual es Rust 2021. Las ediciones anteriores son Rust 2015 y Rust 2018.

    • Las ediciones pueden introducir cambios de incompatibilidad con versiones anteriores en el lenguaje.

    • Para evitar que se rompa el código, las ediciones son opcionales: selecciona la edición para tu crate a través del archivo Cargo.toml.

    • Para evitar la división del ecosistema, los compiladores de Rust pueden mezclar el código escrito para distintas ediciones.

    • Hay que mencionar que es bastante raro utilizar el compilador directamente y no a través de cargo (la mayoría de los usuarios nunca lo hacen).

    • It might be worth alluding that Cargo itself is an extremely powerful and comprehensive tool. It is capable of many advanced features including but not limited to:

    • Read more from the official Cargo Book

Código de ejemplo en esta formación

En esta formación, aprenderemos el lenguaje Rust principalmente con ejemplos que podrás ejecutar con tu navegador. De este modo, la configuración es mucho más sencilla y se asegura una experiencia homogénea para todos.

Se recomienda instalar Cargo, ya que facilitará la realización de los ejercicios. El último día realizaremos un ejercicio más largo en el que se mostrará cómo trabajar con dependencias, y para eso se necesita Cargo.

Los bloques de código de este curso son totalmente interactivos:

fn main() {
    println!("Edit me!");
}

Puedes usar Ctrl + Intropara ejecutar el código cuando el cursor esté en el cuadro de texto.

La mayoría de los códigos de ejemplo se pueden editar, como se muestra arriba, pero hay algunos que no se pueden editar por varios motivos:

  • Los playgrounds insertados no pueden ejecutar pruebas unitarias. Copia y pega el código y ábrelo en la página del playground para mostrar pruebas unitarias.

  • Los playgrounds insertados pierden su estado en cuanto sales e de la página. Por este motivo, los alumnos deben resolver los ejercicios con una versión local de Rust o a través del playground.

Ejecutar código de forma local con Cargo

If you want to experiment with the code on your own system, then you will need to first install Rust. Do this by following the instructions in the Rust Book. This should give you a working rustc and cargo. At the time of writing, the latest stable Rust release has these version numbers:

% rustc --version
rustc 1.69.0 (84c898d65 2023-04-16)
% cargo --version
cargo 1.69.0 (6e9a83356 2023-04-12)

You can use any later version too since Rust maintains backwards compatibility.

Una vez hecho lo anterior, sigue estos pasos para compilar un binario de Rust a partir de uno de los ejemplos de la formación:

  1. Haz clic en el botón “Copiar en el portapapeles” del ejemplo que quieras copiar.

  2. Usa cargo new exercise para crear un directorio exercise/ para tu código:

    $ cargo new exercise
         Created binary (application) `exercise` package
    
  3. Ve a exercise/ y usa cargo run para compilar y ejecutar tu binario:

    $ cd exercise
    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.75s
         Running `target/debug/exercise`
    Hello, world!
    
  4. Sustituye el código de plantilla en src/main.rs con tu propio código. Por ejemplo, si usamos el ejemplo de la página anterior, podemos hacer que src/main.rs tenga el siguiente aspecto:

    fn main() {
        println!("Edit me!");
    }
  5. Usa cargo run para hacer build y ejecutar tu binario actualizado:

    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.24s
         Running `target/debug/exercise`
    Edit me!
    
  6. Comprueba que no haya errores en el proyecto con cargo check. Compílalo sin ejecutarlo con cargo build. Encontrarás la salida en target/debug/ para una versión de depuración normal. Usa cargo build --release para generar una compilación de lanzamiento optimizada en target/release/.

  7. Edita Cargo.toml para añadir dependencias a tu proyecto. Cuando ejecutes comandos cargo, se descargarán y compilarán automáticamente las dependencias que falten.

Anima a los participantes de la clase a instalar Cargo y utilizar un editor local. Les facilitará mucho las cosas, ya que dispondrán de un entorno de desarrollo normal.

Te damos la bienvenida al Día 1

Este es el primer día de Comprehensive Rust. Hoy trataremos muchos temas:

  • Sintaxis básica Rust: variables, scalar y tipos compuestos, enums, structs, references, funciones, y métodos.
  • Types and type inference.
  • Control flow constructs: loops, conditionals, and so on.
  • User-defined types: structs and enums.
  • Emparejamiento de Patrones: desestructuración de enums, structs y arrays.

Schedule

In this session:

Including 10 minute breaks, this session should take about 3 hours

This slide should take about 5 minutes.

Recuerda a los alumnos lo siguiente:

  • Deben hacer las preguntas cuando surgen, no las guarden hasta el final.
  • El curso está pensado para ser muy interactivo, por lo que te recomendamos que dejes que las preguntas guíen el aprendizaje de Rust.
    • As an instructor, you should try to keep the discussions relevant, i.e., keep the discussions related to how Rust does things vs some other language. It can be hard to find the right balance, but err on the side of allowing discussions since they engage people much more than one-way communication.
  • Las preguntas deberían ser sobre cosas acerca del contenido de los slides.
    • Esto está perfecto! Repetir es una parte importante del aprendizaje. Recuerda que los slides son solo un soporte y tienes libertad de saltearlos cuando quieras.

The idea for the first day is to show the “basic” things in Rust that should have immediate parallels in other languages. The more advanced parts of Rust come on the subsequent days.

If you’re teaching this in a classroom, this is a good place to go over the schedule. Note that there is an exercise at the end of each segment, followed by a break. Plan to cover the exercise solution after the break. The times listed here are a suggestion in order to keep the course on schedule. Feel free to be flexible and adjust as necessary!

¡Hola, mundo!

In this segment:

This segment should take about 20 minutes

¿Qué es Rust?

Rust es un nuevo lenguaje de programación que lanzó su versión 1.0 en el 2015:

  • Rust es un lenguaje compilado estático similar a C++
    • rustc usa LLVM como backend.
  • Rust es compatible con muchas plataformas y arquitecturas:
    • x86, ARM, WebAssembly, …
    • Linux, Mac, Windows, …
  • Rust se utiliza en una gran variedad de dispositivos:
    • firmware y cargadores de inicio,
    • pantallas inteligentes,
    • teléfonos móviles,
    • ordenadores,
    • servidores.
This slide should take about 10 minutes.

Rust satisface las mismas necesidades que C++:

  • Gran flexibilidad.
  • Nivel alto de control.
  • Se puede reducir verticalmente a dispositivos muy limitados, como los microcontroladores.
  • No tiene runtime ni garbage collection.
  • Se centra en la fiabilidad y la seguridad sin sacrificar el rendimiento.

¡Hola, mundo!

Vamos a hablar del programa Rust más simple, un clásico Hola Mundo:

fn main() {
    println!("Hello 🌍!");
}

Lo que ves:

  • Las funciones se introducen con fn.
  • Los bloques se delimitan con llaves, como en C y C++.
  • La función main es el punto de entrada del programa.
  • Rust tiene macros higiénicas, como por ejemplo println!.
  • Las cadenas de Rust están codificadas en UTF-8 y pueden contener caracteres Unicode.
This slide should take about 5 minutes.

This slide tries to make the students comfortable with Rust code. They will see a ton of it over the next four days so we start small with something familiar.

Puntos clave:

  • Rust es muy similar a otros lenguajes, como C, C++ o Java. Es imperativo y no intenta reinventar las cosas a menos que sea absolutamente necesario.

  • Rust es moderno y totalmente compatible con sistemas como Unicode.

  • Rust uses macros for situations where you want to have a variable number of arguments (no function overloading).

  • Que las macros sean ‘higiénicas’ significa que no capturan accidentalmenteidentificadores del ámbito en el que se utilizan. En realidad, las macros de Rust solo son parcialmente higiénicas.

  • Rust es un lenguaje multiparadigma. Por ejemplo, cuenta con funciones de programación orientadas a objetos y, aunque no es un lenguaje funcional, incluye una serie de conceptos funcionales.

Benefits of Rust

Estas son algunas de las ventajas competitivas de Rust:

  • Compile time memory safety - whole classes of memory bugs are prevented at compile time

    • No hay variables no inicializadas.
    • No hay errores double free.
    • No hay errores use-after-free.
    • No hay punteros NULL.
    • No se olvidan las exclusiones mutuas bloqueadas.
    • No hay condiciones de carrera de datos entre hilos.
    • No se invalidan los iteradores.
  • No undefined runtime behavior - what a Rust statement does is never left unspecified

    • Se comprueban los límites de acceso a los arrays.
    • Se define el desbordamiento de enteros (panic o wrap-around).
  • Modern language features - as expressive and ergonomic as higher-level languages

    • Enumeraciones (Enums) y coincidencia de patrones.
    • Genéricos
    • Sin overhead de FFI.
    • Abstracciones sin coste.
    • Excelentes errores de compilación.
    • Gestor de dependencias integrado.
    • Asistencia integrada para pruebas.
    • Compatibilidad excelente con el protocolo del servidor de lenguaje.
This slide should take about 3 minutes.

Do not spend much time here. All of these points will be covered in more depth later.

Asegúrate de preguntar a la clase en qué lenguajes tienen experiencia. Dependiendo de la respuesta puedes destacar diferentes características de Rust:

  • Experiencia con C o C++: Rust elimina una clase completa de errores de runtime mediante el borrow checker. Obtienes un rendimiento similar al de C y C++, pero no tienes problemas de seguridad en la memoria. Además, obtienes un lenguaje moderno con elementos como la coincidencia de patrones y la gestión de dependencias integrado.

  • Experiencia con Java, Go, Python, JavaScript, etc.: Consigues la misma seguridad en la memoria que en éstos lenguajes, además de una sensación similar a la de un lenguaje de alto nivel. También consigues un rendimiento rápido y predecible como en C y C++ (sin recolector de memoria residual), así como acceso a hardware de bajo nivel (si lo necesitas).

Playground

The Rust Playground provides an easy way to run short Rust programs, and is the basis for the examples and exercises in this course. Try running the “hello-world” program it starts with. It comes with a few handy features:

  • Under “Tools”, use the rustfmt option to format your code in the “standard” way.

  • Rust has two main “profiles” for generating code: Debug (extra runtime checks, less optimization) and Release (fewer runtime checks, lots of optimization). These are accessible under “Debug” at the top.

  • If you’re interested, use “ASM” under “…” to see the generated assembly code.

This slide should take about 2 minutes.

As students head into the break, encourage them to open up the playground and experiment a little. Encourage them to keep the tab open and try things out during the rest of the course. This is particularly helpful for advanced students who want to know more about Rust’s optimizations or generated assembly.

Types and Values

In this segment:

This segment should take about 1 hour and 5 minutes

Variables

Rust provides type safety via static typing. Variable bindings are made with let:

fn main() {
    let x: i32 = 10;
    println!("x: {x}");
    // x = 20;
    // println!("x: {x}");
}
This slide should take about 5 minutes.
  • Uncomment the x = 20 to demonstrate that variables are immutable by default. Add the mut keyword to allow changes.

  • The i32 here is the type of the variable. This must be known at compile time, but type inference (covered later) allows the programmer to omit it in many cases.

Values

Here are some basic built-in types, and the syntax for literal values of each type.

TiposLiterales
Enteros con signoi8, i16, i32, i64, i128, isize-10, 0, 1_000, 123_i64
Enteros sin signou8, u16, u32, u64, u128, usize0, 123, 10_u16
Números de coma flotantef32, f643.14, -10.0e20, 2_f32
Valores escalares Unicodechar'a', 'α', '∞'
Booleanosbooltrue, false

Los tipos tienen la siguiente anchura:

  • iN, uN, and fN son N bits de capacidad,
  • isize y usize tienen el ancho de un puntero,
  • char tiene un tamaño de 32 bits,
  • bool tiene 8 bits de ancho.
This slide should take about 10 minutes.

Hay algunas sintaxis que no se han mostrado anteriormente:

  • Todos guiones bajos en los números pueden no utilizarse, ya que solo sirven para facilitar la lectura. Por lo tanto, 1_000 se puede escribir como 1000 (o 10_00), y 123_i64 se puede escribir como 123i64.

Arithmetic

fn interproduct(a: i32, b: i32, c: i32) -> i32 {
    return a * b + b * c + c * a;
}

fn main() {
    println!("result: {}", interproduct(120, 100, 248));
}
This slide should take about 5 minutes.

This is the first time we’ve seen a function other than main, but the meaning should be clear: it takes three integers, and returns an integer. Functions will be covered in more detail later.

Arithmetic is very similar to other languages, with similar precedence.

What about integer overflow? In C and C++ overflow of signed integers is actually undefined, and might do different things on different platforms or compilers. In Rust, it’s defined.

Change the i32’s to i16 to see an integer overflow, which panics (checked) in a debug build and wraps in a release build. There are other options, such as overflowing, saturating, and carrying. These are accessed with method syntax, e.g., (a * b).saturating_add(b * c).saturating_add(c * a).

In fact, the compiler will detect overflow of constant expressions, which is why the example requires a separate function.

Cadenas de texto (Strings)

Rust has two types to represent strings, both of which will be covered in more depth later. Both always store UTF-8 encoded strings.

  • String - a modifiable, owned string.
  • &str - a read-only string. String literals have this type.
fn main() {
    let greeting: &str = "Greetings";
    let planet: &str = "🪐";
    let mut sentence = String::new();
    sentence.push_str(greeting);
    sentence.push_str(", ");
    sentence.push_str(planet);
    println!("final sentence: {}", sentence);
    println!("{:?}", &sentence[0..5]);
    //println!("{:?}", &sentence[12..13]);
}
This slide should take about 10 minutes.

This slide introduces strings. Everything here will be covered in more depth later, but this is enough for subsequent slides and exercises to use strings.

  • Invalid UTF-8 in a string is UB, and this not allowed in safe Rust.

  • String is a user-defined type with a constructor (::new()) and methods like s.push_str(..).

  • The & in &str indicates that this is a reference. We will cover references later, so for now just think of &str as a unit meaning “a read-only string”.

  • The commented-out line is indexing into the string by byte position. 12..13 does not end on a character boundary, so the program panics. Adjust it to a range that does, based on the error message.

  • Las cadenas sin formato te permiten crear un valor &str con los escapes inhabilitados: r"\n" == "\\n". Puedes insertar comillas dobles con la misma cantidad de # a cada lado de ellas:

    fn main() {
        println!(r#"<a href="link.html">link</a>"#);
        println!("<a href=\"link.html\">link</a>");
    }

Inferencia de tipos

Rust consultará cómo se usa la variable para determinar el tipo:

fn takes_u32(x: u32) {
    println!("u32: {x}");
}

fn takes_i8(y: i8) {
    println!("i8: {y}");
}

fn main() {
    let x = 10;
    let y = 20;

    takes_u32(x);
    takes_i8(y);
    // takes_u32(y);
}
This slide should take about 5 minutes.

Esta diapositiva muestra cómo el compilador de Rust infiere tipos basándose en restricciones proporcionadas por declaraciones y usos de variables.

Es muy importante subrayar que las variables que se declaran así no son de un “tipo cualquiera” dinámico que pueda contener cualquier dato. El código máquina generado por tal declaración es idéntico a la declaración explícita de un tipo. El compilador hace el trabajo por nosotros y nos ayuda a escribir código más conciso.

When nothing constrains the type of an integer literal, Rust defaults to i32. This sometimes appears as {integer} in error messages. Similarly, floating-point literals default to f64.

fn main() {
    let x = 3.14;
    let y = 20;
    assert_eq!(x, y);
    // ERROR: no implementation for `{float} == {integer}`
}

Exercise: Fibonacci

The first and second Fibonacci numbers are both 1. For n>2, the n’th Fibonacci number is calculated recursively as the sum of the n-1’th and n-2’th Fibonacci numbers.

Write a function fib(n) that calculates the n’th Fibonacci number. When will this function panic?

fn fib(n: u32) -> u32 {
    if n <= 2 {
        // The base case.
        todo!("Implement this")
    } else {
        // The recursive case.
        todo!("Implement this")
    }
}

fn main() {
    let n = 20;
    println!("fib(n) = {}", fib(n));
}

Soluciones

fn fib(n: u32) -> u32 {
    if n <= 2 {
        return 1;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

fn main() {
    let n = 20;
    println!("fib(n) = {}", fib(n));
}

Control de Flujo

In this segment:

This segment should take about 1 hour

Conditionals

Gran parte de la sintaxis de Rust te resultará familiar de C, C++ o Java:

  • Blocks are delimited by curly braces.
  • Los comentarios de línea empiezan por //, mientras que los comentarios de bloque están delimitados por /* ... */.
  • Palabras clave como if y while funcionan igual.
  • La asignación de variables se realiza con = y la comparación con ==.

Expresiones if

Puedes usar expresiones if de la misma forma que en otros lenguajes:

fn main() {
    let x = 10;
    if x < 20 {
        println!("small");
    } else if x < 100 {
        println!("biggish");
    } else {
        println!("huge");
    }
}

Además, puedes utilizar if como expresión. La última expresión de cada bloque se convierte en el valor de la expresión if:

fn main() {
    let x = 10;
    let size = if x < 20 { "small" } else { "large" };
    println!("number size: {}", size);
}
This slide should take about 5 minutes.

Because if is an expression and must have a particular type, both of its branch blocks must have the same type. Show what happens if you add ; after "small" in the second example.

When if is used in an expression, the expression must have a ; to separate it from the next statement. Remove the ; before println! to see the compiler error.

Bucles for

There are three looping keywords in Rust: while, loop, and for:

while

The while keyword works much like in other languages, executing the loop body as long as the condition is true.

fn main() {
    let mut x = 200;
    while x >= 10 {
        x = x / 2;
    }
    println!("Final x: {x}");
}

for

The for loop iterates over ranges of values:

fn main() {
    for x in 1..5 {
        println!("x: {x}");
    }
}

loop

The loop statement just loops forever, until a break.

fn main() {
    let mut i = 0;
    loop {
        i += 1;
        println!("{i}");
        if i > 100 {
            break;
        }
    }
}
This slide should take about 5 minutes.
  • We will discuss iteration later; for now, just stick to range expressions.
  • Note that the for loop only iterates to 4. Show the 1..=5 syntax for an inclusive range.

break y continue

If you want to exit any kind of loop early, use break. For loop, this can take an optional expression that becomes the value of the loop expression.

Si quieres iniciar inmediatamente la siguiente iteración, usa continue.

fn main() {
    let (mut a, mut b) = (100, 52);
    let result = loop {
        if a == b {
            break a;
        }
        if a < b {
            b -= a;
        } else {
            a -= b;
        }
    };
    println!("{result}");
}

De forma opcional, tanto continue como break pueden utilizar un argumento de etiqueta para interrumpir los bucles anidados:

fn main() {
    'outer: for x in 1..5 {
        println!("x: {x}");
        let mut i = 0;
        while i < x {
            println!("x: {x}, i: {i}");
            i += 1;
            if i == 3 {
                break 'outer;
            }
        }
    }
}

En este caso, detenemos el bucle exterior tras tres iteraciones del bucle interno.

This slide should take about 5 minutes.
  • Ten en cuenta que loop es la única construcción de bucle que devuelve un valor no trivial. Esto se debe a que es inevitable que se introduzca al menos una vez (a diferencia de los bucles while y for).

Blocks and Scopes

Bloques

A block in Rust contains a sequence of expressions, enclosed by braces {}. Each block has a value and a type, which are those of the last expression of the block:

fn main() {
    let z = 13;
    let x = {
        let y = 10;
        println!("y: {y}");
        z - y
    };
    println!("x: {x}");
}

Si la última expresión termina con ;, el tipo y el valor resultante será ().

Ámbitos y Shadowing

A variable’s scope is limited to the enclosing block.

Puedes sombrear variables, tanto las de ámbitos externos como las del propio ámbito:

fn main() {
    let a = 10;
    println!("before: {a}");
    {
        let a = "hello";
        println!("inner scope: {a}");

        let a = true;
        println!("shadowed in inner scope: {a}");
    }

    println!("after: {a}");
}
This slide should take about 10 minutes.
  • Puedes mostrar cómo cambia el valor del bloque cambiando su última línea. Por ejemplo, añade o quita un punto y coma, o utiliza la expresión return.
  • Show that a variable’s scope is limited by adding a b in the inner block in the last example, and then trying to access it outside that block.
  • Shadowing is different from mutation, because after shadowing both variable’s memory locations exist at the same time. Both are available under the same name, depending where you use it in the code.
  • A shadowing variable can have a different type.
  • Al principio, el sombreado no es fácil, pero resulta útil para conservar valores después de .unwrap().

Funciones

fn gcd(a: u32, b: u32) -> u32 {
    if b > 0 {
        gcd(b, a % b)
    } else {
        a
    }
}

fn main() {
    println!("gcd: {}", gcd(143, 52));
}
This slide should take about 3 minutes.
  • Los parámetros de declaración van seguidos de un tipo (al contrario que en algunos lenguajes de programación) y, a continuación, de un tipo de resultado devuelto.
  • The last expression in a function body (or any block) becomes the return value. Simply omit the ; at the end of the expression. The return keyword can be used for early return, but the “bare value” form is idiomatic at the end of a function (refactor gcd to use a return).
  • Algunas funciones no devuelven ningún valor, devuelven el “tipo unitario”, (). El compilador deducirá esto si se omite el tipo de retorno -> ().
  • Overloading is not supported – each function has a single implementation.
    • Always takes a fixed number of parameters. Default arguments are not supported. Macros can be used to support variadic functions.
    • Always takes a single set of parameter types. These types can be generic, which will be covered later.

Macros

Macros are expanded into Rust code during compilation, and can take a variable number of arguments. They are distinguished by a ! at the end. The Rust standard library includes an assortment of useful macros.

  • println!(format, ..) prints a line to standard output, applying formatting described in std::fmt.
  • format!(format, ..) works just like println! but returns the result as a string.
  • dbg!(expression) logs the value of the expression and returns it.
  • todo!() marks a bit of code as not-yet-implemented. If executed, it will panic.
  • unreachable!() marks a bit of code as unreachable. If executed, it will panic.
fn factorial(n: u32) -> u32 {
    let mut product = 1;
    for i in 1..=n {
        product *= dbg!(i);
    }
    product
}

fn fizzbuzz(n: u32) -> u32 {
    todo!()
}

fn main() {
    let n = 4;
    println!("{n}! = {}", factorial(n));
}
This slide should take about 2 minutes.

The takeaway from this section is that these common conveniences exist, and how to use them. Why they are defined as macros, and what they expand to, is not especially critical.

The course does not cover defining macros, but a later section will describe use of derive macros.

Exercise: Collatz Sequence

The Collatz Sequence is defined as follows, for an arbitrary n1 greater than zero:

  • If ni is 1, then the sequence terminates at ni.
  • If ni is even, then ni+1 = ni / 2.
  • If ni is odd, then ni+1 = 3 * ni + 1.

For example, beginning with n1 = 3:

  • 3 is odd, so n2 = 3 * 3 + 1 = 10;
  • 10 is even, so n3 = 10 / 2 = 5;
  • 5 is odd, so n4 = 3 * 5 + 1 = 16;
  • 16 is even, so n5 = 16 / 2 = 8;
  • 8 is even, so n6 = 8 / 2 = 4;
  • 4 is even, so n7 = 4 / 2 = 2;
  • 2 is even, so n8 = 1; and
  • the sequence terminates.

Write a function to calculate the length of the collatz sequence for a given initial n.

/// Determine the length of the collatz sequence beginning at `n`.
fn collatz_length(mut n: i32) -> u32 {
  todo!("Implement this")
}

fn main() {
  todo!("Implement this")
}

Soluciones

/// Determine the length of the collatz sequence beginning at `n`.
fn collatz_length(mut n: i32) -> u32 {
    let mut len = 1;
    while n > 1 {
        n = if n % 2 == 0 { n / 2 } else { 3 * n + 1 };
        len += 1;
    }
    len
}

#[test]
fn test_collatz_length() {
    assert_eq!(collatz_length(11), 15);
}

fn main() {
    println!("Length: {}", collatz_length(11));
}

Welcome Back

In this session:

Including 10 minute breaks, this session should take about 2 hours and 55 minutes

Tuples and Arrays

In this segment:

This segment should take about 1 hour

Tuples and Arrays

Tuples and arrays are the first “compound” types we have seen. All elements of an array have the same type, while tuples can accommodate different types. Both types have a size fixed at compile time.

TiposLiterales
Arrays[T; N][20, 30, 40], [0; 3]
Tuplas(), (T,), (T1, T2), …(), ('x',), ('x', 1.2), …

Asignación y acceso a arrays:

fn main() {
    let mut a: [i8; 10] = [42; 10];
    a[5] = 0;
    println!("a: {a:?}");
}

Asignación y acceso a tuplas:

fn main() {
    let t: (i8, bool) = (7, true);
    println!("t.0: {}", t.0);
    println!("t.1: {}", t.1);
}
This slide should take about 10 minutes.

Puntos clave:

Arrays:

  • A value of the array type [T; N] holds N (a compile-time constant) elements of the same type T. Note that the length of the array is part of its type, which means that [u8; 3] and [u8; 4] are considered two different types. Slices, which have a size determined at runtime, are covered later.

  • Try accessing an out-of-bounds array element. Array accesses are checked at runtime. Rust can usually optimize these checks away, and they can be avoided using unsafe Rust.

  • Podemos usar literales para asignar valores a arrays.

  • The println! macro asks for the debug implementation with the ? format parameter: {} gives the default output, {:?} gives the debug output. Types such as integers and strings implement the default output, but arrays only implement the debug output. This means that we must use debug output here.

  • Si se añade #, por ejemplo {a:#?}, se da formato al texto para facilitar la lectura.

Tuplas:

  • Al igual que los arrays, las tuplas tienen una longitud fija.

  • Las tuplas agrupan valores de diferentes tipos en un tipo compuesto.

  • Se puede acceder a los campos de una tupla por el punto y el índice del valor, por ejemplo, t.0, t.1.

  • The empty tuple () is also known as the “unit type”. It is both a type, and the only valid value of that type — that is to say both the type and its value are expressed as (). It is used to indicate, for example, that a function or expression has no return value, as we’ll see in a future slide.

    • You can think of it as void that can be familiar to you from other programming languages.

Integración de Cargo

The for statement supports iterating over arrays (but not tuples).

fn main() {
    let primes = [2, 3, 5, 7, 11, 13, 17, 19];
    for prime in primes {
        for i in 2..prime {
            assert_ne!(prime % i, 0);
        }
    }
}
This slide should take about 3 minutes.

This functionality uses the IntoIterator trait, but we haven’t covered that yet.

The assert_ne! macro is new here. There are also assert_eq! and assert! macros. These are always checked while, debug-only variants like debug_assert! compile to nothing in release builds.

Correspondencia de Patrones

The match keyword lets you match a value against one or more patterns. The comparisons are done from top to bottom and the first match wins.

Los patrones pueden ser valores simples, del mismo modo que switch en C y C++:

#[rustfmt::skip]
fn main() {
    let input = 'x';
    match input {
        'q'                       => println!("Quitting"),
        'a' | 's' | 'w' | 'd'     => println!("Moving around"),
        '0'..='9'                 => println!("Number input"),
        key if key.is_lowercase() => println!("Lowercase: {key}"),
        _                         => println!("Something else"),
    }
}

The _ pattern is a wildcard pattern which matches any value. The expressions must be irrefutable, meaning that it covers every possibility, so _ is often used as the final catch-all case.

Match can be used as an expression. Just like if, each match arm must have the same type. The type is the last expression of the block, if any. In the example above, the type is ().

A variable in the pattern (key in this example) will create a binding that can be used within the match arm.

A match guard causes the arm to match only if the condition is true.

This slide should take about 10 minutes.

Puntos Clave:

  • Puedes señalar cómo se usan algunos caracteres concretos en un patrón

    • | como or,
    • .. se puede ampliar tanto como sea necesario.
    • 1..=5 representa un rango inclusivo.
    • _ es un comodín.
  • Las guardas de coincidencia, como característica sintáctica independiente, son importantes y necesarios cuando queremos expresar de forma concisa ideas más complejas de lo que permitirían los patrones por sí solos.

  • No son lo mismo que una expresión if independiente dentro del brazo de coincidencias. Una expresión if dentro del bloque de ramas (después de =>) se produce tras seleccionar el brazo de coincidencias. Si no se cumple la condición if dentro de ese bloque, no se tienen en cuenta otros brazos de la expresión match original.

  • La condición definida en el guarda se aplica a todas las expresiones de un patrón con un |.

Desestructurando Enums

Destructuring is a way of extracting data from a data structure by writing a pattern that is matched up to the data structure, binding variables to subcomponents of the data structure.

You can destructure tuples and arrays by matching on their elements:

Tuplas

fn main() {
    describe_point((1, 0));
}

fn describe_point(point: (i32, i32)) {
    match point {
        (0, _) => println!("on Y axis"),
        (_, 0) => println!("on X axis"),
        (x, _) if x < 0 => println!("left of Y axis"),
        (_, y) if y < 0 => println!("below X axis"),
        _ => println!("first quadrant"),
    }
}

Arrays

#[rustfmt::skip]
fn main() {
    let triple = [0, -2, 3];
    println!("Tell me about {triple:?}");
    match triple {
        [0, y, z] => println!("First is 0, y = {y}, and z = {z}"),
        [1, ..]   => println!("First is 1 and the rest were ignored"),
        _         => println!("All elements were ignored"),
    }
}
This slide should take about 5 minutes.
  • Create a new array pattern using _ to represent an element.
  • Añade más valores al array.
  • Señala cómo .. se expandirá para representar un número distinto de elementos.
  • Muestra las coincidencias de tail con los patrones [.., b] y [a@..,b].

Exercise: Nested Arrays

Arrays can contain other arrays:

#![allow(unused)]
fn main() {
let array = [[1, 2, 3], [4, 5, 6], [7, 8, 9]];
}

What is the type of this variable?

Use an array such as the above to write a function transpose which will transpose a matrix (turn rows into columns):

2584567⎤8⎥9⎦transpose==1473⎤6⎥9⎦123

Codifica ambas funciones para que operen con matrices de 3 × 3.

Copia el siguiente fragmento de código en https://play.rust-lang.org/ e implementa las funciones:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    unimplemented!()
}

fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix: {:#?}", matrix);
    let transposed = transpose(matrix);
    println!("transposed: {:#?}", transposed);
}

Soluciones

fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    let mut result = [[0; 3]; 3];
    for i in 0..3 {
        for j in 0..3 {
            result[j][i] = matrix[i][j];
        }
    }
    result
}

#[test]
fn test_transpose() {
    let matrix = [
        [101, 102, 103], //
        [201, 202, 203],
        [301, 302, 303],
    ];
    let transposed = transpose(matrix);
    assert_eq!(
        transposed,
        [
            [101, 201, 301], //
            [102, 202, 302],
            [103, 203, 303],
        ]
    );
}

fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix: {:#?}", matrix);
    let transposed = transpose(matrix);
    println!("transposed: {:#?}", transposed);
}

Referencias

In this segment:

This segment should take about 50 minutes

Referencias

A reference provides a way to access another value without taking responsibility for the value, and is also called “borrowing”. Shared references are read-only, and the referenced data cannot change.

fn main() {
    let a = 'A';
    let b = 'B';
    let mut r: &char = &a;
    println!("r: {}", *r);
    r = &b;
    println!("r: {}", *r);
}

A shared reference to a type T has type &T. A reference value is made with the & operator. The * operator “dereferences” a reference, yielding its value.

Rust prohibirá estáticamente las referencias colgantes:

fn x_axis(x: i32) -> &(i32, i32) {
    let point = (x, 0);
    return &point;
}
This slide should take about 10 minutes.
  • A reference is said to “borrow” the value it refers to, and this is a good model for students not familiar with pointers: code can use the reference to access the value, but is still “owned” by the original variable. The course will get into more detail on ownership in day 3.

  • References are implemented as pointers, and a key advantage is that they can be much smaller than the thing they point to. Students familiar with C or C++ will recognize references as pointers. Later parts of the course will cover how Rust prevents the memory-safety bugs that come from using raw pointers.

  • Rust does not automatically create references for you - the & is always required.

  • Rust will auto-dereference in some cases, in particular when invoking methods (try r.count_ones()). There is no need for an -> operator like in C++.

  • In this example, r is mutable so that it can be reassigned (r = &b). Note that this re-binds r, so that it refers to something else. This is different from C++, where assignment to a reference changes the referenced value.

  • A shared reference does not allow modifying the value it refers to, even if that value was mutable. Try *r = 'X'.

  • Rust is tracking the lifetimes of all references to ensure they live long enough. Dangling references cannot occur in safe Rust. x_axis would return a reference to point, but point will be deallocated when the function returns, so this will not compile.

  • Más adelante hablaremos de los préstamos cuando lleguemos a la parte de propiedad.

Referencias colgantes

Exclusive references, also known as mutable references, allow changing the value they refer to. They have type &mut T.

fn main() {
    let mut point = (1, 2);
    let x_coord = &mut point.0;
    *x_coord = 20;
    println!("point: {point:?}");
}
This slide should take about 10 minutes.

Puntos clave:

  • “Exclusive” means that only this reference can be used to access the value. No other references (shared or exclusive) can exist at the same time, and the referenced value cannot be accessed while the exclusive reference exists. Try making an &point.0 or changing point.0 while x_coord is alive.

  • Be sure to note the difference between let mut x_coord: &i32 and let x_coord: &mut i32. The first one represents a shared reference which can be bound to different values, while the second represents an exclusive reference to a mutable value.

Exercise: Geometry

We will create a few utility functions for 3-dimensional geometry, representing a point as [f64;3]. It is up to you to determine the function signatures.

// Calculate the magnitude of a vector by summing the squares of its coordinates
// and taking the square root. Use the `sqrt()` method to calculate the square
// root, like `v.sqrt()`.


fn magnitude(...) -> f64 {
    todo!()
}

// Normalize a vector by calculating its magnitude and dividing all of its
// coordinates by that magnitude.


fn normalize(...) {
    todo!()
}

// Use the following `main` to test your work.

fn main() {
    println!("Magnitude of a unit vector: {}", magnitude(&[0.0, 1.0, 0.0]));

    let mut v = [1.0, 2.0, 9.0];
    println!("Magnitude of {v:?}: {}", magnitude(&v));
    normalize(&mut v);
    println!("Magnitude of {v:?} after normalization: {}", magnitude(&v));
}

Soluciones

/// Calculate the magnitude of the given vector.
fn magnitude(vector: &[f64; 3]) -> f64 {
    let mut mag_squared = 0.0;
    for coord in vector {
        mag_squared += coord * coord;
    }
    mag_squared.sqrt()
}

/// Change the magnitude of the vector to 1.0 without changing its direction.
fn normalize(vector: &mut [f64; 3]) {
    let mag = magnitude(vector);
    vector[0] /= mag;
    vector[1] /= mag;
    vector[2] /= mag;
}

fn main() {
    println!("Magnitude of a unit vector: {}", magnitude(&[0.0, 1.0, 0.0]));

    let mut v = [1.0, 2.0, 9.0];
    println!("Magnitude of {v:?}: {}", magnitude(&v));
    normalize(&mut v);
    println!("Magnitude of {v:?} after normalization: {}", magnitude(&v));
}

User-Defined Types

In this segment:

This segment should take about 50 minutes

Structs

Al igual que C y C++, Rust admite estructuras (struct) personalizadas:

struct Person {
    name: String,
    age: u8,
}

fn describe(person: &Person) {
    println!("{} is {} years old", person.name, person.age);
}

fn main() {
    let mut peter = Person { name: String::from("Peter"), age: 27 };
    describe(&peter);

    peter.age = 28;
    describe(&peter);

    let name = String::from("Avery");
    let age = 39;
    let avery = Person { name, age };
    describe(&avery);

    let jackie = Person { name: String::from("Jackie"), ..avery };
    describe(&jackie);
}
This slide should take about 10 minutes.

Puntos Clave:

  • Las estructuras funcionan como en C o en C++.
    • Al igual que en C++, y a diferencia de C, no se necesita typedef para definir un tipo.
    • A diferencia de C++, no existe ninguna herencia entre las estructuras.
  • This may be a good time to let people know there are different types of structs.
    • Zero-sized structs (e.g. struct Foo;) might be used when implementing a trait on some type but don’t have any data that you want to store in the value itself.
    • La siguiente diapositiva presentará las estructuras de tuplas, que se utilizan cuando los nombres de los campos no son importantes.
  • If you already have variables with the right names, then you can create the struct using a shorthand.
  • The syntax ..avery allows us to copy the majority of the fields from the old struct without having to explicitly type it all out. It must always be the last element.

Estructuras de tuplas

Si los nombres de los campos no son importantes, puedes utilizar una estructura de tuplas:

struct Point(i32, i32);

fn main() {
    let p = Point(17, 23);
    println!("({}, {})", p.0, p.1);
}

Esto se suele utilizar para envoltorios de campo único (denominados newtypes):

struct PoundsOfForce(f64);
struct Newtons(f64);

fn compute_thruster_force() -> PoundsOfForce {
    todo!("Ask a rocket scientist at NASA")
}

fn set_thruster_force(force: Newtons) {
    // ...
}

fn main() {
    let force = compute_thruster_force();
    set_thruster_force(force);
}
This slide should take about 10 minutes.
  • Los newtypes son una buena forma de codificar información adicional sobre el valor de un tipo primitivo, por ejemplo:
    • El número se mide en algunas unidades: Newtons en el ejemplo anterior.
    • The value passed some validation when it was created, so you no longer have to validate it again at every use: PhoneNumber(String) or OddNumber(u32).
  • Demuestra cómo se añade un valor f64 a un tipo Newtons accediendo al campo único del newtype.
    • Por lo general, a Rust no le gustan los elementos no explícitos, como el desenvolvimiento automático o, por ejemplo, el uso de booleanos como enteros.
    • El día 3 (genéricos), se explicará la sobrecarga del operador.
  • El ejemplo es una sutil referencia al fracaso de la sonda Mars Climate Orbiter.

Enums

La palabra clave enum permite crear un tipo que tiene diferentes variantes:

#[derive(Debug)]
enum Direction {
    Left,
    Right,
}

#[derive(Debug)]
enum PlayerMove {
    Pass,                        // Simple variant
    Run(Direction),              // Tuple variant
    Teleport { x: u32, y: u32 }, // Struct variant
}

fn main() {
    let m: PlayerMove = PlayerMove::Run(Direction::Left);
    println!("On this turn: {:?}", m);
}
This slide should take about 5 minutes.

Puntos Clave:

  • Enumerations allow you to collect a set of values under one type.
  • Direction is a type with variants. There are two values of Direction: Direction::Left and Direction::Right.
  • PlayerMove is a type with three variants. In addition to the payloads, Rust will store a discriminant so that it knows at runtime which variant is in a PlayerMove value.
  • This might be a good time to compare structs and enums:
    • In both, you can have a simple version without fields (unit struct) or one with different types of fields (variant payloads).
    • You could even implement the different variants of an enum with separate structs but then they wouldn’t be the same type as they would if they were all defined in an enum.
  • Rust uses minimal space to store the discriminant.
    • If necessary, it stores an integer of the smallest required size

    • If the allowed variant values do not cover all bit patterns, it will use invalid bit patterns to encode the discriminant (the “niche optimization”). For example, Option<&u8> stores either a pointer to an integer or NULL for the None variant.

    • Puedes controlar el discriminante si es necesario (por ejemplo, para asegurar la compatibilidad con C):

      #[repr(u32)]
      enum Bar {
          A, // 0
          B = 10000,
          C, // 10001
      }
      
      fn main() {
          println!("A: {}", Bar::A as u32);
          println!("B: {}", Bar::B as u32);
          println!("C: {}", Bar::C as u32);
      }

      Sin repr, el tipo discriminante ocupa 2 bytes, debido a que 10001 se cabe en 2 bytes.

More to Explore

Rust has several optimizations it can employ to make enums take up less space.

  • Optimización de puntero nulo: para algunos tipos, Rust asegura que size_of::<T>() es igual a size_of::<Option<T> >().

    Fragmento de código de ejemplo si quieres mostrar cómo puede ser la representación bit a bit en la práctica. Es importante tener en cuenta que el compilador no ofrece garantías con respecto a esta representación, por lo tanto es totalmente inseguro.

    use std::mem::transmute;
    
    macro_rules! dbg_bits {
        ($e:expr, $bit_type:ty) => {
            println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
        };
    }
    
    fn main() {
        unsafe {
            println!("bool:");
            dbg_bits!(false, u8);
            dbg_bits!(true, u8);
    
            println!("Option<bool>:");
            dbg_bits!(None::<bool>, u8);
            dbg_bits!(Some(false), u8);
            dbg_bits!(Some(true), u8);
    
            println!("Option<Option<bool>>:");
            dbg_bits!(Some(Some(false)), u8);
            dbg_bits!(Some(Some(true)), u8);
            dbg_bits!(Some(None::<bool>), u8);
            dbg_bits!(None::<Option<bool>>, u8);
    
            println!("Option<&i32>:");
            dbg_bits!(None::<&i32>, usize);
            dbg_bits!(Some(&0i32), usize);
        }
    }

static y const

Static and constant variables are two different ways to create globally-scoped values that cannot be moved or reallocated during the execution of the program.

const

Las variables constantes se evalúan en tiempo de compilación y sus valores se insertan dondequiera que se utilicen:

const DIGEST_SIZE: usize = 3;
const ZERO: Option<u8> = Some(42);

fn compute_digest(text: &str) -> [u8; DIGEST_SIZE] {
    let mut digest = [ZERO.unwrap_or(0); DIGEST_SIZE];
    for (idx, &b) in text.as_bytes().iter().enumerate() {
        digest[idx % DIGEST_SIZE] = digest[idx % DIGEST_SIZE].wrapping_add(b);
    }
    digest
}

fn main() {
    let digest = compute_digest("Hello");
    println!("digest: {digest:?}");
}

According to the Rust RFC Book these are inlined upon use.

Sólo se pueden llamar a las funciones marcadas como const en tiempo de compilación para generar valores const. Sin embargo, las funciones const se pueden llamar en runtime.

static

Las variables estáticas vivirán durante toda la ejecución del programa y, por lo tanto, no se moverán:

static BANNER: &str = "Welcome to RustOS 3.14";

fn main() {
    println!("{BANNER}");
}

As noted in the Rust RFC Book, these are not inlined upon use and have an actual associated memory location. This is useful for unsafe and embedded code, and the variable lives through the entirety of the program execution. When a globally-scoped value does not have a reason to need object identity, const is generally preferred.

This slide should take about 5 minutes.
  • Menciona que const se comporta semánticamente de forma similar a constexpr de C++.
  • Por su parte, static se parece mucho más a const o a una variable global mutable de C++.
  • static proporciona la identidad del objeto: una dirección en la memoria y en el estado que requieren los tipos con mutabilidad interior, como Mutex<T>.
  • No es muy habitual que se necesite una constante evaluada en runtime, pero es útil y más seguro que usar una estática.

Tabla de Propiedades:

PropiedadEstáticoConstante
Tiene una dirección en la memoriaNo (insertado)
Vive durante toda la ejecución del programaNo
Puede ser mutableSí (inseguro)No
Evaluado en tiempo de compilaciónSí (inicializado en tiempo de compilación)
Insertado dondequiera que se utiliceNo

More to Explore

Because static variables are accessible from any thread, they must be Sync. Interior mutability is possible through a Mutex, atomic or similar.

Thread-local data can be created with the macro std::thread_local.

Type Aliases

A type alias creates a name for another type. The two types can be used interchangeably.

enum CarryableConcreteItem {
    Left,
    Right,
}

type Item = CarryableConcreteItem;

// Aliases are more useful with long, complex types:
use std::cell::RefCell;
use std::sync::{Arc, RwLock};
type PlayerInventory = RwLock<Vec<Arc<RefCell<Item>>>>;
This slide should take about 2 minutes.

C programmers will recognize this as similar to a typedef.

Exercise: Elevator Events

We will create a data structure to represent an event in an elevator control system. It is up to you to define the types and functions to construct various events. Use #[derive(Debug)] to allow the types to be formatted with {:?}.

This exercise only requires creating and populating data structures so that main runs without errors. The next part of the course will cover getting data out of these structures.

#[derive(Debug)]
/// An event in the elevator system that the controller must react to.
enum Event {
    // TODO: add required variants
}

/// A direction of travel.
#[derive(Debug)]
enum Direction {
    Up,
    Down,
}

/// The car has arrived on the given floor.
fn car_arrived(floor: i32) -> Event {
    todo!()
}

/// The car doors have opened.
fn car_door_opened() -> Event {
    todo!()
}

/// The car doors have closed.
fn car_door_closed() -> Event {
    todo!()
}

/// A directional button was pressed in an elevator lobby on the given floor.
fn lobby_call_button_pressed(floor: i32, dir: Direction) -> Event {
    todo!()
}

/// A floor button was pressed in the elevator car.
fn car_floor_button_pressed(floor: i32) -> Event {
    todo!()
}

fn main() {
    println!(
        "A ground floor passenger has pressed the up button: {:?}",
        lobby_call_button_pressed(0, Direction::Up)
    );
    println!("The car has arrived on the ground floor: {:?}", car_arrived(0));
    println!("The car door opened: {:?}", car_door_opened());
    println!(
        "A passenger has pressed the 3rd floor button: {:?}",
        car_floor_button_pressed(3)
    );
    println!("The car door closed: {:?}", car_door_closed());
    println!("The car has arrived on the 3rd floor: {:?}", car_arrived(3));
}

Soluciones

#[derive(Debug)]
/// An event in the elevator system that the controller must react to.
enum Event {
    /// A button was pressed.
    ButtonPressed(Button),

    /// The car has arrived at the given floor.
    CarArrived(Floor),

    /// The car's doors have opened.
    CarDoorOpened,

    /// The car's doors have closed.
    CarDoorClosed,
}

/// A floor is represented as an integer.
type Floor = i32;

/// A direction of travel.
#[derive(Debug)]
enum Direction {
    Up,
    Down,
}

/// A user-accessible button.
#[derive(Debug)]
enum Button {
    /// A button in the elevator lobby on the given floor.
    LobbyCall(Direction, Floor),

    /// A floor button within the car.
    CarFloor(Floor),
}

/// The car has arrived on the given floor.
fn car_arrived(floor: i32) -> Event {
    Event::CarArrived(floor)
}

/// The car doors have opened.
fn car_door_opened() -> Event {
    Event::CarDoorOpened
}

/// The car doors have closed.
fn car_door_closed() -> Event {
    Event::CarDoorClosed
}

/// A directional button was pressed in an elevator lobby on the given floor.
fn lobby_call_button_pressed(floor: i32, dir: Direction) -> Event {
    Event::ButtonPressed(Button::LobbyCall(dir, floor))
}

/// A floor button was pressed in the elevator car.
fn car_floor_button_pressed(floor: i32) -> Event {
    Event::ButtonPressed(Button::CarFloor(floor))
}

fn main() {
    println!(
        "A ground floor passenger has pressed the up button: {:?}",
        lobby_call_button_pressed(0, Direction::Up)
    );
    println!("The car has arrived on the ground floor: {:?}", car_arrived(0));
    println!("The car door opened: {:?}", car_door_opened());
    println!(
        "A passenger has pressed the 3rd floor button: {:?}",
        car_floor_button_pressed(3)
    );
    println!("The car door closed: {:?}", car_door_closed());
    println!("The car has arrived on the 3rd floor: {:?}", car_arrived(3));
}

Te damos la bienvenida al día 2

Now that we have seen a fair amount of Rust, today will focus on Rust’s type system:

  • Pattern matching: extracting data from structures.
  • Methods: associating functions with types.
  • Traits: behaviors shared by multiple types.
  • Generics: parameterizing types on other types.
  • Standard library types and traits: a tour of Rust’s rich standard library.

Schedule

In this session:

Including 10 minute breaks, this session should take about 3 hours and 5 minutes

Correspondencia de Patrones

In this segment:

This segment should take about 50 minutes

Desestructurando Enums

Like tuples, structs and enums can also be destructured by matching:

Structs

struct Foo {
    x: (u32, u32),
    y: u32,
}

#[rustfmt::skip]
fn main() {
    let foo = Foo { x: (1, 2), y: 3 };
    match foo {
        Foo { x: (1, b), y } => println!("x.0 = 1, b = {b}, y = {y}"),
        Foo { y: 2, x: i }   => println!("y = 2, x = {i:?}"),
        Foo { y, .. }        => println!("y = {y}, other fields were ignored"),
    }
}

Enums

Los patrones también se pueden usar para enlazar variables a partes de los valores. Así es como se inspecciona la estructura de tus tipos. Empecemos con un tipo enum sencillo:

enum Result {
    Ok(i32),
    Err(String),
}

fn divide_in_two(n: i32) -> Result {
    if n % 2 == 0 {
        Result::Ok(n / 2)
    } else {
        Result::Err(format!("cannot divide {n} into two equal parts"))
    }
}

fn main() {
    let n = 100;
    match divide_in_two(n) {
        Result::Ok(half) => println!("{n} divided in two is {half}"),
        Result::Err(msg) => println!("sorry, an error happened: {msg}"),
    }
}

Aquí hemos utilizado los brazos para desestructurar el valor de Result. En el primer brazo, half está vinculado al valor que hay dentro de la variante Ok. En el segundo, msg está vinculado al mensaje de error.

This slide should take about 10 minutes.

Structs

  • Cambia los valores literales de foo para que coincidan con los demás patrones.
  • Añade un campo nuevo a Foo y realiza los cambios necesarios en el patrón.
  • La diferencia entre una captura y una expresión constante puede ser difícil de detectar. Prueba a cambiar el 2 del segundo brazo por una variable y observa que no funciona. Cámbialo a const y verás que vuelve a funcionar.

Enums

Puntos clave:

  • La expresión if/else devuelve una enumeración que más tarde se descomprime con match.
  • Puedes probar a añadir una tercera variante a la definición de la enumeración y mostrar los errores al ejecutar el código. Señala los lugares en los que tu código está ahora incompleto y explica cómo el compilador intenta dar sugerencias.
  • The values in the enum variants can only be accessed after being pattern matched.
  • Demonstrate what happens when the search is inexhaustive. Note the advantage the Rust compiler provides by confirming when all cases are handled.
  • Save the result of divide_in_two in the result variable and match it in a loop. That won’t compile because msg is consumed when matched. To fix it, match &result instead of result. That will make msg a reference so it won’t be consumed. This “match ergonomics” appeared in Rust 2018. If you want to support older Rust, replace msg with ref msg in the pattern.

Control de Flujo

Rust tiene algunas construcciones de control de flujo que difieren de otros lenguajes. Se utilizan para el patrón de coincidencia:

  • Expresiones if let
  • Expresiones while let
  • Expresiones match

Expresiones if let

La [expresión if let][(https://doc.rust-lang.org/reference/expressions/if-expr.html#if-let-expressions) te permite ejecutar código diferente en función de si un valor coincide con un patrón:

fn sleep_for(secs: f32) {
    let dur = if let Ok(dur) = std::time::Duration::try_from_secs_f32(secs) {
        dur
    } else {
        std::time::Duration::from_millis(500)
    };
    std::thread::sleep(dur);
    println!("slept for {:?}", dur);
}

fn main() {
    sleep_for(-10.0);
    sleep_for(0.8);
}

let else expressions

For the common case of matching a pattern and returning from the function, use let else. The “else” case must diverge (return, break, or panic - anything but falling off the end of the block).

fn hex_or_die_trying(maybe_string: Option<String>) -> Result<u32, String> {
    let s = if let Some(s) = maybe_string {
        s
    } else {
        return Err(String::from("got None"));
    };

    let first_byte_char = if let Some(first_byte_char) = s.chars().next() {
        first_byte_char
    } else {
        return Err(String::from("got empty string"));
    };

    if let Some(digit) = first_byte_char.to_digit(16) {
        Ok(digit)
    } else {
        Err(String::from("not a hex digit"))
    }
}

fn main() {
    println!("result: {:?}", hex_or_die_trying(Some(String::from("foo"))));
}

Al igual que con if let, hay una variante while let que prueba repetidamente un valor con respecto a un patrón:

fn main() {
    let mut name = String::from("Comprehensive Rust 🦀");
    while let Some(c) = name.pop() {
        println!("character: {c}");
    }
    // (There are more efficient ways to reverse a string!)
}

Here String::pop returns Some(c) until the string is empty, after which it will return None. The while let lets us keep iterating through all items.

This slide should take about 10 minutes.

if-let

  • A diferencia de match, if let no tiene que cubrir todas las ramas, pudiendo así conseguir que sea más conciso que match.
  • Un uso habitual consiste en gestionar valores Some al trabajar con Option.
  • A diferencia de match, if let no admite cláusulas guardia para la coincidencia de patrones.

let-else

if-lets can pile up, as shown. The let-else construct supports flattening this nested code. Rewrite the awkward version for students, so they can see the transformation.

The rewritten version is:

#![allow(unused)]
fn main() {
fn hex_or_die_trying(maybe_string: Option<String>) -> Result<u32, String> {
    let Some(s) = maybe_string else {
        return Err(String::from("got None"));
    };

    let Some(first_byte_char) = s.chars().next() else {
        return Err(String::from("got empty string"));
    };

    let Some(digit) = first_byte_char.to_digit(16) else {
        return Err(String::from("not a hex digit"));
    };

    return Ok(digit);
}
}

while-let

  • Señala que el bucle while let seguirá funcionando siempre que el valor coincida con el patrón.
  • You could rewrite the while let loop as an infinite loop with an if statement that breaks when there is no value to unwrap for name.pop(). The while let provides syntactic sugar for the above scenario.

Exercise: Expression Evaluation

Let’s write a simple recursive evaluator for arithmetic expressions.

The Box type here is a smart pointer, and will be covered in detail later in the course. An expression can be “boxed” with Box::new as seen in the tests. To evaluate a boxed expression, use the deref operator (*) to “unbox” it: eval(*boxed_expr).

Some expressions cannot be evaluated and will return an error. The standard Result<Value, String> type is an enum that represents either a successful value (Ok(Value)) or an error (Err(String)). We will cover this type in detail later.

Copy and paste the code into the Rust playground, and begin implementing eval. The final product should pass the tests. It may be helpful to use todo!() and get the tests to pass one-by-one. You can also skip a test temporarily with #[ignore]:

#[test]
#[ignore]
fn test_value() { .. }

If you finish early, try writing a test that results in division by zero or integer overflow. How could you handle this with Result instead of a panic?

#![allow(unused)]
fn main() {
/// An operation to perform on two subexpressions.
#[derive(Debug)]
enum Operation {
    Add,
    Sub,
    Mul,
    Div,
}

/// An expression, in tree form.
#[derive(Debug)]
enum Expression {
    /// An operation on two subexpressions.
    Op { op: Operation, left: Box<Expression>, right: Box<Expression> },

    /// A literal value
    Value(i64),
}

fn eval(e: Expression) -> Result<i64, String> {
    todo!()
}

#[test]
fn test_value() {
    assert_eq!(eval(Expression::Value(19)), Ok(19));
}

#[test]
fn test_sum() {
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Add,
            left: Box::new(Expression::Value(10)),
            right: Box::new(Expression::Value(20)),
        }),
        Ok(30)
    );
}

#[test]
fn test_recursion() {
    let term1 = Expression::Op {
        op: Operation::Mul,
        left: Box::new(Expression::Value(10)),
        right: Box::new(Expression::Value(9)),
    };
    let term2 = Expression::Op {
        op: Operation::Mul,
        left: Box::new(Expression::Op {
            op: Operation::Sub,
            left: Box::new(Expression::Value(3)),
            right: Box::new(Expression::Value(4)),
        }),
        right: Box::new(Expression::Value(5)),
    };
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Add,
            left: Box::new(term1),
            right: Box::new(term2),
        }),
        Ok(85)
    );
}

#[test]
fn test_error() {
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Div,
            left: Box::new(Expression::Value(99)),
            right: Box::new(Expression::Value(0)),
        }),
        Err(String::from("division by zero"))
    );
}
}

Soluciones

/// An operation to perform on two subexpressions.
#[derive(Debug)]
enum Operation {
    Add,
    Sub,
    Mul,
    Div,
}

/// An expression, in tree form.
#[derive(Debug)]
enum Expression {
    /// An operation on two subexpressions.
    Op { op: Operation, left: Box<Expression>, right: Box<Expression> },

    /// A literal value
    Value(i64),
}

fn eval(e: Expression) -> Result<i64, String> {
    match e {
        Expression::Op { op, left, right } => {
            let left = match eval(*left) {
                Ok(v) => v,
                e @ Err(_) => return e,
            };
            let right = match eval(*right) {
                Ok(v) => v,
                e @ Err(_) => return e,
            };
            Ok(match op {
                Operation::Add => left + right,
                Operation::Sub => left - right,
                Operation::Mul => left * right,
                Operation::Div => {
                    if right == 0 {
                        return Err(String::from("division by zero"));
                    } else {
                        left / right
                    }
                }
            })
        }
        Expression::Value(v) => Ok(v),
    }
}

#[test]
fn test_value() {
    assert_eq!(eval(Expression::Value(19)), Ok(19));
}

#[test]
fn test_sum() {
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Add,
            left: Box::new(Expression::Value(10)),
            right: Box::new(Expression::Value(20)),
        }),
        Ok(30)
    );
}

#[test]
fn test_recursion() {
    let term1 = Expression::Op {
        op: Operation::Mul,
        left: Box::new(Expression::Value(10)),
        right: Box::new(Expression::Value(9)),
    };
    let term2 = Expression::Op {
        op: Operation::Mul,
        left: Box::new(Expression::Op {
            op: Operation::Sub,
            left: Box::new(Expression::Value(3)),
            right: Box::new(Expression::Value(4)),
        }),
        right: Box::new(Expression::Value(5)),
    };
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Add,
            left: Box::new(term1),
            right: Box::new(term2),
        }),
        Ok(85)
    );
}

#[test]
fn test_error() {
    assert_eq!(
        eval(Expression::Op {
            op: Operation::Div,
            left: Box::new(Expression::Value(99)),
            right: Box::new(Expression::Value(0)),
        }),
        Err(String::from("division by zero"))
    );
}

fn main() {
    let expr = Expression::Op {
        op: Operation::Sub,
        left: Box::new(Expression::Value(20)),
        right: Box::new(Expression::Value(10)),
    };
    println!("expr: {:?}", expr);
    println!("result: {:?}", eval(expr));
}

Read y Write

In this segment:

This segment should take about 55 minutes

Métodos

Rust te permite asociar funciones a los nuevos tipos. Para ello, usa un bloque impl:

#[derive(Debug)]
struct Race {
    name: String,
    laps: Vec<i32>,
}

impl Race {
    // No receiver, a static method
    fn new(name: &str) -> Self {
        Self { name: String::from(name), laps: Vec::new() }
    }

    // Exclusive borrowed read-write access to self
    fn add_lap(&mut self, lap: i32) {
        self.laps.push(lap);
    }

    // Shared and read-only borrowed access to self
    fn print_laps(&self) {
        println!("Recorded {} laps for {}:", self.laps.len(), self.name);
        for (idx, lap) in self.laps.iter().enumerate() {
            println!("Lap {idx}: {lap} sec");
        }
    }

    // Exclusive ownership of self
    fn finish(self) {
        let total: i32 = self.laps.iter().sum();
        println!("Race {} is finished, total lap time: {}", self.name, total);
    }
}

fn main() {
    let mut race = Race::new("Monaco Grand Prix");
    race.add_lap(70);
    race.add_lap(68);
    race.print_laps();
    race.add_lap(71);
    race.print_laps();
    race.finish();
    // race.add_lap(42);
}

The self arguments specify the “receiver” - the object the method acts on. There are several common receivers for a method:

  • &self: toma prestado el objeto del llamador utilizando una referencia compartida e inmutable. El objeto se puede volver a utilizar después.
  • &mut self: toma prestado el objeto del llamador mediante una referencia única y mutable. El objeto se puede volver a utilizar después.
  • self: asume el ownership del objeto y lo aleja del llamador. El método se convierte en el propietario del objeto. El objeto se eliminará (es decir, se anulará la asignación) cuando el método devuelva un resultado, a menos que se transmita su ownership de forma explícita. El ownership completa no implica automáticamente una mutabilidad.
  • mut self: same as above, but the method can mutate the object.
  • Sin receptor: se convierte en un método estático de la estructura. Normalmente se utiliza para crear constructores que se suelen denominar new.
This slide should take about 10 minutes.

Puntos Clave:

  • Puede resultar útil presentar los métodos comparándolos con funciones.
    • Se llama a los métodos en una instancia de un tipo (como un estructura o una enumeración) y el primer parámetro representa la instancia como self.
    • Los desarrolladores pueden optar por utilizar métodos para aprovechar la sintaxis de los receptores de métodos y para ayudar a mantenerlos más organizados. Mediante el uso de métodos podemos mantener todo el código de implementación en un lugar predecible.
  • Señala el uso de la palabra clave self, receptor de un método.
    • Indica que se trata de un término abreviado de self:&Self y muestra cómo se podría utilizar también el nombre de la estructura.
    • Explica que Self es un tipo de alias para el tipo en el que está el bloque impl y que se puede usar en cualquier parte del bloque.
    • Ten en cuenta que se puede usar self como otras estructuras y que la notación de puntos puede utilizarse para referirse a campos concretos.
    • This might be a good time to demonstrate how the &self differs from self by trying to run finish twice.
    • Además de las variantes self, también hay tipos de envoltorios especiales que pueden ser tipos de receptor, como Box<Self>.

Traits

Rust te permite abstraer sobre tipos con traits. Son similares a las interfaces:

struct Dog {
    name: String,
    age: i8,
}
struct Cat {
    lives: i8,
}

trait Pet {
    fn talk(&self) -> String;

    fn greet(&self) {
        println!("Oh you're a cutie! What's your name? {}", self.talk());
    }
}

impl Pet for Dog {
    fn talk(&self) -> String {
        format!("Woof, my name is {}!", self.name)
    }
}

impl Pet for Cat {
    fn talk(&self) -> String {
        String::from("Miau!")
    }
}

fn main() {
    let captain_floof = Cat { lives: 9 };
    let fido = Dog { name: String::from("Fido"), age: 5 };

    captain_floof.greet();
    fido.greet();
}
This slide should take about 10 minutes.
  • A trait defines a number of methods that types must have in order to implement the trait.

  • Traits are implemented in an impl <trait> for <type> { .. } block.

  • Traits may specify pre-implemented (provided) methods and methods that users are required to implement themselves. Provided methods can rely on required methods. In this case, greet is provided, and relies on talk.

Derivación de Traits

Supported traits can be automatically implemented for your custom types, as follows:

#[derive(Debug, Clone, Default)]
struct Player {
    name: String,
    strength: u8,
    hit_points: u8,
}

fn main() {
    let p1 = Player::default(); // Default trait adds `default` constructor.
    let mut p2 = p1.clone(); // Clone trait adds `clone` method.
    p2.name = String::from("EldurScrollz");
    // Debug trait adds support for printing with `{:?}`.
    println!("{:?} vs. {:?}", p1, p2);
}
This slide should take about 5 minutes.

Derivation is implemented with macros, and many crates provide useful derive macros to add useful functionality. For example, serde can derive serialization support for a struct using #[derive(Serialize)].

Objetos Trait

Los objetos de traits permiten valores de diferentes tipos, por ejemplo, en una colección:

struct Dog {
    name: String,
    age: i8,
}
struct Cat {
    lives: i8,
}

trait Pet {
    fn talk(&self) -> String;
}

impl Pet for Dog {
    fn talk(&self) -> String {
        format!("Woof, my name is {}!", self.name)
    }
}

impl Pet for Cat {
    fn talk(&self) -> String {
        String::from("Miau!")
    }
}

fn main() {
    let pets: Vec<Box<dyn Pet>> = vec![
        Box::new(Cat { lives: 9 }),
        Box::new(Dog { name: String::from("Fido"), age: 5 }),
    ];
    for pet in pets {
        println!("Hello, who are you? {}", pet.talk());
    }
}

Diseño de la memoria después de asignar pets:

<Dog as Pet>::talk<Cat as Pet>::talkStackHeappetsFidoptrlen2capacity2dataname,4,4age5vtabledatalives9vtable
This slide should take about 10 minutes.
  • Types that implement a given trait may be of different sizes. This makes it impossible to have things like Vec<dyn Pet> in the example above.
  • dyn Pet es una forma de indicar al compilador un tipo de tamaño dinámico que implementa Pet.
  • In the example, pets is allocated on the stack and the vector data is on the heap. The two vector elements are fat pointers:
    • A fat pointer is a double-width pointer. It has two components: a pointer to the actual object and a pointer to the virtual method table (vtable) for the Pet implementation of that particular object.
    • The data for the Dog named Fido is the name and age fields. The Cat has a lives field.
  • Compara estas salidas en el ejemplo anterior:
    println!("{} {}", std::mem::size_of::<Dog>(), std::mem::size_of::<Cat>());
    println!("{} {}", std::mem::size_of::<&Dog>(), std::mem::size_of::<&Cat>());
    println!("{}", std::mem::size_of::<&dyn Pet>());
    println!("{}", std::mem::size_of::<Box<dyn Pet>>());

Exercise: Generic Logger

Let’s design a simple logging utility, using a trait Logger with a log method. Code which might log its progress can then take an &impl Logger. In testing, this might put messages in the test logfile, while in a production build it would send messages to a log server.

However, the StderrLogger given below logs all messages, regardless of verbosity. Your task is to write a VerbosityFilter type that will ignore messages above a maximum verbosity.

This is a common pattern: a struct wrapping a trait implementation and implementing that same trait, adding behavior in the process. What other kinds of wrappers might be useful in a logging utility?

use std::fmt::Display;

pub trait Logger {
    /// Log a message at the given verbosity level.
    fn log(&self, verbosity: u8, message: impl Display);
}

struct StderrLogger;

impl Logger for StderrLogger {
    fn log(&self, verbosity: u8, message: impl Display) {
        eprintln!("verbosity={verbosity}: {message}");
    }
}

fn do_things(logger: &impl Logger) {
    logger.log(5, "FYI");
    logger.log(2, "Uhoh");
}

// TODO: Define and implement `VerbosityFilter`.

fn main() {
    let l = VerbosityFilter { max_verbosity: 3, inner: StderrLogger };
    do_things(&l);
}

Soluciones

use std::fmt::Display;

pub trait Logger {
    /// Log a message at the given verbosity level.
    fn log(&self, verbosity: u8, message: impl Display);
}

struct StderrLogger;

impl Logger for StderrLogger {
    fn log(&self, verbosity: u8, message: impl Display) {
        eprintln!("verbosity={verbosity}: {message}");
    }
}

fn do_things(logger: &impl Logger) {
    logger.log(5, "FYI");
    logger.log(2, "Uhoh");
}

/// Only log messages up to the given verbosity level.
struct VerbosityFilter<L: Logger> {
    max_verbosity: u8,
    inner: L,
}

impl<L: Logger> Logger for VerbosityFilter<L> {
    fn log(&self, verbosity: u8, message: impl Display) {
        if verbosity <= self.max_verbosity {
            self.inner.log(verbosity, message);
        }
    }
}

fn main() {
    let l = VerbosityFilter { max_verbosity: 3, inner: StderrLogger };
    do_things(&l);
}

Genéricos

In this segment:

This segment should take about 45 minutes

Funciones Externas

Rust supports generics, which lets you abstract algorithms or data structures (such as sorting or a binary tree) over the types used or stored.

/// Pick `even` or `odd` depending on the value of `n`.
fn pick<T>(n: i32, even: T, odd: T) -> T {
    if n % 2 == 0 {
        even
    } else {
        odd
    }
}

fn main() {
    println!("picked a number: {:?}", pick(97, 222, 333));
    println!("picked a tuple: {:?}", pick(28, ("dog", 1), ("cat", 2)));
}
This slide should take about 5 minutes.
  • Rust infers a type for T based on the types of the arguments and return value.

  • This is similar to C++ templates, but Rust partially compiles the generic function immediately, so that function must be valid for all types matching the constraints. For example, try modifying pick to return even + odd if n == 0. Even if only the pick instantiation with integers is used, Rust still considers it invalid. C++ would let you do this.

  • Generic code is turned into non-generic code based on the call sites. This is a zero-cost abstraction: you get exactly the same result as if you had hand-coded the data structures without the abstraction.

Tipos de Datos Genéricos

Puedes usar genéricos para abstraer el tipo de campo concreto:

#[derive(Debug)]
struct Point<T> {
    x: T,
    y: T,
}

impl<T> Point<T> {
    fn coords(&self) -> (&T, &T) {
        (&self.x, &self.y)
    }

    // fn set_x(&mut self, x: T)
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
    println!("{integer:?} and {float:?}");
    println!("coords: {:?}", integer.coords());
}
This slide should take about 15 minutes.
  • P: ¿Por qué T se especifica dos veces en impl<T> Point<T> {}? ¿No es redundante?

    • Esto se debe a que es una sección de implementación genérica para un tipo genérico. Son genéricos de forma independiente.
    • Significa que estos métodos están definidos para cualquier T.
    • It is possible to write impl Point<u32> { .. }.
      • Point sigue siendo genérico y puedes usar Point<f64>, pero los métodos de este bloque solo estarán disponibles para Point<u32>.
  • Try declaring a new variable let p = Point { x: 5, y: 10.0 };. Update the code to allow points that have elements of different types, by using two type variables, e.g., T and U.

Trait Bounds

Cuando se trabaja con genéricos, a menudo se prefiere que los tipos implementen algún trait, de forma que se pueda llamar a los métodos de este trait.

Puedes hacerlo con T: Trait o impl Trait:

fn duplicate<T: Clone>(a: T) -> (T, T) {
    (a.clone(), a.clone())
}

// struct NotClonable;

fn main() {
    let foo = String::from("foo");
    let pair = duplicate(foo);
    println!("{pair:?}");
}
This slide should take about 10 minutes.
  • Try making a NonClonable and passing it to duplicate.

  • When multiple traits are necessary, use + to join them.

  • Muestra una cláusula where para que los alumnos la encuentren al leer el código.

    fn duplicate<T>(a: T) -> (T, T)
    where
        T: Clone,
    {
        (a.clone(), a.clone())
    }
    • Despeja la firma de la función si tienes muchos parámetros.
    • Tiene funciones adicionales para que sea más potente.
      • Si alguien pregunta, la función adicional es que el tipo que está a la izquierda de “:” puede ser arbitrario, como Option<T>.
  • Note that Rust does not (yet) support specialization. For example, given the original duplicate, it is invalid to add a specialized duplicate(a: u32).

impl Trait

De forma similar a los límites de traits, se puede usar la sintaxis impl Trait en argumentos de funciones y valores devueltos:

// Syntactic sugar for:
//   fn add_42_millions<T: Into<i32>>(x: T) -> i32 {
fn add_42_millions(x: impl Into<i32>) -> i32 {
    x.into() + 42_000_000
}

fn pair_of(x: u32) -> impl std::fmt::Debug {
    (x + 1, x - 1)
}

fn main() {
    let many = add_42_millions(42_i8);
    println!("{many}");
    let many_more = add_42_millions(10_000_000);
    println!("{many_more}");
    let debuggable = pair_of(27);
    println!("debuggable: {debuggable:?}");
}
This slide should take about 5 minutes.

impl Trait allows you to work with types which you cannot name. The meaning of impl Trait is a bit different in the different positions.

  • En el caso de los parámetros, impl Trait es como un parámetro genérico anónimo con un límite de trait.

  • En el caso de un tipo de resultado devuelto, significa que este es un tipo concreto que implementa el trait, sin nombrar el tipo. Esto puede ser útil cuando no quieres exponer el tipo concreto en una API pública.

    La inferencia es más complicada en la posición de retorno. Una función que devuelve impl Foo elige el tipo concreto que devuelve, sin escribirlo en el código fuente. Una función que devuelve un tipo genérico como collect<B>() -> B puede devolver cualquier tipo que cumpla B, y es posible que el llamador tenga que elegir uno, como con let x: Vec<_> = foo.collect() o con la sintaxis turbofish, foo.collect::<Vec<_>>().

What is the type of debuggable? Try let debuggable: () = .. to see what the error message shows.

Exercise: Generic min

In this short exercise, you will implement a generic min function that determines the minimum of two values, using a LessThan trait.

trait LessThan {
    /// Return true if self is less than other.
    fn less_than(&self, other: &Self) -> bool;
}

#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct Citation {
    author: &'static str,
    year: u32,
}

impl LessThan for Citation {
    fn less_than(&self, other: &Self) -> bool {
        if self.author < other.author {
            true
        } else if self.author > other.author {
            false
        } else {
            self.year < other.year
        }
    }
}

// TODO: implement the `min` function used in `main`.

fn main() {
    let cit1 = Citation { author: "Shapiro", year: 2011 };
    let cit2 = Citation { author: "Baumann", year: 2010 };
    let cit3 = Citation { author: "Baumann", year: 2019 };
    debug_assert_eq!(min(cit1, cit2), cit2);
    debug_assert_eq!(min(cit2, cit3), cit2);
    debug_assert_eq!(min(cit1, cit3), cit3);
}

Soluciones

trait LessThan {
    /// Return true if self is less than other.
    fn less_than(&self, other: &Self) -> bool;
}

#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct Citation {
    author: &'static str,
    year: u32,
}

impl LessThan for Citation {
    fn less_than(&self, other: &Self) -> bool {
        if self.author < other.author {
            true
        } else if self.author > other.author {
            false
        } else {
            self.year < other.year
        }
    }
}

fn min<T: LessThan>(l: T, r: T) -> T {
    if l.less_than(&r) {
        l
    } else {
        r
    }
}

fn main() {
    let cit1 = Citation { author: "Shapiro", year: 2011 };
    let cit2 = Citation { author: "Baumann", year: 2010 };
    let cit3 = Citation { author: "Baumann", year: 2019 };
    debug_assert_eq!(min(cit1, cit2), cit2);
    debug_assert_eq!(min(cit2, cit3), cit2);
    debug_assert_eq!(min(cit1, cit3), cit3);
}

Welcome Back

In this session:

Including 10 minute breaks, this session should take about 3 hours

Biblioteca estándar

In this segment:

This segment should take about 1 hour and 10 minutes

For each of the slides in this section, spend some time reviewing the documentation pages, highlighting some of the more common methods.

Biblioteca estándar

Rust comes with a standard library which helps establish a set of common types used by Rust libraries and programs. This way, two libraries can work together smoothly because they both use the same String type.

In fact, Rust contains several layers of the Standard Library: core, alloc and std.

  • core includes the most basic types and functions that don’t depend on libc, allocator or even the presence of an operating system.
  • alloc incluye tipos que requieren un allocator de heap global, como Vec, Box y Arc.
  • Las aplicaciones embebidas en Rust menudo solo usan core y a algunas veces alloc.

Pruebas de Documentación

Rust comes with extensive documentation. For example:

In fact, you can document your own code:

/// Determine whether the first argument is divisible by the second argument.
///
/// If the second argument is zero, the result is false.
fn is_divisible_by(lhs: u32, rhs: u32) -> bool {
    if rhs == 0 {
        return false;
    }
    lhs % rhs == 0
}

El contenido se trata como Markdown. Todos los crates de la biblioteca de Rust publicados se documentan automáticamente en docs.rs mediante la herramienta rustdoc. Es propio documentar todos los elementos públicos de una API usando este patrón.

To document an item from inside the item (such as inside a module), use //! or /*! .. */, called “inner doc comments”:

//! This module contains functionality relating to divisibility of integers.
This slide should take about 5 minutes.

Duration

We have already seen some use of Option<T>. It stores either a value of type T or nothing. For example, String::find returns an Option<usize>.

fn main() {
    let name = "Löwe 老虎 Léopard Gepardi";
    let mut position: Option<usize> = name.find('é');
    println!("find returned {position:?}");
    assert_eq!(position.unwrap(), 14);
    position = name.find('Z');
    println!("find returned {position:?}");
    assert_eq!(position.expect("Character not found"), 0);
}
This slide should take about 10 minutes.
  • Option is widely used, not just in the standard library.
  • unwrap will return the value in an Option, or panic. expect is similar but takes an error message.
    • You can panic on None, but you can’t “accidentally” forget to check for None.
    • It’s common to unwrap/expect all over the place when hacking something together, but production code typically handles None in a nicer fashion.
  • The niche optimization means that Option<T> often has the same size in memory as T.

Option, Result

Result is similar to Option, but indicates the success or failure of an operation, each with a different type. This is similar to the Res defined in the expression exercise, but generic: Result<T, E> where T is used in the Ok variant and E appears in the Err variant.

use std::fs::File;
use std::io::Read;

fn main() {
    let file: Result<File, std::io::Error> = File::open("diary.txt");
    match file {
        Ok(mut file) => {
            let mut contents = String::new();
            if let Ok(bytes) = file.read_to_string(&mut contents) {
                println!("Dear diary: {contents} ({bytes} bytes)");
            } else {
                println!("Could not read file content");
            }
        }
        Err(err) => {
            println!("The diary could not be opened: {err}");
        }
    }
}
This slide should take about 10 minutes.
  • Al igual que con Option, el valor correcto se encuentra dentro de Result, lo que obliga al desarrollador a extraerlo de forma explícita. Esto fomenta la comprobación de errores. En el caso de que nunca se produzca un error, se puede llamar a unwrap() o a expect(), y esto también es una señal de la intención del desarrollador.
  • Result documentation is a recommended read. Not during the course, but it is worth mentioning. It contains a lot of convenience methods and functions that help functional-style programming.
  • Result es el tipo estándar para implementar la gestión de errores, tal y como veremos el día 3.

String

String es el búfer de cadena UTF-8 estándar, ampliable y asignado a un heap:

fn main() {
    let mut s1 = String::new();
    s1.push_str("Hello");
    println!("s1: len = {}, capacity = {}", s1.len(), s1.capacity());

    let mut s2 = String::with_capacity(s1.len() + 1);
    s2.push_str(&s1);
    s2.push('!');
    println!("s2: len = {}, capacity = {}", s2.len(), s2.capacity());

    let s3 = String::from("🇨🇭");
    println!("s3: len = {}, number of chars = {}", s3.len(), s3.chars().count());
}

String implementa [Deref<Target = str>][2], lo que significa que puedes llamar a todos los métodos str en una String.

This slide should take about 10 minutes.
  • String::new devuelve una nueva cadena vacía. Usa String::with_capacity cuando sepas cuántos datos quieres guardar.
  • String::len devuelve el tamaño de String en bytes (que puede ser diferente de su longitud en caracteres).
  • String::chars devuelve un iterador sobre los caracteres reales. Ten en cuenta que un char puede ser diferente de lo que un humano consideraría un “caracter”, debido a los grupos de grafemas.
  • Cuando la gente se refiere a cadenas, pueden estar hablando de &str o de String.
  • Cuando un tipo implementa Deref<Target = T>, el compilador te permite llamar a métodos de forma transparente desde T.
    • We haven’t discussed the Deref trait yet, so at this point this mostly explains the structure of the sidebar in the documentation.
    • String implementa Deref<Target = str>, que le proporciona acceso transparente a los métodos de str.
    • Write and compare let s3 = s1.deref(); and let s3 = &*s1;.
  • String se implementa como un envoltorio alrededor de un vector de bytes. Muchas de las operaciones que ves como compatibles con vectores también lo son con String, pero con algunas garantías adicionales.
  • Compara las diferentes formas de indexar String:
    • A un carácter mediante s3. chars().nth(i).unwrap(), donde i está dentro o fuera de los límites
    • A una cadena secundaria mediante s3[0..4], donde el slice está en los límites de caracteres o no.

Vec

Vec es el búfer estándar redimensionable asignado al heap:

fn main() {
    let mut v1 = Vec::new();
    v1.push(42);
    println!("v1: len = {}, capacity = {}", v1.len(), v1.capacity());

    let mut v2 = Vec::with_capacity(v1.len() + 1);
    v2.extend(v1.iter());
    v2.push(9999);
    println!("v2: len = {}, capacity = {}", v2.len(), v2.capacity());

    // Canonical macro to initialize a vector with elements.
    let mut v3 = vec![0, 0, 1, 2, 3, 4];

    // Retain only the even elements.
    v3.retain(|x| x % 2 == 0);
    println!("{v3:?}");

    // Remove consecutive duplicates.
    v3.dedup();
    println!("{v3:?}");
}

Vec implementa Deref<Target = [T]>, lo que significa que puedes llamar a métodos slice en un Vec.

This slide should take about 10 minutes.
  • Vec is a type of collection, along with String and HashMap. The data it contains is stored on the heap. This means the amount of data doesn’t need to be known at compile time. It can grow or shrink at runtime.
  • Ten en cuenta que Vec<T> también es un tipo genérico, pero no tienes que especificar T de forma explícita. Como siempre sucede con la inferencia de tipos de Rust, T se estableció durante la primera llamada a push.
  • vec![...] es una macro canónica para usarla en lugar de Vec::new() y admite que se añadan elementos iniciales al vector.
  • Para indexar el vector, se utiliza [ ], pero entrará en pánico si se sale de los límites. También se puede usar get para obtener una Option. La función pop eliminará el último elemento.
  • Slices are covered on day 3. For now, students only need to know that a value of type Vec gives access to all of the documented slice methods, too.

HashMap

Mapa hash estándar con protección frente a ataques HashDoS:

use std::collections::HashMap;

fn main() {
    let mut page_counts = HashMap::new();
    page_counts.insert("Adventures of Huckleberry Finn".to_string(), 207);
    page_counts.insert("Grimms' Fairy Tales".to_string(), 751);
    page_counts.insert("Pride and Prejudice".to_string(), 303);

    if !page_counts.contains_key("Les Misérables") {
        println!(
            "We know about {} books, but not Les Misérables.",
            page_counts.len()
        );
    }

    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        match page_counts.get(book) {
            Some(count) => println!("{book}: {count} pages"),
            None => println!("{book} is unknown."),
        }
    }

    // Use the .entry() method to insert a value if nothing is found.
    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        let page_count: &mut i32 = page_counts.entry(book.to_string()).or_insert(0);
        *page_count += 1;
    }

    println!("{page_counts:#?}");
}
This slide should take about 10 minutes.
  • HashMap no se ha explicado en el preludio y debe conocerse.

  • Prueba las siguientes líneas de código. La primera línea comprobará si un libro está incluido en el hashmap y, si no, devolverá un valor alternativo. La segunda línea insertará el valor alternativo en el hashmap si el libro no se encuentra.

    let pc1 = page_counts
        .get("Harry Potter and the Sorcerer's Stone")
        .unwrap_or(&336);
    let pc2 = page_counts
        .entry("The Hunger Games".to_string())
        .or_insert(374);
  • A diferencia de vec!, por desgracia no hay ninguna macro estándar de hashmap!.

    • Sin embargo, desde la versión 1.56 de Rust, HashMap implementa [From<[(K, V); N]>](https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#impl-From%3C%5B(K,+V);+N%5D%3E-for-HashMap%3CK,+V,+RandomState%, que nos permite inicializar fácilmente un mapa hash a partir de un array literal:

      let page_counts = HashMap::from([
        ("Harry Potter and the Sorcerer's Stone".to_string(), 336),
        ("The Hunger Games".to_string(), 374),
      ]);
  • HashMap también se puede crear a partir de cualquier Iterator que genere tuplas de pares clave-valor.

  • Mostraremos HashMap<String, i32> y evitaremos utilizar &str para que los ejemplos sean más sencillos. Por supuesto, se pueden usar las referencias en las colecciones, pero pueden dar problemas con el borrow checker.

    • Prueba a eliminar to_string() del ejemplo anterior para ver si aún sigue compilando. ¿Dónde crees que podríamos encontrar problemas?
  • Este tipo tiene varios tipos de devolución “específicos del método”, como std::collections::hash_map::Keys. Estos tipos a menudo aparecen en las búsquedas de la documentación de Rust. Muestra a los estudiantes la documentación de este tipo y el enlace útil de vuelta al método keys.

Ejercicios

In this exercise you will take a very simple data structure and make it generic. It uses a std::collections::HashMap to keep track of which values have been seen and how many times each one has appeared.

The initial version of Counter is hard coded to only work for u32 values. Make the struct and its methods generic over the type of value being tracked, that way Counter can track any type of value.

If you finish early, try using the entry method to halve the number of hash lookups required to implement the count method.

use std::collections::HashMap;

/// Counter counts the number of times each value of type T has been seen.
struct Counter {
    values: HashMap<u32, u64>,
}

impl Counter {
    /// Create a new Counter.
    fn new() -> Self {
        Counter {
            values: HashMap::new(),
        }
    }

    /// Count an occurrence of the given value.
    fn count(&mut self, value: u32) {
        if self.values.contains_key(&value) {
            *self.values.get_mut(&value).unwrap() += 1;
        } else {
            self.values.insert(value, 1);
        }
    }

    /// Return the number of times the given value has been seen.
    fn times_seen(&self, value: u32) -> u64 {
        self.values.get(&value).copied().unwrap_or_default()
    }
}

fn main() {
    let mut ctr = Counter::new();
    ctr.count(13);
    ctr.count(14);
    ctr.count(16);
    ctr.count(14);
    ctr.count(14);
    ctr.count(11);

    for i in 10..20 {
        println!("saw {} values equal to {}", ctr.times_seen(i), i);
    }

    let mut strctr = Counter::new();
    strctr.count("apple");
    strctr.count("orange");
    strctr.count("apple");
    println!("got {} apples", strctr.times_seen("apple"));
}

Soluciones

use std::collections::HashMap;
use std::hash::Hash;

/// Counter counts the number of times each value of type T has been seen.
struct Counter<T: Eq + Hash> {
    values: HashMap<T, u64>,
}

impl<T: Eq + Hash> Counter<T> {
    /// Create a new Counter.
    fn new() -> Self {
        Counter { values: HashMap::new() }
    }

    /// Count an occurrence of the given value.
    fn count(&mut self, value: T) {
        *self.values.entry(value).or_default() += 1;
    }

    /// Return the number of times the given value has been seen.
    fn times_seen(&self, value: T) -> u64 {
        self.values.get(&value).copied().unwrap_or_default()
    }
}

fn main() {
    let mut ctr = Counter::new();
    ctr.count(13);
    ctr.count(14);
    ctr.count(16);
    ctr.count(14);
    ctr.count(14);
    ctr.count(11);

    for i in 10..20 {
        println!("saw {} values equal to {}", ctr.times_seen(i), i);
    }

    let mut strctr = Counter::new();
    strctr.count("apple");
    strctr.count("orange");
    strctr.count("apple");
    println!("got {} apples", strctr.times_seen("apple"));
}

Biblioteca estándar

In this segment:

This segment should take about 1 hour and 40 minutes

As with the standard-library types, spend time reviewing the documentation for each trait.

This section is long. Take a break midway through.

Comparaciones

These traits support comparisons between values. All traits can be derived for types containing fields that implement these traits.

PartialEq and Eq

PartialEq is a partial equivalence relation, with required method eq and provided method ne. The == and != operators will call these methods.

struct Key {
    id: u32,
    metadata: Option<String>,
}
impl PartialEq for Key {
    fn eq(&self, other: &Self) -> bool {
        self.id == other.id
    }
}

Eq is a full equivalence relation (reflexive, symmetric, and transitive) and implies PartialEq. Functions that require full equivalence will use Eq as a trait bound.

PartialOrd and Ord

PartialOrd defines a partial ordering, with a partial_cmp method. It is used to implement the <, <=, >=, and > operators.

use std::cmp::Ordering;
#[derive(Eq, PartialEq)]
struct Citation {
    author: String,
    year: u32,
}
impl PartialOrd for Citation {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        match self.author.partial_cmp(&other.author) {
            Some(Ordering::Equal) => self.year.partial_cmp(&other.year),
            author_ord => author_ord,
        }
    }
}

Ord is a total ordering, with cmp returning Ordering.

This slide should take about 10 minutes.

PartialEq can be implemented between different types, but Eq cannot, because it is reflexive:

struct Key {
    id: u32,
    metadata: Option<String>,
}
impl PartialEq<u32> for Key {
    fn eq(&self, other: &u32) -> bool {
        self.id == *other
    }
}

In practice, it’s common to derive these traits, but uncommon to implement them.

Iteradores

La sobrecarga de operadores se implementa mediante traits en std::ops:

#[derive(Debug, Copy, Clone)]
struct Point {
    x: i32,
    y: i32,
}

impl std::ops::Add for Point {
    type Output = Self;

    fn add(self, other: Self) -> Self {
        Self { x: self.x + other.x, y: self.y + other.y }
    }
}

fn main() {
    let p1 = Point { x: 10, y: 20 };
    let p2 = Point { x: 100, y: 200 };
    println!("{:?} + {:?} = {:?}", p1, p2, p1 + p2);
}
This slide should take about 10 minutes.

Cuestiones de debate:

  • You could implement Add for &Point. In which situations is that useful?
    • Respuesta: Add:add consume a self. Si el tipo T para el que se sobrecarga el operador no es Copy, deberías plantearte también sobrecargar el operador para &T. Así se evita la clonación innecesaria en el sitio de la llamada.
  • ¿Por qué Output es un tipo asociado? ¿Se podría convertir en un parámetro tipo del método?
    • Short answer: Function type parameters are controlled by the caller, but associated types (like Output) are controlled by the implementer of a trait.
  • Se podría implementar Add para dos tipos distintos; por ejemplo, impl Add<(i32, i32)> for Point añadiría una tupla a un Point.

From e Into

Los tipos implementan From y Into para facilitar las conversiones de tipos:

fn main() {
    let s = String::from("hello");
    let addr = std::net::Ipv4Addr::from([127, 0, 0, 1]);
    let one = i16::from(true);
    let bigger = i32::from(123_i16);
    println!("{s}, {addr}, {one}, {bigger}");
}

Into se implementa automáticamente cuando se implementa From:

fn main() {
    let s: String = "hello".into();
    let addr: std::net::Ipv4Addr = [127, 0, 0, 1].into();
    let one: i16 = true.into();
    let bigger: i32 = 123_i16.into();
    println!("{s}, {addr}, {one}, {bigger}");
}
This slide should take about 10 minutes.
  • Por eso se suele implementar solo From, ya que el tipo ya habrá implementado también Into.
  • Cuando se declara un tipo de entrada de argumento de función como “cualquier elemento que se pueda convertir en String”, la regla es la contraria y se debe usar Into. La función aceptará tipos que implementen From y aquellos que solo implementen Into.

Probando

Rust has no implicit type conversions, but does support explicit casts with as. These generally follow C semantics where those are defined.

fn main() {
    let value: i64 = 1000;
    println!("as u16: {}", value as u16);
    println!("as i16: {}", value as i16);
    println!("as u8: {}", value as u8);
}

The results of as are always defined in Rust and consistent across platforms. This might not match your intuition for changing sign or casting to a smaller type – check the docs, and comment for clarity.

Casting with as is a relatively sharp tool that is easy to use incorrectly, and can be a source of subtle bugs as future maintenance work changes the types that are used or the ranges of values in types. Casts are best used only when the intent is to indicate unconditional truncation (e.g. selecting the bottom 32 bits of a u64 with as u32, regardless of what was in the high bits).

For infallible casts (e.g. u32 to u64), prefer using From or Into over as to confirm that the cast is in fact infallible. For fallible casts, TryFrom and TryInto are available when you want to handle casts that fit differently from those that don’t.

This slide should take about 5 minutes.

Consider taking a break after this slide.

as is similar to a C++ static cast. Use of as in cases where data might be lost is generally discouraged, or at least deserves an explanatory comment.

This is common in casting integers to usize for use as an index.

Read y Write

Usando Read y BufRead, se puede abstraer sobre fuentes u8:

use std::io::{BufRead, BufReader, Read, Result};

fn count_lines<R: Read>(reader: R) -> usize {
    let buf_reader = BufReader::new(reader);
    buf_reader.lines().count()
}

fn main() -> Result<()> {
    let slice: &[u8] = b"foo\nbar\nbaz\n";
    println!("lines in slice: {}", count_lines(slice));

    let file = std::fs::File::open(std::env::current_exe()?)?;
    println!("lines in file: {}", count_lines(file));
    Ok(())
}

De forma similar, Write te permite abstraer sobre fuentes u8:

use std::io::{Result, Write};

fn log<W: Write>(writer: &mut W, msg: &str) -> Result<()> {
    writer.write_all(msg.as_bytes())?;
    writer.write_all("\n".as_bytes())
}

fn main() -> Result<()> {
    let mut buffer = Vec::new();
    log(&mut buffer, "Hello")?;
    log(&mut buffer, "World")?;
    println!("Logged: {:?}", buffer);
    Ok(())
}

El trait Default

El trait Default produce un valor predeterminado para un tipo.

#[derive(Debug, Default)]
struct Derived {
    x: u32,
    y: String,
    z: Implemented,
}

#[derive(Debug)]
struct Implemented(String);

impl Default for Implemented {
    fn default() -> Self {
        Self("John Smith".into())
    }
}

fn main() {
    let default_struct = Derived::default();
    println!("{default_struct:#?}");

    let almost_default_struct =
        Derived { y: "Y is set!".into(), ..Derived::default() };
    println!("{almost_default_struct:#?}");

    let nothing: Option<Derived> = None;
    println!("{:#?}", nothing.unwrap_or_default());
}
This slide should take about 5 minutes.
  • Se puede implementar directamente o se puede derivar a través de #[derive(Default)].
  • Una implementación derivada producirá un valor en el que todos los campos tendrán sus valores predeterminados.
    • Esto significa que todos los tipos de la estructura también deberán implementar Default.
  • Los tipos estándar de Rust suelen implementar Default con valores razonables (por ejemplo, 0, "", etc.).
  • The partial struct initialization works nicely with default.
  • The Rust standard library is aware that types can implement Default and provides convenience methods that use it.
  • The .. syntax is called struct update syntax.

Cierres

Los cierres o expresiones lambda tienen tipos que no pueden nombrarse. Sin embargo, implementan traits especiales Fn, FnMut y FnOnce:

fn apply_with_log(func: impl FnOnce(i32) -> i32, input: i32) -> i32 {
    println!("Calling function on {input}");
    func(input)
}

fn main() {
    let add_3 = |x| x + 3;
    println!("add_3: {}", apply_with_log(add_3, 10));
    println!("add_3: {}", apply_with_log(add_3, 20));

    let mut v = Vec::new();
    let mut accumulate = |x: i32| {
        v.push(x);
        v.iter().sum::<i32>()
    };
    println!("accumulate: {}", apply_with_log(&mut accumulate, 4));
    println!("accumulate: {}", apply_with_log(&mut accumulate, 5));

    let multiply_sum = |x| x * v.into_iter().sum::<i32>();
    println!("multiply_sum: {}", apply_with_log(multiply_sum, 3));
}
This slide should take about 20 minutes.

Un Fn (por ejemplo, add_3) no consume ni modifica los valores capturados, o quizá no captura nada en absoluto. Se puede llamar varias veces al mismo tiempo.

Un FnMut (por ejemplo, accumulate) puede modificar los valores capturados. Se puede llamar varias veces, pero no de forma simultánea.

Si tienes un FnOnce (por ejemplo, multiply_sum), solo puedes llamarlo una vez. Puede consumir valores capturados.

FnMut es un subtipo de FnOnce, mientras que Fn es un subtipo de FnMut y FnOnce. Es decir, puedes utilizar un FnMut siempre que se llame a un FnOnce, y puedes usar un Fn siempre que se llame a un FnMut o a un FnOnce.

When you define a function that takes a closure, you should take FnOnce if you can (i.e. you call it once), or FnMut else, and last Fn. This allows the most flexibility for the caller.

In contrast, when you have a closure, the most flexible you can have is Fn (it can be passed everywhere), then FnMut, and lastly FnOnce.

El compilador también infiere Copy (por ejemplo, add_3) y Clone (por ejemplo, multiply_sum), dependiendo de lo que capture el cierre.

De forma predeterminada, los cierres capturan, si pueden, por referencia. La palabra clave move hace que capturen por valor.

fn make_greeter(prefix: String) -> impl Fn(&str) {
    return move |name| println!("{} {}", prefix, name);
}

fn main() {
    let hi = make_greeter("Hi".to_string());
    hi("there");
}

Ejercicios

In this example, you will implement the classic “ROT13” cipher. Copy this code to the playground, and implement the missing bits. Only rotate ASCII alphabetic characters, to ensure the result is still valid UTF-8.

use std::io::Read;

struct RotDecoder<R: Read> {
    input: R,
    rot: u8,
}

// Implement the `Read` trait for `RotDecoder`.

fn main() {
    let mut rot =
        RotDecoder { input: "Gb trg gb gur bgure fvqr!".as_bytes(), rot: 13 };
    let mut result = String::new();
    rot.read_to_string(&mut result).unwrap();
    println!("{}", result);
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn joke() {
        let mut rot =
            RotDecoder { input: "Gb trg gb gur bgure fvqr!".as_bytes(), rot: 13 };
        let mut result = String::new();
        rot.read_to_string(&mut result).unwrap();
        assert_eq!(&result, "To get to the other side!");
    }

    #[test]
    fn binary() {
        let input: Vec<u8> = (0..=255u8).collect();
        let mut rot = RotDecoder::<&[u8]> { input: input.as_ref(), rot: 13 };
        let mut buf = [0u8; 256];
        assert_eq!(rot.read(&mut buf).unwrap(), 256);
        for i in 0..=255 {
            if input[i] != buf[i] {
                assert!(input[i].is_ascii_alphabetic());
                assert!(buf[i].is_ascii_alphabetic());
            }
        }
    }
}

What happens if you chain two RotDecoder instances together, each rotating by 13 characters?

Soluciones

use std::io::Read;

struct RotDecoder<R: Read> {
    input: R,
    rot: u8,
}

impl<R: Read> Read for RotDecoder<R> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        let size = self.input.read(buf)?;
        for b in &mut buf[..size] {
            if b.is_ascii_alphabetic() {
                let base = if b.is_ascii_uppercase() { 'A' } else { 'a' } as u8;
                *b = (*b - base + self.rot) % 26 + base;
            }
        }
        Ok(size)
    }
}

fn main() {
    let mut rot =
        RotDecoder { input: "Gb trg gb gur bgure fvqr!".as_bytes(), rot: 13 };
    let mut result = String::new();
    rot.read_to_string(&mut result).unwrap();
    println!("{}", result);
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn joke() {
        let mut rot =
            RotDecoder { input: "Gb trg gb gur bgure fvqr!".as_bytes(), rot: 13 };
        let mut result = String::new();
        rot.read_to_string(&mut result).unwrap();
        assert_eq!(&result, "To get to the other side!");
    }

    #[test]
    fn binary() {
        let input: Vec<u8> = (0..=255u8).collect();
        let mut rot = RotDecoder::<&[u8]> { input: input.as_ref(), rot: 13 };
        let mut buf = [0u8; 256];
        assert_eq!(rot.read(&mut buf).unwrap(), 256);
        for i in 0..=255 {
            if input[i] != buf[i] {
                assert!(input[i].is_ascii_alphabetic());
                assert!(buf[i].is_ascii_alphabetic());
            }
        }
    }
}

Te damos la Bienvenida al Día 3

Today, we will cover:

  • Memory management, lifetimes, and the borrow checker: how Rust ensures memory safety.
  • Smart pointers: standard library pointer types.

Schedule

In this session:

Including 10 minute breaks, this session should take about 2 hours and 15 minutes

Manejo de Memoria

In this segment:

This segment should take about 1 hour and 10 minutes

Review of Program Memory

Programs allocate memory in two ways:

  • Stack: Zona de memoria continua para las variables locales.

    • Los valores tienen tamaños fijos conocidos en tiempo de compilación.
    • Muy rápida: mueve el stack pointer.
    • Fácil de gestionar: sigue las llamadas de funciones.
    • Excelente localidad de memoria.
  • Heap: almacenamiento de valores fuera de las llamadas de funciones.

    • Los valores tienen tamaños dinámicos determinados en runtime.
    • Ligeramente más lento que el stack: requiere cierta trazabilidad.
    • No se puede asegurar la localidad de la memoria.

Ejemplo

Al crear un String, los metadatos de tamaño fijo se colocan en la stack y los datos de tamaño dinámico (la cadena real) en el heap:

fn main() {
    let s1 = String::from("Hello");
}
StackHeaps1ptrHellolen5capacity5
This slide should take about 5 minutes.
  • Menciona que un String está respaldado por un Vec, por lo que tiene capacidad y longitud y, si es mutable, puede crecer mediante reasignación en el heap.

  • Si los alumnos lo preguntan, puedes mencionar que la memoria subyacente recibe una asignación de heap mediante el Asignador del Sistema y que se pueden implementar asignadores personalizados mediante el Allocator API.

More to Explore

We can inspect the memory layout with unsafe Rust. However, you should point out that this is rightfully unsafe!

fn main() {
    let mut s1 = String::from("Hello");
    s1.push(' ');
    s1.push_str("world");
    // DON'T DO THIS AT HOME! For educational purposes only.
    // String provides no guarantees about its layout, so this could lead to
    // undefined behavior.
    unsafe {
        let (ptr, capacity, len): (usize, usize, usize) = std::mem::transmute(s1);
        println!("ptr = {ptr:#x}, len = {len}, capacity = {capacity}");
    }
}

Gestión Automática de la Memoria

Tradicionalmente, los lenguajes se dividen en dos grandes categorías:

  • Control total a través de la gestión manual de la memoria: C, C++, Pascal, etc.
    • Programmer decides when to allocate or free heap memory.
    • Programmer must determine whether a pointer still points to valid memory.
    • Studies show, programmers make mistakes.
  • Seguridad total mediante la gestión automática de la memoria en runtime: Java, Python, Go, Haskell, etc.
    • A runtime system ensures that memory is not freed until it can no longer be referenced.
    • Typically implemented with reference counting, garbage collection, or RAII.

Rust ofrece una mezcla de ambas:

Control y seguridad completa gracias a que el compilador se encarga del correcto manejo de la memoria

Para ello, se utiliza un concepto de ownership (propiedad) explícito.

This slide should take about 10 minutes.

This slide is intended to help students coming from other languages to put Rust in context.

  • C must manage heap manually with malloc and free. Common errors include forgetting to call free, calling it multiple times for the same pointer, or dereferencing a pointer after the memory it points to has been freed.

  • C++ has tools like smart pointers (unique_ptr, shared_ptr) that take advantage of language guarantees about calling destructors to ensure memory is freed when a function returns. It is still quite easy to mis-use these tools and create similar bugs to C.

  • Java, Go, and Python rely on the garbage collector to identify memory that is no longer reachable and discard it. This guarantees that any pointer can be dereferenced, eliminating use-after-free and other classes of bugs. But, GC has a runtime cost and is difficult to tune properly.

Rust’s ownership and borrowing model can, in many cases, get the performance of C, with alloc and free operations precisely where they are required – zero cost. It also provides tools similar to C++’s smart pointers. When required, other options such as reference counting are available, and there are even third-party crates available to support runtime garbage collection (not covered in this class).

Ownership

Todos los enlaces a variables tienen un ámbito donde son válidos y se produce un error cuando se usan fuera de él:

struct Point(i32, i32);

fn main() {
    {
        let p = Point(3, 4);
        println!("x: {}", p.0);
    }
    println!("y: {}", p.1);
}

We say that the variable owns the value. Every Rust value has precisely one owner at all times.

At the end of the scope, the variable is dropped and the data is freed. A destructor can run here to free up resources.

This slide should take about 5 minutes.

Students familiar with garbage-collection implementations will know that a garbage collector starts with a set of “roots” to find all reachable memory. Rust’s “single owner” principle is a similar idea.

Semántica de movimiento

Una asignación transferirá su ownership entre variables:

fn main() {
    let s1: String = String::from("Hello!");
    let s2: String = s1;
    println!("s2: {s2}");
    // println!("s1: {s1}");
}
  • La asignación de s1 a s2 transfiere el ownership.
  • Cuando s1 queda fuera del ámbito, no ocurre nada: no le pertenece nada.
  • Cuando s2 sale del ámbito, los datos de la cadena se liberan.

Antes de mover a s2:

StackHeaps1ptrRustlen4capacity4

Después de mover a s2:

StackHeaps1ptrRustlen4capacity4s2ptrlen4capacity4(inaccessible)

Cuando pasas un valor a una función, el valor se asigna al parámetro de la función. Esta acción transfiere el ownership:

fn say_hello(name: String) {
    println!("Hello {name}")
}

fn main() {
    let name = String::from("Alice");
    say_hello(name);
    // say_hello(name);
}
This slide should take about 10 minutes.
  • Menciona que es lo contrario de los valores predeterminados de C++, que se copian por valor, a menos que utilices std::move (y que el constructor de movimiento esté definido).

  • Es únicamente el ownership el que se mueve. Si se genera algún código máquina para manipular los datos en sí, se trata de una cuestión de optimización, y esas copias se optimizan de forma agresiva.

  • Los valores simples (como los enteros) se pueden marcar como Copy (consulta las diapositivas posteriores).

  • En Rust, la clonación es explícita (usando clone).

In the say_hello example:

  • Con la primera llamada a say_hello, main deja de tener el ownership de name. Después, ya no se podrá usar name dentro de main.
  • La memoria de heap asignada a name se liberará al final de la función say_hello.
  • main podrá conservar el _ownership_ si pasaname como referencia (&name) y si say_hello` acepta una referencia como parámetro.
  • Por otro lado, main puede pasar un clon de name en la primera llamada (name.clone()).
  • Rust hace que resulte más difícil que en C++ crear copias por error al definir la semántica de movimiento como predeterminada y al obligar a los programadores a clonar sólo de forma explícita.

More to Explore

Defensive Copies in Modern C++

La versión moderna de C++ soluciona este problema de forma diferente:

std::string s1 = "Cpp";
std::string s2 = s1;  // Duplicate the data in s1.
  • Los datos de la stack de s1 se duplican y s2 obtiene su propia copia independiente.
  • Cuando s1 y s2 salen del ámbito, cada uno libera su propia memoria.

Antes de la asignación de copias:

StackHeaps1ptrCpplen3capacity3

Después de la asignación de copia:

StackHeaps1ptrCpplen3capacity3s2ptrCpplen3capacity3

Puntos clave:

  • C++ has made a slightly different choice than Rust. Because = copies data, the string data has to be cloned. Otherwise we would get a double-free when either string goes out of scope.

  • C++ also has std::move, which is used to indicate when a value may be moved from. If the example had been s2 = std::move(s1), no heap allocation would take place. After the move, s1 would be in a valid but unspecified state. Unlike Rust, the programmer is allowed to keep using s1.

  • Unlike Rust, = in C++ can run arbitrary code as determined by the type which is being copied or moved.

Clone

Sometimes you want to make a copy of a value. The Clone trait accomplishes this.

#[derive(Default)]
struct Backends {
    hostnames: Vec<String>,
    weights: Vec<f64>,
}

impl Backends {
    fn set_hostnames(&mut self, hostnames: &Vec<String>) {
        self.hostnames = hostnames.clone();
        self.weights = hostnames.iter().map(|_| 1.0).collect();
    }
}
This slide should take about 2 minutes.

The idea of Clone is to make it easy to spot where heap allocations are occurring. Look for .clone() and a few others like Vec::new or Box::new.

It’s common to “clone your way out” of problems with the borrow checker, and return later to try to optimize those clones away.

Tipos compuestos

Aunque la semántica de movimiento es la opción predeterminada, algunos tipos se copian de forma predeterminada:

fn main() {
    let x = 42;
    let y = x;
    println!("x: {x}"); // would not be accessible if not Copy
    println!("y: {y}");
}

Estos tipos implementan el trait Copy.

Puedes habilitar tus propios tipos para que usen la semántica de copia:

#[derive(Copy, Clone, Debug)]
struct Point(i32, i32);

fn main() {
    let p1 = Point(3, 4);
    let p2 = p1;
    println!("p1: {p1:?}");
    println!("p2: {p2:?}");
}
  • Después de la asignación, tanto p1 como p2 tienen sus propios datos.
  • También podemos utilizar p1.clone() para copiar los datos de forma explícita.
This slide should take about 5 minutes.

Copiar y clonar no es lo mismo:

  • Copiar hace referencia a las copias bit a bit de regiones de memoria y no funciona en cualquier objeto.
  • Copiar no permite lógica personalizada (a diferencia de los constructores de copias de C++).
  • Clonar es una operación más general y que permite un comportamiento personalizado implementando el trait Clone.
  • Copiar no funciona en los tipos que implementan el trait Drop.

En el ejemplo anterior, prueba lo siguiente:

  • Añade un campo String a struct Point. No se compilará porque String no es de tipo Copy.
  • Remove Copy from the derive attribute. The compiler error is now in the println! for p1.
  • Demuestra que funciona si clonas p1.

El Trait Drop

Los valores que implementan Drop pueden especificar el código que se ejecutará cuando salgan del ámbito:

struct Droppable {
    name: &'static str,
}

impl Drop for Droppable {
    fn drop(&mut self) {
        println!("Dropping {}", self.name);
    }
}

fn main() {
    let a = Droppable { name: "a" };
    {
        let b = Droppable { name: "b" };
        {
            let c = Droppable { name: "c" };
            let d = Droppable { name: "d" };
            println!("Exiting block B");
        }
        println!("Exiting block A");
    }
    drop(a);
    println!("Exiting main");
}
This slide should take about 10 minutes.
  • Note that std::mem::drop is not the same as std::ops::Drop::drop.
  • Values are automatically dropped when they go out of scope.
  • When a value is dropped, if it implements std::ops::Drop then its Drop::drop implementation will be called.
  • All its fields will then be dropped too, whether or not it implements Drop.
  • std::mem::drop is just an empty function that takes any value. The significance is that it takes ownership of the value, so at the end of its scope it gets dropped. This makes it a convenient way to explicitly drop values earlier than they would otherwise go out of scope.
    • This can be useful for objects that do some work on drop: releasing locks, closing files, etc.

Cuestiones de debate:

  • ¿Por qué Drop::drop no acepta self?
    • Respuesta corta: si lo hiciera, se llamaría a std::mem::drop al final del bloque, lo que daría como resultado otra llamada a Drop::drop y un desbordamiento de la stack.
  • Prueba a sustituir drop(a) por a.drop().

Exercise: Builder Type

In this example, we will implement a complex data type that owns all of its data. We will use the “builder pattern” to support building a new value piece-by-piece, using convenience functions.

Fill in the missing pieces.

#[derive(Debug)]
enum Language {
    Rust,
    Java,
    Perl,
}

#[derive(Clone, Debug)]
struct Dependency {
    name: String,
    version_expression: String,
}

/// A representation of a software package.
#[derive(Debug)]
struct Package {
    name: String,
    version: String,
    authors: Vec<String>,
    dependencies: Vec<Dependency>,
    language: Option<Language>,
}

impl Package {
    /// Return a representation of this package as a dependency, for use in
    /// building other packages.
    fn as_dependency(&self) -> Dependency {
        todo!("1")
    }
}

/// A builder for a Package. Use `build()` to create the `Package` itself.
struct PackageBuilder(Package);

impl PackageBuilder {
    fn new(name: impl Into<String>) -> Self {
        todo!("2")
    }

    /// Set the package version.
    fn version(mut self, version: impl Into<String>) -> Self {
        self.0.version = version.into();
        self
    }

    /// Set the package authors.
    fn authors(mut self, authors: Vec<String>) -> Self {
        todo!("3")
    }

    /// Add an additional dependency.
    fn dependency(mut self, dependency: Dependency) -> Self {
        todo!("4")
    }

    /// Set the language. If not set, language defaults to None.
    fn language(mut self, language: Language) -> Self {
        todo!("5")
    }

    fn build(self) -> Package {
        self.0
    }
}

fn main() {
    let base64 = PackageBuilder::new("base64").version("0.13").build();
    println!("base64: {base64:?}");
    let log =
        PackageBuilder::new("log").version("0.4").language(Language::Rust).build();
    println!("log: {log:?}");
    let serde = PackageBuilder::new("serde")
        .authors(vec!["djmitche".into()])
        .version(String::from("4.0"))
        .dependency(base64.as_dependency())
        .dependency(log.as_dependency())
        .build();
    println!("serde: {serde:?}");
}

Soluciones

#[derive(Debug)]
enum Language {
    Rust,
    Java,
    Perl,
}

#[derive(Clone, Debug)]
struct Dependency {
    name: String,
    version_expression: String,
}

/// A representation of a software package.
#[derive(Debug)]
struct Package {
    name: String,
    version: String,
    authors: Vec<String>,
    dependencies: Vec<Dependency>,
    language: Option<Language>,
}

impl Package {
    /// Return a representation of this package as a dependency, for use in
    /// building other packages.
    fn as_dependency(&self) -> Dependency {
        Dependency {
            name: self.name.clone(),
            version_expression: self.version.clone(),
        }
    }
}

/// A builder for a Package. Use `build()` to create the `Package` itself.
struct PackageBuilder(Package);

impl PackageBuilder {
    fn new(name: impl Into<String>) -> Self {
        Self(Package {
            name: name.into(),
            version: "0.1".into(),
            authors: vec![],
            dependencies: vec![],
            language: None,
        })
    }

    /// Set the package version.
    fn version(mut self, version: impl Into<String>) -> Self {
        self.0.version = version.into();
        self
    }

    /// Set the package authors.
    fn authors(mut self, authors: Vec<String>) -> Self {
        self.0.authors = authors;
        self
    }

    /// Add an additional dependency.
    fn dependency(mut self, dependency: Dependency) -> Self {
        self.0.dependencies.push(dependency);
        self
    }

    /// Set the language. If not set, language defaults to None.
    fn language(mut self, language: Language) -> Self {
        self.0.language = Some(language);
        self
    }

    fn build(self) -> Package {
        self.0
    }
}

fn main() {
    let base64 = PackageBuilder::new("base64").version("0.13").build();
    println!("base64: {base64:?}");
    let log =
        PackageBuilder::new("log").version("0.4").language(Language::Rust).build();
    println!("log: {log:?}");
    let serde = PackageBuilder::new("serde")
        .authors(vec!["djmitche".into()])
        .version(String::from("4.0"))
        .dependency(base64.as_dependency())
        .dependency(log.as_dependency())
        .build();
    println!("serde: {serde:?}");
}

Smart Pointers

In this segment:

This segment should take about 45 minutes

Box<T>

Box es un puntero propio de datos en el heap:

fn main() {
    let five = Box::new(5);
    println!("five: {}", *five);
}
5StackHeapfive

Box<T> implementa Deref<Target = T>, lo que significa que puedes llamar a métodos desde T directamente en un Box<T>.

Los tipos de datos recursivos o los tipos de datos con tamaños dinámicos deben utilizar un Box:

#[derive(Debug)]
enum List<T> {
    /// A non-empty list: first element and the rest of the list.
    Element(T, Box<List<T>>),
    /// An empty list.
    Nil,
}

fn main() {
    let list: List<i32> =
        List::Element(1, Box::new(List::Element(2, Box::new(List::Nil))));
    println!("{list:?}");
}
StackHeaplistElement1Element2Nil
This slide should take about 10 minutes.
  • Box is like std::unique_ptr in C++, except that it’s guaranteed to be not null.

  • Un Box puede resultar útil en los siguientes casos:

    • Si tienes un tipo cuyo tamaño no se conoce durante la compilación, pero el compilador de Rust quiere saber el tamaño exacto.
    • Si quieres transferir la propiedad de una gran cantidad de datos. Para evitar que se copien grandes cantidades de datos en la stack, almacena los datos del heap en un Box para que solo se mueva el puntero.
  • Si no se utiliza Box e intentamos insertar un List directamente dentro de List, el compilador no calcularía un tamaño fijo de la estructura en la memoria (List tendría un tamaño infinito).

  • Box resuelve este problema, ya que tiene el mismo tamaño que un puntero normal y solo apunta al siguiente elemento de la List en el heap.

  • Elimina Box de la definición de la lista y muestra el error del compilador. “Recursivo con indirección” es una sugerencia de que puedes usar un Box o referencia de algún tipo, en lugar de almacenar un valor directamente.

More to Explore

Optimización de la Memoria

#[derive(Debug)]
enum List<T> {
    Element(T, Box<List<T>>),
    Nil,
}

fn main() {
    let list: List<i32> =
        List::Element(1, Box::new(List::Element(2, Box::new(List::Nil))));
    println!("{list:?}");
}

Box no puede estar vacío, por lo que el puntero siempre es válido y no null. Esto permite que el compilador optimice el diseño de la memoria:

StackHeaplistElement1Element2

Rc

Rc es un puntero compartido de referencia contada. Utilízalo cuando necesites hacer referencia a los mismos datos desde varios lugares:

use std::rc::Rc;

fn main() {
    let a = Rc::new(10);
    let b = Rc::clone(&a);

    println!("a: {a}");
    println!("b: {b}");
}
  • Consulta Arc y Mutex si te encuentras en un contexto multihilo.
  • Puedes degradar un puntero compartido en un puntero Weak para crear ciclos que se abandonarán.
This slide should take about 5 minutes.
  • El recuento de Rc asegura que el valor que contiene sea válido mientras haya referencias.
  • Rc en Rust es como std::shared_ptr en C++.
  • Rc::clone es simple: crea un puntero en la misma asignación y aumenta el recuento de referencias. No hace clones completos y, por lo general, se puede ignorar cuando se buscan problemas de rendimiento en el código.
  • make_mut clona el valor interno si es necesario (“copiar al escribir”) y devuelve una referencia mutable.
  • Comprueba el recuento de referencias con Rc::strong_count.
  • Rc::downgrade gives you a weakly reference-counted object to create cycles that will be dropped properly (likely in combination with RefCell).

Exercise: Binary Tree

A binary tree is a tree-type data structure where every node has two children (left and right). We will create a tree where each node stores a value. For a given node N, all nodes in a N’s left subtree contain smaller values, and all nodes in N’s right subtree will contain larger values.

Implement the following types, so that the given tests pass.

Extra Credit: implement an iterator over a binary tree that returns the values in order.

/// A node in the binary tree.
#[derive(Debug)]
struct Node<T: Ord> {
    value: T,
    left: Subtree<T>,
    right: Subtree<T>,
}

/// A possibly-empty subtree.
#[derive(Debug)]
struct Subtree<T: Ord>(Option<Box<Node<T>>>);

/// A container storing a set of values, using a binary tree.
///
/// If the same value is added multiple times, it is only stored once.
#[derive(Debug)]
pub struct BinaryTree<T: Ord> {
    root: Subtree<T>,
}

// Implement `new`, `insert`, `len`, and `has`.

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn len() {
        let mut tree = BinaryTree::new();
        assert_eq!(tree.len(), 0);
        tree.insert(2);
        assert_eq!(tree.len(), 1);
        tree.insert(1);
        assert_eq!(tree.len(), 2);
        tree.insert(2); // not a unique item
        assert_eq!(tree.len(), 2);
    }

    #[test]
    fn has() {
        let mut tree = BinaryTree::new();
        fn check_has(tree: &BinaryTree<i32>, exp: &[bool]) {
            let got: Vec<bool> =
                (0..exp.len()).map(|i| tree.has(&(i as i32))).collect();
            assert_eq!(&got, exp);
        }

        check_has(&tree, &[false, false, false, false, false]);
        tree.insert(0);
        check_has(&tree, &[true, false, false, false, false]);
        tree.insert(4);
        check_has(&tree, &[true, false, false, false, true]);
        tree.insert(4);
        check_has(&tree, &[true, false, false, false, true]);
        tree.insert(3);
        check_has(&tree, &[true, false, false, true, true]);
    }

    #[test]
    fn unbalanced() {
        let mut tree = BinaryTree::new();
        for i in 0..100 {
            tree.insert(i);
        }
        assert_eq!(tree.len(), 100);
        assert!(tree.has(&50));
    }
}

Soluciones

use std::cmp::Ordering;

/// A node in the binary tree.
#[derive(Debug)]
struct Node<T: Ord> {
    value: T,
    left: Subtree<T>,
    right: Subtree<T>,
}

/// A possibly-empty subtree.
#[derive(Debug)]
struct Subtree<T: Ord>(Option<Box<Node<T>>>);

/// A container storing a set of values, using a binary tree.
///
/// If the same value is added multiple times, it is only stored once.
#[derive(Debug)]
pub struct BinaryTree<T: Ord> {
    root: Subtree<T>,
}

impl<T: Ord> BinaryTree<T> {
    fn new() -> Self {
        Self { root: Subtree::new() }
    }

    fn insert(&mut self, value: T) {
        self.root.insert(value);
    }

    fn has(&self, value: &T) -> bool {
        self.root.has(value)
    }

    fn len(&self) -> usize {
        self.root.len()
    }
}

impl<T: Ord> Subtree<T> {
    fn new() -> Self {
        Self(None)
    }

    fn insert(&mut self, value: T) {
        match &mut self.0 {
            None => self.0 = Some(Box::new(Node::new(value))),
            Some(n) => match value.cmp(&n.value) {
                Ordering::Less => n.left.insert(value),
                Ordering::Equal => {}
                Ordering::Greater => n.right.insert(value),
            },
        }
    }

    fn has(&self, value: &T) -> bool {
        match &self.0 {
            None => false,
            Some(n) => match value.cmp(&n.value) {
                Ordering::Less => n.left.has(value),
                Ordering::Equal => true,
                Ordering::Greater => n.right.has(value),
            },
        }
    }

    fn len(&self) -> usize {
        match &self.0 {
            None => 0,
            Some(n) => 1 + n.left.len() + n.right.len(),
        }
    }
}

impl<T: Ord> Node<T> {
    fn new(value: T) -> Self {
        Self { value, left: Subtree::new(), right: Subtree::new() }
    }
}

fn main() {
    let mut tree = BinaryTree::new();
    tree.insert("foo");
    assert_eq!(tree.len(), 1);
    tree.insert("bar");
    assert!(tree.has(&"foo"));
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn len() {
        let mut tree = BinaryTree::new();
        assert_eq!(tree.len(), 0);
        tree.insert(2);
        assert_eq!(tree.len(), 1);
        tree.insert(1);
        assert_eq!(tree.len(), 2);
        tree.insert(2); // not a unique item
        assert_eq!(tree.len(), 2);
    }

    #[test]
    fn has() {
        let mut tree = BinaryTree::new();
        fn check_has(tree: &BinaryTree<i32>, exp: &[bool]) {
            let got: Vec<bool> =
                (0..exp.len()).map(|i| tree.has(&(i as i32))).collect();
            assert_eq!(&got, exp);
        }

        check_has(&tree, &[false, false, false, false, false]);
        tree.insert(0);
        check_has(&tree, &[true, false, false, false, false]);
        tree.insert(4);
        check_has(&tree, &[true, false, false, false, true]);
        tree.insert(4);
        check_has(&tree, &[true, false, false, false, true]);
        tree.insert(3);
        check_has(&tree, &[true, false, false, true, true]);
    }

    #[test]
    fn unbalanced() {
        let mut tree = BinaryTree::new();
        for i in 0..100 {
            tree.insert(i);
        }
        assert_eq!(tree.len(), 100);
        assert!(tree.has(&50));
    }
}

Welcome Back

In this session:

Including 10 minute breaks, this session should take about 2 hours and 20 minutes

Préstamos (Borrowing)

In this segment:

This segment should take about 1 hour

Préstamos (Borrowing)

As we saw before, instead of transferring ownership when calling a function, you can let a function borrow the value:

#[derive(Debug)]
struct Point(i32, i32);

fn add(p1: &Point, p2: &Point) -> Point {
    Point(p1.0 + p2.0, p1.1 + p2.1)
}

fn main() {
    let p1 = Point(3, 4);
    let p2 = Point(10, 20);
    let p3 = add(&p1, &p2);
    println!("{p1:?} + {p2:?} = {p3:?}");
}
  • La función add toma prestados dos puntos y devuelve uno nuevo.
  • El llamador conserva el ownership de las entradas.
This slide should take about 10 minutes.

This slide is a review of the material on references from day 1, expanding slightly to include function arguments and return values.

More to Explore

Notas sobre la devolución de resultados de la stack:

  • Demonstrate that the return from add is cheap because the compiler can eliminate the copy operation. Change the above code to print stack addresses and run it on the Playground or look at the assembly in Godbolt. In the “DEBUG” optimization level, the addresses should change, while they stay the same when changing to the “RELEASE” setting:

    #[derive(Debug)]
    struct Point(i32, i32);
    
    fn add(p1: &Point, p2: &Point) -> Point {
        let p = Point(p1.0 + p2.0, p1.1 + p2.1);
        println!("&p.0: {:p}", &p.0);
        p
    }
    
    pub fn main() {
        let p1 = Point(3, 4);
        let p2 = Point(10, 20);
        let p3 = add(&p1, &p2);
        println!("&p3.0: {:p}", &p3.0);
        println!("{p1:?} + {p2:?} = {p3:?}");
    }
  • El compilador de Rust puede hacer la optimización del valor devuelto (RVO).

  • En C++, la elisión de copia tiene que definirse en la especificación del lenguaje, ya que los constructores pueden tener efectos secundarios. En Rust, esto no supone ningún problema. Si no hay RVO, Rust siempre realizará una copia memcpy simple y eficiente.

Préstamos (Borrowing)

Rust’s borrow checker puts constraints on the ways you can borrow values. For a given value, at any time:

  • You can have one or more shared references to the value, or
  • You can have exactly one exclusive reference to the value.
fn main() {
    let mut a: i32 = 10;
    let b: &i32 = &a;

    {
        let c: &mut i32 = &mut a;
        *c = 20;
    }

    println!("a: {a}");
    println!("b: {b}");
}
This slide should take about 10 minutes.
  • Note that the requirement is that conflicting references not exist at the same point. It does not matter where the reference is dereferenced.
  • El código anterior no se compila porque a se toma prestada como mutable (a través de c) y como inmutable (a través de b) al mismo tiempo.
  • Mueve la instrucción println! de b antes del ámbito que introduce c para que el código compile.
  • Después de ese cambio, el compilador se da cuenta de que b solo se usa antes del nuevo préstamo mutable de a a través de c. Se trata de una función del verificador de préstamos denominada “tiempo de vida no léxico”.
  • The exclusive reference constraint is quite strong. Rust uses it to ensure that data races do not occur. Rust also relies on this constraint to optimize code. For example, a value behind a shared reference can be safely cached in a register for the lifetime of that reference.
  • The borrow checker is designed to accommodate many common patterns, such as taking exclusive references to different fields in a struct at the same time. But, there are some situations where it doesn’t quite “get it” and this often results in “fighting with the borrow checker.”

Interoperabilidad

In some situations, it’s necessary to modify data behind a shared (read-only) reference. For example, a shared data structure might have an internal cache, and wish to update that cache from read-only methods.

The “interior mutability” pattern allows exclusive (mutable) access behind a shared reference. The standard library provides several ways to do this, all while still ensuring safety, typically by performing a runtime check.

RefCell

use std::cell::RefCell;
use std::rc::Rc;

#[derive(Debug, Default)]
struct Node {
    value: i64,
    children: Vec<Rc<RefCell<Node>>>,
}

impl Node {
    fn new(value: i64) -> Rc<RefCell<Node>> {
        Rc::new(RefCell::new(Node { value, ..Node::default() }))
    }

    fn sum(&self) -> i64 {
        self.value + self.children.iter().map(|c| c.borrow().sum()).sum::<i64>()
    }
}

fn main() {
    let root = Node::new(1);
    root.borrow_mut().children.push(Node::new(5));
    let subtree = Node::new(10);
    subtree.borrow_mut().children.push(Node::new(11));
    subtree.borrow_mut().children.push(Node::new(12));
    root.borrow_mut().children.push(subtree);

    println!("graph: {root:#?}");
    println!("graph sum: {}", root.borrow().sum());
}

Cell

Cell wraps a value and allows getting or setting the value, even with a shared reference to the Cell. However, it does not allow any references to the value. Since there are no references, borrowing rules cannot be broken.

This slide should take about 10 minutes.

The main thing to take away from this slide is that Rust provides safe ways to modify data behind a shared reference. There are a variety of ways to ensure that safety, and RefCell and Cell are two of them.

  • RefCell enforces Rust’s usual borrowing rules (either multiple shared references or a single exclusive reference) with a runtime check. In this case, all borrows are very short and never overlap, so the checks always succeed.

  • Rc only allows shared (read-only) access to its contents, since its purpose is to allow (and count) many references. But we want to modify the value, so we need interior mutability.

  • Cell is a simpler means to ensure safety: it has a set method that takes &self. This needs no runtime check, but requires moving values, which can have its own cost.

  • Demonstrate that reference loops can be created by adding root to subtree.children.

  • Para demostrar un pánico en tiempo de ejecución, añade un fn inc(&mut self) que incremente self.value y llame al mismo método en sus hijos. Esto entrará en pánico en presencia del bucle de referencia, con thread 'main' panicked at 'already borrowed: BorrowMutError'.

Estadísticas de salud

Estás trabajando en la implementación de un sistema de monitorización de salud. Por ello, debes realizar un seguimiento de las estadísticas de salud de los usuarios.

You’ll start with a stubbed function in an impl block as well as a User struct definition. Your goal is to implement the stubbed out method on the User struct defined in the impl block.

Copy the code below to https://play.rust-lang.org/ and fill in the missing method:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]


#![allow(dead_code)]
pub struct User {
    name: String,
    age: u32,
    height: f32,
    visit_count: usize,
    last_blood_pressure: Option<(u32, u32)>,
}

pub struct Measurements {
    height: f32,
    blood_pressure: (u32, u32),
}

pub struct HealthReport<'a> {
    patient_name: &'a str,
    visit_count: u32,
    height_change: f32,
    blood_pressure_change: Option<(i32, i32)>,
}

impl User {
    pub fn new(name: String, age: u32, height: f32) -> Self {
        Self {
            name,
            age,
            height,
            visit_count: 0,
            last_blood_pressure: None,
        }
    }

    pub fn visit_doctor(&mut self, measurements: Measurements) -> HealthReport {
        todo!("Update a user's statistics based on measurements from a visit to the doctor")
    }
}

fn main() {
    let bob = User::new(String::from("Bob"), 32, 155.2);
    println!("I'm {} and my age is {}", bob.name, bob.age);
}

#[test]
fn test_visit() {
    let mut bob = User::new(String::from("Bob"), 32, 155.2);
    assert_eq!(bob.visit_count, 0);
    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (120, 80),
    });
    assert_eq!(report.patient_name, "Bob");
    assert_eq!(report.visit_count, 1);
    assert_eq!(report.blood_pressure_change, None);

    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (115, 76),
    });

    assert_eq!(report.visit_count, 2);
    assert_eq!(report.blood_pressure_change, Some((-5, -4)));
}

Soluciones


#![allow(dead_code)]
pub struct User {
    name: String,
    age: u32,
    height: f32,
    visit_count: usize,
    last_blood_pressure: Option<(u32, u32)>,
}

pub struct Measurements {
    height: f32,
    blood_pressure: (u32, u32),
}

pub struct HealthReport<'a> {
    patient_name: &'a str,
    visit_count: u32,
    height_change: f32,
    blood_pressure_change: Option<(i32, i32)>,
}

impl User {
    pub fn new(name: String, age: u32, height: f32) -> Self {
        Self {
            name,
            age,
            height,
            visit_count: 0,
            last_blood_pressure: None,
        }
    }

    pub fn visit_doctor(&mut self, measurements: Measurements) -> HealthReport {
        self.visit_count += 1;
        let bp = measurements.blood_pressure;
        let report = HealthReport {
            patient_name: &self.name,
            visit_count: self.visit_count as u32,
            height_change: measurements.height - self.height,
            blood_pressure_change: match self.last_blood_pressure {
                Some(lbp) => {
                    Some((bp.0 as i32 - lbp.0 as i32, bp.1 as i32 - lbp.1 as i32))
                }
                None => None,
            },
        };
        self.height = measurements.height;
        self.last_blood_pressure = Some(bp);
        report
    }
}

fn main() {
    let bob = User::new(String::from("Bob"), 32, 155.2);
    println!("I'm {} and my age is {}", bob.name, bob.age);
}

#[test]
fn test_visit() {
    let mut bob = User::new(String::from("Bob"), 32, 155.2);
    assert_eq!(bob.visit_count, 0);
    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (120, 80),
    });
    assert_eq!(report.patient_name, "Bob");
    assert_eq!(report.visit_count, 1);
    assert_eq!(report.blood_pressure_change, None);

    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (115, 76),
    });

    assert_eq!(report.visit_count, 2);
    assert_eq!(report.blood_pressure_change, Some((-5, -4)));
}

Tiempos de vida

In this segment:

This segment should take about 1 hour and 10 minutes

Slices

Un slice ofrece una visión de una colección más amplia:

fn main() {
    let mut a: [i32; 6] = [10, 20, 30, 40, 50, 60];
    println!("a: {a:?}");

    let s: &[i32] = &a[2..4];

    println!("s: {s:?}");
}
  • Los slices toman prestados datos del tipo slice.
  • Pregunta: ¿Qué ocurre si se modifica a[3] justo antes de imprimir s?
This slide should take about 10 minutes.
  • Creamos un slice tomando prestado a y especificando entre paréntesis los índices de inicio y de fin.

  • Si el slice comienza en el índice 0, la sintaxis de rango de Rust nos permite eliminar el índice inicial, lo que significa que &a[0..a.len()] y &a[..a.len()] son idénticos.

  • Lo mismo ocurre con el último índice, por lo que &a[2..a.len()] y &a[2..] son idénticos.

  • Para crear fácilmente un slice del array completo, podemos usar &a[..].

  • s es una referencia a un slice de i32s. Ten en cuenta que el tipo de s (&[i32]) ya no menciona la longitud del array. Esto nos permite realizar cálculos en slices de diferentes tamaños.

  • Slices always borrow from another object. In this example, a has to remain ‘alive’ (in scope) for at least as long as our slice.

  • The question about modifying a[3] can spark an interesting discussion, but the answer is that for memory safety reasons you cannot do it through a at this point in the execution, but you can read the data from both a and s safely. It works before you created the slice, and again after the println, when the slice is no longer used.

Referencias colgantes

We can now understand the two string types in Rust: &str is almost like &[char], but with its data stored in a variable-length encoding (UTF-8).

fn main() {
    let s1: &str = "World";
    println!("s1: {s1}");

    let mut s2: String = String::from("Hello ");
    println!("s2: {s2}");
    s2.push_str(s1);
    println!("s2: {s2}");

    let s3: &str = &s2[6..];
    println!("s3: {s3}");
}

Terminología de Rust:

  • &str es una referencia inmutable a un slice de una cadena.
  • String es un búfer de cadena mutable.
This slide should take about 10 minutes.
  • &str introduces a string slice, which is an immutable reference to UTF-8 encoded string data stored in a block of memory. String literals (”Hello”), are stored in the program’s binary.

  • El tipo String de Rust es un envoltorio que rodea a un vector de bytes. Como sucede con Vec<T>, tiene propiedad.

  • As with many other types String::from() creates a string from a string literal; String::new() creates a new empty string, to which string data can be added using the push() and push_str() methods.

  • The format!() macro is a convenient way to generate an owned string from dynamic values. It accepts the same format specification as println!().

  • You can borrow &str slices from String via & and optionally range selection. If you select a byte range that is not aligned to character boundaries, the expression will panic. The chars iterator iterates over characters and is preferred over trying to get character boundaries right.

  • For C++ programmers: think of &str as std::string_view from C++, but the one that always points to a valid string in memory. Rust String is a rough equivalent of std::string from C++ (main difference: it can only contain UTF-8 encoded bytes and will never use a small-string optimization).

  • Byte strings literals allow you to create a &[u8] value directly:

    fn main() {
        println!("{:?}", b"abc");
        println!("{:?}", &[97, 98, 99]);
    }

Tiempos de Vida en Llamadas a Función

A reference has a lifetime, which must not “outlive” the value it refers to. This is verified by the borrow checker.

The lifetime can be implicit - this is what we have seen so far. Lifetimes can also be explicit: &'a Point, &'document str. Lifetimes start with ' and 'a is a typical default name. Read &'a Point as “a borrowed Point which is valid for at least the lifetime a”.

Lifetimes are always inferred by the compiler: you cannot assign a lifetime yourself. Explicit lifetime annotations create constraints where there is ambiguity; the compiler verifies that there is a valid solution.

Lifetimes become more complicated when considering passing values to and returning values from functions.

#[derive(Debug)]
struct Point(i32, i32);

fn left_most(p1: &Point, p2: &Point) -> &Point {
    if p1.0 < p2.0 {
        p1
    } else {
        p2
    }
}

fn main() {
    let p1: Point = Point(10, 10);
    let p2: Point = Point(20, 20);
    let p3 = left_most(&p1, &p2); // What is the lifetime of p3?
    println!("p3: {p3:?}");
}
This slide should take about 10 minutes.

In this example, the the compiler does not know what lifetime to infer for p3. Looking inside the function body shows that it can only safely assume that p3’s lifetime is the shorter of p1 and p2. But just like types, Rust requires explicit annotations of lifetimes on function arguments and return values.

Add 'a appropriately to left_most:

fn left_most<'a>(p1: &'a Point, p2: &'a Point) -> &'a Point {

This says, “given p1 and p2 which both outlive 'a, the return value lives for at least 'a.

In common cases, lifetimes can be elided, as described on the next slide.

Tiempos de Vida en Llamadas a Función

Lifetimes for function arguments and return values must be fully specified, but Rust allows lifetimes to be elided in most cases with a few simple rules. This is not inference – it is just a syntactic shorthand.

  • Each argument which does not have a lifetime annotation is given one.
  • If there is only one argument lifetime, it is given to all un-annotated return values.
  • If there are multiple argument lifetimes, but the first one is for self, that lifetime is given to all un-annotated return values.
#[derive(Debug)]
struct Point(i32, i32);

fn cab_distance(p1: &Point, p2: &Point) -> i32 {
    (p1.0 - p2.0).abs() + (p1.1 - p2.1).abs()
}

fn nearest<'a>(points: &'a [Point], query: &Point) -> Option<&'a Point> {
    let mut nearest = None;
    for p in points {
        if let Some((_, nearest_dist)) = nearest {
            let dist = cab_distance(p, query);
            if dist < nearest_dist {
                nearest = Some((p, dist));
            }
        } else {
            nearest = Some((p, cab_distance(p, query)));
        };
    }
    nearest.map(|(p, _)| p)
}

fn main() {
    println!(
        "{:?}",
        nearest(
            &[Point(1, 0), Point(1, 0), Point(-1, 0), Point(0, -1),],
            &Point(0, 2)
        )
    );
}
This slide should take about 5 minutes.

In this example, cab_distance is trivially elided.

The nearest function provides another example of a function with multiple references in its arguments that requires explicit annotation.

Try adjusting the signature to “lie” about the lifetimes returned:

fn nearest<'a, 'q'>(points: &'a [Point], query: &'q Point) -> Option<&'q Point> {

This won’t compile, demonstrating that the annotations are checked for validity by the compiler. Note that this is not the case for raw pointers (unsafe), and this is a common source of errors with unsafe Rust.

Students may ask when to use lifetimes. Rust borrows always have lifetimes. Most of the time, elision and type inference mean these don’t need to be written out. In more complicated cases, lifetime annotations can help resolve ambiguity. Often, especially when prototyping, it’s easier to just work with owned data by cloning values where necessary.

Tiempos de vida en estructuras de datos

Si un tipo de datos almacena datos prestados, se debe anotar con tiempo de vida:

#[derive(Debug)]
struct Highlight<'doc>(&'doc str);

fn erase(text: String) {
    println!("Bye {text}!");
}

fn main() {
    let text = String::from("The quick brown fox jumps over the lazy dog.");
    let fox = Highlight(&text[4..19]);
    let dog = Highlight(&text[35..43]);
    // erase(text);
    println!("{fox:?}");
    println!("{dog:?}");
}
This slide should take about 5 minutes.
  • En el ejemplo anterior, la anotación en Highlight hace que los datos subyacentes a la &str contenida tengan al menos la misma duración que cualquier instancia de Highlight que utilice esos datos.
  • Si text se consume antes de que acabe el tiempo de vida de fox (o dog), el borrow checker (verificador de préstamos) muestra un error.
  • Los tipos con datos prestados (borrowed) obligan a los usuarios a conservar los datos originales. Esto puede ser útil para crear vistas ligeras aunque, por lo general, hace que sean un poco más difíciles de usar.
  • Siempre que sea posible, haz que las estructuras de datos sean propietarias directas de sus datos.
  • Algunas estructuras con varias referencias dentro pueden tener más de una anotación de tiempo de vida. Esto puede ser necesario si hay que describir las relaciones de tiempo de vida entre las propias referencias, además del tiempo de vida de la propia estructura. Estos son casos prácticos muy avanzados.

Exercise: Protobuf Parsing

In this exercise, you will build a parser for the protobuf binary encoding. Don’t worry, it’s simpler than it seems! This illustrates a common parsing pattern, passing slices of data. The underlying data itself is never copied.

Fully parsing a protobuf message requires knowing the types of the fields, indexed by their field numbers. That is typically provided in a proto file. In this exercise, we’ll encode that information into match statements in functions that get called for each field.

We’ll use the following proto:

message PhoneNumber {
  optional string number = 1;
  optional string type = 2;
}

message Person {
  optional string name = 1;
  optional int32 id = 2;
  repeated PhoneNumber phones = 3;
}

A proto message is encoded as a series of fields, one after the next. Each is implemented as a “tag” followed by the value. The tag contains a field number (e.g., 2 for the id field of a Person message) and a wire type defining how the payload should be determined from the byte stream.

Integers, including the tag, are represented with a variable-length encoding called VARINT. Luckily, parse_varint is defined for you below. The given code also defines callbacks to handle Person and PhoneNumber fields, and to parse a message into a series of calls to those callbacks.

What remains for you is to implement the parse_field function and the ProtoMessage trait for Person and PhoneNumber.

use std::convert::TryFrom;
use thiserror::Error;

#[derive(Debug, Error)]
enum Error {
    #[error("Invalid varint")]
    InvalidVarint,
    #[error("Invalid wire-type")]
    InvalidWireType,
    #[error("Unexpected EOF")]
    UnexpectedEOF,
    #[error("Invalid length")]
    InvalidSize(#[from] std::num::TryFromIntError),
    #[error("Unexpected wire-type)")]
    UnexpectedWireType,
    #[error("Invalid string (not UTF-8)")]
    InvalidString,
}

/// A wire type as seen on the wire.
enum WireType {
    /// The Varint WireType indicates the value is a single VARINT.
    Varint,
    //I64,  -- not needed for this exercise
    /// The Len WireType indicates that the value is a length represented as a
    /// VARINT followed by exactly that number of bytes.
    Len,
    /// The I32 WireType indicates that the value is precisely 4 bytes in
    /// little-endian order containing a 32-bit signed integer.
    I32,
}

#[derive(Debug)]
/// A field's value, typed based on the wire type.
enum FieldValue<'a> {
    Varint(u64),
    //I64(i64),  -- not needed for this exercise
    Len(&'a [u8]),
    I32(i32),
}

#[derive(Debug)]
/// A field, containing the field number and its value.
struct Field<'a> {
    field_num: u64,
    value: FieldValue<'a>,
}

trait ProtoMessage<'a>: Default + 'a {
    fn add_field(&mut self, field: Field<'a>) -> Result<(), Error>;
}

impl TryFrom<u64> for WireType {
    type Error = Error;

    fn try_from(value: u64) -> Result<WireType, Error> {
        Ok(match value {
            0 => WireType::Varint,
            //1 => WireType::I64,  -- not needed for this exercise
            2 => WireType::Len,
            5 => WireType::I32,
            _ => return Err(Error::InvalidWireType),
        })
    }
}

impl<'a> FieldValue<'a> {
    fn as_string(&self) -> Result<&'a str, Error> {
        let FieldValue::Len(data) = self else {
            return Err(Error::UnexpectedWireType);
        };
        std::str::from_utf8(data).map_err(|_| Error::InvalidString)
    }

    fn as_bytes(&self) -> Result<&'a [u8], Error> {
        let FieldValue::Len(data) = self else {
            return Err(Error::UnexpectedWireType);
        };
        Ok(data)
    }

    fn as_u64(&self) -> Result<u64, Error> {
        let FieldValue::Varint(value) = self else {
            return Err(Error::UnexpectedWireType);
        };
        Ok(*value)
    }
}

/// Parse a VARINT, returning the parsed value and the remaining bytes.
fn parse_varint(data: &[u8]) -> Result<(u64, &[u8]), Error> {
    for i in 0..7 {
        let Some(b) = data.get(i) else {
            return Err(Error::InvalidVarint);
        };
        if b & 0x80 == 0 {
            // This is the last byte of the VARINT, so convert it to
            // a u64 and return it.
            let mut value = 0u64;
            for b in data[..=i].iter().rev() {
                value = (value << 7) | (b & 0x7f) as u64;
            }
            return Ok((value, &data[i + 1..]));
        }
    }

    // More than 7 bytes is invalid.
    Err(Error::InvalidVarint)
}

/// Convert a tag into a field number and a WireType.
fn unpack_tag(tag: u64) -> Result<(u64, WireType), Error> {
    let field_num = tag >> 3;
    let wire_type = WireType::try_from(tag & 0x7)?;
    Ok((field_num, wire_type))
}


/// Parse a field, returning the remaining bytes
fn parse_field(data: &[u8]) -> Result<(Field, &[u8]), Error> {
    let (tag, remainder) = parse_varint(data)?;
    let (field_num, wire_type) = unpack_tag(tag)?;
    let (fieldvalue, remainder) = match wire_type {
        _ => todo!("Based on the wire type, build a Field, consuming as many bytes as necessary.")
    };
    todo!("Return the field, and any un-consumed bytes.")
}

/// Parse a message in the given data, calling `T::add_field` for each field in
/// the message.
///
/// The entire input is consumed.
fn parse_message<'a, T: ProtoMessage<'a>>(mut data: &'a [u8]) -> Result<T, Error> {
    let mut result = T::default();
    while !data.is_empty() {
        let parsed = parse_field(data)?;
        result.add_field(parsed.0)?;
        data = parsed.1;
    }
    Ok(result)
}

#[derive(Debug, Default)]
struct PhoneNumber<'a> {
    number: &'a str,
    type_: &'a str,
}

#[derive(Debug, Default)]
struct Person<'a> {
    name: &'a str,
    id: u64,
    phone: Vec<PhoneNumber<'a>>,
}

// TODO: Implement ProtoMessage for Person and PhoneNumber.

fn main() {
    let person: Person = parse_message(&[
        0x0a, 0x07, 0x6d, 0x61, 0x78, 0x77, 0x65, 0x6c, 0x6c, 0x10, 0x2a, 0x1a,
        0x16, 0x0a, 0x0e, 0x2b, 0x31, 0x32, 0x30, 0x32, 0x2d, 0x35, 0x35, 0x35,
        0x2d, 0x31, 0x32, 0x31, 0x32, 0x12, 0x04, 0x68, 0x6f, 0x6d, 0x65, 0x1a,
        0x18, 0x0a, 0x0e, 0x2b, 0x31, 0x38, 0x30, 0x30, 0x2d, 0x38, 0x36, 0x37,
        0x2d, 0x35, 0x33, 0x30, 0x38, 0x12, 0x06, 0x6d, 0x6f, 0x62, 0x69, 0x6c,
        0x65,
    ])
    .unwrap();
    println!("{:#?}", person);
}

Soluciones

use std::convert::TryFrom;
use thiserror::Error;

#[derive(Debug, Error)]
enum Error {
    #[error("Invalid varint")]
    InvalidVarint,
    #[error("Invalid wire-type")]
    InvalidWireType,
    #[error("Unexpected EOF")]
    UnexpectedEOF,
    #[error("Invalid length")]
    InvalidSize(#[from] std::num::TryFromIntError),
    #[error("Unexpected wire-type)")]
    UnexpectedWireType,
    #[error("Invalid string (not UTF-8)")]
    InvalidString,
}

/// A wire type as seen on the wire.
enum WireType {
    /// The Varint WireType indicates the value is a single VARINT.
    Varint,
    //I64,  -- not needed for this exercise
    /// The Len WireType indicates that the value is a length represented as a
    /// VARINT followed by exactly that number of bytes.
    Len,
    /// The I32 WireType indicates that the value is precisely 4 bytes in
    /// little-endian order containing a 32-bit signed integer.
    I32,
}

#[derive(Debug)]
/// A field's value, typed based on the wire type.
enum FieldValue<'a> {
    Varint(u64),
    //I64(i64),  -- not needed for this exercise
    Len(&'a [u8]),
    I32(i32),
}

#[derive(Debug)]
/// A field, containing the field number and its value.
struct Field<'a> {
    field_num: u64,
    value: FieldValue<'a>,
}

trait ProtoMessage<'a>: Default + 'a {
    fn add_field(&mut self, field: Field<'a>) -> Result<(), Error>;
}

impl TryFrom<u64> for WireType {
    type Error = Error;

    fn try_from(value: u64) -> Result<WireType, Error> {
        Ok(match value {
            0 => WireType::Varint,
            //1 => WireType::I64,  -- not needed for this exercise
            2 => WireType::Len,
            5 => WireType::I32,
            _ => return Err(Error::InvalidWireType),
        })
    }
}

impl<'a> FieldValue<'a> {
    fn as_string(&self) -> Result<&'a str, Error> {
        let FieldValue::Len(data) = self else {
            return Err(Error::UnexpectedWireType);
        };
        std::str::from_utf8(data).map_err(|_| Error::InvalidString)
    }

    fn as_bytes(&self) -> Result<&'a [u8], Error> {
        let FieldValue::Len(data) = self else {
            return Err(Error::UnexpectedWireType);
        };
        Ok(data)
    }

    fn as_u64(&self) -> Result<u64, Error> {
        let FieldValue::Varint(value) = self else {
            return Err(Error::UnexpectedWireType);
        };
        Ok(*value)
    }
}

/// Parse a VARINT, returning the parsed value and the remaining bytes.
fn parse_varint(data: &[u8]) -> Result<(u64, &[u8]), Error> {
    for i in 0..7 {
        let Some(b) = data.get(i) else {
            return Err(Error::InvalidVarint);
        };
        if b & 0x80 == 0 {
            // This is the last byte of the VARINT, so convert it to
            // a u64 and return it.
            let mut value = 0u64;
            for b in data[..=i].iter().rev() {
                value = (value << 7) | (b & 0x7f) as u64;
            }
            return Ok((value, &data[i + 1..]));
        }
    }

    // More than 7 bytes is invalid.
    Err(Error::InvalidVarint)
}

/// Convert a tag into a field number and a WireType.
fn unpack_tag(tag: u64) -> Result<(u64, WireType), Error> {
    let field_num = tag >> 3;
    let wire_type = WireType::try_from(tag & 0x7)?;
    Ok((field_num, wire_type))
}

/// Parse a field, returning the remaining bytes
fn parse_field(data: &[u8]) -> Result<(Field, &[u8]), Error> {
    let (tag, remainder) = parse_varint(data)?;
    let (field_num, wire_type) = unpack_tag(tag)?;
    let (fieldvalue, remainder) = match wire_type {
        WireType::Varint => {
            let (value, remainder) = parse_varint(remainder)?;
            (FieldValue::Varint(value), remainder)
        }
        WireType::Len => {
            let (len, remainder) = parse_varint(remainder)?;
            let len: usize = len.try_into()?;
            if remainder.len() < len {
                return Err(Error::UnexpectedEOF);
            }
            let (value, remainder) = remainder.split_at(len);
            (FieldValue::Len(value), remainder)
        }
        WireType::I32 => {
            if remainder.len() < 4 {
                return Err(Error::UnexpectedEOF);
            }
            let (value, remainder) = remainder.split_at(4);
            // Unwrap error because `value` is definitely 4 bytes long.
            let value = i32::from_le_bytes(value.try_into().unwrap());
            (FieldValue::I32(value), remainder)
        }
    };
    Ok((Field { field_num, value: fieldvalue }, remainder))
}

/// Parse a message in the given data, calling `T::add_field` for each field in
/// the message.
///
/// The entire input is consumed.
fn parse_message<'a, T: ProtoMessage<'a>>(mut data: &'a [u8]) -> Result<T, Error> {
    let mut result = T::default();
    while !data.is_empty() {
        let parsed = parse_field(data)?;
        result.add_field(parsed.0)?;
        data = parsed.1;
    }
    Ok(result)
}

#[derive(Debug, Default)]
struct PhoneNumber<'a> {
    number: &'a str,
    type_: &'a str,
}

#[derive(Debug, Default)]
struct Person<'a> {
    name: &'a str,
    id: u64,
    phone: Vec<PhoneNumber<'a>>,
}

impl<'a> ProtoMessage<'a> for Person<'a> {
    fn add_field(&mut self, field: Field<'a>) -> Result<(), Error> {
        match field.field_num {
            1 => self.name = field.value.as_string()?,
            2 => self.id = field.value.as_u64()?,
            3 => self.phone.push(parse_message(field.value.as_bytes()?)?),
            _ => {} // skip everything else
        }
        Ok(())
    }
}

impl<'a> ProtoMessage<'a> for PhoneNumber<'a> {
    fn add_field(&mut self, field: Field<'a>) -> Result<(), Error> {
        match field.field_num {
            1 => self.number = field.value.as_string()?,
            2 => self.type_ = field.value.as_string()?,
            _ => {} // skip everything else
        }
        Ok(())
    }
}

fn main() {
    let person: Person = parse_message(&[
        0x0a, 0x07, 0x6d, 0x61, 0x78, 0x77, 0x65, 0x6c, 0x6c, 0x10, 0x2a, 0x1a,
        0x16, 0x0a, 0x0e, 0x2b, 0x31, 0x32, 0x30, 0x32, 0x2d, 0x35, 0x35, 0x35,
        0x2d, 0x31, 0x32, 0x31, 0x32, 0x12, 0x04, 0x68, 0x6f, 0x6d, 0x65, 0x1a,
        0x18, 0x0a, 0x0e, 0x2b, 0x31, 0x38, 0x30, 0x30, 0x2d, 0x38, 0x36, 0x37,
        0x2d, 0x35, 0x33, 0x30, 0x38, 0x12, 0x06, 0x6d, 0x6f, 0x62, 0x69, 0x6c,
        0x65,
    ])
    .unwrap();
    println!("{:#?}", person);
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn as_string() {
        assert!(FieldValue::Varint(10).as_string().is_err());
        assert!(FieldValue::I32(10).as_string().is_err());
        assert_eq!(FieldValue::Len(b"hello").as_string().unwrap(), "hello");
    }

    #[test]
    fn as_bytes() {
        assert!(FieldValue::Varint(10).as_bytes().is_err());
        assert!(FieldValue::I32(10).as_bytes().is_err());
        assert_eq!(FieldValue::Len(b"hello").as_bytes().unwrap(), b"hello");
    }

    #[test]
    fn as_u64() {
        assert_eq!(FieldValue::Varint(10).as_u64().unwrap(), 10u64);
        assert!(FieldValue::I32(10).as_u64().is_err());
        assert!(FieldValue::Len(b"hello").as_u64().is_err());
    }
}

Bienvenido al Día 4

Today we will cover topics relating to building large-scale software in Rust:

  • Iterators: a deep dive on the Iterator trait.
  • Modules and visibility.
  • Testing.
  • Gestión de errores: panics (pánicos), Result y el operador try ?.
  • Unsafe Rust: the escape hatch when you can’t express yourself in safe Rust.

Schedule

In this session:

Including 10 minute breaks, this session should take about 3 hours and 5 minutes

Iteradores

In this segment:

This segment should take about 45 minutes

Iterator

The Iterator trait supports iterating over values in a collection. It requires a next method and provides lots of methods. Many standard library types implement Iterator, and you can implement it yourself, too:

struct Fibonacci {
    curr: u32,
    next: u32,
}

impl Iterator for Fibonacci {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        let new_next = self.curr + self.next;
        self.curr = self.next;
        self.next = new_next;
        Some(self.curr)
    }
}

fn main() {
    let fib = Fibonacci { curr: 0, next: 1 };
    for (i, n) in fib.enumerate().take(5) {
        println!("fib({i}): {n}");
    }
}
This slide should take about 5 minutes.
  • The Iterator trait implements many common functional programming operations over collections (e.g. map, filter, reduce, etc). This is the trait where you can find all the documentation about them. In Rust these functions should produce the code as efficient as equivalent imperative implementations.

  • IntoIterator es el trait que hace que los bucles funcionen. Se implementa a través de tipos de colecciones, como Vec<T>, y de referencias a ellas, como &Vec<T> y &[T]. Los rangos también lo implementan. Esta es la razón por la que se puede iterar sobre un vector con for i in some_vec { .. }, pero some_vec.next() no existe.

IntoIterator

The Iterator trait tells you how to iterate once you have created an iterator. The related trait IntoIterator defines how to create an iterator for a type. It is used automatically by the for loop.

struct Grid {
    x_coords: Vec<u32>,
    y_coords: Vec<u32>,
}

impl IntoIterator for Grid {
    type Item = (u32, u32);
    type IntoIter = GridIter;
    fn into_iter(self) -> GridIter {
        GridIter { grid: self, i: 0, j: 0 }
    }
}

struct GridIter {
    grid: Grid,
    i: usize,
    j: usize,
}

impl Iterator for GridIter {
    type Item = (u32, u32);

    fn next(&mut self) -> Option<(u32, u32)> {
        if self.i >= self.grid.x_coords.len() {
            self.i = 0;
            self.j += 1;
            if self.j >= self.grid.y_coords.len() {
                return None;
            }
        }
        let res = Some((self.grid.x_coords[self.i], self.grid.y_coords[self.j]));
        self.i += 1;
        res
    }
}

fn main() {
    let grid = Grid { x_coords: vec![3, 5, 7, 9], y_coords: vec![10, 20, 30, 40] };
    for (x, y) in grid {
        println!("point = {x}, {y}");
    }
}
This slide should take about 5 minutes.

Click through to the docs for IntoIterator. Every implementation of IntoIterator must declare two types:

  • Item: the type to iterate over, such as i8,
  • IntoIter: el tipo Iterator devuelto por el método into_iter.

Ten en cuenta que IntoIter y Item están vinculados: el iterador debe tener el mismo tipo de Item, lo que significa que devuelve Option<Item>.

The example iterates over all combinations of x and y coordinates.

Try iterating over the grid twice in main. Why does this fail? Note that IntoIterator::into_iter takes ownership of self.

Fix this issue by implementing IntoIterator for &Grid and storing a reference to the Grid in GridIter.

The same problem can occur for standard library types: for e in some_vector will take ownership of some_vector and iterate over owned elements from that vector. Use for e in &some_vector instead, to iterate over references to elements of some_vector.

FromIterator

FromIterator permite construir una colección a partir de un Iterator.

fn main() {
    let primes = vec![2, 3, 5, 7];
    let prime_squares = primes.into_iter().map(|p| p * p).collect::<Vec<_>>();
    println!("prime_squares: {prime_squares:?}");
}
This slide should take about 5 minutes.

Iterator implements

fn collect<B>(self) -> B
where
    B: FromIterator<Self::Item>,
    Self: Sized

There are two ways to specify B for this method:

  • With the “turbofish”: some_iterator.collect::<COLLECTION_TYPE>(), as shown. The _ shorthand used here lets Rust infer the type of the Vec elements.
  • With type inference: let prime_squares: Vec<_> = some_iterator.collect(). Rewrite the example to use this form.

There are basic implementations of FromIterator for Vec, HashMap, etc. There are also more specialized implementations which let you do cool things like convert an Iterator<Item = Result<V, E>> into a Result<Vec<V>, E>.

Exercise: Iterator Method Chaining

In this exercise, you will need to find and use some of the provided methods in the Iterator trait to implement a complex calculation.

Copy the following code to https://play.rust-lang.org/ and make the tests pass. Use an iterator expression and collect the result to construct the return value.

#![allow(unused)]
fn main() {
/// Calculate the differences between elements of `values` offset by `offset`,
/// wrapping around from the end of `values` to the beginning.
///
/// Element `n` of the result is `values[(n+offset)%len] - values[n]`.
fn offset_differences<N>(offset: usize, values: Vec<N>) -> Vec<N>
where
    N: Copy + std::ops::Sub<Output = N>,
{
    unimplemented!()
}

#[test]
fn test_offset_one() {
    assert_eq!(offset_differences(1, vec![1, 3, 5, 7]), vec![2, 2, 2, -6]);
    assert_eq!(offset_differences(1, vec![1, 3, 5]), vec![2, 2, -4]);
    assert_eq!(offset_differences(1, vec![1, 3]), vec![2, -2]);
}

#[test]
fn test_larger_offsets() {
    assert_eq!(offset_differences(2, vec![1, 3, 5, 7]), vec![4, 4, -4, -4]);
    assert_eq!(offset_differences(3, vec![1, 3, 5, 7]), vec![6, -2, -2, -2]);
    assert_eq!(offset_differences(4, vec![1, 3, 5, 7]), vec![0, 0, 0, 0]);
    assert_eq!(offset_differences(5, vec![1, 3, 5, 7]), vec![2, 2, 2, -6]);
}

#[test]
fn test_custom_type() {
    assert_eq!(
        offset_differences(1, vec![1.0, 11.0, 5.0, 0.0]),
        vec![10.0, -6.0, -5.0, 1.0]
    );
}

#[test]
fn test_degenerate_cases() {
    assert_eq!(offset_differences(1, vec![0]), vec![0]);
    assert_eq!(offset_differences(1, vec![1]), vec![0]);
    let empty: Vec<i32> = vec![];
    assert_eq!(offset_differences(1, empty), vec![]);
}
}

Soluciones

/// Calculate the differences between elements of `values` offset by `offset`,
/// wrapping around from the end of `values` to the beginning.
///
/// Element `n` of the result is `values[(n+offset)%len] - values[n]`.
fn offset_differences<N>(offset: usize, values: Vec<N>) -> Vec<N>
where
    N: Copy + std::ops::Sub<Output = N>,
{
    let a = (&values).into_iter();
    let b = (&values).into_iter().cycle().skip(offset);
    a.zip(b).map(|(a, b)| *b - *a).take(values.len()).collect()
}

#[test]
fn test_offset_one() {
    assert_eq!(offset_differences(1, vec![1, 3, 5, 7]), vec![2, 2, 2, -6]);
    assert_eq!(offset_differences(1, vec![1, 3, 5]), vec![2, 2, -4]);
    assert_eq!(offset_differences(1, vec![1, 3]), vec![2, -2]);
}

#[test]
fn test_larger_offsets() {
    assert_eq!(offset_differences(2, vec![1, 3, 5, 7]), vec![4, 4, -4, -4]);
    assert_eq!(offset_differences(3, vec![1, 3, 5, 7]), vec![6, -2, -2, -2]);
    assert_eq!(offset_differences(4, vec![1, 3, 5, 7]), vec![0, 0, 0, 0]);
    assert_eq!(offset_differences(5, vec![1, 3, 5, 7]), vec![2, 2, 2, -6]);
}

#[test]
fn test_custom_type() {
    assert_eq!(
        offset_differences(1, vec![1.0, 11.0, 5.0, 0.0]),
        vec![10.0, -6.0, -5.0, 1.0]
    );
}

#[test]
fn test_degenerate_cases() {
    assert_eq!(offset_differences(1, vec![0]), vec![0]);
    assert_eq!(offset_differences(1, vec![1]), vec![0]);
    let empty: Vec<i32> = vec![];
    assert_eq!(offset_differences(1, empty), vec![]);
}

fn main() {}

Módulos

In this segment:

This segment should take about 40 minutes

Módulos

Hemos visto cómo los bloques impl nos permiten asignar espacios de nombres de funciones a un tipo.

Del mismo modo, mod nos permite asignar espacios de nombres a funciones y tipos:

mod foo {
    pub fn do_something() {
        println!("In the foo module");
    }
}

mod bar {
    pub fn do_something() {
        println!("In the bar module");
    }
}

fn main() {
    foo::do_something();
    bar::do_something();
}
This slide should take about 5 minutes.
  • Los paquetes ofrecen funciones e incluyen un archivo Cargo.toml que describe cómo compilar un paquete de más de un crate.
  • Los crates son un árbol de módulos, donde un crate binario crea un ejecutable y un crate de biblioteca compila una biblioteca.
  • Los módulos definen la organización y el ámbito, y son el centro de esta sección.

Jerarquía del sistema de archivos

Omitir el contenido del módulo hará que Rust lo busque en otro archivo:

mod garden;

Esto indica que el contenido del módulo garden se encuentra en src/garden.rs. Del mismo modo, el módulo garden::vegetables se encuentra en src/garden/vegetables.rs.

La raíz de crate está en:

  • src/lib.rs (para un crate de biblioteca)
  • src/main.rs (para un crate binario)

Los módulos definidos en archivos también se pueden documentar mediante “comentarios internos del documento”. En ellos se indica el elemento que los contiene, en este caso, un módulo.

//! This module implements the garden, including a highly performant germination
//! implementation.

// Re-export types from this module.
pub use garden::Garden;
pub use seeds::SeedPacket;

/// Sow the given seed packets.
pub fn sow(seeds: Vec<SeedPacket>) {
    todo!()
}

/// Harvest the produce in the garden that is ready.
pub fn harvest(garden: &mut Garden) {
    todo!()
}
This slide should take about 5 minutes.
  • Antes de Rust 2018, los módulos debían ubicarse en module/mod.rs en lugar de en module.rs. Esta alternativa sigue existiendo en las ediciones posteriores a 2018.

  • El principal motivo de introducir filename.rs en lugar de filename/mod.rs se debe a que si muchos archivos llamados mod.rs puede ser difícil distinguirlos en IDEs.

  • Un anidamiento más profundo puede usar carpetas, incluso si el módulo principal es un archivo:

    src/
    ├── main.rs
    ├── top_module.rs
    └── top_module/
        └── sub_module.rs
    
  • El lugar donde Rust buscará los módulos se puede cambiar con una directiva del compilador:

    #[path = "some/path.rs"]
    mod some_module;

    Esto resulta útil, por ejemplo, si deseas colocar pruebas de un módulo en un archivo denominado some_module_test.rs, similar a la convención en Go.

Visibilidad

Los módulos marcan el límite de la privacidad:

  • Los elementos del módulo son privados de forma predeterminada (se ocultan los detalles de implementación).
  • Los elementos superiores y los del mismo nivel siempre están visibles.
  • Es decir, si un elemento está visible en el módulo foo, se verá en todos los elementos descendientes de foo.
mod outer {
    fn private() {
        println!("outer::private");
    }

    pub fn public() {
        println!("outer::public");
    }

    mod inner {
        fn private() {
            println!("outer::inner::private");
        }

        pub fn public() {
            println!("outer::inner::public");
            super::private();
        }
    }
}

fn main() {
    outer::public();
}
This slide should take about 5 minutes.
  • Haz que los módulos sean públicos con la palabra clave pub.

Además, hay especificadores pub(...) avanzados para restringir el ámbito de la visibilidad pública.

  • Consulta el libro Rust Reference.
  • Configurar la visibilidad de pub(crate) es un patrón común.
  • Aunque es menos frecuente, se puede dar visibilidad a una ruta específica.
  • En cualquier caso, se debe dar visibilidad a un módulo antecedente (y a todos sus descendientes).

use, super, self

Un módulo puede incluir símbolos de otro módulo en el ámbito con use. Normalmente, se ve algo como esto en la parte superior de cada módulo:

use std::collections::HashSet;
use std::process::abort;

Rutas

Las rutas se resuelven de la siguiente manera:

  1. Como ruta relativa:

    • foo o self::foo hacen referencia a foo en el módulo.
    • super::foo hace referencia a foo en el módulo superior.
  2. Como ruta absoluta:

    • crate::foo hace referencia a foo en la raíz del crate.
    • bar::foo hace referencia a foo en el crate bar.
This slide should take about 10 minutes.
  • It is common to “re-export” symbols at a shorter path. For example, the top-level lib.rs in a crate might have

    mod storage;
    
    pub use storage::disk::DiskStorage;
    pub use storage::network::NetworkStorage;

    making DiskStorage and NetworkStorage available to other crates with a convenient, short path.

  • For the most part, only items that appear in a module need to be use’d. However, a trait must be in scope to call any methods on that trait, even if a type implementing that trait is already in scope. For example, to use the read_to_string method on a type implementing the Read trait, you need to use std::io::Read.

  • The use statement can have a wildcard: use std::io::*. This is discouraged because it is not clear which items are imported, and those might change over time.

Exercise: Modules for a GUI Library

In this exercise, you will reorganize a small GUI Library implementation. This library defines a Widget trait and a few implementations of that trait, as well as a main function.

It is typical to put each type or set of closely-related types into its own module, so each widget type should get its own module.

Cargo Setup

The Rust playground only supports one file, so you will need to make a Cargo project on your local filesystem:

cargo init gui-modules
cd gui-modules
cargo run

Edit the resulting src/main.rs to add mod statements, and add additional files in the src directory.

Source

Here’s the single-module implementation of the GUI library:

pub trait Widget {
    /// Natural width of `self`.
    fn width(&self) -> usize;

    /// Draw the widget into a buffer.
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write);

    /// Draw the widget on standard output.
    fn draw(&self) {
        let mut buffer = String::new();
        self.draw_into(&mut buffer);
        println!("{buffer}");
    }
}

pub struct Label {
    label: String,
}

impl Label {
    fn new(label: &str) -> Label {
        Label { label: label.to_owned() }
    }
}

pub struct Button {
    label: Label,
}

impl Button {
    fn new(label: &str) -> Button {
        Button { label: Label::new(label) }
    }
}

pub struct Window {
    title: String,
    widgets: Vec<Box<dyn Widget>>,
}

impl Window {
    fn new(title: &str) -> Window {
        Window { title: title.to_owned(), widgets: Vec::new() }
    }

    fn add_widget(&mut self, widget: Box<dyn Widget>) {
        self.widgets.push(widget);
    }

    fn inner_width(&self) -> usize {
        std::cmp::max(
            self.title.chars().count(),
            self.widgets.iter().map(|w| w.width()).max().unwrap_or(0),
        )
    }
}

impl Widget for Window {
    fn width(&self) -> usize {
        // Add 4 paddings for borders
        self.inner_width() + 4
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        let mut inner = String::new();
        for widget in &self.widgets {
            widget.draw_into(&mut inner);
        }

        let inner_width = self.inner_width();

        // TODO: Change draw_into to return Result<(), std::fmt::Error>. Then use the
        // ?-operator here instead of .unwrap().
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
        writeln!(buffer, "| {:^inner_width$} |", &self.title).unwrap();
        writeln!(buffer, "+={:=<inner_width$}=+", "").unwrap();
        for line in inner.lines() {
            writeln!(buffer, "| {:inner_width$} |", line).unwrap();
        }
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
    }
}

impl Widget for Button {
    fn width(&self) -> usize {
        self.label.width() + 8 // add a bit of padding
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        let width = self.width();
        let mut label = String::new();
        self.label.draw_into(&mut label);

        writeln!(buffer, "+{:-<width$}+", "").unwrap();
        for line in label.lines() {
            writeln!(buffer, "|{:^width$}|", &line).unwrap();
        }
        writeln!(buffer, "+{:-<width$}+", "").unwrap();
    }
}

impl Widget for Label {
    fn width(&self) -> usize {
        self.label.lines().map(|line| line.chars().count()).max().unwrap_or(0)
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        writeln!(buffer, "{}", &self.label).unwrap();
    }
}

fn main() {
    let mut window = Window::new("Rust GUI Demo 1.23");
    window.add_widget(Box::new(Label::new("This is a small text GUI demo.")));
    window.add_widget(Box::new(Button::new("Click me!")));
    window.draw();
}
This slide and its sub-slides should take about 15 minutes.

Encourage students to divide the code in a way that feels natural for them, and get accustomed to the required mod, use, and pub declarations. Afterward, discuss what organizations are most idiomatic.

Soluciones

src
├── main.rs
├── widgets
│   ├── button.rs
│   ├── label.rs
│   └── window.rs
└── widgets.rs
// ---- src/widgets.rs ----
mod button;
mod label;
mod window;

pub trait Widget {
    /// Natural width of `self`.
    fn width(&self) -> usize;

    /// Draw the widget into a buffer.
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write);

    /// Draw the widget on standard output.
    fn draw(&self) {
        let mut buffer = String::new();
        self.draw_into(&mut buffer);
        println!("{buffer}");
    }
}

pub use button::Button;
pub use label::Label;
pub use window::Window;
// ---- src/widgets/label.rs ----
use super::Widget;

pub struct Label {
    label: String,
}

impl Label {
    pub fn new(label: &str) -> Label {
        Label { label: label.to_owned() }
    }
}

impl Widget for Label {
    fn width(&self) -> usize {
        // ANCHOR_END: Label-width
        self.label.lines().map(|line| line.chars().count()).max().unwrap_or(0)
    }

    // ANCHOR: Label-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Label-draw_into
        writeln!(buffer, "{}", &self.label).unwrap();
    }
}
// ---- src/widgets/button.rs ----
use super::{Label, Widget};

pub struct Button {
    label: Label,
}

impl Button {
    pub fn new(label: &str) -> Button {
        Button { label: Label::new(label) }
    }
}

impl Widget for Button {
    fn width(&self) -> usize {
        // ANCHOR_END: Button-width
        self.label.width() + 8 // add a bit of padding
    }

    // ANCHOR: Button-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Button-draw_into
        let width = self.width();
        let mut label = String::new();
        self.label.draw_into(&mut label);

        writeln!(buffer, "+{:-<width$}+", "").unwrap();
        for line in label.lines() {
            writeln!(buffer, "|{:^width$}|", &line).unwrap();
        }
        writeln!(buffer, "+{:-<width$}+", "").unwrap();
    }
}
// ---- src/widgets/window.rs ----
use super::Widget;

pub struct Window {
    title: String,
    widgets: Vec<Box<dyn Widget>>,
}

impl Window {
    pub fn new(title: &str) -> Window {
        Window { title: title.to_owned(), widgets: Vec::new() }
    }

    pub fn add_widget(&mut self, widget: Box<dyn Widget>) {
        self.widgets.push(widget);
    }

    fn inner_width(&self) -> usize {
        std::cmp::max(
            self.title.chars().count(),
            self.widgets.iter().map(|w| w.width()).max().unwrap_or(0),
        )
    }
}

impl Widget for Window {
    fn width(&self) -> usize {
        // ANCHOR_END: Window-width
        // Add 4 paddings for borders
        self.inner_width() + 4
    }

    // ANCHOR: Window-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Window-draw_into
        let mut inner = String::new();
        for widget in &self.widgets {
            widget.draw_into(&mut inner);
        }

        let inner_width = self.inner_width();

        // TODO: after learning about error handling, you can change
        // draw_into to return Result<(), std::fmt::Error>. Then use
        // the ?-operator here instead of .unwrap().
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
        writeln!(buffer, "| {:^inner_width$} |", &self.title).unwrap();
        writeln!(buffer, "+={:=<inner_width$}=+", "").unwrap();
        for line in inner.lines() {
            writeln!(buffer, "| {:inner_width$} |", line).unwrap();
        }
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
    }
}
// ---- src/main.rs ----
mod widgets;

use widgets::Widget;

fn main() {
    let mut window = widgets::Window::new("Rust GUI Demo 1.23");
    window
        .add_widget(Box::new(widgets::Label::new("This is a small text GUI demo.")));
    window.add_widget(Box::new(widgets::Button::new("Click me!")));
    window.draw();
}

Probando

In this segment:

This segment should take about 1 hour and 5 minutes

Pruebas Unitarias

Rust y Cargo incluyen un sencillo framework para pruebas unitarias:

  • Las pruebas unitarias se admiten en todo el código.

  • Las pruebas de integración se admiten a través del directorio tests/.

Tests are marked with #[test]. Unit tests are often put in a nested tests module, using #[cfg(test)] to conditionally compile them only when building tests.

fn first_word(text: &str) -> &str {
    match text.find(' ') {
        Some(idx) => &text[..idx],
        None => &text,
    }
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_empty() {
        assert_eq!(first_word(""), "");
    }

    #[test]
    fn test_single_word() {
        assert_eq!(first_word("Hello"), "Hello");
    }

    #[test]
    fn test_multiple_words() {
        assert_eq!(first_word("Hello World"), "Hello");
    }
}
  • Esto permite realizar pruebas unitarias de los ayudantes privados.
  • El atributo #[cfg(test)] solo está activo cuando se ejecuta cargo test.
This slide should take about 5 minutes.

Run the tests in the playground in order to show their results.

Otros proyectos

Pruebas de Integración

Si quieres probar tu biblioteca como cliente, haz una prueba de integración.

Crea un archivo .rs en tests/:

// tests/my_library.rs
use my_library::init;

#[test]
fn test_init() {
    assert!(init().is_ok());
}

Estas pruebas solo tienen acceso a la API pública de tu crate.

Pruebas de Documentación

Rust cuenta con asistencia integrada para pruebas de documentación:

#![allow(unused)]
fn main() {
/// Shortens a string to the given length.
///
/// ```
/// # use playground::shorten_string;
/// assert_eq!(shorten_string("Hello World", 5), "Hello");
/// assert_eq!(shorten_string("Hello World", 20), "Hello World");
/// ```
pub fn shorten_string(s: &str, length: usize) -> &str {
    &s[..std::cmp::min(length, s.len())]
}
}
  • Los bloques de código en los comentarios /// se ven automáticamente como código de Rust.
  • El código se compilará y ejecutará como parte de cargo test.
  • Adding # in the code will hide it from the docs, but will still compile/run it.
  • Prueba el código anterior en el playground de Rust.

Crates útiles

Rust solo incluye asistencia básica para las pruebas de escritura.

A continuación, se indican algunos crates adicionales que recomendamos para escribir pruebas:

  • googletest: biblioteca completa de aserción de pruebas en la tradición de GoogleTest para C++.
  • proptest: pruebas basadas en propiedades para Rust.
  • rstest: asistencia para fixtures y pruebas parametrizadas.

GoogleTest

The GoogleTest crate allows for flexible test assertions using matchers:

use googletest::prelude::*;

#[googletest::test]
fn test_elements_are() {
    let value = vec!["foo", "bar", "baz"];
    expect_that!(value, elements_are!(eq("foo"), lt("xyz"), starts_with("b")));
}

If we change the last element to "!", the test fails with a structured error message pin-pointing the error:

---- test_elements_are stdout ----
Value of: value
Expected: has elements:
  0. is equal to "foo"
  1. is less than "xyz"
  2. starts with prefix "!"
Actual: ["foo", "bar", "baz"],
  where element #2 is "baz", which does not start with "!"
  at src/testing/googletest.rs:6:5
Error: See failure output above
This slide should take about 5 minutes.
  • GoogleTest is not part of the Rust Playground, so you need to run this example in a local environment. Use cargo add googletest to quickly add it to an existing Cargo project.

  • The use googletest::prelude::*; line imports a number of commonly used macros and types.

  • This just scratches the surface, there are many builtin matchers.

  • A particularly nice feature is that mismatches in multi-line strings strings are shown as a diff:

#[test]
fn test_multiline_string_diff() {
    let haiku = "Memory safety found,\n\
                 Rust's strong typing guides the way,\n\
                 Secure code you'll write.";
    assert_that!(
        haiku,
        eq("Memory safety found,\n\
            Rust's silly humor guides the way,\n\
            Secure code you'll write.")
    );
}

shows a color-coded diff (colors not shown here):

    Value of: haiku
Expected: is equal to "Memory safety found,\nRust's silly humor guides the way,\nSecure code you'll write."
Actual: "Memory safety found,\nRust's strong typing guides the way,\nSecure code you'll write.",
  which isn't equal to "Memory safety found,\nRust's silly humor guides the way,\nSecure code you'll write."
Difference(-actual / +expected):
 Memory safety found,
-Rust's strong typing guides the way,
+Rust's silly humor guides the way,
 Secure code you'll write.
  at src/testing/googletest.rs:17:5
  • The crate is a Rust port of GoogleTest for C++.

  • GoogleTest is available for use in AOSP.

Mocking

For mocking, Mockall is a widely used library. You need to refactor your code to use traits, which you can then quickly mock:

use std::time::Duration;

#[mockall::automock]
pub trait Pet {
    fn is_hungry(&self, since_last_meal: Duration) -> bool;
}

#[test]
fn test_robot_dog() {
    let mut mock_dog = MockPet::new();
    mock_dog.expect_is_hungry().return_const(true);
    assert_eq!(mock_dog.is_hungry(Duration::from_secs(10)), true);
}
This slide should take about 5 minutes.
  • The advice here is for Android (AOSP) where Mockall is the recommended mocking library. There are other mocking libraries available on crates.io, in particular in the area of mocking HTTP services. The other mocking libraries work in a similar fashion as Mockall, meaning that they make it easy to get a mock implementation of a given trait.

  • Note that mocking is somewhat controversial: mocks allow you to completely isolate a test from its dependencies. The immediate result is faster and more stable test execution. On the other hand, the mocks can be configured wrongly and return output different from what the real dependencies would do.

    If at all possible, it is recommended that you use the real dependencies. As an example, many databases allow you to configure an in-memory backend. This means that you get the correct behavior in your tests, plus they are fast and will automatically clean up after themselves.

    Similarly, many web frameworks allow you to start an in-process server which binds to a random port on localhost. Always prefer this over mocking away the framework since it helps you test your code in the real environment.

  • Mockall is not part of the Rust Playground, so you need to run this example in a local environment. Use cargo add mockall to quickly add Mockall to an existing Cargo project.

  • Mockall has a lot more functionality. In particular, you can set up expectations which depend on the arguments passed. Here we use this to mock a cat which becomes hungry 3 hours after the last time it was fed:

#[test]
fn test_robot_cat() {
    let mut mock_cat = MockPet::new();
    mock_cat
        .expect_is_hungry()
        .with(mockall::predicate::gt(Duration::from_secs(3 * 3600)))
        .return_const(true);
    mock_cat.expect_is_hungry().return_const(false);
    assert_eq!(mock_cat.is_hungry(Duration::from_secs(1 * 3600)), false);
    assert_eq!(mock_cat.is_hungry(Duration::from_secs(5 * 3600)), true);
}
  • You can use .times(n) to limit the number of times a mock method can be called to n — the mock will automatically panic when dropped if this isn’t satisfied.

Compiler Lints and Clippy

The Rust compiler produces fantastic error messages, as well as helpful built-in lints. Clippy provides even more lints, organized into groups that can be enabled per-project.

#[deny(clippy::cast_possible_truncation)]
fn main() {
    let x = 3;
    while (x < 70000) {
        x *= 2;
    }
    println!("X probably fits in a u16, right? {}", x as u16);
}
This slide should take about 5 minutes.

Run the code sample and examine the error message. There are also lints visible here, but those will not be shown once the code compiles. Switch to the Playground site to show those lints.

After resolving the lints, run clippy on the playground site to show clippy warnings. Clippy has extensive documentation of its lints, and adds new lints (including default-deny lints) all the time.

Note that errors or warnings with help: ... can be fixed with cargo fix or via your editor.

Algoritmo de Luhn

Algoritmo de Luhn

El algoritmo de Luhn se usa para validar números de tarjetas de crédito. El algoritmo toma una cadena como entrada y hace lo siguiente para validar el número de la tarjeta de crédito:

  • Ignora todos los espacios. Rechaza los números con menos de dos dígitos.

  • De derecha a izquierda, duplica cada dos cifras: en el caso del número 1234, se duplica el 3 y el 1. En el caso del número 98765, se duplica el 6 y el 8.

  • Después de duplicar un dígito, se suman los dígitos si el resultado es mayor a 9. Por tanto, si duplicas 7, pasará a ser 14, lo cual pasará a ser 1 +4 = 5.

  • Suma todos los dígitos, no duplicados y duplicados.

  • El número de la tarjeta de crédito es válido si la suma termina en 0.

The provided code provides a buggy implementation of the luhn algorithm, along with two basic unit tests that confirm that most the algorithm is implemented correctly.

Copy the code below to https://play.rust-lang.org/ and write additional tests to uncover bugs in the provided implementation, fixing any bugs you find.

#![allow(unused)]
fn main() {
pub fn luhn(cc_number: &str) -> bool {
    let mut sum = 0;
    let mut double = false;

    for c in cc_number.chars().rev() {
        if let Some(digit) = c.to_digit(10) {
            if double {
                let double_digit = digit * 2;
                sum +=
                    if double_digit > 9 { double_digit - 9 } else { double_digit };
            } else {
                sum += digit;
            }
            double = !double;
        } else {
            continue;
        }
    }

    sum % 10 == 0
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_valid_cc_number() {
        assert!(luhn("4263 9826 4026 9299"));
        assert!(luhn("4539 3195 0343 6467"));
        assert!(luhn("7992 7398 713"));
    }

    #[test]
    fn test_invalid_cc_number() {
        assert!(!luhn("4223 9826 4026 9299"));
        assert!(!luhn("4539 3195 0343 6476"));
        assert!(!luhn("8273 1232 7352 0569"));
    }
}
}

Soluciones

// This is the buggy version that appears in the problem.
#[cfg(never)]
pub fn luhn(cc_number: &str) -> bool {
    let mut sum = 0;
    let mut double = false;

    for c in cc_number.chars().rev() {
        if let Some(digit) = c.to_digit(10) {
            if double {
                let double_digit = digit * 2;
                sum +=
                    if double_digit > 9 { double_digit - 9 } else { double_digit };
            } else {
                sum += digit;
            }
            double = !double;
        } else {
            continue;
        }
    }

    sum % 10 == 0
}

// This is the solution and passes all of the tests below.
pub fn luhn(cc_number: &str) -> bool {
    let mut sum = 0;
    let mut double = false;
    let mut digits = 0;

    for c in cc_number.chars().rev() {
        if let Some(digit) = c.to_digit(10) {
            digits += 1;
            if double {
                let double_digit = digit * 2;
                sum +=
                    if double_digit > 9 { double_digit - 9 } else { double_digit };
            } else {
                sum += digit;
            }
            double = !double;
        } else if c.is_whitespace() {
            continue;
        } else {
            return false;
        }
    }

    digits >= 2 && sum % 10 == 0
}

fn main() {
    let cc_number = "1234 5678 1234 5670";
    println!(
        "Is {cc_number} a valid credit card number? {}",
        if luhn(cc_number) { "yes" } else { "no" }
    );
}

#[cfg(test)]
mod test {
    use super::*;

    #[test]
    fn test_valid_cc_number() {
        assert!(luhn("4263 9826 4026 9299"));
        assert!(luhn("4539 3195 0343 6467"));
        assert!(luhn("7992 7398 713"));
    }

    #[test]
    fn test_invalid_cc_number() {
        assert!(!luhn("4223 9826 4026 9299"));
        assert!(!luhn("4539 3195 0343 6476"));
        assert!(!luhn("8273 1232 7352 0569"));
    }

    #[test]
    fn test_non_digit_cc_number() {
        assert!(!luhn("foo"));
        assert!(!luhn("foo 0 0"));
    }

    #[test]
    fn test_empty_cc_number() {
        assert!(!luhn(""));
        assert!(!luhn(" "));
        assert!(!luhn("  "));
        assert!(!luhn("    "));
    }

    #[test]
    fn test_single_digit_cc_number() {
        assert!(!luhn("0"));
    }

    #[test]
    fn test_two_digit_cc_number() {
        assert!(luhn(" 0 0 "));
    }
}

Welcome Back

In this session:

Including 10 minute breaks, this session should take about 2 hours

Manejo de Errores

In this segment:

This segment should take about 45 minutes

Panics

Rust handles fatal errors with a “panic”.

Rust activará un panic si se produce un error grave en runtime:

fn main() {
    let v = vec![10, 20, 30];
    println!("v[100]: {}", v[100]);
}
  • Los panics se usan para errores irrecuperables e inesperados.
    • Los panics son un síntoma de que hay fallos en el programa.
    • Runtime failures like failed bounds checks can panic
    • Assertions (such as assert!) panic on failure
    • Purpose-specific panics can use the panic! macro.
  • A panic will “unwind” the stack, dropping values just as if the functions had returned.
  • Utiliza API que no activen panics (como Vec::get) si no se admiten fallos.
This slide should take about 3 minutes.

De forma predeterminada, el panic hará que la stack se desenrolle. El proceso de desenrrollado se puede detectar:

use std::panic;

fn main() {
    let result = panic::catch_unwind(|| "No problem here!");
    println!("{result:?}");

    let result = panic::catch_unwind(|| {
        panic!("oh no!");
    });
    println!("{result:?}");
}
  • Catching is unusual; do not attempt to implement exceptions with catch_unwind!
  • Esto puede ser útil en los servidores que deben seguir ejecutándose aunque una sola solicitud falle.
  • No funciona si panic = 'abort' está definido en Cargo.toml.

Iteradores

Runtime errors like connection-refused or file-not-found are handled with the Result type, but matching this type on every call can be cumbersome. The try-operator ? is used to return errors to the caller. It lets you turn the common

match some_expression {
    Ok(value) => value,
    Err(err) => return Err(err),
}

en algo mucho más sencillo:

some_expression?

Podemos utilizarlo para simplificar el código de gestión de errores:

use std::io::Read;
use std::{fs, io};

fn read_username(path: &str) -> Result<String, io::Error> {
    let username_file_result = fs::File::open(path);
    let mut username_file = match username_file_result {
        Ok(file) => file,
        Err(err) => return Err(err),
    };

    let mut username = String::new();
    match username_file.read_to_string(&mut username) {
        Ok(_) => Ok(username),
        Err(err) => Err(err),
    }
}

fn main() {
    //fs::write("config.dat", "alice").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}
This slide should take about 5 minutes.

Simplify the read_username function to use ?.

Puntos clave:

  • La variable username puede ser Ok(string) o Err(error).
  • Utiliza la llamada a fs::write para probar las distintas situaciones: sin archivo, archivo vacío o archivo con nombre de usuario.
  • Note that main can return a Result<(), E> as long as it implements std::process:Termination. In practice, this means that E implements Debug. The executable will print the Err variant and return a nonzero exit status on error.

Conversiones Implícitas

La expansión efectiva de ? es un poco más complicada de lo que se ha indicado anteriormente:

expression?

funciona igual que

match expression {
    Ok(value) => value,
    Err(err)  => return Err(From::from(err)),
}

The From::from call here means we attempt to convert the error type to the type returned by the function. This makes it easy to encapsulate errors into higher-level errors.

Ejemplo

use std::error::Error;
use std::fmt::{self, Display, Formatter};
use std::fs::File;
use std::io::{self, Read};

#[derive(Debug)]
enum ReadUsernameError {
    IoError(io::Error),
    EmptyUsername(String),
}

impl Error for ReadUsernameError {}

impl Display for ReadUsernameError {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        match self {
            Self::IoError(e) => write!(f, "IO error: {e}"),
            Self::EmptyUsername(path) => write!(f, "Found no username in {path}"),
        }
    }
}

impl From<io::Error> for ReadUsernameError {
    fn from(err: io::Error) -> Self {
        Self::IoError(err)
    }
}

fn read_username(path: &str) -> Result<String, ReadUsernameError> {
    let mut username = String::with_capacity(100);
    File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(ReadUsernameError::EmptyUsername(String::from(path)));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}
This slide should take about 5 minutes.

The ? operator must return a value compatible with the return type of the function. For Result, it means that the error types have to be compatible. A function that returns Result<T, ErrorOuter> can only use ? on a value of type Result<U, ErrorInner> if ErrorOuter and ErrorInner are the same type or if ErrorOuter implements From<ErrorInner>.

A common alternative to a From implementation is Result::map_err, especially when the conversion only happens in one place.

There is no compatibility requirement for Option. A function returning Option<T> can use the ? operator on Option<U> for arbitrary T and U types.

A function that returns Result cannot use ? on Option and vice versa. However, Option::ok_or converts Option to Result whereas Result::ok turns Result into Option.

Tipos de Errores Dinámicos

Sometimes we want to allow any type of error to be returned without writing our own enum covering all the different possibilities. The std::error::Error trait makes it easy to create a trait object that can contain any error.

use std::error::Error;
use std::fs;
use std::io::Read;

fn read_count(path: &str) -> Result<i32, Box<dyn Error>> {
    let mut count_str = String::new();
    fs::File::open(path)?.read_to_string(&mut count_str)?;
    let count: i32 = count_str.parse()?;
    Ok(count)
}

fn main() {
    fs::write("count.dat", "1i3").unwrap();
    match read_count("count.dat") {
        Ok(count) => println!("Count: {count}"),
        Err(err) => println!("Error: {err}"),
    }
}
This slide should take about 5 minutes.

The read_count function can return std::io::Error (from file operations) or std::num::ParseIntError (from String::parse).

Boxing errors saves on code, but gives up the ability to cleanly handle different error cases differently in the program. As such it’s generally not a good idea to use Box<dyn Error> in the public API of a library, but it can be a good option in a program where you just want to display the error message somewhere.

Make sure to implement the std::error::Error trait when defining a custom error type so it can be boxed. But if you need to support the no_std attribute, keep in mind that the std::error::Error trait is currently compatible with no_std in nightly only.

thiserror and anyhow

The thiserror and anyhow crates are widely used to simplify error handling.

  • thiserror is often used in libraries to create custom error types that implement From<T>.
  • anyhow is often used by applications to help with error handling in functions, including adding contextual information to your errors.
use anyhow::{bail, Context, Result};
use std::fs;
use std::io::Read;
use thiserror::Error;

#[derive(Clone, Debug, Eq, Error, PartialEq)]
#[error("Found no username in {0}")]
struct EmptyUsernameError(String);

fn read_username(path: &str) -> Result<String> {
    let mut username = String::with_capacity(100);
    fs::File::open(path)
        .with_context(|| format!("Failed to open {path}"))?
        .read_to_string(&mut username)
        .context("Failed to read")?;
    if username.is_empty() {
        bail!(EmptyUsernameError(path.to_string()));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err) => println!("Error: {err:?}"),
    }
}
This slide should take about 5 minutes.

thiserror

  • The Error derive macro is provided by thiserror, and has lots of useful attributes to help define error types in a compact way.
  • The std::error::Error trait is derived automatically.
  • The message from #[error] is used to derive the Display trait.

anyhow

  • anyhow::Error es básicamente un envoltorio alrededor de Box<dyn Error>. Como tal, no suele ser una buena elección para la API pública de una biblioteca, pero se usa con frecuencia en aplicaciones.
  • anyhow::Result<V> es un alias de tipo para Result<V, anyhow::Error>.
  • El tipo de error real que contiene se puede extraer para analizarlo si es necesario.
  • La funcionalidad proporcionada por anyhow::Result<T> puede resultar familiar a los desarrolladores de Go, ya que ofrece patrones de uso y ergonomía similares a (T, error) de Go.
  • anyhow::Context is a trait implemented for the standard Result and Option types. use anyhow::Context is necessary to enable .context() and .with_context() on those types.

Gestión Estructurada de Errores con Result

The following implements a very simple parser for an expression language. However, it handles errors by panicking. Rewrite it to instead use idiomatic error handling and propagate errors to a return from main. Feel free to use thiserror and anyhow.

HINT: start by fixing error handling in the parse function. Once that is working correctly, update Tokenizer to implement Iterator<Item=Result<Token, TokenizerError>> and handle that in the parser.

use std::iter::Peekable;
use std::str::Chars;

/// An arithmetic operator.
#[derive(Debug, PartialEq, Clone, Copy)]
enum Op {
    Add,
    Sub,
}

/// A token in the expression language.
#[derive(Debug, PartialEq)]
enum Token {
    Number(String),
    Identifier(String),
    Operator(Op),
}

/// An expression in the expression language.
#[derive(Debug, PartialEq)]
enum Expression {
    /// A reference to a variable.
    Var(String),
    /// A literal number.
    Number(u32),
    /// A binary operation.
    Operation(Box<Expression>, Op, Box<Expression>),
}

fn tokenize(input: &str) -> Tokenizer {
    return Tokenizer(input.chars().peekable());
}

struct Tokenizer<'a>(Peekable<Chars<'a>>);

impl<'a> Iterator for Tokenizer<'a> {
    type Item = Token;

    fn next(&mut self) -> Option<Token> {
        let c = self.0.next()?;
        match c {
            '0'..='9' => {
                let mut num = String::from(c);
                while let Some(c @ '0'..='9') = self.0.peek() {
                    num.push(*c);
                    self.0.next();
                }
                Some(Token::Number(num))
            }
            'a'..='z' => {
                let mut ident = String::from(c);
                while let Some(c @ ('a'..='z' | '_' | '0'..='9')) = self.0.peek() {
                    ident.push(*c);
                    self.0.next();
                }
                Some(Token::Identifier(ident))
            }
            '+' => Some(Token::Operator(Op::Add)),
            '-' => Some(Token::Operator(Op::Sub)),
            _ => panic!("Unexpected character {c}"),
        }
    }
}

fn parse(input: &str) -> Expression {
    let mut tokens = tokenize(input);

    fn parse_expr<'a>(tokens: &mut Tokenizer<'a>) -> Expression {
        let Some(tok) = tokens.next() else {
            panic!("Unexpected end of input");
        };
        let expr = match tok {
            Token::Number(num) => {
                let v = num.parse().expect("Invalid 32-bit integer'");
                Expression::Number(v)
            }
            Token::Identifier(ident) => Expression::Var(ident),
            Token::Operator(_) => panic!("Unexpected token {tok:?}"),
        };
        // Look ahead to parse a binary operation if present.
        match tokens.next() {
            None => expr,
            Some(Token::Operator(op)) => Expression::Operation(
                Box::new(expr),
                op,
                Box::new(parse_expr(tokens)),
            ),
            Some(tok) => panic!("Unexpected token {tok:?}"),
        }
    }

    parse_expr(&mut tokens)
}

fn main() {
    let expr = parse("10+foo+20-30");
    println!("{expr:?}");
}

Soluciones

use thiserror::Error;
use std::iter::Peekable;
use std::str::Chars;

/// An arithmetic operator.
#[derive(Debug, PartialEq, Clone, Copy)]
enum Op {
    Add,
    Sub,
}

/// A token in the expression language.
#[derive(Debug, PartialEq)]
enum Token {
    Number(String),
    Identifier(String),
    Operator(Op),
}

/// An expression in the expression language.
#[derive(Debug, PartialEq)]
enum Expression {
    /// A reference to a variable.
    Var(String),
    /// A literal number.
    Number(u32),
    /// A binary operation.
    Operation(Box<Expression>, Op, Box<Expression>),
}

fn tokenize(input: &str) -> Tokenizer {
    return Tokenizer(input.chars().peekable());
}

#[derive(Debug, Error)]
enum TokenizerError {
    #[error("Unexpected character '{0}' in input")]
    UnexpectedCharacter(char),
}

struct Tokenizer<'a>(Peekable<Chars<'a>>);

impl<'a> Iterator for Tokenizer<'a> {
    type Item = Result<Token, TokenizerError>;

    fn next(&mut self) -> Option<Result<Token, TokenizerError>> {
        let c = self.0.next()?;
        match c {
            '0'..='9' => {
                let mut num = String::from(c);
                while let Some(c @ '0'..='9') = self.0.peek() {
                    num.push(*c);
                    self.0.next();
                }
                Some(Ok(Token::Number(num)))
            }
            'a'..='z' => {
                let mut ident = String::from(c);
                while let Some(c @ ('a'..='z' | '_' | '0'..='9')) = self.0.peek() {
                    ident.push(*c);
                    self.0.next();
                }
                Some(Ok(Token::Identifier(ident)))
            }
            '+' => Some(Ok(Token::Operator(Op::Add))),
            '-' => Some(Ok(Token::Operator(Op::Sub))),
            _ => Some(Err(TokenizerError::UnexpectedCharacter(c))),
        }
    }
}

#[derive(Debug, Error)]
enum ParserError {
    #[error("Tokenizer error: {0}")]
    TokenizerError(#[from] TokenizerError),
    #[error("Unexpected end of input")]
    UnexpectedEOF,
    #[error("Unexpected token {0:?}")]
    UnexpectedToken(Token),
    #[error("Invalid number")]
    InvalidNumber(#[from] std::num::ParseIntError),
}

fn parse(input: &str) -> Result<Expression, ParserError> {
    let mut tokens = tokenize(input);

    fn parse_expr<'a>(
        tokens: &mut Tokenizer<'a>,
    ) -> Result<Expression, ParserError> {
        let tok = tokens.next().ok_or(ParserError::UnexpectedEOF)??;
        let expr = match tok {
            Token::Number(num) => {
                let v = num.parse()?;
                Expression::Number(v)
            }
            Token::Identifier(ident) => Expression::Var(ident),
            Token::Operator(_) => return Err(ParserError::UnexpectedToken(tok)),
        };
        // Look ahead to parse a binary operation if present.
        Ok(match tokens.next() {
            None => expr,
            Some(Ok(Token::Operator(op))) => Expression::Operation(
                Box::new(expr),
                op,
                Box::new(parse_expr(tokens)?),
            ),
            Some(Err(e)) => return Err(e.into()),
            Some(Ok(tok)) => return Err(ParserError::UnexpectedToken(tok)),
        })
    }

    parse_expr(&mut tokens)
}

fn main() -> anyhow::Result<()> {
    let expr = parse("10+foo+20-30")?;
    println!("{expr:?}");
    Ok(())
}

Unsafe Rust

In this segment:

This segment should take about 1 hour and 5 minutes

Unsafe Rust

El lenguaje Rust tiene dos partes:

  • Safe Rust: memoria segura, sin posibilidad de comportamiento indefinido.
  • Unsafe Rust: puede activar un comportamiento no definido si se infringen las condiciones previas.

We saw mostly safe Rust in this course, but it’s important to know what Unsafe Rust is.

Por lo general, el código inseguro es pequeño y está aislado, y su corrección debe estar bien documentada. Suele estar envuelto en una capa de abstracción segura.

Rust inseguro te permite acceder a cinco nuevas funciones:

  • Desreferenciar punteros sin formato.
  • Acceder o modificar variables estáticas mutables.
  • Acceder a los campos union.
  • Llamar a funciones unsafe, incluidas las funciones extern.
  • Implementar traits unsafe.

A continuación, hablaremos brevemente sobre las funciones que no son seguras. Para obtener más información, consulta el capítulo 19.1 del Libro de Rust y el documento Rustonomicon.

This slide should take about 5 minutes.

Unsafe Rust does not mean the code is incorrect. It means that developers have turned off some compiler safety features and have to write correct code by themselves. It means the compiler no longer enforces Rust’s memory-safety rules.

Dereferenciación de Punteros Sin Formato

La creación de punteros es un proceso seguro, pero para anular las referencias, es necesario utilizar unsafe:

fn main() {
    let mut s = String::from("careful!");

    let r1 = &mut s as *mut String;
    let r2 = r1 as *const String;

    // Safe because r1 and r2 were obtained from references and so are
    // guaranteed to be non-null and properly aligned, the objects underlying
    // the references from which they were obtained are live throughout the
    // whole unsafe block, and they are not accessed either through the
    // references or concurrently through any other pointers.
    unsafe {
        println!("r1 is: {}", *r1);
        *r1 = String::from("uhoh");
        println!("r2 is: {}", *r2);
    }

    // NOT SAFE. DO NOT DO THIS.
    /*
    let r3: &String = unsafe { &*r1 };
    drop(s);
    println!("r3 is: {}", *r3);
    */
}
This slide should take about 10 minutes.

Se recomienda (y es obligatorio en la guía de estilo Rust de Android) escribir un comentario para cada bloque unsafe explicando cómo el código que contiene cumple los requisitos de seguridad de las operaciones inseguras que realiza.

En el caso de la desreferenciación de punteros, significa que los punteros deben ser válidos, por ejemplo:

  • El puntero no puede ser nulo.
  • El puntero debe ser desreferenciable (dentro de los límites de un único objeto asignado).
  • El objeto no debe haberse desasignado.
  • No debe haber accesos simultáneos a la misma ubicación.
  • Si el puntero se ha obtenido enviando una referencia, el objeto subyacente debe estar activo y no puede utilizarse ninguna referencia para acceder a la memoria.

En la mayoría de los casos, el puntero también debe estar alineado adecuadamente.

The “NOT SAFE” section gives an example of a common kind of UB bug: *r1 has the 'static lifetime, so r3 has type &'static String, and thus outlives s. Creating a reference from a pointer requires great care.

Variables Estáticas Mutables

Es seguro leer una variable estática inmutable:

static HELLO_WORLD: &str = "Hello, world!";

fn main() {
    println!("HELLO_WORLD: {HELLO_WORLD}");
}

Sin embargo, dado que pueden producirse carreras de datos, no es seguro leer y escribir variables estáticas mutables:

static mut COUNTER: u32 = 0;

fn add_to_counter(inc: u32) {
    unsafe {
        COUNTER += inc;
    }
}

fn main() {
    add_to_counter(42);

    unsafe {
        println!("COUNTER: {COUNTER}");
    }
}
This slide should take about 5 minutes.
  • The program here is safe because it is single-threaded. However, the Rust compiler is conservative and will assume the worst. Try removing the unsafe and see how the compiler explains that it is undefined behavior to mutate a static from multiple threads.

  • No suele ser buena idea usar una variable estática mutable, pero en algunos casos puede encajar en código no_std de bajo nivel, como implementar una asignación de heap o trabajar con algunas APIs C.

Uniones

Las uniones son como enums (enumeraciones), pero eres tú quien debe hacer el seguimiento del campo activo:

#[repr(C)]
union MyUnion {
    i: u8,
    b: bool,
}

fn main() {
    let u = MyUnion { i: 42 };
    println!("int: {}", unsafe { u.i });
    println!("bool: {}", unsafe { u.b }); // Undefined behavior!
}
This slide should take about 5 minutes.

Las uniones raramente son necesarias en Rust, ya que se suele utilizar una enum. A veces se necesitan para interactuar con APIs de biblioteca C.

Si solo quieres reinterpretar los bytes como otro tipo, probablemente te interese std::mem::transmute o una envoltura segura, como el crate zerocopy.

Llamar Funciones Unsafe (Inseguras)

Llamar Funciones Unsafe (Inseguras)

Una función o método se puede marcar como unsafe si tiene condiciones previas adicionales que debes mantener para evitar un comportamiento indefinido:

extern "C" {
    fn abs(input: i32) -> i32;
}

fn main() {
    let emojis = "🗻∈🌏";

    // Safe because the indices are in the correct order, within the bounds of
    // the string slice, and lie on UTF-8 sequence boundaries.
    unsafe {
        println!("emoji: {}", emojis.get_unchecked(0..4));
        println!("emoji: {}", emojis.get_unchecked(4..7));
        println!("emoji: {}", emojis.get_unchecked(7..11));
    }

    println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..7) }));

    unsafe {
        // Undefined behavior if abs misbehaves.
        println!("Absolute value of -3 according to C: {}", abs(-3));
    }

    // Not upholding the UTF-8 encoding requirement breaks memory safety!
    // println!("emoji: {}", unsafe { emojis.get_unchecked(0..3) });
    // println!("char count: {}", count_chars(unsafe {
    // emojis.get_unchecked(0..3) }));
}

fn count_chars(s: &str) -> usize {
    s.chars().count()
}

Escribir Funciones Unsafe (Inseguras)

Puedes marcar tus propias funciones como unsafe si requieren condiciones concretas para evitar un comportamiento indefinido.

/// Swaps the values pointed to by the given pointers.
///
/// # Safety
///
/// The pointers must be valid and properly aligned.
unsafe fn swap(a: *mut u8, b: *mut u8) {
    let temp = *a;
    *a = *b;
    *b = temp;
}

fn main() {
    let mut a = 42;
    let mut b = 66;

    // Safe because ...
    unsafe {
        swap(&mut a, &mut b);
    }

    println!("a = {}, b = {}", a, b);
}
This slide should take about 5 minutes.

Llamar Funciones Unsafe (Inseguras)

get_unchecked, like most _unchecked functions, is unsafe, because it can create UB if the range is incorrect. abs is incorrect for a different reason: it is an external function (FFI). Calling external functions is usually only a problem when those functions do things with pointers which might violate Rust’s memory model, but in general any C function might have undefined behaviour under any arbitrary circumstances.

En este ejemplo, "C" es la ABI.; también hay otras ABI disponibles.

Escribir Funciones Unsafe (Inseguras)

We wouldn’t actually use pointers for a swap function - it can be done safely with references.

Note that unsafe code is allowed within an unsafe function without an unsafe block. We can prohibit this with #[deny(unsafe_op_in_unsafe_fn)]. Try adding it and see what happens. This will likely change in a future Rust edition.

Implementación de Traits Unsafe (Inseguras)

Al igual que con las funciones, puedes marcar un trait como unsafe si la implementación debe asegurar condiciones concretas para evitar un comportamiento indefinido.

Por ejemplo, el crate zerocopy tiene un trait inseguro, que se parece a esto:

use std::mem::size_of_val;
use std::slice;

/// ...
/// # Safety
/// The type must have a defined representation and no padding.
pub unsafe trait AsBytes {
    fn as_bytes(&self) -> &[u8] {
        unsafe {
            slice::from_raw_parts(
                self as *const Self as *const u8,
                size_of_val(self),
            )
        }
    }
}

// Safe because u32 has a defined representation and no padding.
unsafe impl AsBytes for u32 {}
This slide should take about 5 minutes.

Debería haber una sección # Safety en el Rustdoc para el trait explicando los requisitos para que el trait pueda implementarse de forma segura.

La sección de seguridad actual de AsBytes es bastante más larga y complicada.

Los traits integrados Send y Sync no son seguros.

Envoltorio de FFI Seguro

Rust has great support for calling functions through a foreign function interface (FFI). We will use this to build a safe wrapper for the libc functions you would use from C to read the names of files in a directory.

Consulta las páginas del manual:

También te recomendamos que consultes el módulo std::ffi. Ahí encontrarás una serie de tipos de cadena que necesitas para el ejercicio:

TiposCodificaciónUso
str y StringUTF-8Procesar textos en Rust
CStr y CStringTerminado en NULComunicarse con funciones C
OsStr y OsStringEspecífico del SOComunicarse con el SO

Realizarás conversiones entre todos estos tipos:

  • De &str a CString: debes asignar espacio para un carácter final \0,
  • De CString a *const i8: necesitas un puntero para llamar a funciones C,
  • De *const i8 a &CStr: necesitas algo que pueda encontrar el carácter final \0,
  • &CStr to &[u8]: a slice of bytes is the universal interface for “some unknown data”,
  • De &[u8] a &OsStr: &OsStr es un paso hacia OsString, usa OsStrExt para crearlo.
  • De OsStr a OsString: debes clonar los datos en &OsStr para poder devolverlo y llamar a readdir de nuevo.

El Nomicon también tiene un capítulo muy útil sobre FFI.

Copia el fragmento de código que aparece más abajo en la página https://play.rust-lang.org/ y rellena los métodos y funciones que faltan:

// TODO: remove this when you're done with your implementation.
#![allow(unused_imports, unused_variables, dead_code)]

mod ffi {
    use std::os::raw::{c_char, c_int};
    #[cfg(not(target_os = "macos"))]
    use std::os::raw::{c_long, c_uchar, c_ulong, c_ushort};

    // Opaque type. See https://doc.rust-lang.org/nomicon/ffi.html.
    #[repr(C)]
    pub struct DIR {
        _data: [u8; 0],
        _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
    }

    // Layout according to the Linux man page for readdir(3), where ino_t and
    // off_t are resolved according to the definitions in
    // /usr/include/x86_64-linux-gnu/{sys/types.h, bits/typesizes.h}.
    #[cfg(not(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_ino: c_ulong,
        pub d_off: c_long,
        pub d_reclen: c_ushort,
        pub d_type: c_uchar,
        pub d_name: [c_char; 256],
    }

    // Layout according to the macOS man page for dir(5).
    #[cfg(all(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_fileno: u64,
        pub d_seekoff: u64,
        pub d_reclen: u16,
        pub d_namlen: u16,
        pub d_type: u8,
        pub d_name: [c_char; 1024],
    }

    extern "C" {
        pub fn opendir(s: *const c_char) -> *mut DIR;

        #[cfg(not(all(target_os = "macos", target_arch = "x86_64")))]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        // See https://github.com/rust-lang/libc/issues/414 and the section on
        // _DARWIN_FEATURE_64_BIT_INODE in the macOS man page for stat(2).
        //
        // "Platforms that existed before these updates were available" refers
        // to macOS (as opposed to iOS / wearOS / etc.) on Intel and PowerPC.
        #[cfg(all(target_os = "macos", target_arch = "x86_64"))]
        #[link_name = "readdir$INODE64"]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        pub fn closedir(s: *mut DIR) -> c_int;
    }
}

use std::ffi::{CStr, CString, OsStr, OsString};
use std::os::unix::ffi::OsStrExt;

#[derive(Debug)]
struct DirectoryIterator {
    path: CString,
    dir: *mut ffi::DIR,
}

impl DirectoryIterator {
    fn new(path: &str) -> Result<DirectoryIterator, String> {
        // Call opendir and return a Ok value if that worked,
        // otherwise return Err with a message.
        unimplemented!()
    }
}

impl Iterator for DirectoryIterator {
    type Item = OsString;
    fn next(&mut self) -> Option<OsString> {
        // Keep calling readdir until we get a NULL pointer back.
        unimplemented!()
    }
}

impl Drop for DirectoryIterator {
    fn drop(&mut self) {
        // Call closedir as needed.
        unimplemented!()
    }
}

fn main() -> Result<(), String> {
    let iter = DirectoryIterator::new(".")?;
    println!("files: {:#?}", iter.collect::<Vec<_>>());
    Ok(())
}

Soluciones

mod ffi {
    use std::os::raw::{c_char, c_int};
    #[cfg(not(target_os = "macos"))]
    use std::os::raw::{c_long, c_uchar, c_ulong, c_ushort};

    // Opaque type. See https://doc.rust-lang.org/nomicon/ffi.html.
    #[repr(C)]
    pub struct DIR {
        _data: [u8; 0],
        _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
    }

    // Layout according to the Linux man page for readdir(3), where ino_t and
    // off_t are resolved according to the definitions in
    // /usr/include/x86_64-linux-gnu/{sys/types.h, bits/typesizes.h}.
    #[cfg(not(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_ino: c_ulong,
        pub d_off: c_long,
        pub d_reclen: c_ushort,
        pub d_type: c_uchar,
        pub d_name: [c_char; 256],
    }

    // Layout according to the macOS man page for dir(5).
    #[cfg(all(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_fileno: u64,
        pub d_seekoff: u64,
        pub d_reclen: u16,
        pub d_namlen: u16,
        pub d_type: u8,
        pub d_name: [c_char; 1024],
    }

    extern "C" {
        pub fn opendir(s: *const c_char) -> *mut DIR;

        #[cfg(not(all(target_os = "macos", target_arch = "x86_64")))]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        // See https://github.com/rust-lang/libc/issues/414 and the section on
        // _DARWIN_FEATURE_64_BIT_INODE in the macOS man page for stat(2).
        //
        // "Platforms that existed before these updates were available" refers
        // to macOS (as opposed to iOS / wearOS / etc.) on Intel and PowerPC.
        #[cfg(all(target_os = "macos", target_arch = "x86_64"))]
        #[link_name = "readdir$INODE64"]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        pub fn closedir(s: *mut DIR) -> c_int;
    }
}

use std::ffi::{CStr, CString, OsStr, OsString};
use std::os::unix::ffi::OsStrExt;

#[derive(Debug)]
struct DirectoryIterator {
    path: CString,
    dir: *mut ffi::DIR,
}

impl DirectoryIterator {
    fn new(path: &str) -> Result<DirectoryIterator, String> {
        // Call opendir and return a Ok value if that worked,
        // otherwise return Err with a message.
        let path =
            CString::new(path).map_err(|err| format!("Invalid path: {err}"))?;
        // SAFETY: path.as_ptr() cannot be NULL.
        let dir = unsafe { ffi::opendir(path.as_ptr()) };
        if dir.is_null() {
            Err(format!("Could not open {:?}", path))
        } else {
            Ok(DirectoryIterator { path, dir })
        }
    }
}

impl Iterator for DirectoryIterator {
    type Item = OsString;
    fn next(&mut self) -> Option<OsString> {
        // Keep calling readdir until we get a NULL pointer back.
        // SAFETY: self.dir is never NULL.
        let dirent = unsafe { ffi::readdir(self.dir) };
        if dirent.is_null() {
            // We have reached the end of the directory.
            return None;
        }
        // SAFETY: dirent is not NULL and dirent.d_name is NUL
        // terminated.
        let d_name = unsafe { CStr::from_ptr((*dirent).d_name.as_ptr()) };
        let os_str = OsStr::from_bytes(d_name.to_bytes());
        Some(os_str.to_owned())
    }
}

impl Drop for DirectoryIterator {
    fn drop(&mut self) {
        // Call closedir as needed.
        if !self.dir.is_null() {
            // SAFETY: self.dir is not NULL.
            if unsafe { ffi::closedir(self.dir) } != 0 {
                panic!("Could not close {:?}", self.path);
            }
        }
    }
}

fn main() -> Result<(), String> {
    let iter = DirectoryIterator::new(".")?;
    println!("files: {:#?}", iter.collect::<Vec<_>>());
    Ok(())
}

#[cfg(test)]
mod tests {
    use super::*;
    use std::error::Error;

    #[test]
    fn test_nonexisting_directory() {
        let iter = DirectoryIterator::new("no-such-directory");
        assert!(iter.is_err());
    }

    #[test]
    fn test_empty_directory() -> Result<(), Box<dyn Error>> {
        let tmp = tempfile::TempDir::new()?;
        let iter = DirectoryIterator::new(
            tmp.path().to_str().ok_or("Non UTF-8 character in path")?,
        )?;
        let mut entries = iter.collect::<Vec<_>>();
        entries.sort();
        assert_eq!(entries, &[".", ".."]);
        Ok(())
    }

    #[test]
    fn test_nonempty_directory() -> Result<(), Box<dyn Error>> {
        let tmp = tempfile::TempDir::new()?;
        std::fs::write(tmp.path().join("foo.txt"), "The Foo Diaries\n")?;
        std::fs::write(tmp.path().join("bar.png"), "<PNG>\n")?;
        std::fs::write(tmp.path().join("crab.rs"), "//! Crab\n")?;
        let iter = DirectoryIterator::new(
            tmp.path().to_str().ok_or("Non UTF-8 character in path")?,
        )?;
        let mut entries = iter.collect::<Vec<_>>();
        entries.sort();
        assert_eq!(entries, &[".", "..", "bar.png", "crab.rs", "foo.txt"]);
        Ok(())
    }
}

Te Damos la Bienvenida a Rust en Android

Rust is supported for system software on Android. This means that you can write new services, libraries, drivers or even firmware in Rust (or improve existing code as needed).

Hoy intentaremos llamar a Rust desde un proyecto personal. Intenta encontrar una pequeña esquina de tu código base donde podamos mover algunas líneas de código a Rust. Cuantas menos dependencias y tipos “exóticos” tenga, mejor. Lo ideal sería algo que analizara bytes sin procesar.

The speaker may mention any of the following given the increased use of Rust in Android:

Configurar

We will be using a Cuttlefish Android Virtual Device to test our code. Make sure you have access to one or create a new one with:

source build/envsetup.sh
lunch aosp_cf_x86_64_phone-trunk_staging-userdebug
acloud create

Consulta el Codelab para desarrolladores de Android para obtener más información.

Puntos clave:

  • Cuttlefish is a reference Android device designed to work on generic Linux desktops. MacOS support is also planned.

  • The Cuttlefish system image maintains high fidelity to real devices, and is the ideal emulator to run many Rust use cases.

Reglas de Compilación (Build)

El sistema de compilación de Android (Soong) es compatible con Rust a través de una serie de módulos:

Tipo de móduloDescripción
rust_binaryProduce un binario de Rust.
rust_libraryProduce una biblioteca de Rust y proporciona las variantes rlib y dylib.
rust_ffiProduce una biblioteca de Rust C que pueden usar los módulos cc y proporciona variantes estáticas y compartidas.
rust_proc_macroProduce una biblioteca de Rust proc-macro. Son similares a complementos del compilador.
rust_testProduce un binario de prueba de Rust que utiliza el agente de prueba estándar de Rust.
rust_fuzzProduce un binario de fuzz de Rust que aprovecha libfuzzer.
rust_protobufGenera código fuente y produce una biblioteca Rust que proporciona una interfaz para un protobuf en particular.
rust_bindgenGenera código fuente y produce una biblioteca de Rust que contiene enlaces de Rust a bibliotecas de C.

A continuación, hablaremos de rust_binary y rust_library.

Additional items speaker may mention:

  • Cargo is not optimized for multi-language repos, and also downloads packages from the internet.

  • For compliance and performance, Android must have crates in-tree. It must also interop with C/C++/Java code. Soong fills that gap.

  • Soong has many similarities to Bazel, which is the open-source variant of Blaze (used in google3).

  • There is a plan to transition Android, ChromeOS, and Fuchsia to Bazel.

  • Learning Bazel-like build rules is useful for all Rust OS developers.

  • Fun fact: Data from Star Trek is a Soong-type Android.

Binarios de Rust

Empecemos con una sencilla aplicación. Desde la raíz de un AOSP revisado, crea los siguientes archivos:

hello_rust/Android.bp:

rust_binary {
    name: "hello_rust",
    crate_name: "hello_rust",
    srcs: ["src/main.rs"],
}

hello_rust/src/main.rs:

//! Rust demo.

/// Prints a greeting to standard output.
fn main() {
    println!("Hello from Rust!");
}

Ahora puedes compilar, insertar y ejecutar el binario:

m hello_rust
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust" /data/local/tmp
adb shell /data/local/tmp/hello_rust
Hello from Rust!

Bibliotecas de Rust

Crea una biblioteca de Rust para Android con rust_library.

Aquí declaramos una dependencia en dos bibliotecas:

  • libgreeting, que definimos más abajo.
  • libtextwrap, que es un crate ya incluido en external/rust/crates/.

hello_rust/Android.bp:

rust_binary {
    name: "hello_rust_with_dep",
    crate_name: "hello_rust_with_dep",
    srcs: ["src/main.rs"],
    rustlibs: [
        "libgreetings",
        "libtextwrap",
    ],
    prefer_rlib: true, // Need this to avoid dynamic link error.
}

rust_library {
    name: "libgreetings",
    crate_name: "greetings",
    srcs: ["src/lib.rs"],
}

hello_rust/src/main.rs:

//! Rust demo.

use greetings::greeting;
use textwrap::fill;

/// Prints a greeting to standard output.
fn main() {
    println!("{}", fill(&greeting("Bob"), 24));
}

hello_rust/src/lib.rs:

//! Greeting library.

/// Greet `name`.
pub fn greeting(name: &str) -> String {
    format!("Hello {name}, it is very nice to meet you!")
}

Puedes compilar, insertar y ejecutar el binario como antes:

m hello_rust_with_dep
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust_with_dep" /data/local/tmp
adb shell /data/local/tmp/hello_rust_with_dep
Hello Bob, it is very
nice to meet you!

AIDL

El lenguaje de definición de la interfaz de Android (AIDL) es compatible con Rust:

  • El código de Rust puede llamar a servidores AIDL que ya se hayan creado.
  • Puedes crear servidores de AIDL en Rust.

Interfaces de AIDL

La API de tu servicio se declara mediante una interfaz de AIDL:

birthday_service/aidl/com/example/birthdayservice/IBirthdayService.aidl:

package com.example.birthdayservice;

/** Birthday service interface. */
interface IBirthdayService {
    /** Generate a Happy Birthday message. */
    String wishHappyBirthday(String name, int years);
}

birthday_service/aidl/Android.bp:

aidl_interface {
    name: "com.example.birthdayservice",
    srcs: ["com/example/birthdayservice/*.aidl"],
    unstable: true,
    backend: {
        rust: { // Rust is not enabled by default
            enabled: true,
        },
    },
}

Añade vendor_available: true si un binario de la partición del proveedor utiliza tu archivo de AIDL.

Implementación del servicio

Ahora podemos implementar el servicio de AIDL:

birthday_service/src/lib.rs:

//! Implementation of the `IBirthdayService` AIDL interface.
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::IBirthdayService;
use com_example_birthdayservice::binder;

/// The `IBirthdayService` implementation.
pub struct BirthdayService;

impl binder::Interface for BirthdayService {}

impl IBirthdayService for BirthdayService {
    fn wishHappyBirthday(&self, name: &str, years: i32) -> binder::Result<String> {
        Ok(format!("Happy Birthday {name}, congratulations with the {years} years!"))
    }
}

birthday_service/Android.bp:

rust_library {
    name: "libbirthdayservice",
    srcs: ["src/lib.rs"],
    crate_name: "birthdayservice",
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
    ],
}

Servidor de AIDL

Por último, podemos crear un servidor que exponga el servicio:

birthday_service/src/server.rs:

//! Birthday service.
use birthdayservice::BirthdayService;
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::BnBirthdayService;
use com_example_birthdayservice::binder;

const SERVICE_IDENTIFIER: &str = "birthdayservice";

/// Entry point for birthday service.
fn main() {
    let birthday_service = BirthdayService;
    let birthday_service_binder = BnBirthdayService::new_binder(
        birthday_service,
        binder::BinderFeatures::default(),
    );
    binder::add_service(SERVICE_IDENTIFIER, birthday_service_binder.as_binder())
        .expect("Failed to register service");
    binder::ProcessState::join_thread_pool()
}

birthday_service/Android.bp:

rust_binary {
    name: "birthday_server",
    crate_name: "birthday_server",
    srcs: ["src/server.rs"],
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
        "libbirthdayservice",
    ],
    prefer_rlib: true, // To avoid dynamic link error.
}

Despliegue

Ahora podemos crear, insertar e iniciar el servicio:

m birthday_server
adb push "$ANDROID_PRODUCT_OUT/system/bin/birthday_server" /data/local/tmp
adb root
adb shell /data/local/tmp/birthday_server

Comprueba que el servicio funciona en otra terminal:

adb shell service check birthdayservice
Service birthdayservice: found

También puedes llamar al servicio con service call:

adb shell service call birthdayservice 1 s16 Bob i32 24
Result: Parcel(
  0x00000000: 00000000 00000036 00610048 00700070 '....6...H.a.p.p.'
  0x00000010: 00200079 00690042 00740072 00640068 'y. .B.i.r.t.h.d.'
  0x00000020: 00790061 00420020 0062006f 0020002c 'a.y. .B.o.b.,. .'
  0x00000030: 006f0063 0067006e 00610072 00750074 'c.o.n.g.r.a.t.u.'
  0x00000040: 0061006c 00690074 006e006f 00200073 'l.a.t.i.o.n.s. .'
  0x00000050: 00690077 00680074 00740020 00650068 'w.i.t.h. .t.h.e.'
  0x00000060: 00320020 00200034 00650079 00720061 ' .2.4. .y.e.a.r.'
  0x00000070: 00210073 00000000                   's.!.....        ')

Cliente de AIDL

Por último, podemos crear un cliente de Rust para nuestro nuevo servicio.

birthday_service/src/client.rs:

//! Birthday service.
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::IBirthdayService;
use com_example_birthdayservice::binder;

const SERVICE_IDENTIFIER: &str = "birthdayservice";

/// Connect to the BirthdayService.
pub fn connect() -> Result<binder::Strong<dyn IBirthdayService>, binder::StatusCode>
{
    binder::get_interface(SERVICE_IDENTIFIER)
}

/// Call the birthday service.
fn main() -> Result<(), binder::Status> {
    let name = std::env::args().nth(1).unwrap_or_else(|| String::from("Bob"));
    let years = std::env::args()
        .nth(2)
        .and_then(|arg| arg.parse::<i32>().ok())
        .unwrap_or(42);

    binder::ProcessState::start_thread_pool();
    let service = connect().expect("Failed to connect to BirthdayService");
    let msg = service.wishHappyBirthday(&name, years)?;
    println!("{msg}");
    Ok(())
}

birthday_service/Android.bp:

rust_binary {
    name: "birthday_client",
    crate_name: "birthday_client",
    srcs: ["src/client.rs"],
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
    ],
    prefer_rlib: true, // To avoid dynamic link error.
}

Ten en cuenta que el cliente no depende de libbirthdayservice.

Compila, inserta y ejecuta el cliente en tu dispositivo:

m birthday_client
adb push "$ANDROID_PRODUCT_OUT/system/bin/birthday_client" /data/local/tmp
adb shell /data/local/tmp/birthday_client Charlie 60
Happy Birthday Charlie, congratulations with the 60 years!

Cambio de API

Ampliemos la API con más funciones. Queremos que los clientes puedan indicar una lista de líneas para la tarjeta de cumpleaños:

package com.example.birthdayservice;

/** Birthday service interface. */
interface IBirthdayService {
    /** Generate a Happy Birthday message. */
    String wishHappyBirthday(String name, int years, in String[] text);
}

Almacenamiento de registros

Utiliza el crate log para que se registre automáticamente en logcat (en el dispositivo) o stdout (en el host):

hello_rust_logs/Android.bp:

rust_binary {
    name: "hello_rust_logs",
    crate_name: "hello_rust_logs",
    srcs: ["src/main.rs"],
    rustlibs: [
        "liblog_rust",
        "liblogger",
    ],
    host_supported: true,
}

hello_rust_logs/src/main.rs:

//! Rust logging demo.

use log::{debug, error, info};

/// Logs a greeting.
fn main() {
    logger::init(
        logger::Config::default()
            .with_tag_on_device("rust")
            .with_min_level(log::Level::Trace),
    );
    debug!("Starting program.");
    info!("Things are going fine.");
    error!("Something went wrong!");
}

Compila, inserta y ejecuta el binario en tu dispositivo:

m hello_rust_logs
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust_logs" /data/local/tmp
adb shell /data/local/tmp/hello_rust_logs

Los registros se muestran en adb logcat:

adb logcat -s rust
09-08 08:38:32.454  2420  2420 D rust: hello_rust_logs: Starting program.
09-08 08:38:32.454  2420  2420 I rust: hello_rust_logs: Things are going fine.
09-08 08:38:32.454  2420  2420 E rust: hello_rust_logs: Something went wrong!

Interoperabilidad

Rust admite sin problemas la interoperabilidad con otros lenguajes. Esto significa que puedes hacer lo siguiente:

  • Llamar a funciones de Rust desde otros lenguajes.
  • Llamar a funciones escritas en otros lenguajes desde Rust.

Cuando llamas a funciones en otro lenguaje, se dice que estás usando una interfaz de función externa, también denominada FFI.

Interoperabilidad con C

Rust admite vincular archivos de objetos con una convención de llamada de C. Del mismo modo, puedes exportar funciones de Rust y llamarlas desde C.

Si quieres, puedes hacerlo de forma manual:

extern "C" {
    fn abs(x: i32) -> i32;
}

fn main() {
    let x = -42;
    let abs_x = unsafe { abs(x) };
    println!("{x}, {abs_x}");
}

Ya lo hemos visto en el ejercicio Envoltorio de FFI seguro.

Esto supone un conocimiento completo de la plataforma objetivo. No se recomienda para producción.

A continuación, estudiaremos otras opciones mejores.

Uso de Bindgen

La herramienta bindgen puede generar automáticamente enlaces desde un archivo de encabezado de C.

En primer lugar, crea una biblioteca de C pequeña:

interoperability/bindgen/libbirthday.h:

typedef struct card {
  const char* name;
  int years;
} card;

void print_card(const card* card);

interoperability/bindgen/libbirthday.c:

#include <stdio.h>
#include "libbirthday.h"

void print_card(const card* card) {
  printf("+--------------\n");
  printf("| Happy Birthday %s!\n", card->name);
  printf("| Congratulations with the %i years!\n", card->years);
  printf("+--------------\n");
}

Añade lo siguiente a tu archivo Android.bp:

interoperability/bindgen/Android.bp:

cc_library {
    name: "libbirthday",
    srcs: ["libbirthday.c"],
}

Crea un archivo de encabezado de envoltorio para la biblioteca (no es estrictamente necesario en este ejemplo):

interoperability/bindgen/libbirthday_wrapper.h:

#include "libbirthday.h"

Ahora puedes generar automáticamente los enlaces:

interoperability/bindgen/Android.bp:

rust_bindgen {
    name: "libbirthday_bindgen",
    crate_name: "birthday_bindgen",
    wrapper_src: "libbirthday_wrapper.h",
    source_stem: "bindings",
    static_libs: ["libbirthday"],
}

Por último, podemos utilizar los enlaces de nuestro programa de Rust:

interoperability/bindgen/Android.bp:

rust_binary {
    name: "print_birthday_card",
    srcs: ["main.rs"],
    rustlibs: ["libbirthday_bindgen"],
}

interoperability/bindgen/main.rs:

//! Bindgen demo.

use birthday_bindgen::{card, print_card};

fn main() {
    let name = std::ffi::CString::new("Peter").unwrap();
    let card = card { name: name.as_ptr(), years: 42 };
    // SAFETY: `print_card` is safe to call with a valid `card` pointer.
    unsafe {
        print_card(&card as *const card);
    }
}

Compila, inserta y ejecuta el binario en tu dispositivo:

m print_birthday_card
adb push "$ANDROID_PRODUCT_OUT/system/bin/print_birthday_card" /data/local/tmp
adb shell /data/local/tmp/print_birthday_card

Por último, podemos ejecutar pruebas generadas automáticamente para comprobar que los enlaces funcionan:

interoperability/bindgen/Android.bp:

rust_test {
    name: "libbirthday_bindgen_test",
    srcs: [":libbirthday_bindgen"],
    crate_name: "libbirthday_bindgen_test",
    test_suites: ["general-tests"],
    auto_gen_config: true,
    clippy_lints: "none", // Generated file, skip linting
    lints: "none",
}
atest libbirthday_bindgen_test

Llamar a Rust

Es fácil exportar las funciones y los tipos de Rust a C:

interoperability/rust/libanalyze/analyze.rs

//! Rust FFI demo.
#![deny(improper_ctypes_definitions)]

use std::os::raw::c_int;

/// Analyze the numbers.
#[no_mangle]
pub extern "C" fn analyze_numbers(x: c_int, y: c_int) {
    if x < y {
        println!("x ({x}) is smallest!");
    } else {
        println!("y ({y}) is probably larger than x ({x})");
    }
}

interoperability/rust/libanalyze/analyze.h

#ifndef ANALYSE_H
#define ANALYSE_H

extern "C" {
void analyze_numbers(int x, int y);
}

#endif

interoperability/rust/libanalyze/Android.bp

rust_ffi {
    name: "libanalyze_ffi",
    crate_name: "analyze_ffi",
    srcs: ["analyze.rs"],
    include_dirs: ["."],
}

Ahora podemos llamarlo desde un binario de C:

interoperability/rust/analyze/main.c

#include "analyze.h"

int main() {
  analyze_numbers(10, 20);
  analyze_numbers(123, 123);
  return 0;
}

interoperability/rust/analyze/Android.bp

cc_binary {
    name: "analyze_numbers",
    srcs: ["main.c"],
    static_libs: ["libanalyze_ffi"],
}

Compila, inserta y ejecuta el binario en tu dispositivo:

m analyze_numbers
adb push "$ANDROID_PRODUCT_OUT/system/bin/analyze_numbers" /data/local/tmp
adb shell /data/local/tmp/analyze_numbers

“#[no_mangle]” inhabilita la modificación de nombres habitual de Rust, por lo que el símbolo exportado será el nombre de la función. También puedes utilizar #[export_name = "some_name"] para especificar el nombre que quieras.

Con C++

El crate CXX permite una interoperabilidad segura entre Rust y C++.

El enfoque general es el siguiente:

Módulos de Pruebas

CXX relies on a description of the function signatures that will be exposed from each language to the other. You provide this description using extern blocks in a Rust module annotated with the #[cxx::bridge] attribute macro.

#[allow(unsafe_op_in_unsafe_fn)]
#[cxx::bridge(namespace = "org::blobstore")]
mod ffi {
    // Shared structs with fields visible to both languages.
    struct BlobMetadata {
        size: usize,
        tags: Vec<String>,
    }

    // Rust types and signatures exposed to C++.
    extern "Rust" {
        type MultiBuf;

        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
    }

    // C++ types and signatures exposed to Rust.
    unsafe extern "C++" {
        include!("include/blobstore.h");

        type BlobstoreClient;

        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
        fn put(self: Pin<&mut BlobstoreClient>, parts: &mut MultiBuf) -> u64;
        fn tag(self: Pin<&mut BlobstoreClient>, blobid: u64, tag: &str);
        fn metadata(&self, blobid: u64) -> BlobMetadata;
    }
}
  • The bridge is generally declared in an ffi module within your crate.
  • From the declarations made in the bridge module, CXX will generate matching Rust and C++ type/function definitions in order to expose those items to both languages.
  • To view the generated Rust code, use cargo-expand to view the expanded proc macro. For most of the examples you would use cargo expand ::ffi to expand just the ffi module (though this doesn’t apply for Android projects).
  • To view the generated C++ code, look in target/cxxbridge.

Rust Bridge Declarations

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type MyType; // Opaque type
        fn foo(&self); // Method on `MyType`
        fn bar() -> Box<MyType>; // Free function
    }
}

struct MyType(i32);

impl MyType {
    fn foo(&self) {
        println!("{}", self.0);
    }
}

fn bar() -> Box<MyType> {
    Box::new(MyType(123))
}
  • Items declared in the extern "Rust" reference items that are in scope in the parent module.
  • The CXX code generator uses your extern "Rust" section(s) to produce a C++ header file containing the corresponding C++ declarations. The generated header has the same path as the Rust source file containing the bridge, except with a .rs.h file extension.

Generated C++

#[cxx::bridge]
mod ffi {
    // Rust types and signatures exposed to C++.
    extern "Rust" {
        type MultiBuf;

        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
    }
}

Results in (roughly) the following C++:

struct MultiBuf final : public ::rust::Opaque {
  ~MultiBuf() = delete;

private:
  friend ::rust::layout;
  struct layout {
    static ::std::size_t size() noexcept;
    static ::std::size_t align() noexcept;
  };
};

::rust::Slice<::std::uint8_t const> next_chunk(::org::blobstore::MultiBuf &buf) noexcept;

C++ Bridge Declarations

#[cxx::bridge]
mod ffi {
    // C++ types and signatures exposed to Rust.
    unsafe extern "C++" {
        include!("include/blobstore.h");

        type BlobstoreClient;

        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
        fn put(self: Pin<&mut BlobstoreClient>, parts: &mut MultiBuf) -> u64;
        fn tag(self: Pin<&mut BlobstoreClient>, blobid: u64, tag: &str);
        fn metadata(&self, blobid: u64) -> BlobMetadata;
    }
}

Results in (roughly) the following Rust:

#[repr(C)]
pub struct BlobstoreClient {
    _private: ::cxx::private::Opaque,
}

pub fn new_blobstore_client() -> ::cxx::UniquePtr<BlobstoreClient> {
    extern "C" {
        #[link_name = "org$blobstore$cxxbridge1$new_blobstore_client"]
        fn __new_blobstore_client() -> *mut BlobstoreClient;
    }
    unsafe { ::cxx::UniquePtr::from_raw(__new_blobstore_client()) }
}

impl BlobstoreClient {
    pub fn put(&self, parts: &mut MultiBuf) -> u64 {
        extern "C" {
            #[link_name = "org$blobstore$cxxbridge1$BlobstoreClient$put"]
            fn __put(
                _: &BlobstoreClient,
                parts: *mut ::cxx::core::ffi::c_void,
            ) -> u64;
        }
        unsafe {
            __put(self, parts as *mut MultiBuf as *mut ::cxx::core::ffi::c_void)
        }
    }
}

// ...
  • The programmer does not need to promise that the signatures they have typed in are accurate. CXX performs static assertions that the signatures exactly correspond with what is declared in C++.
  • unsafe extern blocks allow you to declare C++ functions that are safe to call from Rust.

Tipos escalares

#[cxx::bridge]
mod ffi {
    #[derive(Clone, Debug, Hash)]
    struct PlayingCard {
        suit: Suit,
        value: u8,  // A=1, J=11, Q=12, K=13
    }

    enum Suit {
        Clubs,
        Diamonds,
        Hearts,
        Spades,
    }
}
  • Only C-like (unit) enums are supported.
  • A limited number of traits are supported for #[derive()] on shared types. Corresponding functionality is also generated for the C++ code, e.g. if you derive Hash also generates an implementation of std::hash for the corresponding C++ type.

Shared Enums

#[cxx::bridge]
mod ffi {
    enum Suit {
        Clubs,
        Diamonds,
        Hearts,
        Spades,
    }
}

Generated Rust:

#![allow(unused)]
fn main() {
#[derive(Copy, Clone, PartialEq, Eq)]
#[repr(transparent)]
pub struct Suit {
    pub repr: u8,
}

#[allow(non_upper_case_globals)]
impl Suit {
    pub const Clubs: Self = Suit { repr: 0 };
    pub const Diamonds: Self = Suit { repr: 1 };
    pub const Hearts: Self = Suit { repr: 2 };
    pub const Spades: Self = Suit { repr: 3 };
}
}

Generated C++:

enum class Suit : uint8_t {
  Clubs = 0,
  Diamonds = 1,
  Hearts = 2,
  Spades = 3,
};
  • On the Rust side, the code generated for shared enums is actually a struct wrapping a numeric value. This is because it is not UB in C++ for an enum class to hold a value different from all of the listed variants, and our Rust representation needs to have the same behavior.

Manejo de Errores

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        fn fallible(depth: usize) -> Result<String>;
    }
}

fn fallible(depth: usize) -> anyhow::Result<String> {
    if depth == 0 {
        return Err(anyhow::Error::msg("fallible1 requires depth > 0"));
    }

    Ok("Success!".into())
}
  • Rust functions that return Result are translated to exceptions on the C++ side.
  • The exception thrown will always be of type rust::Error, which primarily exposes a way to get the error message string. The error message will come from the error type’s Display impl.
  • A panic unwinding from Rust to C++ will always cause the process to immediately terminate.

Manejo de Errores

#[cxx::bridge]
mod ffi {
    unsafe extern "C++" {
        include!("example/include/example.h");
        fn fallible(depth: usize) -> Result<String>;
    }
}

fn main() {
    if let Err(err) = ffi::fallible(99) {
        eprintln!("Error: {}", err);
        process::exit(1);
    }
}
  • C++ functions declared to return a Result will catch any thrown exception on the C++ side and return it as an Err value to the calling Rust function.
  • If an exception is thrown from an extern “C++” function that is not declared by the CXX bridge to return Result, the program calls C++’s std::terminate. The behavior is equivalent to the same exception being thrown through a noexcept C++ function.

Additional Types

Rust TypeC++ Type
Stringrust::String
&strrust::Str
CxxStringstd::string
&[T]/&mut [T]rust::Slice
Box<T>rust::Box<T>
UniquePtr<T>std::unique_ptr<T>
Vec<T>rust::Vec<T>
CxxVector<T>std::vector<T>
  • These types can be used in the fields of shared structs and the arguments and returns of extern functions.
  • Note that Rust’s String does not map directly to std::string. There are a few reasons for this:
    • std::string does not uphold the UTF-8 invariant that String requires.
    • The two types have different layouts in memory and so can’t be passed directly between languages.
    • std::string requires move constructors that don’t match Rust’s move semantics, so a std::string can’t be passed by value to Rust.

Building in Android

Create a cc_library_static to build the C++ library, including the CXX generated header and source file.

cc_library_static {
    name: "libcxx_test_cpp",
    srcs: ["cxx_test.cpp"],
    generated_headers: [
        "cxx-bridge-header",
        "libcxx_test_bridge_header"
    ],
    generated_sources: ["libcxx_test_bridge_code"],
}
  • Point out that libcxx_test_bridge_header and libcxx_test_bridge_code are the dependencies for the CXX-generated C++ bindings. We’ll show how these are setup on the next slide.
  • Note that you also need to depend on the cxx-bridge-header library in order to pull in common CXX definitions.
  • Full docs for using CXX in Android can be found in the Android docs. You may want to share that link with the class so that students know where they can find these instructions again in the future.

Building in Android

Create two genrules: One to generate the CXX header, and one to generate the CXX source file. These are then used as inputs to the cc_library_static.

// Generate a C++ header containing the C++ bindings
// to the Rust exported functions in lib.rs.
genrule {
    name: "libcxx_test_bridge_header",
    tools: ["cxxbridge"],
    cmd: "$(location cxxbridge) $(in) --header > $(out)",
    srcs: ["lib.rs"],
    out: ["lib.rs.h"],
}

// Generate the C++ code that Rust calls into.
genrule {
    name: "libcxx_test_bridge_code",
    tools: ["cxxbridge"],
    cmd: "$(location cxxbridge) $(in) > $(out)",
    srcs: ["lib.rs"],
    out: ["lib.rs.cc"],
}
  • The cxxbridge tool is a standalone tool that generates the C++ side of the bridge module. It is included in Android and available as a Soong tool.
  • By convention, if your Rust source file is lib.rs your header file will be named lib.rs.h and your source file will be named lib.rs.cc. This naming convention isn’t enforced, though.

Building in Android

Create a rust_binary that depends on libcxx and your cc_library_static.

rust_binary {
    name: "cxx_test",
    srcs: ["lib.rs"],
    rustlibs: ["libcxx"],
    static_libs: ["libcxx_test_cpp"],
}

Interoperabilidad con Java

Java puede cargar objetos compartidos a través de la interfaz nativa de Java (JNI). El crate jni permite crear una biblioteca compatible.

En primer lugar, creamos una función de Rust para exportar a Java:

interoperability/java/src/lib.rs:

#![allow(unused)]
fn main() {
//! Rust <-> Java FFI demo.

use jni::objects::{JClass, JString};
use jni::sys::jstring;
use jni::JNIEnv;

/// HelloWorld::hello method implementation.
#[no_mangle]
pub extern "system" fn Java_HelloWorld_hello(
    env: JNIEnv,
    _class: JClass,
    name: JString,
) -> jstring {
    let input: String = env.get_string(name).unwrap().into();
    let greeting = format!("Hello, {input}!");
    let output = env.new_string(greeting).unwrap();
    output.into_inner()
}
}

interoperability/java/Android.bp:

rust_ffi_shared {
    name: "libhello_jni",
    crate_name: "hello_jni",
    srcs: ["src/lib.rs"],
    rustlibs: ["libjni"],
}

We then call this function from Java:

interoperability/java/HelloWorld.java:

class HelloWorld {
    private static native String hello(String name);

    static {
        System.loadLibrary("hello_jni");
    }

    public static void main(String[] args) {
        String output = HelloWorld.hello("Alice");
        System.out.println(output);
    }
}

interoperability/java/Android.bp:

java_binary {
    name: "helloworld_jni",
    srcs: ["HelloWorld.java"],
    main_class: "HelloWorld",
    required: ["libhello_jni"],
}

Ahora puedes compilar, sincronizar y ejecutar el binario:

m helloworld_jni
adb sync  # requires adb root && adb remount
adb shell /system/bin/helloworld_jni

Ejercicios

Este es un ejercicio de grupo: escogeremos uno de los proyectos con los que se esté trabajando e intentaremos integrar Rust en él. Algunas sugerencias:

  • Llama a tu servicio de AIDL con un cliente escrito en Rust.

  • Mueve una función desde tu proyecto a Rust y llámala.

Aquí la solución es abierta, ya que depende de que alguno de los asistentes tenga un fragmento de código que se pueda convertir en Rust sobre la marcha.

Welcome to Rust in Chromium

Rust is supported for third-party libraries in Chromium, with first-party glue code to connect between Rust and existing Chromium C++ code.

Today, we’ll call into Rust to do something silly with strings. If you’ve got a corner of the code where you’re displaying a UTF8 string to the user, feel free to follow this recipe in your part of the codebase instead of the exact part we talk about.

Configurar

Make sure you can build and run Chromium. Any platform and set of build flags is OK, so long as your code is relatively recent (commit position 1223636 onwards, corresponding to November 2023):

gn gen out/Debug
autoninja -C out/Debug chrome
out/Debug/chrome # or on Mac, out/Debug/Chromium.app/Contents/MacOS/Chromium

(A component, debug build is recommended for quickest iteration time. This is the default!)

See How to build Chromium if you aren’t already at that point. Be warned: setting up to build Chromium takes time.

It’s also recommended that you have Visual Studio code installed.

About the exercises

This part of the course has a series of exercises which build on each other. We’ll be doing them spread throughout the course instead of just at the end. If you don’t have time to complete a certain part, don’t worry: you can catch up in the next slot.

Comparing Chromium and Cargo Ecosystems

Rust community typically uses cargo and libraries from crates.io. Chromium is built using gn and ninja and a curated set of dependencies.

When writing code in Rust, your choices are:

From here on we’ll be focusing on gn and ninja, because this is how Rust code can be built into the Chromium browser. At the same time, Cargo is an important part of the Rust ecosystem and you should keep it in your toolbox.

Mini exercise

Split into small groups and:

  • Brainstorm scenarios where cargo may offer an advantage and assess the risk profile of these scenarios.
  • Discuss which tools, libraries, and groups of people need to be trusted when using gn and ninja, offline cargo, etc.

Ask students to avoid peeking at the speaker notes before completing the exercise. Assuming folks taking the course are physically together, ask them to discuss in small groups of 3-4 people.

Notes/hints related to the first part of the exercise (“scenarios where Cargo may offer an advantage”):

  • It’s fantastic that when writing a tool, or prototyping a part of Chromium, one has access to the rich ecosystem of crates.io libraries. There is a crate for almost anything and they are usually quite pleasant to use. (clap for command-line parsing, serde for serializing/deserializing to/from various formats, itertools for working with iterators, etc.).

    • cargo makes it easy to try a library (just add a single line to Cargo.toml and start writing code)
    • It may be worth comparing how CPAN helped make perl a popular choice. Or comparing with python + pip.
  • Development experience is made really nice not only by core Rust tools (e.g. using rustup to switch to a different rustc version when testing a crate that needs to work on nightly, current stable, and older stable) but also by an ecosystem of third-party tools (e.g. Mozilla provides cargo vet for streamlining and sharing security audits; criterion crate gives a streamlined way to run benchmarks).

    • cargo makes it easy to add a tool via cargo install --locked cargo-vet.
    • It may be worth comparing with Chrome Extensions or VScode extensions.
  • Broad, generic examples of projects where cargo may be the right choice:

    • Perhaps surprisingly, Rust is becoming increasingly popular in the industry for writing command line tools. The breadth and ergonomics of libraries is comparable to Python, while being more robust (thanks to the rich typesystem) and running faster (as a compiled, rather than interpreted language).
    • Participating in the Rust ecosystem requires using standard Rust tools like Cargo. Libraries that want to get external contributions, and want to be used outside of Chromium (e.g. in Bazel or Android/Soong build environments) should probably use Cargo.
  • Examples of Chromium-related projects that are cargo-based:

    • serde_json_lenient (experimented with in other parts of Google which resulted in PRs with performance improvements)
    • Fontations libraries like font-types
    • gnrt tool (we will meet it later in the course) which depends on clap for command-line parsing and on toml for configuration files.
      • Disclaimer: a unique reason for using cargo was unavailability of gn when building and bootstrapping Rust standard library when building Rust toolchain.)
      • run_gnrt.py uses Chromium’s copy of cargo and rustc. gnrt depends on third-party libraries downloaded from the internet, by run_gnrt.py asks cargo that only --locked content is allowed via Cargo.lock.)

Students may identify the following items as being implicitly or explicitly trusted:

  • rustc (the Rust compiler) which in turn depends on the LLVM libraries, the Clang compiler, the rustc sources (fetched from GitHub, reviewed by Rust compiler team), binary Rust compiler downloaded for bootstrapping
  • rustup (it may be worth pointing out that rustup is developed under the umbrella of the https://github.com/rust-lang/ organization - same as rustc)
  • cargo, rustfmt, etc.
  • Various internal infrastructure (bots that build rustc, system for distributing the prebuilt toolchain to Chromium engineers, etc.)
  • Cargo tools like cargo audit, cargo vet, etc.
  • Rust libraries vendored into //third_party/rust (audited by security@chromium.org)
  • Other Rust libraries (some niche, some quite popular and commonly used)

Chromium Rust policy

Chromium does not yet allow first-party Rust except in rare cases as approved by Chromium’s Area Tech Leads.

Chromium’s policy on third party libraries is outlined here - Rust is allowed for third party libraries under various circumstances, including if they’re the best option for performance or for security.

Very few Rust libraries directly expose a C/C++ API, so that means that nearly all such libraries will require a small amount of first-party glue code.

RustExistingcrateLanguageCrateboundaryAPIExistingChromiumChromiumRustRustC++C++wrapper

First-party Rust glue code for a particular third-party crate should normally be kept in third_party/rust/<crate>/<version>/wrapper.

Because of this, today’s course will be heavily focused on:

  • Bringing in third-party Rust libraries (“crates”)
  • Writing glue code to be able to use those crates from Chromium C++.

If this policy changes over time, the course will evolve to keep up.

Build rules

Rust code is usually built using cargo. Chromium builds with gn and ninja for efficiency — its static rules allow maximum parallelism. Rust is no exception.

Adding Rust code to Chromium

In some existing Chromium BUILD.gn file, declare a rust_static_library:

import("//build/rust/rust_static_library.gni")

rust_static_library("my_rust_lib") {
  crate_root = "lib.rs"
  sources = [ "lib.rs" ]
}

You can also add deps on other Rust targets. Later we’ll use this to depend upon third party code.

You must specify both the crate root, and a full list of sources. The crate_root is the file given to the Rust compiler representing the root file of the compilation unit — typically lib.rs. sources is a complete list of all source files which ninja needs in order to determine when rebuilds are necessary.

(There’s no such thing as a Rust source_set, because in Rust, an entire crate is a compilation unit. A static_library is the smallest unit.)

Students might be wondering why we need a gn template, rather than using gn’s built-in support for Rust static libraries. The answer is that this template provides support for CXX interop, Rust features, and unit tests, some of which we’ll use later.

Including unsafe Rust Code

Unsafe Rust code is forbidden in rust_static_library by default — it won’t compile. If you need unsafe Rust code, add allow_unsafe = true to the gn target. (Later in the course we’ll see circumstances where this is necessary.)

import("//build/rust/rust_static_library.gni")

rust_static_library("my_rust_lib") {
  crate_root = "lib.rs"
  sources = [
    "lib.rs",
    "hippopotamus.rs"
  ]
  allow_unsafe = true
}

Depending on Rust Code from Chromium C++

Simply add the above target to the deps of some Chromium C++ target.

import("//build/rust/rust_static_library.gni")

rust_static_library("my_rust_lib") {
  crate_root = "lib.rs"
  sources = [ "lib.rs" ]
}

# or source_set, static_library etc.
component("preexisting_cpp") {
  deps = [ ":my_rust_lib" ]
}
We'll see that this relationship only works if the Rust code exposes plain C APIs which can be called from C++, or if we use a C++/Rust interop tool.

Visual Studio Code

Types are elided in Rust code, which makes a good IDE even more useful than for C++. Visual Studio code works well for Rust in Chromium. To use it,

  • Ensure your VSCode has the rust-analyzer extension, not earlier forms of Rust support
  • gn gen out/Debug --export-rust-project (or equivalent for your output directory)
  • ln -s out/Debug/rust-project.json rust-project.json
Example screenshot from VSCode

A demo of some of the code annotation and exploration features of rust-analyzer might be beneficial if the audience are naturally skeptical of IDEs.

The following steps may help with the demo (but feel free to instead use a piece of Chromium-related Rust that you are most familiar with):

  • Open components/qr_code_generator/qr_code_generator_ffi_glue.rs
  • Place the cursor over the QrCode::new call (around line 26) in `qr_code_generator_ffi_glue.rs
  • Demo show documentation (typical bindings: vscode = ctrl k i; vim/CoC = K).
  • Demo go to definition (typical bindings: vscode = F12; vim/CoC = g d). (This will take you to //third_party/rust/.../qr_code-.../src/lib.rs.)
  • Demo outline and navigate to the QrCode::with_bits method (around line 164; the outline is in the file explorer pane in vscode; typical vim/CoC bindings = space o)
  • Demo type annotations (there are quote a few nice examples in the QrCode::with_bits method)

It may be worth pointing out that gn gen ... --export-rust-project will need to be rerun after editing BUILD.gn files (which we will do a few times throughout the exercises in this session).

Build rules exercise

In your Chromium build, add a new Rust target to //ui/base/BUILD.gn containing:

#![allow(unused)]
fn main() {
#[no_mangle]
pub extern "C" fn hello_from_rust() {
    println!("Hello from Rust!")
}
}

Important: note that no_mangle here is considered a type of unsafety by the Rust compiler, so you’ll need to to allow unsafe code in your gn target.

Add this new Rust target as a dependency of //ui/base:base. Declare this function at the top of ui/base/resource/resource_bundle.cc (later, we’ll see how this can be automated by bindings generation tools):

extern "C" void hello_from_rust();

Call this function from somewhere in ui/base/resource/resource_bundle.cc - we suggest the top of ResourceBundle::MaybeMangleLocalizedString. Build and run Chromium, and ensure that “Hello from Rust!” is printed lots of times.

If you use VSCode, now set up Rust to work well in VSCode. It will be useful in subsequent exercises. If you’ve succeeded, you will be able to use right-click “Go to definition” on println!.

Where to find help

It's really important that students get this running, because future exercises will build on it.

This example is unusual because it boils down to the lowest-common-denominator interop language, C. Both C++ and Rust can natively declare and call C ABI functions. Later in the course, we’ll connect C++ directly to Rust.

allow_unsafe = true is required here because #[no_mangle] might allow Rust to generate two functions with the same name, and Rust can no longer guarantee that the right one is called.

If you need a pure Rust executable, you can also do that using the rust_executable gn template.

Probando

Rust community typically authors unit tests in a module placed in the same source file as the code being tested. This was covered earlier in the course and looks like this:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    #[test]
    fn my_test() {
        todo!()
    }
}
}

In Chromium we place unit tests in a separate source file and we continue to follow this practice for Rust — this makes tests consistently discoverable and helps to avoid rebuilding .rs files a second time (in the test configuration).

This results in the following options for testing Rust code in Chromium:

  • Native Rust tests (i.e. #[test]). Discouraged outside of //third_party/rust.
  • gtest tests authored in C++ and exercising Rust via FFI calls. Sufficient when Rust code is just a thin FFI layer and the existing unit tests provide sufficient coverage for the feature.
  • gtest tests authored in Rust and using the crate under test through its public API (using pub mod for_testing { ... } if needed). This is the subject of the next few slides.

Mention that native Rust tests of third-party crates should eventually be exercised by Chromium bots. (Such testing is needed rarely — only after adding or updating third-party crates.)

Some examples may help illustrate when C++ gtest vs Rust gtest should be used:

  • QR has very little functionality in the first-party Rust layer (it’s just a thin FFI glue) and therefore uses the existing C++ unit tests for testing both the C++ and the Rust implementation (parameterizing the tests so they enable or disable Rust using a ScopedFeatureList).

  • Hypothetical/WIP PNG integration may need to implement memory-safe implementation of pixel transformations that are provided by libpng but missing in the png crate - e.g. RGBA => BGRA, or gamma correction. Such functionality may benefit from separate tests authored in Rust.

rust_gtest_interop Library

The rust_gtest_interop library provides a way to:

  • Use a Rust function as a gtest testcase (using the #[gtest(...)] attribute)
  • Use expect_eq! and similar macros (similar to assert_eq! but not panicking and not terminating the test when the assertion fails).

Example:

use rust_gtest_interop::prelude::*;

#[gtest(MyRustTestSuite, MyAdditionTest)]
fn test_addition() {
    expect_eq!(2 + 2, 4);
}

GN Rules for Rust Tests

The simplest way to build Rust gtest tests is to add them to an existing test binary that already contains tests authored in C++. For example:

test("ui_base_unittests") {
  ...
  sources += [ "my_rust_lib_unittest.rs" ]
  deps += [ ":my_rust_lib" ]
}

Authoring Rust tests in a separate static_library also works, but requires manually declaring the dependency on the support libraries:

rust_static_library("my_rust_lib_unittests") {
  testonly = true
  is_gtest_unittests = true
  crate_root = "my_rust_lib_unittest.rs"
  sources = [ "my_rust_lib_unittest.rs" ]
  deps = [
    ":my_rust_lib",
    "//testing/rust_gtest_interop",
  ]
}

test("ui_base_unittests") {
  ...
  deps += [ ":my_rust_lib_unittests" ]
}

chromium::import! Macro

After adding :my_rust_lib to GN deps, we still need to learn how to import and use my_rust_lib from my_rust_lib_unittest.rs. We haven’t provided an explicit crate_name for my_rust_lib so its crate name is computed based on the full target path and name. Fortunately we can avoid working with such an unwieldy name by using the chromium::import! macro from the automatically-imported chromium crate:

chromium::import! {
    "//ui/base:my_rust_lib";
}

use my_rust_lib::my_function_under_test;

Under the covers the macro expands to something similar to:

extern crate ui_sbase_cmy_urust_ulib as my_rust_lib;

use my_rust_lib::my_function_under_test;

More information can be found in the doc comment of the chromium::import macro.

rust_static_library supports specifying an explicit name via crate_name property, but doing this is discouraged. And it is discouraged because the crate name has to be globally unique. crates.io guarantees uniqueness of its crate names so cargo_crate GN targets (generated by the gnrt tool covered in a later section) use short crate names.

Testing exercise

Time for another exercise!

In your Chromium build:

  • Add a testable function next to hello_from_rust. Some suggestions: adding two integers received as arguments, computing the nth Fibonacci number, summing integers in a slice, etc.
  • Add a separate ..._unittest.rs file with a test for the new function.
  • Add the new tests to BUILD.gn.
  • Build the tests, run them, and verify that the new test works.

Interoperabilidad con C

The Rust community offers multiple options for C++/Rust interop, with new tools being developed all the time. At the moment, Chromium uses a tool called CXX.

You describe your whole language boundary in an interface definition language (which looks a lot like Rust) and then CXX tools generate declarations for functions and types in both Rust and C++.

Overview diagram of cxx, showing that the same interface definition is used to create both C++ and Rust side code which then communicate via a lowest common denominator C API

See the CXX tutorial for a full example of using this.

Talk through the diagram. Explain that behind the scenes, this is doing just the same as you previously did. Point out that automating the process has the following benefits:

  • The tool guarantees that the C++ and Rust sides match (e.g. you get compile errors if the #[cxx::bridge] doesn’t match the actual C++ or Rust definitions, but with out-of-sync manual bindings you’d get Undefined Behavior)
  • The tool automates generation of FFI thunks (small, C-ABI-compatible, free functions) for non-C features (e.g. enabling FFI calls into Rust or C++ methods; manual bindings would require authoring such top-level, free functions manually)
  • The tool and the library can handle a set of core types - for example:
    • &[T] can be passed across the FFI boundary, even though it doesn’t guarantee any particular ABI or memory layout. With manual bindings std::span<T> / &[T] have to be manually destructured and rebuilt out of a pointer and length - this is error-prone given that each language represents empty slices slightly differently)
    • Smart pointers like std::unique_ptr<T>, std::shared_ptr<T>, and/or Box are natively supported. With manual bindings, one would have to pass C-ABI-compatible raw pointers, which would increase lifetime and memory-safety risks.
    • rust::String and CxxString types understand and maintain differences in string representation across the languages (e.g. rust::String::lossy can build a Rust string from non-UTF8 input and rust::String::c_str can NUL-terminate a string).

Ejemplos

CXX requires that the whole C++/Rust boundary is declared in cxx::bridge modules inside .rs source code.

#[cxx::bridge]
mod ffi {
    extern "Rust" {
        type MultiBuf;

        fn next_chunk(buf: &mut MultiBuf) -> &[u8];
    }

    unsafe extern "C++" {
        include!("example/include/blobstore.h");

        type BlobstoreClient;

        fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
        fn put(self: &BlobstoreClient, buf: &mut MultiBuf) -> Result<u64>;
    }
}

// Definitions of Rust types and functions go here

Point out:

  • Although this looks like a regular Rust mod, the #[cxx::bridge] procedural macro does complex things to it. The generated code is quite a bit more sophisticated - though this does still result in a mod called ffi in your code.
  • Native support for C++’s std::unique_ptr in Rust
  • Native support for Rust slices in C++
  • Calls from C++ to Rust, and Rust types (in the top part)
  • Calls from Rust to C++, and C++ types (in the bottom part)

Common misconception: It looks like a C++ header is being parsed by Rust, but this is misleading. This header is never interpreted by Rust, but simply #included in the generated C++ code for the benefit of C++ compilers.

Limitations of CXX

By far the most useful page when using CXX is the type reference.

CXX fundamentally suits cases where:

  • Your Rust-C++ interface is sufficiently simple that you can declare all of it.
  • You’re using only the types natively supported by CXX already, for example std::unique_ptr, std::string, &[u8] etc.

It has many limitations — for example lack of support for Rust’s Option type.

These limitations constrain us to using Rust in Chromium only for well isolated “leaf nodes” rather than for arbitrary Rust-C++ interop. When considering a use-case for Rust in Chromium, a good starting point is to draft the CXX bindings for the language boundary to see if it appears simple enough.

In addition, right now, Rust code in one component cannot depend on Rust code in another, due to linking details in our component build. That's another reason to restrict Rust to use in leaf nodes.

You should also discuss some of the other sticky points with CXX, for example:

  • Its error handling is based around C++ exceptions (given on the next slide)
  • Function pointers are awkward to use.

Manejo de Errores

CXX’s support for Result<T,E> relies on C++ exceptions, so we can’t use that in Chromium. Alternatives:

  • The T part of Result<T, E> can be:

    • Returned via out parameters (e.g. via &mut T). This requires that T can be passed across the FFI boundary - for example T has to be:
      • A primitive type (like u32 or usize)
      • A type natively supported by cxx (like UniquePtr<T>) that has a suitable default value to use in a failure case (unlike Box<T>).
    • Retained on the Rust side, and exposed via reference. This may be needed when T is a Rust type, which cannot be passed across the FFI boundary, and cannot be stored in UniquePtr<T>.
  • The E part of Result<T, E> can be:

    • Returned as a boolean (e.g. true representing success, and false representing failure)
    • Preserving error details is in theory possible, but so far hasn’t been needed in practice.

CXX Error Handling: QR Example

The QR code generator is an example where a boolean is used to communicate success vs failure, and where the successful result can be passed across the FFI boundary:

#[cxx::bridge(namespace = "qr_code_generator")]
mod ffi {
    extern "Rust" {
        fn generate_qr_code_using_rust(
            data: &[u8],
            min_version: i16,
            out_pixels: Pin<&mut CxxVector<u8>>,
            out_qr_size: &mut usize,
        ) -> bool;
    }
}

Students may be curious about the semantics of the out_qr_size output. This is not the size of the vector, but the size of the QR code (and admittedly it is a bit redundant - this is the square root of the size of the vector).

It may be worth pointing out the importance of initializing out_qr_size before calling into the Rust function. Creation of a Rust reference that points to uninitialized memory results in Undefined Behavior (unlike in C++, when only the act of dereferencing such memory results in UB).

If students ask about Pin, then explain why CXX needs it for mutable references to C++ data: the answer is that C++ data can’t be moved around like Rust data, because it may contain self-referential pointers.

CXX Error Handling: PNG Example

A prototype of a PNG decoder illustrates what can be done when the successful result cannot be passed across the FFI boundary:

#[cxx::bridge(namespace = "gfx::rust_bindings")]
mod ffi {
    extern "Rust" {
        /// This returns an FFI-friendly equivalent of `Result<PngReader<'a>,
        /// ()>`.
        fn new_png_reader<'a>(input: &'a [u8]) -> Box<ResultOfPngReader<'a>>;

        /// C++ bindings for the `crate::png::ResultOfPngReader` type.
        type ResultOfPngReader<'a>;
        fn is_err(self: &ResultOfPngReader) -> bool;
        fn unwrap_as_mut<'a, 'b>(
            self: &'b mut ResultOfPngReader<'a>,
        ) -> &'b mut PngReader<'a>;

        /// C++ bindings for the `crate::png::PngReader` type.
        type PngReader<'a>;
        fn height(self: &PngReader) -> u32;
        fn width(self: &PngReader) -> u32;
        fn read_rgba8(self: &mut PngReader, output: &mut [u8]) -> bool;
    }
}

PngReader and ResultOfPngReader are Rust types — objects of these types cannot cross the FFI boundary without indirection of a Box<T>. We can’t have an out_parameter: &mut PngReader, because CXX doesn’t allow C++ to store Rust objects by value.

This example illustrates that even though CXX doesn’t support arbitrary generics nor templates, we can still pass them across the FFI boundary by manually specializing / monomorphizing them into a non-generic type. In the example ResultOfPngReader is a non-generic type that forwards into appropriate methods of Result<T, E> (e.g. into is_err, unwrap, and/or as_mut).

Using cxx in Chromium

In Chromium, we define an independent #[cxx::bridge] mod for each leaf-node where we want to use Rust. You’d typically have one for each rust_static_library. Just add

cxx_bindings = [ "my_rust_file.rs" ]
   # list of files containing #[cxx::bridge], not all source files
allow_unsafe = true

to your existing rust_static_library target alongside crate_root and sources.

C++ headers will be generated at a sensible location, so you can just

#include "ui/base/my_rust_file.rs.h"

You will find some utility functions in //base to convert to/from Chromium C++ types to CXX Rust types — for example SpanToRustSlice.

Students may ask — why do we still need allow_unsafe = true?

The broad answer is that no C/C++ code is “safe” by the normal Rust standards. Calling back and forth to C/C++ from Rust may do arbitrary things to memory, and compromise the safety of Rust’s own data layouts. Presence of too many unsafe keywords in C/C++ interop can harm the signal-to-noise ratio of such a keyword, and is controversial, but strictly, bringing any foreign code into a Rust binary can cause unexpected behavior from Rust’s perspective.

The narrow answer lies in the diagram at the top of this page — behind the scenes, CXX generates Rust unsafe and extern "C" functions just like we did manually in the previous section.

Exercise: Interoperability with C++

Part one

  • In the Rust file you previously created, add a #[cxx::bridge] which specifies a single function, to be called from C++, called hello_from_rust, taking no parameters and returning no value.
  • Modify your previous hello_from_rust function to remove extern "C" and #[no_mangle]. This is now just a standard Rust function.
  • Modify your gn target to build these bindings.
  • In your C++ code, remove the forward-declaration of hello_from_rust. Instead, include the generated header file.
  • Build and run!

Part two

It’s a good idea to play with CXX a little. It helps you think about how flexible Rust in Chromium actually is.

Some things to try:

  • Call back into C++ from Rust. You will need:
    • An additional header file which you can include! from your cxx::bridge. You’ll need to declare your C++ function in that new header file.
    • An unsafe block to call such a function, or alternatively specify the unsafe keyword in your #[cxx::bridge] as described here.
    • You may also need to #include "third_party/rust/cxx/v1/crate/include/cxx.h"
  • Pass a C++ string from C++ into Rust.
  • Pass a reference to a C++ object into Rust.
  • Intentionally get the Rust function signatures mismatched from the #[cxx::bridge], and get used to the errors you see.
  • Intentionally get the C++ function signatures mismatched from the #[cxx::bridge], and get used to the errors you see.
  • Pass a std::unique_ptr of some type from C++ into Rust, so that Rust can own some C++ object.
  • Create a Rust object and pass it into C++, so that C++ owns it. (Hint: you need a Box).
  • Declare some methods on a C++ type. Call them from Rust.
  • Declare some methods on a Rust type. Call them from C++.

Part three

Now you understand the strengths and limitations of CXX interop, think of a couple of use-cases for Rust in Chromium where the interface would be sufficiently simple. Sketch how you might define that interface.

Where to find help

As students explore Part Two, they're bound to have lots of questions about how to achieve these things, and also how CXX works behind the scenes.

Some of the questions you may encounter:

  • I’m seeing a problem initializing a variable of type X with type Y, where X and Y are both function types. This is because your C++ function doesn’t quite match the declaration in your cxx::bridge.
  • I seem to be able to freely convert C++ references into Rust references. Doesn’t that risk UB? For CXX’s opaque types, no, because they are zero-sized. For CXX trivial types yes, it’s possible to cause UB, although CXX’s design makes it quite difficult to craft such an example.

Adding Third Party Crates

Rust libraries are called “crates” and are found at crates.io. It’s very easy for Rust crates to depend upon one another. So they do!

PropiedadC++ libraryRust crate
Build systemLotsConsistent: Cargo.toml
Typical library sizeLarge-ishSmall
Transitive dependenciesFewLots

For a Chromium engineer, this has pros and cons:

  • All crates use a common build system so we can automate their inclusion into Chromium…
  • … but, crates typically have transitive dependencies, so you will likely have to bring in multiple libraries.

We’ll discuss:

  • How to put a crate in the Chromium source code tree
  • How to make gn build rules for it
  • How to audit its source code for sufficient safety.
All of the things in the table on this slide are generalizations, and counter-examples can be found. But in general it's important for students to understand that most Rust code depends on other Rust libraries, because it's easy to do so, and that this has both benefits and costs.

Configuring the Cargo.toml file to add crates

Chromium has a single set of centrally-managed direct crate dependencies. These are managed through a single Cargo.toml:

[dependencies]
bitflags = "1"
cfg-if = "1"
cxx = "1"
# lots more...

As with any other Cargo.toml, you can specify more details about the dependencies — most commonly, you’ll want to specify the features that you wish to enable in the crate.

When adding a crate to Chromium, you’ll often need to provide some extra information in an additional file, gnrt_config.toml, which we’ll meet next.

Configuring gnrt_config.toml

Alongside Cargo.toml is gnrt_config.toml. This contains Chromium-specific extensions to crate handling.

If you add a new crate, you should specify at least the group. This is one of:

#   'safe': The library satisfies the rule-of-2 and can be used in any process.
#   'sandbox': The library does not satisfy the rule-of-2 and must be used in
#              a sandboxed process such as the renderer or a utility process.
#   'test': The library is only used in tests.

For instance,

[crate.my-new-crate]
group = 'test' # only used in test code

Depending on the crate source code layout, you may also need to use this file to specify where its LICENSE file(s) can be found.

Later, we’ll see some other things you will need to configure in this file to resolve problems.

Downloading Crates

A tool called gnrt knows how to download crates and how to generate BUILD.gn rules.

To start, download the crate you want like this:

cd chromium/src
vpython3 tools/crates/run_gnrt.py -- vendor

Although the gnrt tool is part of the Chromium source code, by running this command you will be downloading and running its dependencies from crates.io. See the earlier section discussing this security decision.

This vendor command may download:

  • Your crate
  • Direct and transitive dependencies
  • New versions of other crates, as required by cargo to resolve the complete set of crates required by Chromium.

Chromium maintains patches for some crates, kept in //third_party/rust/chromium_crates_io/patches. These will be reapplied automatically, but if patching fails you may need to take manual action.

Generating gn Build Rules

Once you’ve downloaded the crate, generate the BUILD.gn files like this:

vpython3 tools/crates/run_gnrt.py -- gen

Now run git status. You should find:

  • At least one new crate source code in third_party/rust/chromium_crates_io/vendor
  • At least one new BUILD.gn in third_party/rust/<crate name>/v<major semver version>
  • An appropriate README.chromium

The “major semver version” is a Rust “semver” version number.

Take a close look, especially at the things generated in third_party/rust.

Talk a little about semver — and specifically the way that in Chromium it’s to allow multiple incompatible versions of a crate, which is discouraged but sometimes necessary in the Cargo ecosystem.

Resolving Problems

If your build fails, it may be because of a build.rs: programs which do arbitrary things at build time. This is fundamentally at odds with the design of gn and ninja which aim for static, deterministic, build rules to maximize parallelism and repeatability of builds.

Some build.rs actions are automatically supported; others require action:

build script effectSupported by our gn templatesWork required by you
Checking rustc version to configure features on and offNone
Checking platform or CPU to configure features on and offNone
Generating codeYes - specify in gnrt_config.toml
Building C/C++NoPatch around it
Arbitrary other actionsNoPatch around it

Fortunately, most crates don’t contain a build script, and fortunately, most build scripts only do the top two actions.

Build Scripts Which Generate Code

If ninja complains about missing files, check the build.rs to see if it writes source code files.

If so, modify gnrt_config.toml to add build-script-outputs to the crate. If this is a transitive dependency, that is, one on which Chromium code should not directly depend, also add allow-first-party-usage=false. There are several examples already in that file:

[crate.unicode-linebreak]
allow-first-party-usage = false
build-script-outputs = ["tables.rs"]

Now rerun gnrt.py -- gen to regenerate BUILD.gn files to inform ninja that this particular output file is input to subsequent build steps.

Build Scripts Which Build C++ or Take Arbitrary Actions

Some crates use the cc crate to build and link C/C++ libraries. Other crates parse C/C++ using bindgen within their build scripts. These actions can’t be supported in a Chromium context — our gn, ninja and LLVM build system is very specific in expressing relationships between build actions.

So, your options are:

  • Avoid these crates
  • Apply a patch to the crate.

Patches should be kept in third_party/rust/chromium_crates_io/patches/<crate> - see for example the patches against the cxx crate - and will be applied automatically by gnrt each time it upgrades the crate.

Depending on a Crate

Once you’ve added a third-party crate and generated build rules, depending on a crate is simple. Find your rust_static_library target, and add a dep on the :lib target within your crate.

Specifically,

cratenamemajorsemverversion//third_party/rust/v:lib

For instance,

rust_static_library("my_rust_lib") {
  crate_root = "lib.rs"
  sources = [ "lib.rs" ]
  deps = [ "//third_party/rust/example_rust_crate/v1:lib" ]
}

Auditing Third Party Crates

Adding new libraries is subject to Chromium’s standard policies, but of course also subject to security review. As you may be bringing in not just a single crate but also transitive dependencies, there may be a lot of code to review. On the other hand, safe Rust code can have limited negative side effects. How should you review it?

Over time Chromium aims to move to a process based around cargo vet.

Meanwhile, for each new crate addition, we are checking for the following:

  • Understand why each crate is used. What’s the relationship between crates? If the build system for each crate contains a build.rs or procedural macros, work out what they’re for. Are they compatible with the way Chromium is normally built?
  • Check each crate seems to be reasonably well maintained
  • Use cd third-party/rust/chromium_crates_io; cargo audit to check for known vulnerabilities (first you’ll need to cargo install cargo-audit, which ironically involves downloading lots of dependencies from the internet2)
  • Ensure any unsafe code is good enough for the Rule of Two
  • Check for any use of fs or net APIs
  • Read all the code at a sufficient level to look for anything out of place that might have been maliciously inserted. (You can’t realistically aim for 100% perfection here: there’s often just too much code.)

These are just guidelines — work with reviewers from security@chromium.org to work out the right way to become confident of the crate.

Checking Crates into Chromium Source Code

git status should reveal:

  • Crate code in //third_party/rust/chromium_crates_io
  • Metadata (BUILD.gn and README.chromium) in //third_party/rust/<crate>/<version>

Please also add an OWNERS file in the latter location.

You should land all this, along with your Cargo.toml and gnrt_config.toml changes, into the Chromium repo.

Important: you need to use git add -f because otherwise .gitignore files may result in some files being skipped.

As you do so, you might find presubmit checks fail because of non-inclusive language. This is because Rust crate data tends to include names of git branches, and many projects still use non-inclusive terminology there. So you may need to run:

infra/update_inclusive_language_presubmit_exempt_dirs.sh > infra/inclusive_language_presubmit_exempt_dirs.txt
git add -p infra/inclusive_language_presubmit_exempt_dirs.txt # add whatever changes are yours

Keeping Crates Up to Date

As the OWNER of any third party Chromium dependency, you are expected to keep it up to date with any security fixes. It is hoped that we will soon automate this for Rust crates, but for now, it’s still your responsibility just as it is for any other third party dependency.

Ejercicios

Add uwuify to Chromium, turning off the crate’s default features. Assume that the crate will be used in shipping Chromium, but won’t be used to handle untrustworthy input.

(In the next exercise we’ll use uwuify from Chromium, but feel free to skip ahead and do that now if you like. Or, you could create a new rust_executable target which uses uwuify).

Students will need to download lots of transitive dependencies.

The total crates needed are:

  • instant,
  • lock_api,
  • parking_lot,
  • parking_lot_core,
  • redox_syscall,
  • scopeguard,
  • smallvec, and
  • uwuify.

If students are downloading even more than that, they probably forgot to turn off the default features.

Thanks to Daniel Liu for this crate!

Bringing It Together — Exercise

In this exercise, you’re going to add a whole new Chromium feature, bringing together everything you already learned.

The Brief from Product Management

A community of pixies has been discovered living in a remote rainforest. It’s important that we get Chromium for Pixies delivered to them as soon as possible.

The requirement is to translate all Chromium’s UI strings into Pixie language.

There’s not time to wait for proper translations, but fortunately pixie language is very close to English, and it turns out there’s a Rust crate which does the translation.

In fact, you already imported that crate in the previous exercise.

(Obviously, real translations of Chrome require incredible care and diligence. Don’t ship this!)

Steps

Modify ResourceBundle::MaybeMangleLocalizedString so that it uwuifies all strings before display. In this special build of Chromium, it should always do this irrespective of the setting of mangle_localized_strings_.

If you’ve done everything right across all these exercises, congratulations, you should have created Chrome for pixies!

Chromium UI screenshot with uwu language
Students will likely need some hints here. Hints include:
  • UTF16 vs UTF8. Students should be aware that Rust strings are always UTF8, and will probably decide that it’s better to do the conversion on the C++ side using base::UTF16ToUTF8 and back again.
  • If students decide to do the conversion on the Rust side, they’ll need to consider String::from_utf16, consider error handling, and consider which CXX supported types can transfer a lot of u16s.
  • Students may design the C++/Rust boundary in several different ways, e.g. taking and returning strings by value, or taking a mutable reference to a string. If a mutable reference is used, CXX will likely tell the student that they need to use Pin. You may need to explain what Pin does, and then explain why CXX needs it for mutable references to C++ data: the answer is that C++ data can’t be moved around like Rust data, because it may contain self-referential pointers.
  • The C++ target containing ResourceBundle::MaybeMangleLocalizedString will need to depend on a rust_static_library target. The student probably already did this.
  • The rust_static_library target will need to depend on //third_party/rust/uwuify/v0_2:lib.

Soluciones

Solutions to the Chromium exercises can be found in this series of CLs.

Te damos la bienvenida a Bare Metal Rust

Este es un curso independiente de un día sobre Rust bare-metal, dirigido a personas que están familiarizadas con los conceptos básicos de Rust (tal vez después de completar el curso Comprehensive Rust). Lo ideal sería que también tuvieran experiencia con la programación bare-metal en otros lenguajes, como C.

Hoy vamos a hablar de Rust “bare-metal”: ejecutar código de Rust sin un sistema operativo. Se dividirá en varias partes:

  • ¿Qué es no_std en Rust?
  • Escribir firmware para microcontroladores.
  • Escribir código bootloader o kernel para procesadores de aplicaciones.
  • Algunos crates útiles para el desarrollo de Rust bare-metal.

En la parte del curso dedicada a los microcontroladores, utilizaremos la versión 2 de BBC micro:bit como ejemplo. Es una placa de desarrollo basada en el microcontrolador Nordic nRF51822 con algunos LED y botones, un acelerómetro y una brújula conectados mediante I2C y un depurador SWD integrado.

Para empezar, instala algunas de las herramientas que necesitarás más adelante. En gLinux o Debian:

sudo apt install gcc-aarch64-linux-gnu gdb-multiarch libudev-dev picocom pkg-config qemu-system-arm
rustup update
rustup target add aarch64-unknown-none thumbv7em-none-eabihf
rustup component add llvm-tools-preview
cargo install cargo-binutils cargo-embed

Permite a los usuarios del grupo plugdev acceder al programador micro:bit:

echo 'SUBSYSTEM=="usb", ATTR{idVendor}=="0d28", MODE="0664", GROUP="plugdev"' |\
  sudo tee /etc/udev/rules.d/50-microbit.rules
sudo udevadm control --reload-rules

En MacOS:

xcode-select --install
brew install gdb picocom qemu
brew install --cask gcc-aarch64-embedded
rustup update
rustup target add aarch64-unknown-none thumbv7em-none-eabihf
rustup component add llvm-tools-preview
cargo install cargo-binutils cargo-embed

no_std

core

alloc

std

  • Slices, &str, CStr
  • NonZeroU8
  • Option, Result
  • Display, Debug, write!
  • Iterator
  • panic!, assert_eq!
  • NonNull y todas las funciones relacionadas con punteros habituales
  • Future and async/await
  • fence, AtomicBool, AtomicPtr, AtomicU32
  • Duration
  • Box, Cow, Arc, Rc
  • Vec, BinaryHeap, BtreeMap, LinkedList, VecDeque
  • String, CString, format!
  • Error
  • HashMap
  • Mutex, Condvar, Barrier, Once, RwLock, mpsc
  • File y el resto de fs
  • println!, Read, Write, Stdin, Stdout y el resto de io
  • Path, OsString
  • net
  • Command, Child, ExitCode
  • spawn, sleep y el resto de thread
  • SystemTime, Instant
  • HashMap depende de RNG.
  • std vuelve a exportar el contenido de core y alloc.

Un programa no_std mínimo

#![no_main]
#![no_std]

use core::panic::PanicInfo;

#[panic_handler]
fn panic(_panic: &PanicInfo) -> ! {
    loop {}
}
  • Se compilará en un binario vacío.
  • std proporciona un controlador de panic; sin no hay, debemos proporcionar uno nuestro.
  • También puede proporcionarlo otro crate, como panic-halt.
  • Dependiendo del objetivo, es posible que tengas que compilar con panic = "abort" para evitar un error sobre eh_personality.
  • Ten en cuenta que no hay main ni ningún otro punto de entrada; depende de ti definir un punto de entrada propio. Esto suele implicar una secuencia de comandos de enlazador y algún código de ensamblado de forma que todo esté preparado para que se ejecute el código de Rust.

alloc

Para utilizar alloc, debes implementar un asignador global (de heap).

#![no_main]
#![no_std]

extern crate alloc;
extern crate panic_halt as _;

use alloc::string::ToString;
use alloc::vec::Vec;
use buddy_system_allocator::LockedHeap;

#[global_allocator]
static HEAP_ALLOCATOR: LockedHeap<32> = LockedHeap::<32>::new();

static mut HEAP: [u8; 65536] = [0; 65536];

pub fn entry() {
    // Safe because `HEAP` is only used here and `entry` is only called once.
    unsafe {
        // Give the allocator some memory to allocate.
        HEAP_ALLOCATOR.lock().init(HEAP.as_mut_ptr() as usize, HEAP.len());
    }

    // Now we can do things that require heap allocation.
    let mut v = Vec::new();
    v.push("A string".to_string());
}
  • buddy_system_allocator es un crate de terceros que implementa un buddy system allocator (una técnica de asignación de memoria) básico. Hay otros crates disponibles, pero también puedes escribir el tuyo propio o conectarte a tu asignador.
  • El parámetro const de LockedHeap es el orden máximo del asignador. Es decir, en este caso, puede asignar regiones de hasta 2**32 bytes.
  • Si algún crate del árbol de dependencias depende de alloc, debes tener exactamente un asignador global definido en el binario. Esto se suele hacer en el crate binario de nivel superior.
  • extern crate panic_halt as _ es necesario para asegurar que el crate panic_halt esté vinculado y así podamos obtener su controlador de panic.
  • Este ejemplo se compilará pero no se ejecutará, ya que no cuenta con un punto de entrada.

Microcontroladores

El crate cortex_m_rt proporciona (entre otras cosas) un controlador de reinicio para microcontroladores Cortex M.

#![no_main]
#![no_std]

extern crate panic_halt as _;

mod interrupts;

use cortex_m_rt::entry;

#[entry]
fn main() -> ! {
    loop {}
}

A continuación, veremos cómo se accede a los periféricos con niveles de abstracción cada vez mayores.

  • La macro cortex_m_rt::entry requiere que la función tenga el tipo fn() -> !, ya que no tiene sentido devolver resultados al controlador de reinicio.
  • Ejecuta el ejemplo con cargo embed --bin minimal.

MMIO sin procesar

La mayoría de los microcontroladores acceden a los periféricos a través de E/S asignada a la memoria. Vamos a probar a encender un LED en nuestro micro:bit:

#![no_main]
#![no_std]

extern crate panic_halt as _;

mod interrupts;

use core::mem::size_of;
use cortex_m_rt::entry;

/// GPIO port 0 peripheral address
const GPIO_P0: usize = 0x5000_0000;

// GPIO peripheral offsets
const PIN_CNF: usize = 0x700;
const OUTSET: usize = 0x508;
const OUTCLR: usize = 0x50c;

// PIN_CNF fields
const DIR_OUTPUT: u32 = 0x1;
const INPUT_DISCONNECT: u32 = 0x1 << 1;
const PULL_DISABLED: u32 = 0x0 << 2;
const DRIVE_S0S1: u32 = 0x0 << 8;
const SENSE_DISABLED: u32 = 0x0 << 16;

#[entry]
fn main() -> ! {
    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    let pin_cnf_21 = (GPIO_P0 + PIN_CNF + 21 * size_of::<u32>()) as *mut u32;
    let pin_cnf_28 = (GPIO_P0 + PIN_CNF + 28 * size_of::<u32>()) as *mut u32;
    // Safe because the pointers are to valid peripheral control registers, and
    // no aliases exist.
    unsafe {
        pin_cnf_21.write_volatile(
            DIR_OUTPUT
                | INPUT_DISCONNECT
                | PULL_DISABLED
                | DRIVE_S0S1
                | SENSE_DISABLED,
        );
        pin_cnf_28.write_volatile(
            DIR_OUTPUT
                | INPUT_DISCONNECT
                | PULL_DISABLED
                | DRIVE_S0S1
                | SENSE_DISABLED,
        );
    }

    // Set pin 28 low and pin 21 high to turn the LED on.
    let gpio0_outset = (GPIO_P0 + OUTSET) as *mut u32;
    let gpio0_outclr = (GPIO_P0 + OUTCLR) as *mut u32;
    // Safe because the pointers are to valid peripheral control registers, and
    // no aliases exist.
    unsafe {
        gpio0_outclr.write_volatile(1 << 28);
        gpio0_outset.write_volatile(1 << 21);
    }

    loop {}
}
  • El pin 21 de GPIO 0 está conectado a la primera columna de la matriz de LED y el pin 28 a la primera fila.

Ejecuta el ejemplo con:

cargo embed --bin mmio

Crates de Acceso Periférico

svd2rust genera, en su gran mayoría, envoltorios seguros de Rust para periféricos asignados a la memoria a partir de archivos CMSIS-SVD.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use nrf52833_pac::Peripherals;

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();
    let gpio0 = p.P0;

    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    gpio0.pin_cnf[21].write(|w| {
        w.dir().output();
        w.input().disconnect();
        w.pull().disabled();
        w.drive().s0s1();
        w.sense().disabled();
        w
    });
    gpio0.pin_cnf[28].write(|w| {
        w.dir().output();
        w.input().disconnect();
        w.pull().disabled();
        w.drive().s0s1();
        w.sense().disabled();
        w
    });

    // Set pin 28 low and pin 21 high to turn the LED on.
    gpio0.outclr.write(|w| w.pin28().clear());
    gpio0.outset.write(|w| w.pin21().set());

    loop {}
}
  • Los archivos SVD (System View Description) son archivos XML que suelen proporcionar los proveedores de silicio y que describen el mapa de memoria del dispositivo.
    • Se organizan por periférico, registro, campo y valor, con nombres, descripciones y direcciones, etc.
    • Los archivos SVD suelen tener errores y estar incompletos, por lo que existen varios proyectos que aplican parches a los errores, añaden detalles que faltan y publican los crates generados.
  • cortex-m-rt proporciona la tabla de vectores, entre otras cosas.
  • Si instalas cargo install cargo-binutils puedes ejecutar cargo objdump --bin pac -- -d --no-show-raw-insn para ver el binario resultante.

Ejecuta el ejemplo con:

cargo embed --bin pac

Crates HAL

Los crates HAL de muchos microcontroladores incluyen envoltorios alrededor de varios periféricos. Por lo general, implementan traits de embedded-hal.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use nrf52833_hal::gpio::{p0, Level};
use nrf52833_hal::pac::Peripherals;
use nrf52833_hal::prelude::*;

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();

    // Create HAL wrapper for GPIO port 0.
    let gpio0 = p0::Parts::new(p.P0);

    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    let mut col1 = gpio0.p0_28.into_push_pull_output(Level::High);
    let mut row1 = gpio0.p0_21.into_push_pull_output(Level::Low);

    // Set pin 28 low and pin 21 high to turn the LED on.
    col1.set_low().unwrap();
    row1.set_high().unwrap();

    loop {}
}
  • set_low y set_high son métodos del trait OutputPin de embedded_hal.
  • Hay crates HAL para muchos dispositivos Cortex-M y RISC-V, incluidos varios microcontroladores STM32, GD32, nRF, NXP, MSP430, AVR y PIC.

Ejecuta el ejemplo con:

cargo embed --bin hal

Crates de compatibilidad de placa

Los crates de compatibilidad de placa proporcionan un nivel adicional de envoltorio a una placa específica para mayor comodidad.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use microbit::hal::prelude::*;
use microbit::Board;

#[entry]
fn main() -> ! {
    let mut board = Board::take().unwrap();

    board.display_pins.col1.set_low().unwrap();
    board.display_pins.row1.set_high().unwrap();

    loop {}
}
  • En este caso, el crate de compatibilidad de placa proporciona solo nombres más útiles y un poco de inicialización.
  • El crate también puede incluir controladores para algunos dispositivos integrados fuera del propio microcontrolador .
    • microbit-v2 incluye un controlador sencillo para la matriz de LED.

Ejecuta el ejemplo con:

cargo embed --bin board_support

El patrón de tipo de estado

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();
    let gpio0 = p0::Parts::new(p.P0);

    let pin: P0_01<Disconnected> = gpio0.p0_01;

    // let gpio0_01_again = gpio0.p0_01; // Error, moved.
    let pin_input: P0_01<Input<Floating>> = pin.into_floating_input();
    if pin_input.is_high().unwrap() {
        // ...
    }
    let mut pin_output: P0_01<Output<OpenDrain>> = pin_input
        .into_open_drain_output(OpenDrainConfig::Disconnect0Standard1, Level::Low);
    pin_output.set_high().unwrap();
    // pin_input.is_high(); // Error, moved.

    let _pin2: P0_02<Output<OpenDrain>> = gpio0
        .p0_02
        .into_open_drain_output(OpenDrainConfig::Disconnect0Standard1, Level::Low);
    let _pin3: P0_03<Output<PushPull>> =
        gpio0.p0_03.into_push_pull_output(Level::Low);

    loop {}
}
  • Los pines no implementan Copy ni Clone, por lo que solo puede haber una instancia de cada uno. Cuando se quita un pin de la estructura del puerto, nadie más puede usarlo.
  • Si cambias la configuración de un pin, se consumirá la instancia del pin anterior y no podrás seguir usando la instancia previa.
  • El tipo de un valor indica el estado en el que se encuentra: por ejemplo, en este caso, el estado de configuración de un pin de GPIO. De esta manera, se codifica la máquina de estados en el sistema de tipos, asegurando así que no se use un pin de cierta forma sin antes configurarlo correctamente. Las transiciones de estado ilegales se detectan durante el tiempo de compilación.
  • Puedes llamar a is_high en un pin de entrada y a set_high en un pin de salida, pero no al revés.
  • Muchos crates HAL siguen este patrón.

embedded-hal

El crate embedded-hal proporciona una serie de traits que cubren los periféricos habituales de los microcontroladores.

  • GPIO
  • ADC
  • I2C, SPI, UART, CAN
  • RNG
  • Temporizadores
  • Watchdogs

Es entonces cuando otros crates implementan controladores en función de estos traits. Por ejemplo, un controlador de acelerómetro podría necesitar una implementación de bus I2C o SPI.

  • Hay implementaciones para muchos microcontroladores, así como otras plataformas como Linux en Raspberry Pi.
  • Se está trabajando en una versión async de embedded-hal, pero aún no es estable.

probe-rs and cargo-embed

probe-rs es un conjunto de herramientas de depuración integradas muy útil, como OpenOCD, pero mejor integrado.

  • SWD (Serial Wire Debug) and JTAG via CMSIS-DAP, ST-Link and J-Link probes
  • GDB stub and Microsoft DAP (Debug Adapter Protocol) server
  • Integración de Cargo

cargo-embed is a cargo subcommand to build and flash binaries, log RTT (Real Time Transfers) output and connect GDB. It’s configured by an Embed.toml file in your project directory.

  • CMSIS-DAP es un protocolo estándar de Arm mediante USB que permite que un depurador en circuito acceda al puerto de acceso de depuración CoreSight de varios procesadores Cortex de Arm. Es lo que utiliza el depurador integrado en el BBC micro:bit
  • ST-Link es una gama de depuradores en circuito de ST Microelectronics. J-Link es una gama de SEGGER.
  • El puerto de acceso de depuración suele ser una interfaz JTAG de 5 pines o una SWD de 2 pines.
  • probe-rs es una biblioteca que puedes integrar en tus propias herramientas.
  • El protocolo de adaptador de depuración de Microsoft permite que VSCode y otros IDEs depuren el código que se ejecuta en cualquier microcontrolador compatible.
  • cargo-embed es un binario compilado con la biblioteca probe-rs.
  • TTR (transferencias en tiempo real) es un mecanismo para transferir datos entre el host de depuración y el objetivo a través de una serie de búferes circulares.

Depuración

Embed.toml:

[default.general]
chip = "nrf52833_xxAA"

[debug.gdb]
enabled = true

En un terminal en src/bare-metal/microcontrollers/examples/:

cargo embed --bin board_support debug

En otro terminal del mismo directorio:

On gLinux or Debian:

gdb-multiarch target/thumbv7em-none-eabihf/debug/board_support --eval-command="target remote :1337"

En MacOS:

arm-none-eabi-gdb target/thumbv7em-none-eabihf/debug/board_support --eval-command="target remote :1337"

En GDB, prueba a ejecutar:

b src/bin/board_support.rs:29
b src/bin/board_support.rs:30
b src/bin/board_support.rs:32
c
c
c

Otros proyectos

  • RTIC
    • “Concurrencia en tiempo real basada en interrupciones”
    • Gestión de recursos compartidos, envío de mensajes, programación de tareas, cola del temporizador, etc.
  • Embassy
    • Ejecutores async con prioridades, temporizadores, redes, USB, etc.
  • TockOS
    • RTOS centrado en la seguridad con programación interrumpible y compatibilidad con la unidad de protección de memoria.
  • Hubris
    • RTOS de microkernel de Oxide Computer Company con protección de memoria, controladores sin privilegios, IPC, etc.
  • Enlaces para FreeRTOS
  • Algunas plataformas tienen implementaciones std, como esp-idf.
  • RTIC se puede considerar un RTOS o un framework de concurrencia.
    • No incluye ningún HAL.
    • Usa el NVIC (controlador de interrupción virtual anidado) Cortex‐M para la programación en lugar de un kernel propio.
    • Solo Cortex-M.
  • Google utiliza TockOS en el microcontrolador Haven para las llaves de seguridad Titan.
  • FreeRTOS está escrito principalmente en C, pero hay enlaces de Rust para aplicaciones de escritura.

Ejercicios

Leeremos la dirección desde una brújula I2C, y registraremos las lecturas en un puerto serie.

Después de realizar los ejercicios, puedes consultar las soluciones correspondientes.

Brújula

Leeremos la dirección desde una brújula I2C, y registraremos las lecturas en un puerto serie. Si tienes tiempo, prueba a mostrarlo también en los LED o usa los botones de alguna forma.

Sugerencias:

  • Consulta la documentación sobre los crates lsm303agr y microbit-v2, así como el hardware de micro:bit.
  • La unidad de medición inercial LSM303AGR está conectada al bus I2C interno.
  • TWI es otro nombre para I2C, por lo que el periférico I2C maestro se llama TWIM.
  • El controlador LSM303AGR necesita algo que implemente el trait embedded_hal::blocking::i2c::WriteRead. La estructura microbit::hal::Twim implementa esto.
  • Tienes una estructura microbit::Board con campos para los distintos pines y periféricos.
  • También puedes consultar la [hoja de datos nRF52833]nRF52833 datasheet si quieres, pero no debería ser necesario para este ejercicio.

Descarga la plantilla de ejercicio y busca los siguientes archivos en el directorio compass.

src/main.rs:

#![no_main]
#![no_std]

extern crate panic_halt as _;

use core::fmt::Write;
use cortex_m_rt::entry;
use microbit::{hal::uarte::{Baudrate, Parity, Uarte}, Board};

#[entry]
fn main() -> ! {
    let board = Board::take().unwrap();

    // Configure serial port.
    let mut serial = Uarte::new(
        board.UARTE0,
        board.uart.into(),
        Parity::EXCLUDED,
        Baudrate::BAUD115200,
    );

    // Use the system timer as a delay provider.
    let mut delay = Delay::new(board.SYST);

    // Set up the I2C controller and Inertial Measurement Unit.
    // TODO

    writeln!(serial, "Ready.").unwrap();

    loop {
        // Read compass data and log it to the serial port.
        // TODO
    }
}

Cargo.toml (you shouldn’t need to change this):

[workspace]

[package]
name = "compass"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
cortex-m-rt = "0.7.3"
embedded-hal = "1.0.0"
lsm303agr = "0.3.0"
microbit-v2 = "0.13.0"
panic-halt = "0.2.0"

Embed.toml (you shouldn’t need to change this):

[default.general]
chip = "nrf52833_xxAA"

[debug.gdb]
enabled = true

[debug.reset]
halt_afterwards = true

.cargo/config.toml (you shouldn’t need to change this):

[build]
target = "thumbv7em-none-eabihf" # Cortex-M4F

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
rustflags = ["-C", "link-arg=-Tlink.x"]

Consulta la salida de serie en Linux con:

picocom --baud 115200 --imap lfcrlf /dev/ttyACM0

En Mac OS debería ser algo como lo siguiente (el nombre del dispositivo puede ser algo diferente):

picocom --baud 115200 --imap lfcrlf /dev/tty.usbmodem14502

Pulsa Ctrl+A Ctrl+Q para salir de Picocom.

Rust Bare Metal: Ejercicio de la Mañana

Brújula

(volver al ejercicio)

#![no_main]
#![no_std]

extern crate panic_halt as _;

use core::fmt::Write;
use cortex_m_rt::entry;
use core::cmp::{max, min};
use lsm303agr::{
    AccelMode, AccelOutputDataRate, Lsm303agr, MagMode, MagOutputDataRate,
};
use microbit::display::blocking::Display;
use microbit::hal::prelude::*;
use microbit::hal::twim::Twim;
use microbit::hal::uarte::{Baudrate, Parity, Uarte};
use microbit::hal::{Delay, Timer};
use microbit::pac::twim0::frequency::FREQUENCY_A;
use microbit::Board;

const COMPASS_SCALE: i32 = 30000;
const ACCELEROMETER_SCALE: i32 = 700;

#[entry]
fn main() -> ! {
    let board = Board::take().unwrap();

    // Configure serial port.
    let mut serial = Uarte::new(
        board.UARTE0,
        board.uart.into(),
        Parity::EXCLUDED,
        Baudrate::BAUD115200,
    );

    // Use the system timer as a delay provider.
    let mut delay = Delay::new(board.SYST);

    // Set up the I2C controller and Inertial Measurement Unit.
    writeln!(serial, "Setting up IMU...").unwrap();
    let i2c = Twim::new(board.TWIM0, board.i2c_internal.into(), FREQUENCY_A::K100);
    let mut imu = Lsm303agr::new_with_i2c(i2c);
    imu.init().unwrap();
    imu.set_mag_mode_and_odr(
        &mut delay,
        MagMode::HighResolution,
        MagOutputDataRate::Hz50,
    )
    .unwrap();
    imu.set_accel_mode_and_odr(
        &mut delay,
        AccelMode::Normal,
        AccelOutputDataRate::Hz50,
    )
    .unwrap();
    let mut imu = imu.into_mag_continuous().ok().unwrap();

    // Set up display and timer.
    let mut timer = Timer::new(board.TIMER0);
    let mut display = Display::new(board.display_pins);

    let mut mode = Mode::Compass;
    let mut button_pressed = false;

    writeln!(serial, "Ready.").unwrap();

    loop {
        // Read compass data and log it to the serial port.
        while !(imu.mag_status().unwrap().xyz_new_data()
            && imu.accel_status().unwrap().xyz_new_data())
        {}
        let compass_reading = imu.magnetic_field().unwrap();
        let accelerometer_reading = imu.acceleration().unwrap();
        writeln!(
            serial,
            "{},{},{}\t{},{},{}",
            compass_reading.x_nt(),
            compass_reading.y_nt(),
            compass_reading.z_nt(),
            accelerometer_reading.x_mg(),
            accelerometer_reading.y_mg(),
            accelerometer_reading.z_mg(),
        )
        .unwrap();

        let mut image = [[0; 5]; 5];
        let (x, y) = match mode {
            Mode::Compass => (
                scale(-compass_reading.x_nt(), -COMPASS_SCALE, COMPASS_SCALE, 0, 4)
                    as usize,
                scale(compass_reading.y_nt(), -COMPASS_SCALE, COMPASS_SCALE, 0, 4)
                    as usize,
            ),
            Mode::Accelerometer => (
                scale(
                    accelerometer_reading.x_mg(),
                    -ACCELEROMETER_SCALE,
                    ACCELEROMETER_SCALE,
                    0,
                    4,
                ) as usize,
                scale(
                    -accelerometer_reading.y_mg(),
                    -ACCELEROMETER_SCALE,
                    ACCELEROMETER_SCALE,
                    0,
                    4,
                ) as usize,
            ),
        };
        image[y][x] = 255;
        display.show(&mut timer, image, 100);

        // If button A is pressed, switch to the next mode and briefly blink all LEDs
        // on.
        if board.buttons.button_a.is_low().unwrap() {
            if !button_pressed {
                mode = mode.next();
                display.show(&mut timer, [[255; 5]; 5], 200);
            }
            button_pressed = true;
        } else {
            button_pressed = false;
        }
    }
}

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum Mode {
    Compass,
    Accelerometer,
}

impl Mode {
    fn next(self) -> Self {
        match self {
            Self::Compass => Self::Accelerometer,
            Self::Accelerometer => Self::Compass,
        }
    }
}

fn scale(value: i32, min_in: i32, max_in: i32, min_out: i32, max_out: i32) -> i32 {
    let range_in = max_in - min_in;
    let range_out = max_out - min_out;
    cap(min_out + range_out * (value - min_in) / range_in, min_out, max_out)
}

fn cap(value: i32, min_value: i32, max_value: i32) -> i32 {
    max(min_value, min(value, max_value))
}

Procesadores de aplicaciones

Hasta ahora hemos hablado de microcontroladores, como la serie Cortex‐M de Arm. Ahora vamos a probar a escribir algo para Cortex-A. Para simplificar, solo trabajaremos con la placa ‘virt’ aarch64 de QEMU.

  • En términos generales, los microcontroladores no tienen un MMU ni varios niveles de privilegio (niveles de excepción en las CPU de Arm, anillos en x86), mientras que los procesadores de aplicaciones sí los tienen.
  • QEMU permite emular varias máquinas o modelos de placa diferentes para cada arquitectura. La placa “virt” no se corresponde con ningún hardware real concreto, pero está diseñada exclusivamente para máquinas virtuales.

Iniciación a Rust

Antes de que podamos empezar a ejecutar código de Rust, tenemos que hacer alguna inicialización.

.section .init.entry, "ax"
.global entry
entry:
    /*
     * Load and apply the memory management configuration, ready to enable MMU and
     * caches.
     */
    adrp x30, idmap
    msr ttbr0_el1, x30

    mov_i x30, .Lmairval
    msr mair_el1, x30

    mov_i x30, .Ltcrval
    /* Copy the supported PA range into TCR_EL1.IPS. */
    mrs x29, id_aa64mmfr0_el1
    bfi x30, x29, #32, #4

    msr tcr_el1, x30

    mov_i x30, .Lsctlrval

    /*
     * Ensure everything before this point has completed, then invalidate any
     * potentially stale local TLB entries before they start being used.
     */
    isb
    tlbi vmalle1
    ic iallu
    dsb nsh
    isb

    /*
     * Configure sctlr_el1 to enable MMU and cache and don't proceed until this
     * has completed.
     */
    msr sctlr_el1, x30
    isb

    /* Disable trapping floating point access in EL1. */
    mrs x30, cpacr_el1
    orr x30, x30, #(0x3 << 20)
    msr cpacr_el1, x30
    isb

    /* Zero out the bss section. */
    adr_l x29, bss_begin
    adr_l x30, bss_end
0:  cmp x29, x30
    b.hs 1f
    stp xzr, xzr, [x29], #16
    b 0b

1:  /* Prepare the stack. */
    adr_l x30, boot_stack_end
    mov sp, x30

    /* Set up exception vector. */
    adr x30, vector_table_el1
    msr vbar_el1, x30

    /* Call into Rust code. */
    bl main

    /* Loop forever waiting for interrupts. */
2:  wfi
    b 2b
  • Es lo mismo que en C: inicializar el estado del procesador, poner a cero el BSS y configurar el puntero de la stack.
    • El BSS (símbolo de inicio del bloque, por motivos históricos) es la parte del objeto que contiene variables asignadas de forma estática que se inicializan a cero. Se omiten en la imagen para evitar malgastar espacio con ceros. El compilador asume que el cargador se encargará de ponerlos a cero.
  • Es posible que el BSS ya esté a cero, dependiendo de cómo se inicialice la memoria y cómo se cargue la imagen, aunque se pone igualmente a cero para estar seguros.
  • Necesitamos habilitar la MMU y la caché antes de leer o escribir memoria. Si no lo hacemos, sucederá lo siguiente:
    • Los accesos no alineados fallarán. Compilamos el código Rust para el objetivo aarch64-unknown-none, que define +strict-align para evitar que el compilador genere accesos no alineados. En este caso debería estar bien, pero no tiene por qué ser así en general.
    • Si se estuviera ejecutando en una máquina virtual, podría provocar problemas de coherencia en la caché. El problema es que la máquina virtual accede a la memoria directamente con la caché inhabilitada, mientras que el host cuenta con alias que se pueden almacenar en caché en la misma memoria. Incluso si el host no accede explícitamente a la memoria, los accesos especulativos pueden provocar que se llene la caché, haciendo que los cambios de uno u otro se pierdan cuando se borre la caché o cuando la máquina virtual la habilite. (La caché está codificada por dirección física, no por VA ni IPA).
  • Para simplificar, solo se utiliza una tabla de páginas codificada (consulta idmap.S) que mapea la identidad del primer GiB de espacio de direcciones para dispositivos, el siguiente GiB para DRAM y otro GiB más para más dispositivos. Esto coincide con la disposición de memoria que utiliza QEMU.
  • También configuramos el vector de excepción (vbar_el1), del que veremos más contenido en próximas dipositivas.
  • Todos los ejemplos de esta tarde se ejecutarán en el nivel de excepción 1 (EL1). Si necesitas ejecutar en un nivel de excepción diferente, deberás modificar entry.S según corresponda.

Ensamblaje integrado

Sometimes we need to use assembly to do things that aren’t possible with Rust code. For example, to make an HVC (hypervisor call) to tell the firmware to power off the system:

#![no_main]
#![no_std]

use core::arch::asm;
use core::panic::PanicInfo;

mod exceptions;

const PSCI_SYSTEM_OFF: u32 = 0x84000008;

#[no_mangle]
extern "C" fn main(_x0: u64, _x1: u64, _x2: u64, _x3: u64) {
    // Safe because this only uses the declared registers and doesn't do
    // anything with memory.
    unsafe {
        asm!("hvc #0",
            inout("w0") PSCI_SYSTEM_OFF => _,
            inout("w1") 0 => _,
            inout("w2") 0 => _,
            inout("w3") 0 => _,
            inout("w4") 0 => _,
            inout("w5") 0 => _,
            inout("w6") 0 => _,
            inout("w7") 0 => _,
            options(nomem, nostack)
        );
    }

    loop {}
}

(Si realmente quieres hacer esto, utiliza el crate smccc que tiene envoltorios para todas estas funciones).

  • PSCI es la interfaz de coordinación de estado de alimentación de Arm, un conjunto estándar de funciones para gestionar los estados de alimentación del sistema y de la CPU, entre otras cosas. Lo implementan el firmware EL3 y los hipervisores en muchos sistemas.
  • La sintaxis 0 => _ significa inicializar el registro a 0 antes de ejecutar el código de ensamblaje integrado e ignorar su contenido después. Necesitamos utilizar inout en lugar de in porque la llamada podría alterar el contenido de los registros.
  • Esta función main debe ser #[no_mangle] y extern "C", ya que se llama desde nuestro punto de entrada en entry.S.
  • _x0_x3 son los valores de los registros x0x3, que el bootloader utiliza habitualmente para pasar elementos al árbol de dispositivos, como un puntero. De acuerdo con la convención de llamadas estándar de aarch64 (que es lo que extern "C" usa), los registros x0x7 se utilizan para los primeros ocho argumentos que se pasan a una función, de modo que entry.S no tiene que hacer nada especial, salvo asegurarse de que no cambia estos registros.
  • Ejecuta el ejemplo en QEMU con make qemu_psci en src/bare-metal/aps/examples.

Acceso a la memoria volátil para MMIO

  • Se puede usar pointer::read_volatile y pointer::write_volatile.
  • Nunca retengas una referencia.
  • addr_of! permite obtener campos de estructuras sin crear una referencia intermedia.
  • Acceso volátil: las operaciones de lectura o escritura pueden tener efectos secundarios, por lo que se debe evitar que el compilador o el hardware las reordene, duplique u omita.
    • Normalmente, si escribes y luego lees (por ejemplo, a través de una referencia mutable), el compilador puede suponer que el valor leído es el mismo que el que se acaba de escribir, sin molestarse si quiera en leer realmente la memoria.
  • Algunos crates para el acceso volátil al hardware sí mantienen referencias, aunque no es seguro. Siempre que exista una referencia, el compilador puede desreferenciarla.
  • Utiliza la macro addr_of! para obtener punteros de campos de estructuras a partir de un puntero en la estructura.

Vamos a escribir un controlador de UART

La máquina “virt” de QEMU tiene una UART [PL011]https://developer.arm.com/documentation/ddi0183/g), así que vamos a escribir un controlador para ella.

const FLAG_REGISTER_OFFSET: usize = 0x18;
const FR_BUSY: u8 = 1 << 3;
const FR_TXFF: u8 = 1 << 5;

/// Minimal driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    base_address: *mut u8,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the 8 MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u8) -> Self {
        Self { base_address }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register() & FR_TXFF != 0 {}

        // Safe because we know that the base address points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            self.base_address.write_volatile(byte);
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register() & FR_BUSY != 0 {}
    }

    fn read_flag_register(&self) -> u8 {
        // Safe because we know that the base address points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { self.base_address.add(FLAG_REGISTER_OFFSET).read_volatile() }
    }
}
  • Ten en cuenta que Uart::new no es seguro, mientras que los otros métodos sí lo son. Esto se debe a que mientras que el llamador de Uart::new asegure que se cumplan sus requisitos de seguridad (es decir, que solo haya una instancia del controlador para una UART determinada y que nada más asigne alias a su espacio de direcciones), siempre es más seguro llamar a write_byte más adelante, ya que podemos asumir\ las condiciones previas necesarias.
  • Podríamos haberlo hecho al revés (haciendo que new fuese seguro y write_byte no seguro), pero\sería mucho menos cómodo de usar, ya que cada lugar que llamase a write_byte tendría que pensar en la seguridad
  • Este es un patrón común para escribir envoltorios seguros de código inseguro: mover la carga de la prueba de seguridad de un gran número de lugares a otro más pequeño.

Más traits

Hemos derivado el trait Debug. También sería útil implementar algunos traits más.

use core::fmt::{self, Write};

impl Write for Uart {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        for c in s.as_bytes() {
            self.write_byte(*c);
        }
        Ok(())
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Uart {}
  • Implementar Write nos permite utilizar las macros write! y writeln! con nuestro tipo Uart.
  • Ejecuta el ejemplo en QEMU con make qemu_minimal en src/bare-metal/aps/examples.

Un controlador UART mejor

En realidad, PL011 tiene muchos registros más, por lo que añadir desplazamientos para crear punteros que les permita acceder a ellos da lugar a errores y dificulta la lectura. Además, algunos de ellos son campos de bits a los que estaría bien acceder de forma estructurada.

DesplazamientoNombre de registroAncho
0x00DR12
0x04RSR4
0x18FR9
0x20ILPR8
0x24IBRD16
0x28FBRD6
0x2cLCR_H8
0x30CR16
0x34IFLS6
0x38IMSC11
0x3cRIS11
0x40MIS11
0x44ICR11
0x48DMACR3
  • También hay algunos registros de ID que se han omitido para abreviar.

Bitflags

El crate bitflags resulta útil para trabajar con bitflags.

use bitflags::bitflags;

bitflags! {
    /// Flags from the UART flag register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct Flags: u16 {
        /// Clear to send.
        const CTS = 1 << 0;
        /// Data set ready.
        const DSR = 1 << 1;
        /// Data carrier detect.
        const DCD = 1 << 2;
        /// UART busy transmitting data.
        const BUSY = 1 << 3;
        /// Receive FIFO is empty.
        const RXFE = 1 << 4;
        /// Transmit FIFO is full.
        const TXFF = 1 << 5;
        /// Receive FIFO is full.
        const RXFF = 1 << 6;
        /// Transmit FIFO is empty.
        const TXFE = 1 << 7;
        /// Ring indicator.
        const RI = 1 << 8;
    }
}
  • La macro bitflags! crea un newtype, como Flags(u16), junto con un montón de implementaciones de métodos para obtener y definir flags (banderas).

Varios registros

Podemos utilizar una estructura para representar la disposición de la memoria de los registros de UART.

#[repr(C, align(4))]
struct Registers {
    dr: u16,
    _reserved0: [u8; 2],
    rsr: ReceiveStatus,
    _reserved1: [u8; 19],
    fr: Flags,
    _reserved2: [u8; 6],
    ilpr: u8,
    _reserved3: [u8; 3],
    ibrd: u16,
    _reserved4: [u8; 2],
    fbrd: u8,
    _reserved5: [u8; 3],
    lcr_h: u8,
    _reserved6: [u8; 3],
    cr: u16,
    _reserved7: [u8; 3],
    ifls: u8,
    _reserved8: [u8; 3],
    imsc: u16,
    _reserved9: [u8; 2],
    ris: u16,
    _reserved10: [u8; 2],
    mis: u16,
    _reserved11: [u8; 2],
    icr: u16,
    _reserved12: [u8; 2],
    dmacr: u8,
    _reserved13: [u8; 3],
}
  • #[repr(C)] indica al compilador que ordene los campos de la estructura siguiendo las mismas reglas que en C. Esto es necesario para que nuestra estructura tenga un diseño predecible, ya que la representación predeterminada de Rust permite que el compilador (entre otras cosas) reordene los campos como crea conveniente.

Conductor

Ahora vamos a utilizar la nueva estructura de Registers en nuestro controlador.

/// Driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    registers: *mut Registers,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the 8 MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self { registers: base_address as *mut Registers }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register().contains(Flags::TXFF) {}

        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            addr_of_mut!((*self.registers).dr).write_volatile(byte.into());
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register().contains(Flags::BUSY) {}
    }

    /// Reads and returns a pending byte, or `None` if nothing has been
    /// received.
    pub fn read_byte(&self) -> Option<u8> {
        if self.read_flag_register().contains(Flags::RXFE) {
            None
        } else {
            let data = unsafe { addr_of!((*self.registers).dr).read_volatile() };
            // TODO: Check for error conditions in bits 8-11.
            Some(data as u8)
        }
    }

    fn read_flag_register(&self) -> Flags {
        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).fr).read_volatile() }
    }
}
  • Fíjate en el uso de addr_of! y addr_of_mut! para llevar punteros a campos individuales sin crear una referencia intermedia. Sería una acción insegura.

Uso

Vamos a crear un pequeño programa con nuestro controlador para escribir en la consola serie y compartir los bytes entrantes.

#![no_main]
#![no_std]

mod exceptions;
mod pl011;

use crate::pl011::Uart;
use core::fmt::Write;
use core::panic::PanicInfo;
use log::error;
use smccc::psci::system_off;
use smccc::Hvc;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let mut uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };

    writeln!(uart, "main({x0:#x}, {x1:#x}, {x2:#x}, {x3:#x})").unwrap();

    loop {
        if let Some(byte) = uart.read_byte() {
            uart.write_byte(byte);
            match byte {
                b'\r' => {
                    uart.write_byte(b'\n');
                }
                b'q' => break,
                _ => {}
            }
        }
    }

    writeln!(uart, "Bye!").unwrap();
    system_off::<Hvc>().unwrap();
}
  • Al igual que en el ejemplo de ensamblaje integrado, esta función main se llama desde nuestro código de punto de entrada en entry.S. Consulta las notas del orador para obtener más información.
  • Ejecuta el ejemplo en QEMU con make qemu en src/bare-metal/aps/examples.

Almacenamiento de registros

Estaría bien poder utilizar las macros de registro del crate log. Podemos hacerlo implementando el trait Log.

use crate::pl011::Uart;
use core::fmt::Write;
use log::{LevelFilter, Log, Metadata, Record, SetLoggerError};
use spin::mutex::SpinMutex;

static LOGGER: Logger = Logger { uart: SpinMutex::new(None) };

struct Logger {
    uart: SpinMutex<Option<Uart>>,
}

impl Log for Logger {
    fn enabled(&self, _metadata: &Metadata) -> bool {
        true
    }

    fn log(&self, record: &Record) {
        writeln!(
            self.uart.lock().as_mut().unwrap(),
            "[{}] {}",
            record.level(),
            record.args()
        )
        .unwrap();
    }

    fn flush(&self) {}
}

/// Initialises UART logger.
pub fn init(uart: Uart, max_level: LevelFilter) -> Result<(), SetLoggerError> {
    LOGGER.uart.lock().replace(uart);

    log::set_logger(&LOGGER)?;
    log::set_max_level(max_level);
    Ok(())
}
  • La desenvoltura en log es segura porque inicializamos LOGGER antes de llamar a set_logger.

Uso

Debemos inicializar el registrador antes de utilizarlo.

#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;

use crate::pl011::Uart;
use core::panic::PanicInfo;
use log::{error, info, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({x0:#x}, {x1:#x}, {x2:#x}, {x3:#x})");

    assert_eq!(x1, 42);

    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}
  • Ten en cuenta que nuestro controlador de panic ahora ya puede registrar la información de los pánicos.
  • Ejecuta el ejemplo en QEMU con make qemu_logger en src/bare-metal/aps/examples.

Excepciones

AArch64 define una tabla de vectores de excepción con 16 entradas, para 4 tipos de excepciones (synchronous, IRQ, FIQ, SError) desde 4 estados (EL actual con SP0, EL actual con SPx, EL inferior con AArch64 y EL inferior con AArch32). Implementamos esto en el ensamblaje para guardar los registros volátiles en la stack antes de llamar al código de Rust:

use log::error;
use smccc::psci::system_off;
use smccc::Hvc;

#[no_mangle]
extern "C" fn sync_exception_current(_elr: u64, _spsr: u64) {
    error!("sync_exception_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_current(_elr: u64, _spsr: u64) {
    error!("irq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_current(_elr: u64, _spsr: u64) {
    error!("fiq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_current(_elr: u64, _spsr: u64) {
    error!("serr_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn sync_lower(_elr: u64, _spsr: u64) {
    error!("sync_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_lower(_elr: u64, _spsr: u64) {
    error!("irq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_lower(_elr: u64, _spsr: u64) {
    error!("fiq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_lower(_elr: u64, _spsr: u64) {
    error!("serr_lower");
    system_off::<Hvc>().unwrap();
}
  • EL significa nivel de excepción (por sus siglas en inglés); todos nuestros ejemplos de esta tarde se ejecutan en EL1.
  • Para simplificar, no distinguimos entre SP0 y SPx para las excepciones del EL actual, ni entre AArch32 y AArch64 para las excepciones de EL inferiores.
  • En este ejemplo, nos limitaremos a registrar la excepción y a apagarla, ya que no esperamos que se produzca ninguna.
  • Podríamos pensar en los controladores de excepciones y en nuestro contexto de ejecución principal como si fueran hilos diferentes. Send y Sync controlarán lo que podemos compartir entre ellos, igual que con los hilos. Por ejemplo, si queremos compartir algún valor entre los controladores de excepciones y el resto del programa, y es Senden vez de Sync, necesitaremos envolverlo en un Mutex, por ejemplo, y ponerlo en un estático.

Otros proyectos

  • oreboot
    • “coreboot sin la C”.
    • Compatible con x86, aarch64 y RISC-V.
    • Depende de LinuxBoot en lugar de tener controladores propios.
  • [Tutorial del SO de Rust en RaspberryPi]Rust RaspberryPi OS tutorial
    • Inicialización, controlador de UART, bootloader sencillo, JTAG, niveles de excepción, gestión de excepciones, tablas de páginas, etc.
    • Algunas dudas sobre el mantenimiento de la caché y la inicialización en Rust, aunque no es precisamente un buen ejemplo para copiar en código de producción.
  • cargo-call-stack
    • Análisis estático para determinar el uso máximo de la stack.
  • El tutorial del sistema operativo en RaspberryPi ejecuta código de Rust antes de que la MMU y las cachés se habiliten. De este modo, se leerá y escribirá memoria (por ejemplo, la stack). Sin embargo:
    • Sin la MMU y la caché, los accesos no alineados fallarán. Se compila con aarch64-unknown-none, que define +strict-align para evitar que el compilador genere accesos no alineados. Debería estar bien, pero no tiene por qué ser así, en general.
    • Si se estuviera ejecutando en una máquina virtual, podría provocar problemas de coherencia en la caché. El problema es que la máquina virtual accede a la memoria directamente con la caché inhabilitada, mientras que el host cuenta con alias que se pueden almacenar en caché en la misma memoria. Incluso si el host no accede explícitamente a la memoria, los accesos especulativos pueden provocar que se llene la caché, haciendo que los cambios de uno u otro se pierdan. De nuevo, es correcto en este caso particular (si se ejecuta directamente en el hardware sin hipervisor) pero, por lo general, no es un buen patrón.

Crates Útiles

A continuación, repasaremos algunos crates que resuelven ciertos problemas comunes en la programación bare-metal.

zerocopy

El crate zerocopy (de Fuchsia) proporciona traits y macros para realizar conversiones seguras entre secuencias de bytes y otros tipos.

use zerocopy::AsBytes;

#[repr(u32)]
#[derive(AsBytes, Debug, Default)]
enum RequestType {
    #[default]
    In = 0,
    Out = 1,
    Flush = 4,
}

#[repr(C)]
#[derive(AsBytes, Debug, Default)]
struct VirtioBlockRequest {
    request_type: RequestType,
    reserved: u32,
    sector: u64,
}

fn main() {
    let request = VirtioBlockRequest {
        request_type: RequestType::Flush,
        sector: 42,
        ..Default::default()
    };

    assert_eq!(
        request.as_bytes(),
        &[4, 0, 0, 0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0, 0]
    );
}

No es adecuado para MMIO (ya que no utiliza lecturas y escrituras volátiles), pero puede ser útil para trabajar con estructuras compartidas con hardware (por ejemplo, mediante DMA) o enviadas a través de alguna interfaz externa.

  • FromBytes se puede implementar en tipos en los que cualquier patrón de bytes es válido, por lo que se puede convertir de forma segura a partir de una secuencia de bytes que no es fiable.
  • Si se intenta derivar FromBytes para estos tipos, se produciría un error, pues RequestType no utiliza todos los valores u32 posibles como discriminantes y, por tanto, todos los patrones de bytes son válidos.
  • zerocopy::byteorder tiene tipos para primitivos numéricos conscientes del orden de bytes.
  • Ejecuta el ejemplo con cargo run en src/bare-metal/useful-crates/zerocopy-example/. (No se ejecutará en el playground debido a la dependencia del crate).

aarch64-paging

El crate aarch64-paging permite crear tablas de páginas de acuerdo con la arquitectura del sistema de memoria virtual AArch64.

use aarch64_paging::{
    idmap::IdMap,
    paging::{Attributes, MemoryRegion},
};

const ASID: usize = 1;
const ROOT_LEVEL: usize = 1;

// Create a new page table with identity mapping.
let mut idmap = IdMap::new(ASID, ROOT_LEVEL);
// Map a 2 MiB region of memory as read-only.
idmap.map_range(
    &MemoryRegion::new(0x80200000, 0x80400000),
    Attributes::NORMAL | Attributes::NON_GLOBAL | Attributes::READ_ONLY,
).unwrap();
// Set `TTBR0_EL1` to activate the page table.
idmap.activate();
  • Por ahora, solo es compatible con EL1, pero debería ser sencillo añadir compatibilidad con otros niveles de excepción.
  • Se utiliza en Android para el Firmware de Máquina Virtual Protegida.
  • No hay una forma sencilla de ejecutar este ejemplo, ya que debe hacerse en hardware real o en QEMU.

buddy_system_allocator

buddy_system_allocator es un crate de terceros que implementa un asignador básico del sistema buddy. Se puede utilizar tanto para LockedHeap implementando GlobalAlloc, de forma que puedas usar el crate alloc estándar (tal y como vimos antes), o para asignar otro espacio de direcciones. Por ejemplo, podríamos querer asignar espacio MMIO para los registros de dirección base (BAR) de PCI:

use buddy_system_allocator::FrameAllocator;
use core::alloc::Layout;

fn main() {
    let mut allocator = FrameAllocator::<32>::new();
    allocator.add_frame(0x200_0000, 0x400_0000);

    let layout = Layout::from_size_align(0x100, 0x100).unwrap();
    let bar = allocator
        .alloc_aligned(layout)
        .expect("Failed to allocate 0x100 byte MMIO region");
    println!("Allocated 0x100 byte MMIO region at {:#x}", bar);
}
  • Los BAR de PCI siempre tienen una alineación igual a su tamaño.
  • Ejecuta el ejemplo con cargo run en src/bare-metal/useful-crates/allocator-example/. (No se ejecutará en el playground debido a la dependencia del crate).

tinyvec

A veces, se necesita algo que se pueda cambiar de tamaño, como Vec, pero sin asignación de heap. tinyvec ofrece un vector respaldado por un array o slice, que se podría asignar estáticamente o en la stack, y que hace un seguimiento de cuántos elementos se usan, entrando en panic si intentas utilizar más elementos de los asignados.

use tinyvec::{array_vec, ArrayVec};

fn main() {
    let mut numbers: ArrayVec<[u32; 5]> = array_vec!(42, 66);
    println!("{numbers:?}");
    numbers.push(7);
    println!("{numbers:?}");
    numbers.remove(1);
    println!("{numbers:?}");
}
  • tinyvec requiere que el tipo de elemento implemente Default para la inicialización.
  • El playground de Rust incluye tinyvec, por lo que este ejemplo se ejecutará bien aunque esté insertado.

spin

std::sync::Mutex y el resto de los primitivos de sincronización de std::sync no están disponibles en core o alloc. ¿Cómo podemos gestionar la sincronización o la mutabilidad interior para, por ejemplo, compartir el estado entre diferentes CPUs?

El crate spin proporciona equivalentes basados en spinlocks de muchos de estos primitivos.

use spin::mutex::SpinMutex;

static counter: SpinMutex<u32> = SpinMutex::new(0);

fn main() {
    println!("count: {}", counter.lock());
    *counter.lock() += 2;
    println!("count: {}", counter.lock());
}
  • Intenta evitar interbloqueos si usas bloqueos en los controladores de las interrupciones.
  • spin also has a ticket lock mutex implementation; equivalents of RwLock, Barrier and Once from std::sync; and Lazy for lazy initialisation.
  • El crate once_cell también tiene algunos tipos útiles de inicialización tardía con un enfoque ligeramente distinto al de spin::once::Once.
  • El playground de Rust incluye spin, por lo que este ejemplo se ejecutará bien aunque está insertado.

Android

Para compilar un binario de Rust bare-metal en AOSP, tienes que usar una regla rust_ffi_static de Soong para crear tu código Rust y, seguidamente, un cc_binary con una secuencia de comandos de enlazador para producir el binario en sí. Por último, un raw_binary para convertir el ELF en un binario sin formato que pueda ejecutarse.

rust_ffi_static {
    name: "libvmbase_example",
    defaults: ["vmbase_ffi_defaults"],
    crate_name: "vmbase_example",
    srcs: ["src/main.rs"],
    rustlibs: [
        "libvmbase",
    ],
}

cc_binary {
    name: "vmbase_example",
    defaults: ["vmbase_elf_defaults"],
    srcs: [
        "idmap.S",
    ],
    static_libs: [
        "libvmbase_example",
    ],
    linker_scripts: [
        "image.ld",
        ":vmbase_sections",
    ],
}

raw_binary {
    name: "vmbase_example_bin",
    stem: "vmbase_example.bin",
    src: ":vmbase_example",
    enabled: false,
    target: {
        android_arm64: {
            enabled: true,
        },
    },
}

vmbase

En el caso de las máquinas virtuales que se ejecutan con crosvm en aarch64, la biblioteca vmbase proporciona una secuencia de comandos de enlazador y valores predeterminados útiles para las reglas de compilación, además de un punto de entrada, registro de la consola UART y mucho más.

#![no_main]
#![no_std]

use vmbase::{main, println};

main!(main);

pub fn main(arg0: u64, arg1: u64, arg2: u64, arg3: u64) {
    println!("Hello world");
}
  • La macro main! indica tu función principal, que se llama desde el punto de entrada vmbase.
  • El punto de entrada vmbase gestiona la inicialización de la consola y emite PSCI_SYSTEM_OFF para apagar la máquina virtual si tu función principal devuelve un resultado.

Ejercicios

Escribiremos un controlador para el dispositivo de reloj en tiempo real PL031.

Luego de ver los ejercicios, puedes ver las soluciones que se brindan.

Controlador RTC

La máquina virtual aarch64 de QEMU tiene un reloj en tiempo real PL031 en 0x9010000. En este ejercicio, debes escribir un controlador para el reloj.

  1. Úsalo para imprimir la hora en la consola serie. Puedes usar el crate chrono para dar formato a la fecha y la hora.
  2. Utiliza el registro de coincidencias y el estado de interrupción sin formato para esperar hasta un momento dado, por ejemplo, un adelanto de 3 segundos. (Llama a core::hint::spin_loop dentro d+el bucle).
  3. Ampliación si hay tiempo: habilita y gestiona la interrupción que genera la coincidencia de RTC. Puedes usar el controlador que se proporciona con el crate arm-gic para configurar el controlador de interrupciones genérico (GIC) de Arm.
    • Utiliza la interrupción de RTC, que está conectada al GIC como IntId::spi(2).
    • Después de habilitar la interrupción, puedes poner el núcleo en suspensión mediante arm_gic::wfi(), lo que hará que entre en suspensión hasta que reciba una interrupción.

Descarga la plantilla de ejercicio y busca en el directorio rtc los siguientes archivos.

src/main.rs:

#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;

use crate::pl011::Uart;
use arm_gic::gicv3::GicV3;
use core::panic::PanicInfo;
use log::{error, info, trace, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base addresses of the GICv3.
const GICD_BASE_ADDRESS: *mut u64 = 0x800_0000 as _;
const GICR_BASE_ADDRESS: *mut u64 = 0x80A_0000 as _;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({:#x}, {:#x}, {:#x}, {:#x})", x0, x1, x2, x3);

    // Safe because `GICD_BASE_ADDRESS` and `GICR_BASE_ADDRESS` are the base
    // addresses of a GICv3 distributor and redistributor respectively, and
    // nothing else accesses those address ranges.
    let mut gic = unsafe { GicV3::new(GICD_BASE_ADDRESS, GICR_BASE_ADDRESS) };
    gic.setup();

    // TODO: Create instance of RTC driver and print current time.

    // TODO: Wait for 3 seconds.

    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}

src/exceptions.rs (you should only need to change this for the 3rd part of the exercise):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use arm_gic::gicv3::GicV3;
use log::{error, info, trace};
use smccc::psci::system_off;
use smccc::Hvc;

#[no_mangle]
extern "C" fn sync_exception_current(_elr: u64, _spsr: u64) {
    error!("sync_exception_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_current(_elr: u64, _spsr: u64) {
    trace!("irq_current");
    let intid =
        GicV3::get_and_acknowledge_interrupt().expect("No pending interrupt");
    info!("IRQ {intid:?}");
}

#[no_mangle]
extern "C" fn fiq_current(_elr: u64, _spsr: u64) {
    error!("fiq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_current(_elr: u64, _spsr: u64) {
    error!("serr_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn sync_lower(_elr: u64, _spsr: u64) {
    error!("sync_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_lower(_elr: u64, _spsr: u64) {
    error!("irq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_lower(_elr: u64, _spsr: u64) {
    error!("fiq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_lower(_elr: u64, _spsr: u64) {
    error!("serr_lower");
    system_off::<Hvc>().unwrap();
}
}

src/logger.rs (you shouldn’t need to change this):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: main
use crate::pl011::Uart;
use core::fmt::Write;
use log::{LevelFilter, Log, Metadata, Record, SetLoggerError};
use spin::mutex::SpinMutex;

static LOGGER: Logger = Logger { uart: SpinMutex::new(None) };

struct Logger {
    uart: SpinMutex<Option<Uart>>,
}

impl Log for Logger {
    fn enabled(&self, _metadata: &Metadata) -> bool {
        true
    }

    fn log(&self, record: &Record) {
        writeln!(
            self.uart.lock().as_mut().unwrap(),
            "[{}] {}",
            record.level(),
            record.args()
        )
        .unwrap();
    }

    fn flush(&self) {}
}

/// Initialises UART logger.
pub fn init(uart: Uart, max_level: LevelFilter) -> Result<(), SetLoggerError> {
    LOGGER.uart.lock().replace(uart);

    log::set_logger(&LOGGER)?;
    log::set_max_level(max_level);
    Ok(())
}
}

src/pl011.rs (you shouldn’t need to change this):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#![allow(unused)]

use core::fmt::{self, Write};
use core::ptr::{addr_of, addr_of_mut};

// ANCHOR: Flags
use bitflags::bitflags;

bitflags! {
    /// Flags from the UART flag register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct Flags: u16 {
        /// Clear to send.
        const CTS = 1 << 0;
        /// Data set ready.
        const DSR = 1 << 1;
        /// Data carrier detect.
        const DCD = 1 << 2;
        /// UART busy transmitting data.
        const BUSY = 1 << 3;
        /// Receive FIFO is empty.
        const RXFE = 1 << 4;
        /// Transmit FIFO is full.
        const TXFF = 1 << 5;
        /// Receive FIFO is full.
        const RXFF = 1 << 6;
        /// Transmit FIFO is empty.
        const TXFE = 1 << 7;
        /// Ring indicator.
        const RI = 1 << 8;
    }
}
// ANCHOR_END: Flags

bitflags! {
    /// Flags from the UART Receive Status Register / Error Clear Register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct ReceiveStatus: u16 {
        /// Framing error.
        const FE = 1 << 0;
        /// Parity error.
        const PE = 1 << 1;
        /// Break error.
        const BE = 1 << 2;
        /// Overrun error.
        const OE = 1 << 3;
    }
}

// ANCHOR: Registers
#[repr(C, align(4))]
struct Registers {
    dr: u16,
    _reserved0: [u8; 2],
    rsr: ReceiveStatus,
    _reserved1: [u8; 19],
    fr: Flags,
    _reserved2: [u8; 6],
    ilpr: u8,
    _reserved3: [u8; 3],
    ibrd: u16,
    _reserved4: [u8; 2],
    fbrd: u8,
    _reserved5: [u8; 3],
    lcr_h: u8,
    _reserved6: [u8; 3],
    cr: u16,
    _reserved7: [u8; 3],
    ifls: u8,
    _reserved8: [u8; 3],
    imsc: u16,
    _reserved9: [u8; 2],
    ris: u16,
    _reserved10: [u8; 2],
    mis: u16,
    _reserved11: [u8; 2],
    icr: u16,
    _reserved12: [u8; 2],
    dmacr: u8,
    _reserved13: [u8; 3],
}
// ANCHOR_END: Registers

// ANCHOR: Uart
/// Driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    registers: *mut Registers,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self { registers: base_address as *mut Registers }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register().contains(Flags::TXFF) {}

        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            addr_of_mut!((*self.registers).dr).write_volatile(byte.into());
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register().contains(Flags::BUSY) {}
    }

    /// Reads and returns a pending byte, or `None` if nothing has been
    /// received.
    pub fn read_byte(&self) -> Option<u8> {
        if self.read_flag_register().contains(Flags::RXFE) {
            None
        } else {
            let data = unsafe { addr_of!((*self.registers).dr).read_volatile() };
            // TODO: Check for error conditions in bits 8-11.
            Some(data as u8)
        }
    }

    fn read_flag_register(&self) -> Flags {
        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).fr).read_volatile() }
    }
}
// ANCHOR_END: Uart

impl Write for Uart {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        for c in s.as_bytes() {
            self.write_byte(*c);
        }
        Ok(())
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Uart {}
}

Cargo.toml (you shouldn’t need to change this):

[workspace]

[package]
name = "rtc"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
arm-gic = "0.1.0"
bitflags = "2.4.2"
chrono = { version = "0.4.24", default-features = false }
log = "0.4.17"
smccc = "0.1.1"
spin = "0.9.8"

[build-dependencies]
cc = "1.0.73"

build.rs (you shouldn’t need to change this):

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use cc::Build;
use std::env;

fn main() {
    #[cfg(target_os = "linux")]
    env::set_var("CROSS_COMPILE", "aarch64-linux-gnu");
    #[cfg(not(target_os = "linux"))]
    env::set_var("CROSS_COMPILE", "aarch64-none-elf");

    Build::new()
        .file("entry.S")
        .file("exceptions.S")
        .file("idmap.S")
        .compile("empty")
}

entry.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

.macro adr_l, reg:req, sym:req
	adrp \reg, \sym
	add \reg, \reg, :lo12:\sym
.endm

.macro mov_i, reg:req, imm:req
	movz \reg, :abs_g3:\imm
	movk \reg, :abs_g2_nc:\imm
	movk \reg, :abs_g1_nc:\imm
	movk \reg, :abs_g0_nc:\imm
.endm

.set .L_MAIR_DEV_nGnRE,	0x04
.set .L_MAIR_MEM_WBWA,	0xff
.set .Lmairval, .L_MAIR_DEV_nGnRE | (.L_MAIR_MEM_WBWA << 8)

/* 4 KiB granule size for TTBR0_EL1. */
.set .L_TCR_TG0_4KB, 0x0 << 14
/* 4 KiB granule size for TTBR1_EL1. */
.set .L_TCR_TG1_4KB, 0x2 << 30
/* Disable translation table walk for TTBR1_EL1, generating a translation fault instead. */
.set .L_TCR_EPD1, 0x1 << 23
/* Translation table walks for TTBR0_EL1 are inner sharable. */
.set .L_TCR_SH_INNER, 0x3 << 12
/*
 * Translation table walks for TTBR0_EL1 are outer write-back read-allocate write-allocate
 * cacheable.
 */
.set .L_TCR_RGN_OWB, 0x1 << 10
/*
 * Translation table walks for TTBR0_EL1 are inner write-back read-allocate write-allocate
 * cacheable.
 */
.set .L_TCR_RGN_IWB, 0x1 << 8
/* Size offset for TTBR0_EL1 is 2**39 bytes (512 GiB). */
.set .L_TCR_T0SZ_512, 64 - 39
.set .Ltcrval, .L_TCR_TG0_4KB | .L_TCR_TG1_4KB | .L_TCR_EPD1 | .L_TCR_RGN_OWB
.set .Ltcrval, .Ltcrval | .L_TCR_RGN_IWB | .L_TCR_SH_INNER | .L_TCR_T0SZ_512

/* Stage 1 instruction access cacheability is unaffected. */
.set .L_SCTLR_ELx_I, 0x1 << 12
/* SP alignment fault if SP is not aligned to a 16 byte boundary. */
.set .L_SCTLR_ELx_SA, 0x1 << 3
/* Stage 1 data access cacheability is unaffected. */
.set .L_SCTLR_ELx_C, 0x1 << 2
/* EL0 and EL1 stage 1 MMU enabled. */
.set .L_SCTLR_ELx_M, 0x1 << 0
/* Privileged Access Never is unchanged on taking an exception to EL1. */
.set .L_SCTLR_EL1_SPAN, 0x1 << 23
/* SETEND instruction disabled at EL0 in aarch32 mode. */
.set .L_SCTLR_EL1_SED, 0x1 << 8
/* Various IT instructions are disabled at EL0 in aarch32 mode. */
.set .L_SCTLR_EL1_ITD, 0x1 << 7
.set .L_SCTLR_EL1_RES1, (0x1 << 11) | (0x1 << 20) | (0x1 << 22) | (0x1 << 28) | (0x1 << 29)
.set .Lsctlrval, .L_SCTLR_ELx_M | .L_SCTLR_ELx_C | .L_SCTLR_ELx_SA | .L_SCTLR_EL1_ITD | .L_SCTLR_EL1_SED
.set .Lsctlrval, .Lsctlrval | .L_SCTLR_ELx_I | .L_SCTLR_EL1_SPAN | .L_SCTLR_EL1_RES1

/**
 * This is a generic entry point for an image. It carries out the operations required to prepare the
 * loaded image to be run. Specifically, it zeroes the bss section using registers x25 and above,
 * prepares the stack, enables floating point, and sets up the exception vector. It preserves x0-x3
 * for the Rust entry point, as these may contain boot parameters.
 */
.section .init.entry, "ax"
.global entry
entry:
	/* Load and apply the memory management configuration, ready to enable MMU and caches. */
	adrp x30, idmap
	msr ttbr0_el1, x30

	mov_i x30, .Lmairval
	msr mair_el1, x30

	mov_i x30, .Ltcrval
	/* Copy the supported PA range into TCR_EL1.IPS. */
	mrs x29, id_aa64mmfr0_el1
	bfi x30, x29, #32, #4

	msr tcr_el1, x30

	mov_i x30, .Lsctlrval

	/*
	 * Ensure everything before this point has completed, then invalidate any potentially stale
	 * local TLB entries before they start being used.
	 */
	isb
	tlbi vmalle1
	ic iallu
	dsb nsh
	isb

	/*
	 * Configure sctlr_el1 to enable MMU and cache and don't proceed until this has completed.
	 */
	msr sctlr_el1, x30
	isb

	/* Disable trapping floating point access in EL1. */
	mrs x30, cpacr_el1
	orr x30, x30, #(0x3 << 20)
	msr cpacr_el1, x30
	isb

	/* Zero out the bss section. */
	adr_l x29, bss_begin
	adr_l x30, bss_end
0:	cmp x29, x30
	b.hs 1f
	stp xzr, xzr, [x29], #16
	b 0b

1:	/* Prepare the stack. */
	adr_l x30, boot_stack_end
	mov sp, x30

	/* Set up exception vector. */
	adr x30, vector_table_el1
	msr vbar_el1, x30

	/* Call into Rust code. */
	bl main

	/* Loop forever waiting for interrupts. */
2:	wfi
	b 2b

exceptions.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**
 * Saves the volatile registers onto the stack. This currently takes 14
 * instructions, so it can be used in exception handlers with 18 instructions
 * left.
 *
 * On return, x0 and x1 are initialised to elr_el2 and spsr_el2 respectively,
 * which can be used as the first and second arguments of a subsequent call.
 */
.macro save_volatile_to_stack
	/* Reserve stack space and save registers x0-x18, x29 & x30. */
	stp x0, x1, [sp, #-(8 * 24)]!
	stp x2, x3, [sp, #8 * 2]
	stp x4, x5, [sp, #8 * 4]
	stp x6, x7, [sp, #8 * 6]
	stp x8, x9, [sp, #8 * 8]
	stp x10, x11, [sp, #8 * 10]
	stp x12, x13, [sp, #8 * 12]
	stp x14, x15, [sp, #8 * 14]
	stp x16, x17, [sp, #8 * 16]
	str x18, [sp, #8 * 18]
	stp x29, x30, [sp, #8 * 20]

	/*
	 * Save elr_el1 & spsr_el1. This such that we can take nested exception
	 * and still be able to unwind.
	 */
	mrs x0, elr_el1
	mrs x1, spsr_el1
	stp x0, x1, [sp, #8 * 22]
.endm

/**
 * Restores the volatile registers from the stack. This currently takes 14
 * instructions, so it can be used in exception handlers while still leaving 18
 * instructions left; if paired with save_volatile_to_stack, there are 4
 * instructions to spare.
 */
.macro restore_volatile_from_stack
	/* Restore registers x2-x18, x29 & x30. */
	ldp x2, x3, [sp, #8 * 2]
	ldp x4, x5, [sp, #8 * 4]
	ldp x6, x7, [sp, #8 * 6]
	ldp x8, x9, [sp, #8 * 8]
	ldp x10, x11, [sp, #8 * 10]
	ldp x12, x13, [sp, #8 * 12]
	ldp x14, x15, [sp, #8 * 14]
	ldp x16, x17, [sp, #8 * 16]
	ldr x18, [sp, #8 * 18]
	ldp x29, x30, [sp, #8 * 20]

	/* Restore registers elr_el1 & spsr_el1, using x0 & x1 as scratch. */
	ldp x0, x1, [sp, #8 * 22]
	msr elr_el1, x0
	msr spsr_el1, x1

	/* Restore x0 & x1, and release stack space. */
	ldp x0, x1, [sp], #8 * 24
.endm

/**
 * This is a generic handler for exceptions taken at the current EL while using
 * SP0. It behaves similarly to the SPx case by first switching to SPx, doing
 * the work, then switching back to SP0 before returning.
 *
 * Switching to SPx and calling the Rust handler takes 16 instructions. To
 * restore and return we need an additional 16 instructions, so we can implement
 * the whole handler within the allotted 32 instructions.
 */
.macro current_exception_sp0 handler:req
	msr spsel, #1
	save_volatile_to_stack
	bl \handler
	restore_volatile_from_stack
	msr spsel, #0
	eret
.endm

/**
 * This is a generic handler for exceptions taken at the current EL while using
 * SPx. It saves volatile registers, calls the Rust handler, restores volatile
 * registers, then returns.
 *
 * This also works for exceptions taken from EL0, if we don't care about
 * non-volatile registers.
 *
 * Saving state and jumping to the Rust handler takes 15 instructions, and
 * restoring and returning also takes 15 instructions, so we can fit the whole
 * handler in 30 instructions, under the limit of 32.
 */
.macro current_exception_spx handler:req
	save_volatile_to_stack
	bl \handler
	restore_volatile_from_stack
	eret
.endm

.section .text.vector_table_el1, "ax"
.global vector_table_el1
.balign 0x800
vector_table_el1:
sync_cur_sp0:
	current_exception_sp0 sync_exception_current

.balign 0x80
irq_cur_sp0:
	current_exception_sp0 irq_current

.balign 0x80
fiq_cur_sp0:
	current_exception_sp0 fiq_current

.balign 0x80
serr_cur_sp0:
	current_exception_sp0 serr_current

.balign 0x80
sync_cur_spx:
	current_exception_spx sync_exception_current

.balign 0x80
irq_cur_spx:
	current_exception_spx irq_current

.balign 0x80
fiq_cur_spx:
	current_exception_spx fiq_current

.balign 0x80
serr_cur_spx:
	current_exception_spx serr_current

.balign 0x80
sync_lower_64:
	current_exception_spx sync_lower

.balign 0x80
irq_lower_64:
	current_exception_spx irq_lower

.balign 0x80
fiq_lower_64:
	current_exception_spx fiq_lower

.balign 0x80
serr_lower_64:
	current_exception_spx serr_lower

.balign 0x80
sync_lower_32:
	current_exception_spx sync_lower

.balign 0x80
irq_lower_32:
	current_exception_spx irq_lower

.balign 0x80
fiq_lower_32:
	current_exception_spx fiq_lower

.balign 0x80
serr_lower_32:
	current_exception_spx serr_lower

idmap.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

.set .L_TT_TYPE_BLOCK, 0x1
.set .L_TT_TYPE_PAGE,  0x3
.set .L_TT_TYPE_TABLE, 0x3

/* Access flag. */
.set .L_TT_AF, 0x1 << 10
/* Not global. */
.set .L_TT_NG, 0x1 << 11
.set .L_TT_XN, 0x3 << 53

.set .L_TT_MT_DEV, 0x0 << 2			// MAIR #0 (DEV_nGnRE)
.set .L_TT_MT_MEM, (0x1 << 2) | (0x3 << 8)	// MAIR #1 (MEM_WBWA), inner shareable

.set .L_BLOCK_DEV, .L_TT_TYPE_BLOCK | .L_TT_MT_DEV | .L_TT_AF | .L_TT_XN
.set .L_BLOCK_MEM, .L_TT_TYPE_BLOCK | .L_TT_MT_MEM | .L_TT_AF | .L_TT_NG

.section ".rodata.idmap", "a", %progbits
.global idmap
.align 12
idmap:
	/* level 1 */
	.quad		.L_BLOCK_DEV | 0x0		    // 1 GiB of device mappings
	.quad		.L_BLOCK_MEM | 0x40000000	// 1 GiB of DRAM
	.fill		254, 8, 0x0			// 254 GiB of unmapped VA space
	.quad		.L_BLOCK_DEV | 0x4000000000 // 1 GiB of device mappings
	.fill		255, 8, 0x0			// 255 GiB of remaining VA space

image.ld (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/*
 * Code will start running at this symbol which is placed at the start of the
 * image.
 */
ENTRY(entry)

MEMORY
{
	image : ORIGIN = 0x40080000, LENGTH = 2M
}

SECTIONS
{
	/*
	 * Collect together the code.
	 */
	.init : ALIGN(4096) {
		text_begin = .;
		*(.init.entry)
		*(.init.*)
	} >image
	.text : {
		*(.text.*)
	} >image
	text_end = .;

	/*
	 * Collect together read-only data.
	 */
	.rodata : ALIGN(4096) {
		rodata_begin = .;
		*(.rodata.*)
	} >image
	.got : {
		*(.got)
	} >image
	rodata_end = .;

	/*
	 * Collect together the read-write data including .bss at the end which
	 * will be zero'd by the entry code.
	 */
	.data : ALIGN(4096) {
		data_begin = .;
		*(.data.*)
		/*
		 * The entry point code assumes that .data is a multiple of 32
		 * bytes long.
		 */
		. = ALIGN(32);
		data_end = .;
	} >image

	/* Everything beyond this point will not be included in the binary. */
	bin_end = .;

	/* The entry point code assumes that .bss is 16-byte aligned. */
	.bss : ALIGN(16)  {
		bss_begin = .;
		*(.bss.*)
		*(COMMON)
		. = ALIGN(16);
		bss_end = .;
	} >image

	.stack (NOLOAD) : ALIGN(4096) {
		boot_stack_begin = .;
		. += 40 * 4096;
		. = ALIGN(4096);
		boot_stack_end = .;
	} >image

	. = ALIGN(4K);
	PROVIDE(dma_region = .);

	/*
	 * Remove unused sections from the image.
	 */
	/DISCARD/ : {
		/* The image loads itself so doesn't need these sections. */
		*(.gnu.hash)
		*(.hash)
		*(.interp)
		*(.eh_frame_hdr)
		*(.eh_frame)
		*(.note.gnu.build-id)
	}
}

Makefile (you shouldn’t need to change this):

# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

UNAME := $(shell uname -s)
ifeq ($(UNAME),Linux)
	TARGET = aarch64-linux-gnu
else
	TARGET = aarch64-none-elf
endif
OBJCOPY = $(TARGET)-objcopy

.PHONY: build qemu_minimal qemu qemu_logger

all: rtc.bin

build:
	cargo build

rtc.bin: build
	$(OBJCOPY) -O binary target/aarch64-unknown-none/debug/rtc $@

qemu: rtc.bin
	qemu-system-aarch64 -machine virt,gic-version=3 -cpu max -serial mon:stdio -display none -kernel $< -s

clean:
	cargo clean
	rm -f *.bin

.cargo/config.toml (you shouldn’t need to change this):

[build]
target = "aarch64-unknown-none"
rustflags = ["-C", "link-arg=-Timage.ld"]

Ejecuta el código en QEMU con make qemu.

Rust Bare Metal: tarde

Controlador RTC

(volver al ejercicio)

main.rs:

#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;
mod pl031;

use crate::pl031::Rtc;
use arm_gic::gicv3::{IntId, Trigger};
use arm_gic::{irq_enable, wfi};
use chrono::{TimeZone, Utc};
use core::hint::spin_loop;
use crate::pl011::Uart;
use arm_gic::gicv3::GicV3;
use core::panic::PanicInfo;
use log::{error, info, trace, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base addresses of the GICv3.
const GICD_BASE_ADDRESS: *mut u64 = 0x800_0000 as _;
const GICR_BASE_ADDRESS: *mut u64 = 0x80A_0000 as _;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

/// Base address of the PL031 RTC.
const PL031_BASE_ADDRESS: *mut u32 = 0x901_0000 as _;
/// The IRQ used by the PL031 RTC.
const PL031_IRQ: IntId = IntId::spi(2);

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({:#x}, {:#x}, {:#x}, {:#x})", x0, x1, x2, x3);

    // Safe because `GICD_BASE_ADDRESS` and `GICR_BASE_ADDRESS` are the base
    // addresses of a GICv3 distributor and redistributor respectively, and
    // nothing else accesses those address ranges.
    let mut gic = unsafe { GicV3::new(GICD_BASE_ADDRESS, GICR_BASE_ADDRESS) };
    gic.setup();

    // Safe because `PL031_BASE_ADDRESS` is the base address of a PL031 device,
    // and nothing else accesses that address range.
    let mut rtc = unsafe { Rtc::new(PL031_BASE_ADDRESS) };
    let timestamp = rtc.read();
    let time = Utc.timestamp_opt(timestamp.into(), 0).unwrap();
    info!("RTC: {time}");

    GicV3::set_priority_mask(0xff);
    gic.set_interrupt_priority(PL031_IRQ, 0x80);
    gic.set_trigger(PL031_IRQ, Trigger::Level);
    irq_enable();
    gic.enable_interrupt(PL031_IRQ, true);

    // Wait for 3 seconds, without interrupts.
    let target = timestamp + 3;
    rtc.set_match(target);
    info!("Waiting for {}", Utc.timestamp_opt(target.into(), 0).unwrap());
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    while !rtc.matched() {
        spin_loop();
    }
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    info!("Finished waiting");

    // Wait another 3 seconds for an interrupt.
    let target = timestamp + 6;
    info!("Waiting for {}", Utc.timestamp_opt(target.into(), 0).unwrap());
    rtc.set_match(target);
    rtc.clear_interrupt();
    rtc.enable_interrupt(true);
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    while !rtc.interrupt_pending() {
        wfi();
    }
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    info!("Finished waiting");

    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}

pl031.rs:

#![allow(unused)]
fn main() {
use core::ptr::{addr_of, addr_of_mut};

#[repr(C, align(4))]
struct Registers {
    /// Data register
    dr: u32,
    /// Match register
    mr: u32,
    /// Load register
    lr: u32,
    /// Control register
    cr: u8,
    _reserved0: [u8; 3],
    /// Interrupt Mask Set or Clear register
    imsc: u8,
    _reserved1: [u8; 3],
    /// Raw Interrupt Status
    ris: u8,
    _reserved2: [u8; 3],
    /// Masked Interrupt Status
    mis: u8,
    _reserved3: [u8; 3],
    /// Interrupt Clear Register
    icr: u8,
    _reserved4: [u8; 3],
}

/// Driver for a PL031 real-time clock.
#[derive(Debug)]
pub struct Rtc {
    registers: *mut Registers,
}

impl Rtc {
    /// Constructs a new instance of the RTC driver for a PL031 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the MMIO control registers of a
    /// PL031 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self { registers: base_address as *mut Registers }
    }

    /// Reads the current RTC value.
    pub fn read(&self) -> u32 {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).dr).read_volatile() }
    }

    /// Writes a match value. When the RTC value matches this then an interrupt
    /// will be generated (if it is enabled).
    pub fn set_match(&mut self, value: u32) {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).mr).write_volatile(value) }
    }

    /// Returns whether the match register matches the RTC value, whether or not
    /// the interrupt is enabled.
    pub fn matched(&self) -> bool {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        let ris = unsafe { addr_of!((*self.registers).ris).read_volatile() };
        (ris & 0x01) != 0
    }

    /// Returns whether there is currently an interrupt pending.
    ///
    /// This should be true if and only if `matched` returns true and the
    /// interrupt is masked.
    pub fn interrupt_pending(&self) -> bool {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        let ris = unsafe { addr_of!((*self.registers).mis).read_volatile() };
        (ris & 0x01) != 0
    }

    /// Sets or clears the interrupt mask.
    ///
    /// When the mask is true the interrupt is enabled; when it is false the
    /// interrupt is disabled.
    pub fn enable_interrupt(&mut self, mask: bool) {
        let imsc = if mask { 0x01 } else { 0x00 };
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).imsc).write_volatile(imsc) }
    }

    /// Clears a pending interrupt, if any.
    pub fn clear_interrupt(&mut self) {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).icr).write_volatile(0x01) }
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Rtc {}
}

Te Damos la Bienvenida a Concurrencia en Rust

Rust es totalmente compatible con la concurrencia mediante hilos del SO con exclusiones mutuas y canales.

El sistema de tipos de Rust desempeña un papel importante al hacer que muchos errores de concurrencia sean errores en tiempo de compilación. A menudo, esto se conoce como concurrencia sin miedo, ya que puedes confiar en el compilador para asegurar la corrección en el tiempo de ejecución.

Hilos

Los hilos de Rust funcionan de forma similar a los de otros lenguajes:

use std::thread;
use std::time::Duration;

fn main() {
    thread::spawn(|| {
        for i in 1..10 {
            println!("Count in thread: {i}!");
            thread::sleep(Duration::from_millis(5));
        }
    });

    for i in 1..5 {
        println!("Main thread: {i}");
        thread::sleep(Duration::from_millis(5));
    }
}
  • Los hilos son todos hilos daemon, y el hilo principal no espera por ellos.
  • Los pánicos de los hilos son independientes entre sí.
    • Los pánicos pueden transportar una carga útil, que se puede desempaquetar con downcast_ref.

Puntos clave:

  • Notice that the thread is stopped before it reaches 10 — the main thread is not waiting.

  • Utiliza let handle = thread::spawn(...) y, después, handle.join() para esperar a que el hilo termine.

  • Activa un pánico en el hilo y observa cómo esto no afecta a main.

  • Usa el valor devuelto Result de handle.join.() para acceder a la carga útil del pánico. Este es un buen momento para hablar sobre Any.

Hilos con ámbito

Los hilos normales no pueden tomar nada prestado de su entorno:

use std::thread;

fn foo() {
    let s = String::from("Hello");
    thread::spawn(|| {
        println!("Length: {}", s.len());
    });
}

fn main() {
    foo();
}

Sin embargo, puedes usar un hilo con ámbito para lo siguiente:

use std::thread;

fn main() {
    let s = String::from("Hello");

    thread::scope(|scope| {
        scope.spawn(|| {
            println!("Length: {}", s.len());
        });
    });
}
  • La razón es que, cuando se completa la función thread::scope, se asegura que todos los hilos están unidos, por lo que pueden devolver datos prestados.
  • Se aplican las reglas normales de préstamo de Rust: un hilo puede tomar datos prestados de manera mutable o cualquier número de hilos puede tomar datos prestados de manera inmutable.

Canales

Los canales de Rust tienen dos partes: Sender<T> y Receiver<T>. Las dos partes están conectadas a través del canal, pero solo se ven los puntos finales.

use std::sync::mpsc;

fn main() {
    let (tx, rx) = mpsc::channel();

    tx.send(10).unwrap();
    tx.send(20).unwrap();

    println!("Received: {:?}", rx.recv());
    println!("Received: {:?}", rx.recv());

    let tx2 = tx.clone();
    tx2.send(30).unwrap();
    println!("Received: {:?}", rx.recv());
}
  • mpsc son las siglas de Multi-Producer, Single-Consumer (multiproductor, consumidor único.) Sender y SyncSender implementan Clone (es decir, puedes crear varios productores), pero Receiver no.
  • send() y recv() devuelven Result. Si devuelven Err, significa que el homólogo Sender o Receiver se ha eliminado y el canal se ha cerrado.

Canales sin límites

Se obtiene un canal asíncrono y sin límites con mpsc::channel():

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::channel();

    thread::spawn(move || {
        let thread_id = thread::current().id();
        for i in 1..10 {
            tx.send(format!("Message {i}")).unwrap();
            println!("{thread_id:?}: sent Message {i}");
        }
        println!("{thread_id:?}: done");
    });
    thread::sleep(Duration::from_millis(100));

    for msg in rx.iter() {
        println!("Main: got {msg}");
    }
}

Canales delimitados

Con canales limitados (síncronos), send puede bloquear el hilo:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::sync_channel(3);

    thread::spawn(move || {
        let thread_id = thread::current().id();
        for i in 1..10 {
            tx.send(format!("Message {i}")).unwrap();
            println!("{thread_id:?}: sent Message {i}");
        }
        println!("{thread_id:?}: done");
    });
    thread::sleep(Duration::from_millis(100));

    for msg in rx.iter() {
        println!("Main: got {msg}");
    }
}
  • Al llamar a send, se bloqueará el hilo hasta que haya espacio suficiente en el canal para el mensaje nuevo. El hilo se puede bloquear de forma indefinida si no hay nadie que lea el canal.
  • Si se cierra el canal, se anulará la llamada a send y se producirá un error (por eso devuelve Result). Un canal se cierra cuando se elimina el receptor.
  • Un canal delimitado con un tamaño de cero se denomina “canal rendezvous”. Cada envío bloqueará el hilo actual hasta que otro hilo llame a read.

Send y Sync

How does Rust know to forbid shared access across threads? The answer is in two traits:

  • Send: un tipo T es Send si es seguro mover un T entre los límites de un hilo.
  • Sync: un tipo T es Sync si es seguro mover un &T entre los límites de un hilo.

Send y Sync son traits inseguros. El compilador los derivará automáticamente a tus tipos siempre que solo contengan los tipos Send y Sync. También puedes implementarlos de forma manual cuando sepas que es válido.

  • Se podría pensar en estos traits como marcadores que indican que el tipo tiene ciertas propiedades de seguridad en hilos.
  • Se pueden utilizar en las restricciones genéricas como traits normales.

Send

Un tipo T es Send si es seguro mover un valor T a otro hilo.

El efecto de mover la propiedad a otro hilo es que los destructores se ejecutarán en ese hilo. Por tanto, la cuestion es cuándo se puede asignar un valor a un hilo y desasignarlo en otro.

Por ejemplo, solo se puede acceder a una conexión a la biblioteca SQLite desde un único hilo.

Sync

Un tipo T es Sync si es seguro acceder a un valor T desde varios hilos al mismo tiempo.

En concreto, la definición es la siguiente:

T es Sync únicamente si &T es Send.

Esta instrucción es, básicamente, una forma resumida de indicar que, si un tipo es seguro para los hilos en uso compartido, también lo es para pasar referencias de él a través de los hilos.

Esto se debe a que, si el tipo es Sync, significa que se puede compartir entre múltiples hilos sin el riesgo de que haya carreras de datos u otros problemas de sincronización, por lo que es seguro moverlo a otro hilo. También es seguro mover una referencia al tipo a otro hilo, ya que se puede acceder de forma segura a los datos a los que hace referencia desde cualquier hilo.

Ejemplos

Send + Sync

La mayoría de los tipos que encuentras son Send + Sync:

  • i8, f32, bool, char, &str, etc.
  • (T1, T2), [T; N], &[T], struct { x: T }, etc.
  • String, Option<T>, Vec<T>, Box<T>, etc.
  • Arc<T>: explícitamente seguro para los hilos mediante el recuento atómico de referencias.
  • Mutex<T>: explícitamente seguro para los hilos mediante bloqueo interno.
  • AtomicBool, AtomicU8, etc.: utiliza instrucciones atómicas especiales.

Los tipos genéricos suelen ser Send + Sync cuando los parámetros del tipo son Send + Sync.

Send + !Sync

Estos tipos se pueden mover a otros hilos, pero no son seguros para los hilos. Normalmente, esto se debe a la mutabilidad interior:

  • mpsc::Sender<T>
  • mpsc::Receiver<T>
  • Cell<T>
  • RefCell<T>

!Send + Sync

Estos tipos son seguros para los hilos (thread safe), pero no se pueden mover a otro hilo:

  • MutexGuard<T: Sync>: Uses OS level primitives which must be deallocated on the thread which created them.

!Send + !Sync

Estos tipos no son seguros para los hilos y no se pueden mover a otros hilos:

  • Rc<T>: cada Rc<T> tiene una referencia a un RcBox<T>, que contiene un recuento de referencias no atómico.
  • *const T, *mut T: Rust asume que los punteros sin procesar pueden tener consideraciones especiales de concurrencia.

Estado compartido

Rust utiliza el sistema de tipos para implementar la sincronización de los datos compartidos. Esto se hace principalmente a través de dos tipos:

  • Arc<T>, recuento atómico de referencias T: gestiona el uso compartido entre hilos y se encarga de desasignar T cuando se elimina la última referencia.
  • Mutex<T>: asegura el acceso mutuamente excluyente al valor T.

Arc

Arc<T> permite el acceso compartido de solo lectura a través de Arc::clone:

use std::sync::Arc;
use std::thread;

fn main() {
    let v = Arc::new(vec![10, 20, 30]);
    let mut handles = Vec::new();
    for _ in 1..5 {
        let v = Arc::clone(&v);
        handles.push(thread::spawn(move || {
            let thread_id = thread::current().id();
            println!("{thread_id:?}: {v:?}");
        }));
    }

    handles.into_iter().for_each(|h| h.join().unwrap());
    println!("v: {v:?}");
}
  • Arc son las siglas de “Atomic Reference Counted” (recuento atómico de referencias), una versión de Rc segura para los hilos que utiliza operaciones atómicas.
  • Arc<T> implementa Clone, independientemente de si T lo hace o no. Implementa Send y Sync si T implementa ambos.
  • Arc::clone() tiene el coste de las operaciones atómicas que se ejecutan; después el uso de T es libre.
  • Hay que prestar atención a los ciclos de referencia, ya que Arc no usa un recolector de memoria residual para detectarlos.
    • std::sync::Weak puede resultar útil.

Mutex

Mutex<T> ensures mutual exclusion and allows mutable access to T behind a read-only interface (another form of interior mutability):

use std::sync::Mutex;

fn main() {
    let v = Mutex::new(vec![10, 20, 30]);
    println!("v: {:?}", v.lock().unwrap());

    {
        let mut guard = v.lock().unwrap();
        guard.push(40);
    }

    println!("v: {:?}", v.lock().unwrap());
}

Fíjate en cómo tenemos una implementación general de impl<T: Send> Sync for Mutex<T>.

  • Mutex in Rust looks like a collection with just one element — the protected data.
    • No es posible olvidarse de adquirir la exclusión mutua antes de acceder a los datos protegidos.
  • Puedes obtener un &mut T de Mutex<T> mediante el bloqueo. El MutexGuard asegura que &mut T no dure más tiempo que el bloqueo que se ha aplicado.
  • Mutex<T> implementa tanto Send como Sync únicamente si T implementa Send.
  • A read-write lock counterpart: RwLock.
  • Why does lock() return a Result?
    • Si el hilo que contiene Mutex entra en pánico, Mutex se “envenena” para indicar que los datos que protegía pueden estar en un estado incoherente. Llamar a lock() en una exclusión mutua envenenada da el error PoisonError. Puedes llamar a into_inner() en el error para recuperar los datos de todos modos.

Ejemplo

Veamos cómo funcionan Arc y Mutex:

use std::thread;
// use std::sync::{Arc, Mutex};

fn main() {
    let v = vec![10, 20, 30];
    let handle = thread::spawn(|| {
        v.push(10);
    });
    v.push(1000);

    handle.join().unwrap();
    println!("v: {v:?}");
}

Solución posible:

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let v = Arc::new(Mutex::new(vec![10, 20, 30]));

    let v2 = Arc::clone(&v);
    let handle = thread::spawn(move || {
        let mut v2 = v2.lock().unwrap();
        v2.push(10);
    });

    {
        let mut v = v.lock().unwrap();
        v.push(1000);
    }

    handle.join().unwrap();

    println!("v: {v:?}");
}

Puntos a destacar:

  • v se envuelve tanto en Arc como en Mutex, porque sus preocupaciones son ortogonales.
    • Envolver un Mutex en un Arc es un patrón habitual para compartir el estado mutable entre hilos.
  • v: Arc<_> se debe clonar como v2 antes de poder moverlo a otro hilo. Ten en cuenta que move se ha añadido a la firma lambda.
  • Se introducen bloqueos para limitar al máximo el ámbito de LockGuard.

Ejercicios

Vamos a practicar nuestras nuevas habilidades de concurrencia con

  • La cena de los filósofos: un problema clásico de concurrencia.

  • El comprobador de enlaces multihilo: un proyecto más grande donde utilizarás Cargo para descargar dependencias y luego comprobar los enlaces en paralelo.

Después de realizar los ejercicios, puedes consultar las soluciones correspondientes.

La cena de los filósofos

El problema de la cena de los filósofos es un problema clásico de concurrencia:

Cinco filósofos cenan juntos en la misma mesa. Cada filósofo tiene su propio sitio en ella. Hay un tenedor entre cada plato. El plato que van a degustar es una especie de espaguetis que hay que comer con dos tenedores. Los filósofos solo pueden pensar y comer alternativamente. Además, solo pueden comer sus espaguetis cuando disponen de un tenedor a la izquierda y otro a la derecha. Por tanto, los dos tenedores solo estarán disponibles cuando su dos vecinos más cercanos estén pensando y no comiendo. Cuando un filósofo termina de comer, deja los dos tenedores en la mesa.

Para realizar este ejercicio necesitarás una [instalación local de Cargo] (../../cargo/running-locally.md). Copia el fragmento de código que aparece más abajo en un archivo denominado src/main.rs, rellena los espacios en blanco y comprueba que cargo run no presenta interbloqueos:

use std::sync::{mpsc, Arc, Mutex};
use std::thread;
use std::time::Duration;

struct Fork;

struct Philosopher {
    name: String,
    // left_fork: ...
    // right_fork: ...
    // thoughts: ...
}

impl Philosopher {
    fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .unwrap();
    }

    fn eat(&self) {
        // Pick up forks...
        println!("{} is eating...", &self.name);
        thread::sleep(Duration::from_millis(10));
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];

fn main() {
    // Create forks

    // Create philosophers

    // Make each of them think and eat 100 times

    // Output their thoughts
}

Puedes usar el siguiente archivo Cargo.toml:

[package]
name = "dining-philosophers"
version = "0.1.0"
edition = "2021"

Comprobador de enlaces multihilo

Utilicemos nuestros nuevos conocimientos para crear un comprobador de enlaces multihilo. Debería empezar en una página web y comprobar que los enlaces de la página son válidos. Debería consultar otras páginas del mismo dominio y seguir haciéndolo hasta que todas las páginas se hayan validado.

Para ello, necesitarás un cliente HTTP como reqwest. Crea un proyecto de Cargo y aplica reqwest como una dependencia con:

cargo new link-checker
cd link-checker
cargo add --features blocking,rustls-tls reqwest

Si cargo add da error: no such subcommand, edita el archivo Cargo.toml de forma manual. Añade las dependencias que se indican más abajo.

También necesitarás una forma de encontrar enlaces. Podemos usar scraper para eso:

cargo add scraper

Por último, necesitaremos algún método para gestionar los errores. Para ello, usaremos thiserror:

cargo add thiserror

Las llamadas a cargo add actualizarán el archivo Cargo.toml para que tenga este aspecto:

[package]
name = "link-checker"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
reqwest = { version = "0.11.12", features = ["blocking", "rustls-tls"] }
scraper = "0.13.0"
thiserror = "1.0.37"

Ya puedes descargar la página de inicio. Prueba con un sitio pequeño, como https://www.google.org/.

El archivo src/main.rs debería tener un aspecto similar a este:

use reqwest::blocking::Client;
use reqwest::Url;
use scraper::{Html, Selector};
use thiserror::Error;

#[derive(Error, Debug)]
enum Error {
    #[error("request error: {0}")]
    ReqwestError(#[from] reqwest::Error),
    #[error("bad http response: {0}")]
    BadResponse(String),
}

#[derive(Debug)]
struct CrawlCommand {
    url: Url,
    extract_links: bool,
}

fn visit_page(client: &Client, command: &CrawlCommand) -> Result<Vec<Url>, Error> {
    println!("Checking {:#}", command.url);
    let response = client.get(command.url.clone()).send()?;
    if !response.status().is_success() {
        return Err(Error::BadResponse(response.status().to_string()));
    }

    let mut link_urls = Vec::new();
    if !command.extract_links {
        return Ok(link_urls);
    }

    let base_url = response.url().to_owned();
    let body_text = response.text()?;
    let document = Html::parse_document(&body_text);

    let selector = Selector::parse("a").unwrap();
    let href_values = document
        .select(&selector)
        .filter_map(|element| element.value().attr("href"));
    for href in href_values {
        match base_url.join(href) {
            Ok(link_url) => {
                link_urls.push(link_url);
            }
            Err(err) => {
                println!("On {base_url:#}: ignored unparsable {href:?}: {err}");
            }
        }
    }
    Ok(link_urls)
}

fn main() {
    let client = Client::new();
    let start_url = Url::parse("https://www.google.org").unwrap();
    let crawl_command = CrawlCommand{ url: start_url, extract_links: true };
    match visit_page(&client, &crawl_command) {
        Ok(links) => println!("Links: {links:#?}"),
        Err(err) => println!("Could not extract links: {err:#}"),
    }
}

Ejecuta el código en src/main.rs con

cargo run

Tasks

  • Comprueba los enlaces en paralelo con los hilos: envía las URLs que se van a comprobar a un canal y deja que varios hilos comprueben las URLs en paralelo.
  • Amplía esta opción para extraer enlaces de todas las páginas del dominio www.google.org. Define un límite máximo de 100 páginas para que el sitio no te bloquee.

Concurrencia: Ejercicios de la Mañana

La cena de los filósofos

(volver al ejercicio)

use std::sync::{mpsc, Arc, Mutex};
use std::thread;
use std::time::Duration;

struct Fork;

struct Philosopher {
    name: String,
    left_fork: Arc<Mutex<Fork>>,
    right_fork: Arc<Mutex<Fork>>,
    thoughts: mpsc::SyncSender<String>,
}

impl Philosopher {
    fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .unwrap();
    }

    fn eat(&self) {
        println!("{} is trying to eat", &self.name);
        let _left = self.left_fork.lock().unwrap();
        let _right = self.right_fork.lock().unwrap();

        println!("{} is eating...", &self.name);
        thread::sleep(Duration::from_millis(10));
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];

fn main() {
    let (tx, rx) = mpsc::sync_channel(10);

    let forks = (0..PHILOSOPHERS.len())
        .map(|_| Arc::new(Mutex::new(Fork)))
        .collect::<Vec<_>>();

    for i in 0..forks.len() {
        let tx = tx.clone();
        let mut left_fork = Arc::clone(&forks[i]);
        let mut right_fork = Arc::clone(&forks[(i + 1) % forks.len()]);

        // To avoid a deadlock, we have to break the symmetry
        // somewhere. This will swap the forks without deinitializing
        // either of them.
        if i == forks.len() - 1 {
            std::mem::swap(&mut left_fork, &mut right_fork);
        }

        let philosopher = Philosopher {
            name: PHILOSOPHERS[i].to_string(),
            thoughts: tx,
            left_fork,
            right_fork,
        };

        thread::spawn(move || {
            for _ in 0..100 {
                philosopher.eat();
                philosopher.think();
            }
        });
    }

    drop(tx);
    for thought in rx {
        println!("{thought}");
    }
}

Comprobador de Enlaces

(volver al ejercicio)

use std::sync::{mpsc, Arc, Mutex};
use std::thread;

use reqwest::blocking::Client;
use reqwest::Url;
use scraper::{Html, Selector};
use thiserror::Error;

#[derive(Error, Debug)]
enum Error {
    #[error("request error: {0}")]
    ReqwestError(#[from] reqwest::Error),
    #[error("bad http response: {0}")]
    BadResponse(String),
}

#[derive(Debug)]
struct CrawlCommand {
    url: Url,
    extract_links: bool,
}

fn visit_page(client: &Client, command: &CrawlCommand) -> Result<Vec<Url>, Error> {
    println!("Checking {:#}", command.url);
    let response = client.get(command.url.clone()).send()?;
    if !response.status().is_success() {
        return Err(Error::BadResponse(response.status().to_string()));
    }

    let mut link_urls = Vec::new();
    if !command.extract_links {
        return Ok(link_urls);
    }

    let base_url = response.url().to_owned();
    let body_text = response.text()?;
    let document = Html::parse_document(&body_text);

    let selector = Selector::parse("a").unwrap();
    let href_values = document
        .select(&selector)
        .filter_map(|element| element.value().attr("href"));
    for href in href_values {
        match base_url.join(href) {
            Ok(link_url) => {
                link_urls.push(link_url);
            }
            Err(err) => {
                println!("On {base_url:#}: ignored unparsable {href:?}: {err}");
            }
        }
    }
    Ok(link_urls)
}

struct CrawlState {
    domain: String,
    visited_pages: std::collections::HashSet<String>,
}

impl CrawlState {
    fn new(start_url: &Url) -> CrawlState {
        let mut visited_pages = std::collections::HashSet::new();
        visited_pages.insert(start_url.as_str().to_string());
        CrawlState { domain: start_url.domain().unwrap().to_string(), visited_pages }
    }

    /// Determine whether links within the given page should be extracted.
    fn should_extract_links(&self, url: &Url) -> bool {
        let Some(url_domain) = url.domain() else {
            return false;
        };
        url_domain == self.domain
    }

    /// Mark the given page as visited, returning false if it had already
    /// been visited.
    fn mark_visited(&mut self, url: &Url) -> bool {
        self.visited_pages.insert(url.as_str().to_string())
    }
}

type CrawlResult = Result<Vec<Url>, (Url, Error)>;
fn spawn_crawler_threads(
    command_receiver: mpsc::Receiver<CrawlCommand>,
    result_sender: mpsc::Sender<CrawlResult>,
    thread_count: u32,
) {
    let command_receiver = Arc::new(Mutex::new(command_receiver));

    for _ in 0..thread_count {
        let result_sender = result_sender.clone();
        let command_receiver = command_receiver.clone();
        thread::spawn(move || {
            let client = Client::new();
            loop {
                let command_result = {
                    let receiver_guard = command_receiver.lock().unwrap();
                    receiver_guard.recv()
                };
                let Ok(crawl_command) = command_result else {
                    // The sender got dropped. No more commands coming in.
                    break;
                };
                let crawl_result = match visit_page(&client, &crawl_command) {
                    Ok(link_urls) => Ok(link_urls),
                    Err(error) => Err((crawl_command.url, error)),
                };
                result_sender.send(crawl_result).unwrap();
            }
        });
    }
}

fn control_crawl(
    start_url: Url,
    command_sender: mpsc::Sender<CrawlCommand>,
    result_receiver: mpsc::Receiver<CrawlResult>,
) -> Vec<Url> {
    let mut crawl_state = CrawlState::new(&start_url);
    let start_command = CrawlCommand { url: start_url, extract_links: true };
    command_sender.send(start_command).unwrap();
    let mut pending_urls = 1;

    let mut bad_urls = Vec::new();
    while pending_urls > 0 {
        let crawl_result = result_receiver.recv().unwrap();
        pending_urls -= 1;

        match crawl_result {
            Ok(link_urls) => {
                for url in link_urls {
                    if crawl_state.mark_visited(&url) {
                        let extract_links = crawl_state.should_extract_links(&url);
                        let crawl_command = CrawlCommand { url, extract_links };
                        command_sender.send(crawl_command).unwrap();
                        pending_urls += 1;
                    }
                }
            }
            Err((url, error)) => {
                bad_urls.push(url);
                println!("Got crawling error: {:#}", error);
                continue;
            }
        }
    }
    bad_urls
}

fn check_links(start_url: Url) -> Vec<Url> {
    let (result_sender, result_receiver) = mpsc::channel::<CrawlResult>();
    let (command_sender, command_receiver) = mpsc::channel::<CrawlCommand>();
    spawn_crawler_threads(command_receiver, result_sender, 16);
    control_crawl(start_url, command_sender, result_receiver)
}

fn main() {
    let start_url = reqwest::Url::parse("https://www.google.org").unwrap();
    let bad_urls = check_links(start_url);
    println!("Bad URLs: {:#?}", bad_urls);
}

Async en Rust

“Async” es un modelo de concurrencia en el que se ejecutan varias tareas al mismo tiempo. Se ejecuta cada una de ellas hasta que se bloquea y, a continuación, se cambia a otra tarea que está lista para progresar. El modelo permite ejecutar un mayor número de tareas en un número limitado de hilos. Esto se debe a que la sobrecarga por tarea suele ser muy baja y los sistemas operativos proporcionan primitivos para identificar de forma eficiente las E/S que pueden continuar.

La operación asíncrona de Rust se basa en “valores futuros”, que representan el trabajo que puede completarse más adelante. Los futuros se “sondean” hasta que indican que se han completado.

Los futuros se sondean mediante un tiempo de ejecución asíncrono y hay disponibles varios tiempos de ejecución diferentes.

Comparaciones

  • Python tiene un modelo similar en su asyncio. Sin embargo, su tipo Future está basado en retrollamadas y no se sondea. Los programas asíncronos de Python requieren un “bucle”, similar a un tiempo de ejecución en Rust.

  • Promise de JavaScript es parecido, pero también se basa en retrollamadas. El tiempo de ejecución del lenguaje implementa el bucle de eventos, por lo que muchos de los detalles de la resolución de Promise están ocultos.

async/await

En general, el código asíncrono de Rust se parece mucho al código secuencial “normal”:

use futures::executor::block_on;

async fn count_to(count: i32) {
    for i in 1..=count {
        println!("Count is: {i}!");
    }
}

async fn async_main(count: i32) {
    count_to(count).await;
}

fn main() {
    block_on(async_main(10));
}

Puntos clave:

  • Ten en cuenta que este es un ejemplo simplificado para mostrar la sintaxis. No hay ninguna operación de larga duración ni concurrencia real.

  • ¿Cuál es el tipo de resultado devuelto de una llamada asíncrona?

    • Consulta el tipo con let future: () = async_main(10); en main .
  • The “async” keyword is syntactic sugar. The compiler replaces the return type with a future.

  • No se puede hacer que main sea asíncrono sin dar instrucciones adicionales al compilador sobre cómo usar el futuro devuelto.

  • You need an executor to run async code. block_on blocks the current thread until the provided future has run to completion.

  • .await espera de forma asíncrona la finalización de otra operación. A diferencia de block_on, .await no bloquea el hilo.

  • .await can only be used inside an async function (or block; these are introduced later).

Future

Future es un trait implementado por objetos que representan una operación que puede que aún no se haya completado. Se puede sondear un futuro y poll devuelve un Poll.

#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::Context;

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

pub enum Poll<T> {
    Ready(T),
    Pending,
}
}

Una función asíncrona devuelve impl Future. También es posible (aunque no es habitual) implementar Future para tus propios tipos. Por ejemplo, el JoinHandle devuelto por tokio::spawn implementa Future para permitir que se una a él.

La palabra clave .await, aplicada a un futuro, provoca que la función asíncrona se detenga hasta que dicho futuro esté listo y, a continuación, se evalúa su salida.

  • Los tipos Future y Polll se implementan exactamente como se indica. Haz clic en los enlaces para mostrar las implementaciones en los documentos.

  • No trataremos Pin ni Context, ya que nos centraremos en escribir código asíncrono en lugar de compilar nuevos primitivos asíncronos. Brevemente:

    • Context permite que un futuro se programe a sí mismo para que se vuelva a sondear cuando se produzca un evento.

    • Pin asegura que el futuro no se mueva en la memoria, de forma que los punteros en ese futuro siguen siendo válidos. Esto es necesario para que las referencias sigan siendo válidas después de .await.

Runtimes (Tiempos de Ejecución)

Un runtime ofrece asistencia para realizar operaciones de forma asíncrona (un reactor) y es responsable de ejecutar futuros (un ejecutor). Rust no cuenta con un tiempo de ejecución “integrado”, pero hay varias opciones disponibles:

  • Tokio: eficaz, con un ecosistema bien desarrollado de funciones, como Hyper para HTTP o Tonic para usar gRPC.
  • async-std: se trata de un “std para async” e incluye un tiempo de ejecución básico en async::task.
  • smol: sencillo y ligero.

Varias aplicaciones de mayor tamaño tienen sus propios tiempos de ejecución. Por ejemplo, Fuchsia ya tiene uno.

  • Ten en cuenta que, de los tiempos de ejecución enumerados, el playground de Rust solo admite Tokio. El playground tampoco permite ningún tipo de E/S, por lo que la mayoría de elementos asíncronos interesantes no se pueden ejecutar. en él.

  • Los futuros son “inertes”, ya que no realizan ninguna acción (ni siquiera iniciar una operación de E/S) a menos que haya un ejecutor que los sondee. Muy diferente de las promesas de JavaScript, por ejemplo, que se ejecutan hasta su finalización, aunque nunca se utilicen.

Tokio

Tokio provides:

  • Un tiempo de ejecución multihilo para ejecutar código asíncrono.
  • Una versión asíncrona de la biblioteca estándar.
  • Un amplio ecosistema de bibliotecas.
use tokio::time;

async fn count_to(count: i32) {
    for i in 1..=count {
        println!("Count in task: {i}!");
        time::sleep(time::Duration::from_millis(5)).await;
    }
}

#[tokio::main]
async fn main() {
    tokio::spawn(count_to(10));

    for i in 1..5 {
        println!("Main task: {i}");
        time::sleep(time::Duration::from_millis(5)).await;
    }
}
  • Con la macro tokio::main, podemos hacer que main sea asíncrono.

  • La función spawn crea una “tarea” simultánea.

  • Nota: spawn utiliza un Future, no se llama a .await en count_to.

Más información:

  • ¿Por qué count_to no suele llegar a 10? Se trata de un ejemplo de cancelación asíncrona. tokio::spawn devuelve un controlador que puede esperarse hasta que termine.

  • Prueba count_to(10).await en lugar de usar spawn.

  • Intenta esperar a la correción de la tarea de tokio::spawn.

Tasks

Rust tiene un sistema de tareas, que es una forma de hilo ligero.

Una tarea tiene un solo futuro de nivel superior que el ejecutor sondea para hacer que progrese. El futuro puede tener uno o varios futuros anidados que su método poll sondea, lo que se corresponde con una pila de llamadas. La concurrencia dentro de una tarea es posible mediante el sondeo de varios futuros secundarios , como una carrera de un temporizador y una operación de E/S.

use tokio::io::{self, AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpListener;

#[tokio::main]
async fn main() -> io::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:0").await?;
    println!("listening on port {}", listener.local_addr()?.port());

    loop {
        let (mut socket, addr) = listener.accept().await?;

        println!("connection from {addr:?}");

        tokio::spawn(async move {
            socket.write_all(b"Who are you?\n").await.expect("socket error");

            let mut buf = vec![0; 1024];
            let name_size = socket.read(&mut buf).await.expect("socket error");
            let name = std::str::from_utf8(&buf[..name_size]).unwrap().trim();
            let reply = format!("Thanks for dialing in, {name}!\n");
            socket.write_all(reply.as_bytes()).await.expect("socket error");
        });
    }
}

Copia este ejemplo en el archivo src/main.rs que has preparado y ejecútalo desde ahí.

Try connecting to it with a TCP connection tool like nc or telnet.

  • Pide a los alumnos que vean cuál sería el estado del servidor de ejemplo con algunos clientes conectados. ¿Qué tareas hay? ¿Cuáles son sus futuros?

  • This is the first time we’ve seen an async block. This is similar to a closure, but does not take any arguments. Its return value is a Future, similar to an async fn.

  • Refactoriza el bloque asíncrono en una función y mejora la gestión de errores con ?.

Canales asíncronos

Varios crates admiten canales asíncronos. Por ejemplo, tokio:

use tokio::sync::mpsc::{self, Receiver};

async fn ping_handler(mut input: Receiver<()>) {
    let mut count: usize = 0;

    while let Some(_) = input.recv().await {
        count += 1;
        println!("Received {count} pings so far.");
    }

    println!("ping_handler complete");
}

#[tokio::main]
async fn main() {
    let (sender, receiver) = mpsc::channel(32);
    let ping_handler_task = tokio::spawn(ping_handler(receiver));
    for i in 0..10 {
        sender.send(()).await.expect("Failed to send ping.");
        println!("Sent {} pings so far.", i + 1);
    }

    drop(sender);
    ping_handler_task.await.expect("Something went wrong in ping handler task.");
}
  • Cambia el tamaño del canal a 3 y comprueba cómo afecta a la ejecución.

  • En general, la interfaz es similar a los canales sync, tal como se ha visto ver en la clase de la mañana.

  • Prueba a quitar la llamada a std::mem::drop. ¿Qué sucede? ¿Por qué?

  • El crate Flume tiene canales que implementan sync y async,send y recv. Esto puede resultar práctico para aplicaciones complejas con tareas de E/S y tareas pesadas de procesamiento de CPU.

  • Es preferible trabajar con canales async por la capacidad de combinarlos con otros future para poder crear un flujo de control complejo.

Flujo de Control de Futuros

Los futuros pueden combinarse para producir gráficos de flujo de computación simultáneos. Ya hemos visto tareas que funcionan como hilos de ejecución independientes.

Unir

Una operación join espera hasta que todos los futuros estén listos y devuelve una colección de sus resultados. Es similar a Promise.all en JavaScript o asyncio.gather en Python.

use anyhow::Result;
use futures::future;
use reqwest;
use std::collections::HashMap;

async fn size_of_page(url: &str) -> Result<usize> {
    let resp = reqwest::get(url).await?;
    Ok(resp.text().await?.len())
}

#[tokio::main]
async fn main() {
    let urls: [&str; 4] = [
        "https://google.com",
        "https://httpbin.org/ip",
        "https://play.rust-lang.org/",
        "BAD_URL",
    ];
    let futures_iter = urls.into_iter().map(size_of_page);
    let results = future::join_all(futures_iter).await;
    let page_sizes_dict: HashMap<&str, Result<usize>> =
        urls.into_iter().zip(results.into_iter()).collect();
    println!("{:?}", page_sizes_dict);
}

Copia este ejemplo en el archivo src/main.rs que has preparado y ejecútalo desde ahí.

  • En el caso de varios futuros de tipos distintos, puedes utilizar std::future::join!, pero debes saber cuántos futuros tendrás en el tiempo de compilación. Esto se encuentra actualmente en el crate futures, que pronto se estabilizará en std::future.

  • The risk of join is that one of the futures may never resolve, this would cause your program to stall.

  • También puedes combinar join_all con join!, por ejemplo, para unir todas las solicitudes a un servicio HTTP, así como una consulta a la base de datos. Prueba a añadir un tokio::time::sleepal futuro mediantefutures::join!. No se trata de un tiempo de espera (para eso se requiere select!, que se explica en el siguiente capítulo), sino que muestra join!`.

Seleccionar

Una operación select espera hasta que un conjunto de futuros esté listo y responde al resultado de ese futuro. En JavaScript, esto es similar a Promise.race. En Python, se compara con asyncio.wait(task_set, return_when=asyncio.FIRST_COMPLETED).

Similar to a match statement, the body of select! has a number of arms, each of the form pattern = future => statement. When a future is ready, its return value is destructured by the pattern. The statement is then run with the resulting variables. The statement result becomes the result of the select! macro.

use tokio::sync::mpsc::{self, Receiver};
use tokio::time::{sleep, Duration};

#[derive(Debug, PartialEq)]
enum Animal {
    Cat { name: String },
    Dog { name: String },
}

async fn first_animal_to_finish_race(
    mut cat_rcv: Receiver<String>,
    mut dog_rcv: Receiver<String>,
) -> Option<Animal> {
    tokio::select! {
        cat_name = cat_rcv.recv() => Some(Animal::Cat { name: cat_name? }),
        dog_name = dog_rcv.recv() => Some(Animal::Dog { name: dog_name? })
    }
}

#[tokio::main]
async fn main() {
    let (cat_sender, cat_receiver) = mpsc::channel(32);
    let (dog_sender, dog_receiver) = mpsc::channel(32);
    tokio::spawn(async move {
        sleep(Duration::from_millis(500)).await;
        cat_sender.send(String::from("Felix")).await.expect("Failed to send cat.");
    });
    tokio::spawn(async move {
        sleep(Duration::from_millis(50)).await;
        dog_sender.send(String::from("Rex")).await.expect("Failed to send dog.");
    });

    let winner = first_animal_to_finish_race(cat_receiver, dog_receiver)
        .await
        .expect("Failed to receive winner");

    println!("Winner is {winner:?}");
}
  • En este ejemplo, tenemos una carrera entre un gato y un perro. first_animal_to_finish_race escucha a ambos canales y elige el que llegue primero. Como el perro tarda 50 ms, gana al gato, que tarda 500 ms.

  • En este ejemplo, puedes usar canales oneshot, ya que se supone que solo recibirán un send.

  • Prueba a añadir un límite a la carrera y demuestra cómo se seleccionan distintos tipos de futuros.

  • Ten en cuenta que select! elimina las ramas sin coincidencias, cancelando así sus futuros. Es más fácil de usar cuando cada ejecución de select! crea futuros.

    • También puedes enviar &mut future en lugar del futuro en sí, pero esto podría provocar problemas, como se explica más adelante en la diapositiva sobre pines.

Inconvenientes de async/await

Async/await ofrece una abstracción práctica y eficiente para la programación asíncrona simultánea. Sin embargo, el modelo async/await de Rust también viene acompañado de errores y footguns. En este capítulo veremos algunos de ellos:

Bloqueo del ejecutor

La mayoría de los tiempos de ejecución asíncronos solo permiten que las tareas de E/S se ejecuten de forma simultánea. Esto significa que las tareas que bloquean la CPU bloquearán el ejecutor e impedirán que se ejecuten otras tareas. Una solución alternativa y sencilla es utilizar métodos asíncronos equivalentes siempre que sea posible.

use futures::future::join_all;
use std::time::Instant;

async fn sleep_ms(start: &Instant, id: u64, duration_ms: u64) {
    std::thread::sleep(std::time::Duration::from_millis(duration_ms));
    println!(
        "future {id} slept for {duration_ms}ms, finished after {}ms",
        start.elapsed().as_millis()
    );
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    let start = Instant::now();
    let sleep_futures = (1..=10).map(|t| sleep_ms(&start, t, t * 10));
    join_all(sleep_futures).await;
}
  • Ejecuta el código y comprueba que las suspensiones se producen de forma consecutiva y no simultánea.

  • La versión "current_thread" reúne todas las tareas en un solo hilo. Esto consigue que el efecto sea más obvio, pero el error sigue estando presente en la versión multihilo.

  • Cambia std::thread::sleep a tokio::time::sleep. y espera su resultado.

  • Otra solución sería tokio::task::spawn_blocking, que genera un hilo real y transforma su controlador en un futuro sin bloquear el ejecutor.

  • No debes pensar en las tareas como hilos del sistema operativo. No se asignan 1 a 1 y la mayoría de los ejecutores permitirán que se ejecuten muchas tareas en un solo hilo del sistema operativo. Esta situación es especialmente problemática cuando se interactúa con otras bibliotecas a través de FFI, donde dicha biblioteca puede depender del almacenamiento local de hilos o puede asignarse a hilos específicos del sistema operativo (por ejemplo, CUDA). En estos casos es preferible usar tokio::task::spawn_blocking.

  • Utiliza las exclusión mutuas de sincronización con cuidado. Si mantienes una exclusión mutua sobre un .await, puede que se bloquee otra tarea y que esta se esté ejecutando en el mismo hilo.

Pin

Async blocks and functions return types implementing the Future trait. The type returned is the result of a compiler transformation which turns local variables into data stored inside the future.

Some of those variables can hold pointers to other local variables. Because of that, the future should never be moved to a different memory location, as it would invalidate those pointers.

To prevent moving the future type in memory, it can only be polled through a pinned pointer. Pin is a wrapper around a reference that disallows all operations that would move the instance it points to into a different memory location.

use tokio::sync::{mpsc, oneshot};
use tokio::task::spawn;
use tokio::time::{sleep, Duration};

// A work item. In this case, just sleep for the given time and respond
// with a message on the `respond_on` channel.
#[derive(Debug)]
struct Work {
    input: u32,
    respond_on: oneshot::Sender<u32>,
}

// A worker which listens for work on a queue and performs it.
async fn worker(mut work_queue: mpsc::Receiver<Work>) {
    let mut iterations = 0;
    loop {
        tokio::select! {
            Some(work) = work_queue.recv() => {
                sleep(Duration::from_millis(10)).await; // Pretend to work.
                work.respond_on
                    .send(work.input * 1000)
                    .expect("failed to send response");
                iterations += 1;
            }
            // TODO: report number of iterations every 100ms
        }
    }
}

// A requester which requests work and waits for it to complete.
async fn do_work(work_queue: &mpsc::Sender<Work>, input: u32) -> u32 {
    let (tx, rx) = oneshot::channel();
    work_queue
        .send(Work { input, respond_on: tx })
        .await
        .expect("failed to send on work queue");
    rx.await.expect("failed waiting for response")
}

#[tokio::main]
async fn main() {
    let (tx, rx) = mpsc::channel(10);
    spawn(worker(rx));
    for i in 0..100 {
        let resp = do_work(&tx, i).await;
        println!("work result for iteration {i}: {resp}");
    }
}
  • Puede que reconozcas esto como un ejemplo del patrón actor. Los actores suelen llamar a select! en un bucle.

  • Esta sección es un resumen de algunas de las lecciones anteriores, así que tómate tu tiempo .

    • Si añade un _ = sleep(Duration::from_millis(100)) => { println!(..) } a select!, nunca se ejecutará. ¿Por qué?

    • En su lugar, añade un timeout_fut que contenga ese futuro fuera de loop:

      #![allow(unused)]
      fn main() {
      let mut timeout_fut = sleep(Duration::from_millis(100));
      loop {
          select! {
              ..,
              _ = timeout_fut => { println!(..); },
          }
      }
      }
    • Continuará sin funcionar. Sigue los errores del compilador y añade &mut a timeout_fut en select! para ir despejando el problema. A continuación, usa Box::pin:

      #![allow(unused)]
      fn main() {
      let mut timeout_fut = Box::pin(sleep(Duration::from_millis(100)));
      loop {
          select! {
              ..,
              _ = &mut timeout_fut => { println!(..); },
          }
      }
      }
    • Se puede compilar, pero una vez que vence el tiempo de espera, aparece Poll::Ready en cada iteración (un futuro fusionado podría resultar útil). Actualiza para restablecer timeout_fut cada vez que expire.

  • Box se asigna en el montículo. En algunos casos, std::pin::pin! (solo si se ha estabilizado recientemente, con código antiguo que suele utilizar tokio::pin!) también es una opción, pero difícil de utilizar en un futuro que se reasigna.

  • Otra alternativa es no utilizar pin, sino generar otra tarea que se enviará a un canal de oneshot cada 100 ms.

  • Data that contains pointers to itself is called self-referential. Normally, the Rust borrow checker would prevent self-referential data from being moved, as the references cannot outlive the data they point to. However, the code transformation for async blocks and functions is not verified by the borrow checker.

  • Pin is a wrapper around a reference. An object cannot be moved from its place using a pinned pointer. However, it can still be moved through an unpinned pointer.

  • The poll method of the Future trait uses Pin<&mut Self> instead of &mut Self to refer to the instance. That’s why it can only be called on a pinned pointer.

Traits asíncronos

Los métodos asíncronos en traits todavía no son compatibles con el canal estable. (Existe una característica experimental en nightly que debería estabilizarse a medio plazo.)

El crate async_trait proporciona una solución mediante una macro:

use async_trait::async_trait;
use std::time::Instant;
use tokio::time::{sleep, Duration};

#[async_trait]
trait Sleeper {
    async fn sleep(&self);
}

struct FixedSleeper {
    sleep_ms: u64,
}

#[async_trait]
impl Sleeper for FixedSleeper {
    async fn sleep(&self) {
        sleep(Duration::from_millis(self.sleep_ms)).await;
    }
}

async fn run_all_sleepers_multiple_times(
    sleepers: Vec<Box<dyn Sleeper>>,
    n_times: usize,
) {
    for _ in 0..n_times {
        println!("running all sleepers..");
        for sleeper in &sleepers {
            let start = Instant::now();
            sleeper.sleep().await;
            println!("slept for {}ms", start.elapsed().as_millis());
        }
    }
}

#[tokio::main]
async fn main() {
    let sleepers: Vec<Box<dyn Sleeper>> = vec![
        Box::new(FixedSleeper { sleep_ms: 50 }),
        Box::new(FixedSleeper { sleep_ms: 100 }),
    ];
    run_all_sleepers_multiple_times(sleepers, 5).await;
}
  • async_trait es fácil de usar, pero ten en cuenta que utiliza asignaciones de montículos para conseguirlo. Esta asignación de montículo tiene una sobrecarga de rendimiento.

  • Los problemas de compatibilidad del lenguaje con async trait son muy complejos y no vale la pena describirlos en profundidad. Niko Matsakis lo explica muy bien en esta publicación, por si te interesa investigar más a fondo.

  • Prueba a crear una estructura que entre en suspensión durante un periodo aleatorio y añádela a Vec.

Cancelación

Si eliminas un futuro, no se podrá volver a sondear. Este fenómeno se denomina cancelación y puede producirse en cualquier momento de await. Hay que tener cuidado para asegurar que el sistema funcione correctamente, incluso cuando se cancelen los futuros. Por ejemplo, no debería sufrir interbloqueos o perder datos.

use std::io::{self, ErrorKind};
use std::time::Duration;
use tokio::io::{AsyncReadExt, AsyncWriteExt, DuplexStream};

struct LinesReader {
    stream: DuplexStream,
}

impl LinesReader {
    fn new(stream: DuplexStream) -> Self {
        Self { stream }
    }

    async fn next(&mut self) -> io::Result<Option<String>> {
        let mut bytes = Vec::new();
        let mut buf = [0];
        while self.stream.read(&mut buf[..]).await? != 0 {
            bytes.push(buf[0]);
            if buf[0] == b'\n' {
                break;
            }
        }
        if bytes.is_empty() {
            return Ok(None);
        }
        let s = String::from_utf8(bytes)
            .map_err(|_| io::Error::new(ErrorKind::InvalidData, "not UTF-8"))?;
        Ok(Some(s))
    }
}

async fn slow_copy(source: String, mut dest: DuplexStream) -> std::io::Result<()> {
    for b in source.bytes() {
        dest.write_u8(b).await?;
        tokio::time::sleep(Duration::from_millis(10)).await
    }
    Ok(())
}

#[tokio::main]
async fn main() -> std::io::Result<()> {
    let (client, server) = tokio::io::duplex(5);
    let handle = tokio::spawn(slow_copy("hi\nthere\n".to_owned(), client));

    let mut lines = LinesReader::new(server);
    let mut interval = tokio::time::interval(Duration::from_millis(60));
    loop {
        tokio::select! {
            _ = interval.tick() => println!("tick!"),
            line = lines.next() => if let Some(l) = line? {
                print!("{}", l)
            } else {
                break
            },
        }
    }
    handle.await.unwrap()?;
    Ok(())
}
  • El compilador no ayuda con la seguridad de la cancelación. Debes leer la documentación de la API y tener en cuenta el estado de tu async fn.

  • A diferencia de panic y ?, la cancelación forma parte del flujo de control normal (en contraposición a la gestión de errores).

  • En el ejemplo se pierden partes de la cadena.

    • Cuando la rama tick() termina primero, se eliminan next() y su buf.

    • LinesReader se puede configurar para que no se cancele marcando buf como parte del struct:

      #![allow(unused)]
      fn main() {
      struct LinesReader {
          stream: DuplexStream,
          bytes: Vec<u8>,
          buf: [u8; 1],
      }
      
      impl LinesReader {
          fn new(stream: DuplexStream) -> Self {
              Self { stream, bytes: Vec::new(), buf: [0] }
          }
          async fn next(&mut self) -> io::Result<Option<String>> {
              // prefix buf and bytes with self.
              // ...
              let raw = std::mem::take(&mut self.bytes);
              let s = String::from_utf8(raw)
              // ...
          }
      }
      }
  • Interval::tick es a prueba de cancelaciones, ya que registra si una marca se ha ‘entregado’.

  • AsyncReadExt::read es a prueba de cancelaciones porque o devuelve los datos o no los lee.

  • AsyncBufReadExt::read_line es similar al ejemplo y no está configurado a prueba de cancelaciones. Consulta su documentación para obtener información detallada y alternativas.

Ejercicios

Para practicar tus habilidades con async de Rust, tenemos otros dos nuevos ejercicios:

  • La cena de los filósofos: ya hemos visto este problema por la mañana. Esta vez vas a implementarlo con async de Rust.

  • Una aplicación de chat de difusión: se trata de un proyecto más grande que te permite experimentar con características más avanzadas de async de Rust.

Luego de ver los ejercicios, puedes ver las soluciones que se brindan.

Dining Philosophers — Async

Consulta la descripción del problema en la sección sobre la cena de filósofos.

Como antes, necesitarás una instalación local de Cargo para realizar el ejercicio. Copia el fragmento de código que aparece más abajo en un archivo denominado src/main.rs, rellena los espacios en blanco y comprueba que cargo run no presenta interbloqueos:

use std::sync::Arc;
use tokio::sync::mpsc::{self, Sender};
use tokio::sync::Mutex;
use tokio::time;

struct Fork;

struct Philosopher {
    name: String,
    // left_fork: ...
    // right_fork: ...
    // thoughts: ...
}

impl Philosopher {
    async fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .await
            .unwrap();
    }

    async fn eat(&self) {
        // Pick up forks...
        println!("{} is eating...", &self.name);
        time::sleep(time::Duration::from_millis(5)).await;
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];

#[tokio::main]
async fn main() {
    // Create forks

    // Create philosophers

    // Make them think and eat

    // Output their thoughts
}

Dado que esta vez usas async, necesitarás una dependencia tokio. Puedes usar el siguiente Cargo.toml:

[package]
name = "dining-philosophers-async-dine"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1.26.0", features = ["sync", "time", "macros", "rt-multi-thread"] }

Además, ten en cuenta que esta vez tienes que utilizar Mutex y el módulo mpsc del crate tokio.

  • Can you make your implementation single-threaded?

Aplicación de chat de difusión

En este ejercicio, queremos usar nuestros nuevos conocimientos para implementar una aplicación de chat de difusión. Disponemos de un servidor de chat al que los clientes se conectan y publican sus mensajes. El cliente lee los mensajes de usuario de la entrada estándar y los envía al servidor. El servidor del chat transmite cada mensaje que recibe a todos los clientes.

For this, we use a broadcast channel on the server, and tokio_websockets for the communication between the client and the server.

Crea un proyecto de Cargo y añade las siguientes dependencias:

Cargo.toml:

[package]
name = "chat-async"
version = "0.1.0"
edition = "2021"

[dependencies]
futures-util = { version = "0.3.30", features = ["sink"] }
http = "1.0.0"
tokio = { version = "1.28.1", features = ["full"] }
tokio-websockets = { version = "0.5.1", features = ["client", "fastrand", "server", "sha1_smol"] }

Las APIs necesarias

You are going to need the following functions from tokio and tokio_websockets. Spend a few minutes to familiarize yourself with the API.

  • StreamExt::next() implemented by WebSocketStream: for asynchronously reading messages from a Websocket Stream.
  • SinkExt::send() implemented by WebSocketStream: for asynchronously sending messages on a Websocket Stream.
  • Lines::next_line(): para la lectura asíncrona de mensajes de usuario de la entrada estándar.
  • Sender::subscribe(): para suscribirse a un canal en abierto.

Dos binarios

Normally in a Cargo project, you can have only one binary, and one src/main.rs file. In this project, we need two binaries. One for the client, and one for the server. You could potentially make them two separate Cargo projects, but we are going to put them in a single Cargo project with two binaries. For this to work, the client and the server code should go under src/bin (see the documentation).

Copy the following server and client code into src/bin/server.rs and src/bin/client.rs, respectively. Your task is to complete these files as described below.

src/bin/server.rs:

use futures_util::sink::SinkExt;
use futures_util::stream::StreamExt;
use std::error::Error;
use std::net::SocketAddr;
use tokio::net::{TcpListener, TcpStream};
use tokio::sync::broadcast::{channel, Sender};
use tokio_websockets::{Message, ServerBuilder, WebSocketStream};

async fn handle_connection(
    addr: SocketAddr,
    mut ws_stream: WebSocketStream<TcpStream>,
    bcast_tx: Sender<String>,
) -> Result<(), Box<dyn Error + Send + Sync>> {

    // TODO: For a hint, see the description of the task below.

}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let (bcast_tx, _) = channel(16);

    let listener = TcpListener::bind("127.0.0.1:2000").await?;
    println!("listening on port 2000");

    loop {
        let (socket, addr) = listener.accept().await?;
        println!("New connection from {addr:?}");
        let bcast_tx = bcast_tx.clone();
        tokio::spawn(async move {
            // Wrap the raw TCP stream into a websocket.
            let ws_stream = ServerBuilder::new().accept(socket).await?;

            handle_connection(addr, ws_stream, bcast_tx).await
        });
    }
}

src/bin/client.rs:

use futures_util::stream::StreamExt;
use futures_util::SinkExt;
use http::Uri;
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio_websockets::{ClientBuilder, Message};

#[tokio::main]
async fn main() -> Result<(), tokio_websockets::Error> {
    let (mut ws_stream, _) =
        ClientBuilder::from_uri(Uri::from_static("ws://127.0.0.1:2000"))
            .connect()
            .await?;

    let stdin = tokio::io::stdin();
    let mut stdin = BufReader::new(stdin).lines();


    // TODO: For a hint, see the description of the task below.

}

Ejecutar los binarios

Ejecuta el servidor con:

cargo run --bin server

y el cliente con:

cargo run --bin client

Tasks

  • Implementa la función handle_connection en src/bin/server.rs.
    • Sugerencia: usa tokio::select! para realizar dos tareas simultáneamente en un bucle continuo. Una tarea recibe mensajes del cliente y los transmite. La otra envía los mensajes que recibe el servidor al cliente.
  • Completa la función principal en src/bin/client.rs.
    • Sugerencia: al igual que antes, usa tokio::select! en un bucle continuo para realizar dos tareas simultáneamente: (1) leer los mensajes del usuario desde la entrada estándar y enviarlos al servidor, y (2) recibir mensajes del servidor y mostrárselos al usuario.
  • Opcional: cuando termines, cambia el código para difundir mensajes a todos los clientes, excepto al remitente.

Concurrencia: Ejercicios de la Tarde

Dining Philosophers — Async

(volver al ejercicio)

use std::sync::Arc;
use tokio::sync::mpsc::{self, Sender};
use tokio::sync::Mutex;
use tokio::time;

struct Fork;

struct Philosopher {
    name: String,
    left_fork: Arc<Mutex<Fork>>,
    right_fork: Arc<Mutex<Fork>>,
    thoughts: Sender<String>,
}

impl Philosopher {
    async fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .await
            .unwrap();
    }

    async fn eat(&self) {
        // Pick up forks...
        let _first_lock = self.left_fork.lock().await;
        // Add a delay before picking the second fork to allow the execution
        // to transfer to another task
        time::sleep(time::Duration::from_millis(1)).await;
        let _second_lock = self.right_fork.lock().await;

        println!("{} is eating...", &self.name);
        time::sleep(time::Duration::from_millis(5)).await;

        // The locks are dropped here
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Hypatia", "Plato", "Aristotle", "Pythagoras"];

#[tokio::main]
async fn main() {
    // Create forks
    let mut forks = vec![];
    (0..PHILOSOPHERS.len()).for_each(|_| forks.push(Arc::new(Mutex::new(Fork))));

    // Create philosophers
    let (philosophers, mut rx) = {
        let mut philosophers = vec![];
        let (tx, rx) = mpsc::channel(10);
        for (i, name) in PHILOSOPHERS.iter().enumerate() {
            let left_fork = Arc::clone(&forks[i]);
            let right_fork = Arc::clone(&forks[(i + 1) % PHILOSOPHERS.len()]);
            // To avoid a deadlock, we have to break the symmetry
            // somewhere. This will swap the forks without deinitializing
            // either of them.
            if i == 0 {
                std::mem::swap(&mut left_fork, &mut right_fork);
            }
            philosophers.push(Philosopher {
                name: name.to_string(),
                left_fork,
                right_fork,
                thoughts: tx.clone(),
            });
        }
        (philosophers, rx)
        // tx is dropped here, so we don't need to explicitly drop it later
    };

    // Make them think and eat
    for phil in philosophers {
        tokio::spawn(async move {
            for _ in 0..100 {
                phil.think().await;
                phil.eat().await;
            }
        });
    }

    // Output their thoughts
    while let Some(thought) = rx.recv().await {
        println!("Here is a thought: {thought}");
    }
}

Aplicación de chat de difusión

(volver al ejercicio)

src/bin/server.rs:

use futures_util::sink::SinkExt;
use futures_util::stream::StreamExt;
use std::error::Error;
use std::net::SocketAddr;
use tokio::net::{TcpListener, TcpStream};
use tokio::sync::broadcast::{channel, Sender};
use tokio_websockets::{Message, ServerBuilder, WebSocketStream};

async fn handle_connection(
    addr: SocketAddr,
    mut ws_stream: WebSocketStream<TcpStream>,
    bcast_tx: Sender<String>,
) -> Result<(), Box<dyn Error + Send + Sync>> {

    ws_stream
        .send(Message::text("Welcome to chat! Type a message".to_string()))
        .await?;
    let mut bcast_rx = bcast_tx.subscribe();

    // A continuous loop for concurrently performing two tasks: (1) receiving
    // messages from `ws_stream` and broadcasting them, and (2) receiving
    // messages on `bcast_rx` and sending them to the client.
    loop {
        tokio::select! {
            incoming = ws_stream.next() => {
                match incoming {
                    Some(Ok(msg)) => {
                        if let Some(text) = msg.as_text() {
                            println!("From client {addr:?} {text:?}");
                            bcast_tx.send(text.into())?;
                        }
                    }
                    Some(Err(err)) => return Err(err.into()),
                    None => return Ok(()),
                }
            }
            msg = bcast_rx.recv() => {
                ws_stream.send(Message::text(msg?)).await?;
            }
        }
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let (bcast_tx, _) = channel(16);

    let listener = TcpListener::bind("127.0.0.1:2000").await?;
    println!("listening on port 2000");

    loop {
        let (socket, addr) = listener.accept().await?;
        println!("New connection from {addr:?}");
        let bcast_tx = bcast_tx.clone();
        tokio::spawn(async move {
            // Wrap the raw TCP stream into a websocket.
            let ws_stream = ServerBuilder::new().accept(socket).await?;

            handle_connection(addr, ws_stream, bcast_tx).await
        });
    }
}

src/bin/client.rs:

use futures_util::stream::StreamExt;
use futures_util::SinkExt;
use http::Uri;
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio_websockets::{ClientBuilder, Message};

#[tokio::main]
async fn main() -> Result<(), tokio_websockets::Error> {
    let (mut ws_stream, _) =
        ClientBuilder::from_uri(Uri::from_static("ws://127.0.0.1:2000"))
            .connect()
            .await?;

    let stdin = tokio::io::stdin();
    let mut stdin = BufReader::new(stdin).lines();

    // Continuous loop for concurrently sending and receiving messages.
    loop {
        tokio::select! {
            incoming = ws_stream.next() => {
                match incoming {
                    Some(Ok(msg)) => {
                        if let Some(text) = msg.as_text() {
                            println!("From server: {}", text);
                        }
                    },
                    Some(Err(err)) => return Err(err.into()),
                    None => return Ok(()),
                }
            }
            res = stdin.next_line() => {
                match res {
                    Ok(None) => return Ok(()),
                    Ok(Some(line)) => ws_stream.send(Message::text(line.to_string())).await?,
                    Err(err) => return Err(err.into()),
                }
            }

        }
    }
}

Gracias.

Gracias por realizar el curso Comprehensive Rust 🦀. Esperamos que te haya gustado y que te haya resultado útil.

Nos lo hemos pasado muy bien preparando el curso. Sabemos que no es perfecto, así que si has detectado algún error o tienes ideas para mejorarlo, ponte en contacto con nosotros en GitHub. Nos encantaría saber tu opinión.

Glossary

The following is a glossary which aims to give a short definition of many Rust terms. For translations, this also serves to connect the term back to the English original.

  • allocate:
    Dynamic memory allocation on the heap.
  • argument:
    Information that is passed into a function or method.
  • Bare-metal Rust:
    Low-level Rust development, often deployed to a system without an operating system. See Bare-metal Rust.
  • block:
    See Blocks and scope.
  • borrow:
    See Borrowing.
  • borrow checker:
    The part of the Rust compiler which checks that all borrows are valid.
  • brace:
    { and }. Also called curly brace, they delimit blocks.
  • build:
    The process of converting source code into executable code or a usable program.
  • call:
    To invoke or execute a function or method.
  • channel:
    Used to safely pass messages between threads.
  • Comprehensive Rust 🦀:
    The courses here are jointly called Comprehensive Rust 🦀.
  • concurrency:
    The execution of multiple tasks or processes at the same time.
  • Concurrency in Rust:
    See Concurrency in Rust.
  • constant:
    A value that does not change during the execution of a program.
  • control flow:
    The order in which the individual statements or instructions are executed in a program.
  • crash:
    An unexpected and unhandled failure or termination of a program.
  • enumeration:
    A data type that holds one of several named constants, possibly with an associated tuple or struct.
  • error:
    An unexpected condition or result that deviates from the expected behavior.
  • error handling:
    The process of managing and responding to errors that occur during program execution.
  • exercise:
    A task or problem designed to practice and test programming skills.
  • function:
    A reusable block of code that performs a specific task.
  • garbage collector:
    A mechanism that automatically frees up memory occupied by objects that are no longer in use.
  • generics:
    A feature that allows writing code with placeholders for types, enabling code reuse with different data types.
  • immutable:
    Unable to be changed after creation.
  • integration test:
    A type of test that verifies the interactions between different parts or components of a system.
  • keyword:
    A reserved word in a programming language that has a specific meaning and cannot be used as an identifier.
  • library:
    A collection of precompiled routines or code that can be used by programs.
  • macro:
    Rust macros can be recognized by a ! in the name. Macros are used when normal functions are not enough. A typical example is format!, which takes a variable number of arguments, which isn’t supported by Rust functions.
  • main function:
    Rust programs start executing with the main function.
  • match:
    A control flow construct in Rust that allows for pattern matching on the value of an expression.
  • memory leak:
    A situation where a program fails to release memory that is no longer needed, leading to a gradual increase in memory usage.
  • method:
    A function associated with an object or a type in Rust.
  • module:
    A namespace that contains definitions, such as functions, types, or traits, to organize code in Rust.
  • move:
    The transfer of ownership of a value from one variable to another in Rust.
  • mutable:
    A property in Rust that allows variables to be modified after they have been declared.
  • ownership:
    The concept in Rust that defines which part of the code is responsible for managing the memory associated with a value.
  • panic:
    An unrecoverable error condition in Rust that results in the termination of the program.
  • parameter:
    A value that is passed into a function or method when it is called.
  • pattern:
    A combination of values, literals, or structures that can be matched against an expression in Rust.
  • payload:
    The data or information carried by a message, event, or data structure.
  • program:
    A set of instructions that a computer can execute to perform a specific task or solve a particular problem.
  • programming language:
    A formal system used to communicate instructions to a computer, such as Rust.
  • receiver:
    The first parameter in a Rust method that represents the instance on which the method is called.
  • reference counting:
    A memory management technique in which the number of references to an object is tracked, and the object is deallocated when the count reaches zero.
  • return:
    A keyword in Rust used to indicate the value to be returned from a function.
  • Rust:
    A systems programming language that focuses on safety, performance, and concurrency.
  • Rust Fundamentals:
    Days 1 to 3 of this course.
  • Rust in Android:
    See Rust in Android.
  • Rust in Chromium:
    See Rust in Chromium.
  • safe:
    Refers to code that adheres to Rust’s ownership and borrowing rules, preventing memory-related errors.
  • scope:
    The region of a program where a variable is valid and can be used.
  • standard library:
    A collection of modules providing essential functionality in Rust.
  • static:
    A keyword in Rust used to define static variables or items with a 'static lifetime.
  • string:
    A data type storing textual data. See String vs str for more.
  • struct:
    A composite data type in Rust that groups together variables of different types under a single name.
  • test:
    A Rust module containing functions that test the correctness of other functions.
  • thread:
    A separate sequence of execution in a program, allowing concurrent execution.
  • thread safety:
    The property of a program that ensures correct behavior in a multithreaded environment.
  • trait:
    A collection of methods defined for an unknown type, providing a way to achieve polymorphism in Rust.
  • trait bound:
    An abstraction where you can require types to implement some traits of your interest.
  • tuple:
    A composite data type that contains variables of different types. Tuple fields have no names, and are accessed by their ordinal numbers.
  • type:
    A classification that specifies which operations can be performed on values of a particular kind in Rust.
  • type inference:
    The ability of the Rust compiler to deduce the type of a variable or expression.
  • undefined behavior:
    Actions or conditions in Rust that have no specified result, often leading to unpredictable program behavior.
  • union:
    A data type that can hold values of different types but only one at a time.
  • unit test:
    Rust comes with built-in support for running small unit tests and larger integration tests. See Unit Tests.
  • unit type:
    Type that holds no data, written as a tuple with no members.
  • unsafe:
    The subset of Rust which allows you to trigger undefined behavior. See Unsafe Rust.
  • variable:
    A memory location storing data. Variables are valid in a scope.

Otros recursos de Rust

La comunidad de Rust ha creado una gran cantidad de recursos online sin coste y de gran calidad.

Documentación oficial

El proyecto Rust cuenta con muchos recursos. Estos tratan sobre Rust en general:

  • The Rust Programming Language: el libro canónico sobre Rust sin coste alguno. Trata el lenguaje de forma detallada e incluye algunos proyectos que los usuarios pueden compilar.
  • Rust by Example: trata la sintaxis de Rust a través de una serie de ejemplos que muestran distintas construcciones. A veces incluye pequeños ejercicios en los que se te pide que amplíes el código de los ejemplos.
  • La biblioteca estándar de Rust: documentación completa de la biblioteca estándar de Rust.
  • The Rust Reference: un libro incompleto que describe la gramática y el modelo de memoria de Rust.

Consulta guías más especializadas en el sitio oficial de Rust:

  • The Rustonomicon: trata de Rust inseguro, incluido cómo trabajar con punteros sin formato e interactuar con otros lenguajes (FFI).
  • Asynchronous Programming in Rust: incluye el nuevo modelo de programación asíncrona que se introdujo después de que se escribiera el libro de Rust.
  • The Embedded Rust Book: una introducción sobre el uso de Rust en dispositivos integrados sin sistema operativo.

Material de formación no oficial

Una pequeña selección de otras guías y tutoriales sobre Rust:

Consulta The Little Book of Rust Books para ver más libros de Rust.

Créditos

Este material se basa en las numerosas fuentes de documentación sobre Rust. Consulta la página de otros recursos para ver una lista completa de recursos útiles.

El material de Comprehensive Rust está sujeto a los términos de la licencia Apache 2.0. Para obtener más información, consulta LICENSE.

Rust by Example

Algunos ejemplos y ejercicios se han copiado y adaptado del libro Rust by Example. Consulta el directorio third_party/rust-by-example/ para obtener más información, incluidos los términos de la licencia.

Rust on Exercism

Se han copiado y adaptado algunos ejercicios del recurso Rust on Exercism. Consulta el directorio third_party/rust-on-exercism/ para obtener más información, incluidos los términos de la licencia.

CXX

En la sección Interoperabilidad con C++ se usa una imagen de CXX. Consulta el directorio third_party/cxx/ para obtener más información, incluidos los términos de la licencia.