歡迎參加 Comprehensive Rust 🦀 課程

Build workflow GitHub contributors GitHub stars

這個免費的 Rust 課程是由 Google 的 Android 團隊負責開發。本課程涵蓋 Rust 的全部內容,從基礎語法到進階主題 (泛型和錯誤處理等),應有盡有。

如需最新版課程,請造訪 https://google.github.io/comprehensive-rust/。假如您是在其他網址閱讀課程資料,別忘了查看這個連結的內容是否有更新。

本課程旨在教授 Rust 的知識。我們會假設您是從零開始學習 Rust,希望能夠:

  • 讓您對 Rust 語法和語言有全面的認識。
  • 讓您學會在 Rust 中修改現有程式及編寫新程式。
  • 向您介紹常見的 Rust 慣用語法。

我們將前三天的課程稱為「Rust 基礎知識」。

在此基礎上,我們將誠摯邀請您深入探討一或多個專題:

  • Android:這是半天的課程,會說明如何針對 Android 平台開發作業 (Android 開放原始碼計畫) 使用 Rust,並介紹與 C、C++ 和 Java 的互通性。
  • Bare-metal: a whole-day class on using Rust for bare-metal (embedded) development. Both microcontrollers and application processors are covered.
  • 並行:這個全天課程著重於 Rust 中的並行問題。我們將探討傳統並行 (使用執行緒和互斥鎖進行先占式排程) 以及 async/await 並行 (使用 future 進行合作多工處理)。

Non-Goal

Rust 是大型的程式語言,無法在幾天內就介紹完畢。因此,本課程會有一些 non-goal,包括:

假設

本課程假設您已瞭解如何設計程式。Rust 是一種靜態的程式設計類型,我們有時會將其與 C 和 C++ 比較,以便進一步解釋或凸顯 Rust 做法上的差別。

如果您知道如何以 Python 或 JavaScript 等動態程式語言編寫程式,也很適合跟著我們學習 Rust。

這是「演講者備忘稿」的範例。我們會透過這些備忘稿補充投影片中未提到的資訊。這可能包括老師應提及的重點,以及課堂上典型問題的解答。

講授課程

本頁面的適用對象為課程講師。

以下提供一些背景資訊,說明 Google 內部近期採用的授課方式。

我們一般會從上午 10 點上課到下午 4 點,中間 1 小時午休。也就是說,上下午課程各為 2.5 小時。請注意,這只是建議的上課時間:您也可以將上午的課程訂為 3 小時,讓學員有更多時間練習。延長課程時間的缺點是,學員上了整整 6 小時的課,到了下午可能會非常疲倦。

在講授課程前,建議您注意下列事項:

  1. 請熟悉課程教材。我們已附上演講者備忘稿,協助突顯重點,也請您不吝提供更多演講者備忘稿內容!分享螢幕畫面時,請務必在彈出式視窗中開啟演講者備忘稿 (按一下「Speaker Notes」旁小箭頭的連結)。如此一來,您就能在課堂上分享簡潔的螢幕畫面。

  2. 請決定授課日期。由於課程長度至少為三個整天,建議您將授課日分散安排在兩週內。課程參與者曾表示,如果課程中間有間隔,可協助他們消化我們提供的所有資訊,對學習效果有助益。

  3. 找到可容納現場參與者的場地。建議的開班人數為 15 至 25 人。這樣的小班制教學可讓學員自在地提問,講師也有時間可以回答問題。請確認上課場地有_書桌_,可供講師和學員使用:您們都會需要能坐著使用筆電。 講師尤其會需要現場編寫許多程式碼,因此使用講台可能會造成不便。

  4. 在講課當天提早到上課場地完成設定。建議您直接在筆電上執行 mdbook serve 分享螢幕畫面 (請參閱安裝操作說明)。這可確保提供最佳效能,不會在您切換頁面時發生延遲。使用筆電也可讓您修正自己或課程參與者發現的錯字。

  5. 讓學員獨自或分成小組做習題。我們通常會在早上和下午各安排 30 至 45 分鐘的時間做習題,這包含檢討解題方式的時間。請務必詢問學員是否遇到難題,或需要您的協助。如果發現多位學員遇到相同問題,請向全班說明該問題,並提供解決方式:例如示範如何在標準程式庫 (The Rust Standard Library) 找到相關資訊。

以上為所有注意事項,祝您授課順利,並和我們一樣樂在其中!

請在授課後提供意見回饋,協助我們持續改善課程。您可以與我們分享您滿意的部分,以及值得改善的地方。也歡迎您的學生提供意見回饋

課程架構

本頁面的適用對象為課程講師。

Rust 基礎知識

我們會在前三天介紹 Rust 基礎知識。這幾天的步調會稍快,因為要探討許多層面:

  • 第 1 天:Rust 基本概念、語法、控制流程、建立及取用值。
  • Day 2: Memory management, ownership, compound data types, and the standard library.
  • Day 3: Generics, traits, error handling, testing, and unsafe Rust.

深入探索

除了為期 3 天的 Rust 基礎知識課程,我們也涵蓋更多專門主題:

Rust in Android

The Rust in Android deep dive is a half-day course on using Rust for Android platform development. This includes interoperability with C, C++, and Java.

您會需要進行 Android 開放原始碼計畫檢查。請檢查課程存放區上相同的機器,並將 src/android/ 目錄移至 Android 開放原始碼計畫檢查的根層級。這可確保 Android 建構系統在 src/android/ 中看到 Android.bp 檔案。

請確保 adb sync 可與模擬器或實際裝置搭配使用,並運用 src/android/build_all.sh 預先建構所有 Android 範例。請閱讀指令碼,瞭解指令碼執行的指令,並確保可以手動執行指令。

Bare-Metal Rust

The Bare-Metal Rust deep dive is a full day class on using Rust for bare-metal (embedded) development. Both microcontrollers and application processors are covered.

針對微控制器,您會需要預先購買 BBC micro:bit 第 2 版開發板。此外,所有人都需要按照歡迎頁面上的指示安裝多種套件。

Concurrency in Rust

The Concurrency in Rust deep dive is a full day class on classical as well as async/await concurrency.

您會需要設定新的 Crate,然後下載並準備執行依附元件。接著就能將範例複製貼上至 src/main.rs,使用這些範例進行實驗:

cargo init concurrency
cd concurrency
cargo add tokio --features full
cargo run

形式

本課程極具互動性,因此建議您根據各項疑問,帶領學員瞭解 Rust!

鍵盤快速鍵

以下為 mdBook 中實用的鍵盤快速鍵:

  • 向左鍵:前往上一頁。
  • 向右鍵:前往下一頁。
  • Ctrl + Enter:執行具有焦點的程式碼範例。
  • s:啟用搜尋列。

翻譯

本課程已由一群優秀的志工翻譯成其他語言:

使用右上角的語言選單即可切換語言。

不完整翻譯

目前有許多正在翻譯的語言版本。以下連結為最近更新的翻譯:

如果想協助翻譯,請參閱[操作說明],瞭解如何開始翻譯。譯者可以在[問題追蹤工具]上討論及統整翻譯。

使用 Cargo

您開始閱讀 Rust 內容後,很快就會認識 Cargo,這是在 Rust 生態系統中使用的標準工具,用於建構及執行 Rust 應用程式。以下簡要介紹 Cargo,以及如何在更廣大的生態系統和本訓練課程中運用 Cargo。

安裝

請按照 https://rustup.rs/ 中的指示操作。

This will give you the Cargo build tool (cargo) and the Rust compiler (rustc). You will also get rustup, a command line utility that you can use to install/switch toolchains, setup cross compilation, etc.

  • On Debian/Ubuntu, you can also install Cargo, the Rust source and the Rust formatter via apt. However, this gets you an outdated rust version and may lead to unexpected behavior. The command would be:
sudo apt install cargo rust-src rustfmt
  • We suggest using VS Code to edit the code (but any LSP compatible editor works with rust-analyzer3).

  • 有些人也偏好使用 JetBrains 系列的 IDE,這些工具會自行分析,但也各有缺點。如果您偏好這些工具,可以安裝 Rust 外掛程式。請注意,自 2023 年 1 月起,偵錯功能僅適用於 JetBrains IDEA 套件的 CLion 版本。

Rust 生態系統

Rust 生態系統包含多項工具,以下列出主要工具:

  • rustc:Rust 編譯器,可將 .rs 檔案轉換成二進位檔和其他中繼格式。

  • cargo: the Rust dependency manager and build tool. Cargo knows how to download dependencies, usually hosted on https://crates.io, and it will pass them to rustc when building your project. Cargo also comes with a built-in test runner which is used to execute unit tests.

  • rustup:Rust 工具鍊安裝程式和更新程式。新版 Rust 推出時,這項工具可用來安裝及更新 rustccargo。此外,rustup 也可以下載標準程式庫的說明文件。您可以一次安裝多個 Rust 版本,並視需要使用 rustup 切換版本。

重要須知:

  • Rust 的發布時程相當緊湊,每六週就會推出新版本。新版本可與舊版本回溯相容,且會啟用新功能。

  • 發布版本 (release channel) 分為三種:「穩定版」、「Beta 版」和「Nightly 版」。

  • 「Nightly 版」會用於測試新功能,「Beta 版」則會每六週成為「穩定版」。

  • 您也可以透過其他註冊資料庫、git、資料夾等管道解析依附元件。

  • Rust 還具有[版本] (edition):目前版本為 Rust 2021。先前版本為 Rust 2015 和 Rust 2018。

    • 這些版本可針對語言進行回溯不相容的變更。

    • 為避免破壞程式碼,版本皆為自行選擇採用:您可以透過 Cargo.toml 檔案選擇所需版本。

    • 為避免分割生態系統,Rust 編譯器可混合寫給不同版本的程式碼。

    • 請說明很少會略過 cargo 直接使用編譯器,大部分使用者都不會這麼做。

    • 可以考慮暗示 Cargo 本身是極其強大且功能全面的工具,具有許多進階功能,包括但不限於:

      • 專案/套件結構
      • [工作區]
      • 開發人員依附元件和執行階段依附元件管理/快取
      • [建構指令碼]
      • [全域安裝]
      • 此外,還可以擴充使用子指令外掛程式,例如 cargo clippy
    • 詳情請參閱[官方的 Cargo 手冊]。

本訓練課程的程式碼範例

在本訓練課程中,我們主要會透過範例瞭解 Rust 語言,這些範例可在瀏覽器中執行。這麼做可讓設定程序更輕鬆,並確保所有人獲得一致的體驗。

我們仍建議安裝 Cargo,方便您更輕鬆做習題。在最後一天,我們會做規模較大的習題,讓您瞭解如何使用依附元件,而這需要使用 Cargo。

本課程的程式碼區塊皆完全為互動式:

fn main() {
    println!("Edit me!");
}

當焦點位於文字方塊時,按下 Ctrl + Enter鍵即可執行程式碼。

大部分程式碼範例都可供編輯,如上所示。有些程式碼範例無法編輯,原因如下:

  • 嵌入式遊樂場無法執行單元測試。請複製貼上程式碼,然後在實際的 Playground 中開啟,即可示範單元測試。

  • 當您一離開頁面,嵌入式遊樂場就會失去目前狀態!因此,學生應使用本機 Rust 安裝項目或透過 Playground 來做習題。

使用 Cargo 在本機執行程式碼

如果想在自己的系統上進行程式碼實驗,您會需要先安裝 Rust。請按照 Rust 手冊中的指示操作。您應會獲得正常運作的 rustccargo。截至本文撰寫時間,最新的 Rust 穩定版具有下列版本編號:

% rustc --version
rustc 1.69.0 (84c898d65 2023-04-16)
% cargo --version
cargo 1.69.0 (6e9a83356 2023-04-12)

由於 Rust 保有回溯相容性,您也可以使用任何後續版本。

完成上述步驟後,請按照下列步驟操作,在本訓練課程的任一範例中建構 Rust 二進位檔:

  1. 在要複製的範例中,按一下「Copy to clipboard」按鈕。

  2. 使用 cargo new exercise,為程式碼建立新的 exercise/ 目錄:

    $ cargo new exercise
         Created binary (application) `exercise` package
    
  3. 前往 exercise/,使用 cargo run 建構並執行二進位檔:

    $ cd exercise
    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.75s
         Running `target/debug/exercise`
    Hello, world!
    
  4. src/main.rs 中的樣板程式碼替換為自己的程式碼。以上一頁的範例為例,替換後的 src/main.rs 會類似如下:

    fn main() {
        println!("Edit me!");
    }
  5. 使用 cargo run 建構並執行更新版二進位檔:

    $ cargo run
       Compiling exercise v0.1.0 (/home/mgeisler/tmp/exercise)
        Finished dev [unoptimized + debuginfo] target(s) in 0.24s
         Running `target/debug/exercise`
    Edit me!
    
  6. 使用 cargo check 快速檢查專案中是否有錯誤,並使用 cargo build 在不執行的情況下編譯專案。您會在 target/debug/ 中看到一般偵錯版本的輸出內容。使用 cargo build --release,在 target/release/ 中產生經過最佳化的發布子版本。

  7. 只要編輯 Cargo.toml,即可為專案新增依附元件。執行 cargo 指令時,系統會自動下載及編譯缺少的依附元件。

建議您鼓勵課程參與者安裝 Cargo 及使用本機編輯器。這麼做能提供正常的開發環境,降低操作難度。

歡迎參加第 1 天課程

今天是學習 Rust 基礎知識的第一天,我們會探討許多內容:

  • 基本的 Rust 語法:變數、純量和複合型別、列舉、結構體、參照、函式和方法。

  • 控制流程結構:ifif letwhilewhile letbreakcontinue

  • 模式配對:解構列舉、結構和陣列。

請提醒學生以下事項:

  • 應該一有問題就提問,不要留到最後。
  • 本課程的宗旨是互動,非常鼓勵大家討論!
    • 老師應設法讓討論不要離題,例如確保討論的主題在於比較 Rust 和其他語言的運作方式。要找到適當的平衡點並不容易,但我們還是寧可讓學員討論,因為這比老師單向授課更能引起學生興趣。
  • 我們討論的議題,可能會超前投影片進度。
    • 這完全沒問題!複習是學習的重要一環。請記得,投影片只是輔助,您可以視情況略過不需要的部分。

第一天的規畫是說明 Rust 基礎概念,「只要剛好」能介紹到著名的借用檢查器就行了。Rust 處理記憶體的方式是一大特色,我們應該立即向學生展示這一點。

如果您是在教室授課,就很適合參考這裡的時間表。建議您將一天分為兩部分 (根據投影片安排):

  • 上午:9 點到 12 點
  • 下午:1 點到 4 點

您當然可以視需要調整這個時間表。但請務必要加入休息時段,建議每小時休息一次!

什麼是 Rust?

Rust 是一款新的程式設計語言,在 2015 年推出 1.0 版

  • Rust 是靜態編譯的程式語言,功能與 C++ 類似
    • rustc 使用 LLVM 做為後端。
  • Rust 支援許多平台和架構
    • x86、ARM、WebAssembly…
    • Linux、Mac、Windows…
  • Rust 適用於多種裝置:
    • 韌體和啟動載入器
    • 智慧螢幕、
    • 手機、
    • 電腦、
    • 伺服器。

Rust 適合用於與 C++ 同樣的領域,且具有以下特色:

  • 高靈活性。
  • 提供高度主控權。
  • 可縮減到十分受限的裝置規模,例如微控制器。
  • 沒有執行階段,也不使用垃圾收集機制。
  • 著重可靠性和安全性,但不犧牲效能。

Hello World!

我們直接來看看最簡單的 Rust 程式吧,也就是經典的 Hello World 程式:

fn main() {
    println!("Hello 🌍!");
}

您會看到:

  • 函式是以 fn 導入。
  • 區塊會用大括號分隔,這跟在 C 和 C++ 一樣。
  • main 函式是程式的進入點。
  • Rust 含有衛生巨集,例如 println!
  • Rust 字串採用 UTF-8 編碼,可包含任何萬國碼字元。

我們會藉由這張投影片,試著讓學生熟悉 Rust 程式碼。在接下來的三天裡,他們會大量接觸到這些內容,所以我們得從他們熟悉的小地方著手。

重要須知:

  • Rust 與 C/C++/Java 傳統中的其他語言非常相似。它是指令式的程式語言,除非絕對必要,否則不會嘗試改編任何內容。

  • Rust 是現代的程式語言,可完整支援萬國碼等等。

  • 當您想使用可變數量的引數時 (亦即無任何函式超載),可使用 Rust 的巨集。

  • 所謂「衛生」巨集,是指這類巨集不會誤從自身所用於的範圍內擷取 ID。Rust 巨集實際上只能算是部分衛生的巨集。

  • Rust 是多範式的語言。舉例來說,它具備強大的物件導向程式設計功能,雖然並非函式語言,卻涉及各式各樣的函式概念

簡短範例

以下是使用 Rust 語言的簡短範例程式:

fn main() {              // Program entry point
    let mut x: i32 = 6;  // Mutable variable binding
    print!("{x}");       // Macro for printing, like printf
    while x != 1 {       // No parenthesis around expression
        if x % 2 == 0 {  // Math like in other languages
            x = x / 2;
        } else {
            x = 3 * x + 1;
        }
        print!(" -> {x}");
    }
    println!();
}

這是「考拉茲猜想」的實作程式碼,考拉茲相信迴圈不管怎樣終會結束,但這尚未得證。您可以編輯該程式碼,試著輸入不同內容。

重要須知:

  • 解釋所有變數都是靜態的。試著移除 i32 來觸發型別推斷。接著嘗試改用 i8,並觸發執行階段的整數溢位現象。

  • let mut x 改為 let x,討論編譯器錯誤。

  • 說明如果引數與格式字串不符,print! 會如何呈現編譯錯誤。

  • 說明要輸出比單一變數更複雜的運算式時,需如何使用 {} 做為預留位置。

  • 向學生介紹標準程式庫,示範如何搜尋具有格式化迷你語言規則的 std::fmt。請務必確保學生熟悉如何在標準程式庫中搜尋。

    • 在殼層中,rustup doc std::fmt 會開啟本機 std::fmt 說明文件上的瀏覽器。

為什麼要使用 Rust?

Rust 的幾個獨特賣點如下:

  • 在編譯期間確保記憶體安全性。
  • 沒有未定義的執行階段行為。
  • 現代的語言特色。

請務必詢問全班同學,瞭解他們具備哪些語言的使用經驗。根據學生答覆,您可以強調不同的 Rust 功能:

  • 具備 C 或 C++ 經驗:Rust 會透過借用檢查器,徹底刪除一整類的「執行階段錯誤」。這不僅可讓您獲得像是 C 和 C++ 的效能,也不會造成記憶體安全問題。此外,您還能取得具備模式配對、內建依附元件管理機制等結構的新型語言。

  • 具備 Java、Go、Python、JavaScript…經驗:Rust 能讓您享有與這些語言相同的記憶體安全性,而且還可帶來使用類似高階語言的感受。此外,您也能獲得像 C 和 C++ 一樣快速可預期的成效 (無垃圾收集器),以及低階硬體的存取權限 (如有需要)。

編譯時期保證

編譯期間的靜態記憶體管理機制好處多多,包括:

  • 不會產生未初始化的變數。
  • 不會造成記憶體流失 (「一般」來說是這樣,請參閱附註)。
  • 不會導致重複釋放記憶體。
  • 不會使用已釋放的記憶體。
  • 不會產生 NULL 指標。
  • 不會產生忘記鎖定的互斥鎖。
  • 執行緒之間不會發生資料競爭。
  • 不會發生疊代器無效的情形。

在 (安全的) Rust 範疇內,可能還是有機會造成記憶體流失。以下是一些例子:

  • 您可能會使用 Box::leak,以致洩漏指標。如果您為了取得在執行階段中初始化或設定大小的靜態變數,就可能發生這個情況。
  • 您可能會透過 std::mem::forget 讓編譯器「忘記」某個值 (亦即解構函式永遠不會執行)。
  • 您也可能會不小心使用 RcArc 建立參照循環
  • 事實上,有些人會認為無限地填充集合是一種記憶體流失,而 Rust 並不能避免這種情況。

因此,以本課程的宗旨來說,「沒有記憶體流失」應理解為「幾乎沒有『意外的』記憶體流失」。

執行時期保證

在執行階段不會產生未定義的行為,好處如下:

  • 陣列存取行為會經過邊界檢查。
  • 整數溢位的行為是明確的 (恐慌或迴繞)。

重要須知:

  • 整數溢位的處理是透過 overflow-checks 編譯時間標記定義的。如果啟用的話,程式就會恐慌 (程式受控地異常終止),如未啟用,則會發生語意迴繞現象。根據預設,在偵錯模式 (cargo build) 中會發生恐慌,在發布模式 (cargo build --release) 中會發生迴繞。

  • 您無法使用編譯器參數停用邊界檢查,也無法直接透過 unsafe 關鍵字停用。不過,您可以使用 unsafe 呼叫 slice::get_unchecked 這類不執行邊界檢查的函式。

新潮的功能

Rust 是根據過去數十年累積的所有經驗打造而成。

語言特色

  • 列舉和模式配對。
  • 泛型。
  • 沒有 FFI 負擔。
  • 零成本的抽象化機制。

工具

  • 更好的編譯錯誤描述。
  • 內建依附元件管理工具。
  • 內建測試支援。
  • 卓越的語言伺服器通訊協定支援。

重要須知:

  • 與 C++ 類似,零成本抽象化機制是指您不必為使用記憶體或 CPU 的高階程式設計結構「付費」。舉例來說,使用 for 編寫迴圈時,應產生與使用 .iter().fold() 結構大致相同的低階指示。

  • 值得一提的是,Rust 列舉屬於「代數資料型別」(也稱為「加總型別」),可讓型別系統表達 Option<T>Result<T, E> 等項目。

  • 提醒使用者詳讀錯誤訊息,許多開發人員已習慣忽略冗長的編譯器輸出結果。Rust 編譯器的表達能力比其他編譯器高出許多,通常都會提供「實用」的意見回饋,您可以直接將其複製貼到程式碼中。

  • 與 Java、Python 和 Go 等語言相比,Rust 標準程式庫較小。Rust 並不提供某些您可能認為是標準和基本項目的內容:

    • 隨機號碼產生器,請參閱 rand
    • SSL 或 TLS 支援,請參閱 rusttls
    • JSON 支援,請參閱 serde_json。 未提供此支援的原因是,標準程式庫中的功能無法移除,因此必須相當穩定。對於以上範例,Rust 社群仍在努力尋找最佳解決方案,但其中某幾個例子或許並沒有單一的「最佳解決方案」。 Rust 內建採用 Cargo 形式的套件管理工具,因此可讓您輕鬆下載及編譯第三方 Crate。這樣一來,就可以縮小標準程式庫的規模。

    如何找到理想的第三方 Crate 可能是一大問題。但請放心,https://lib.rs/ 這類網站可協助您比較 Crate 的健康指標,找出優質且值得信賴的 Crate。

  • rust-analyzer 是廣受支援的 LSP 實作項目,適用於主要的 IDE 和文字編輯器。

基本語法

如果您有 C、C++ 或 Java 基礎,會覺得大部分的 Rust 語法都似曾相識:

  • 區塊和範圍會以大括號分隔。
  • 行註解以 // 開頭,區塊註解則以 /* ... */ 分隔。
  • ifwhile 等關鍵字的功用相同。
  • 變數指派作業透過 = 完成,等於運算則透過 == 完成。

純量型別

類型常值
帶號整數i8i16i32i64i128isize-1001_000123_i64
非帶號整數u8u16u32u64u128usize012310_u16
浮點數f32f643.14-10.0e202_f32
字串 (String)&str"foo""two\nlines"
萬國碼純量值char'a''α''∞'
布林值booltruefalse

型別的寬度如下:

  • iNuNfN 的寬度為 N 位元
  • isizeusize 等同於指標的寬度
  • char 寬度為 32 位元
  • bool 寬度為 8 位元

除此之外,還有一些其他語法:

  • 原形字串可讓您建立停用逸出功能的 &str 值:r"\n" == "\\n"。只要在引號兩側使用等量的 #,即可嵌入雙引號:

    fn main() {
        println!(r#"<a href="link.html">link</a>"#);
        println!("<a href=\"link.html\">link</a>");
    }
  • 位元組字串可讓您直接建立 &[u8] 值:

    fn main() {
        println!("{:?}", b"abc");
        println!("{:?}", &[97, 98, 99]);
    }
  • 數字中的底線全都可以省略,寫出來只是為了方便閱讀。換句話說,1_000 可以寫成 1000 (或 10_00),而 123_i64 則可寫成 123i64

複合型別

類型常值
陣列[T; N][20, 30, 40][0; 3]
元組()(T,)(T1, T2)()('x',)('x', 1.2)

陣列指派與存取:

fn main() {
    let mut a: [i8; 10] = [42; 10];
    a[5] = 0;
    println!("a: {:?}", a);
}

元組指派與存取:

fn main() {
    let t: (i8, bool) = (7, true);
    println!("1st index: {}", t.0);
    println!("2nd index: {}", t.1);
}

重要須知:

陣列:

  • 陣列型別 [T; N] 的值會保留同樣屬於 T 型別的 N (編譯時間常數) 元素。請注意,陣列的長度是「其型別的一部分」,也就是說 [u8; 3][u8; 4] 視為兩種不同型別。

  • 我們可以使用常值將值指派給陣列。

  • 在主函式中,輸出陳述式會使用 ? 格式參數要求偵錯實作:{} 提供預設輸出內容,{:?} 則提供偵錯輸出內容。我們也可以使用 {a}{a:?} 而不需指定格式字串後方的值。

  • 加入 # (例如 {a:#?}) 可叫用方便閱讀的「美化排版」格式。

元組:

  • 和陣列一樣,元組有固定的長度。

  • 元組會將不同型別的值組成複合型別。

  • 元組的欄位可透過點號和值的索引存取,例如 t.0t.1

  • 空白元組 () 也稱為「單位型別」。它既是型別,也是該型別唯一的有效值,亦即該型別及其值都以 () 表示。舉例來說,空白元組可用於表示函式或運算式沒有任何回傳值,我們會在之後的投影片看到這個例子。

    • 您可以將其視為其他程式設計語言中的 void,可能就不會感到陌生。

參照

和 C++ 一樣,Rust 具有參照:

fn main() {
    let mut x: i32 = 10;
    let ref_x: &mut i32 = &mut x;
    *ref_x = 20;
    println!("x: {x}");
}

注意事項:

  • 指派至 ref_x 時,我們必須對其解除參照,這類似於 C 和 C++ 指標。
  • 在某些情況下,尤其是在叫用方法時,Rust 會自動解除參照 (請嘗試使用 ref_x.count_ones())。
  • 宣告為 mut 的參照可在其生命週期內綁定至不同的值。

重要須知:

  • 請務必留意 let mut ref_x: &i32let ref_x: &mut i32 的差異。前者代表可變動的參照,可綁定至不同的值;後者則代表可變動值的參照。

迷途參照

Rust 會以靜態方式禁止迷途參照:

fn main() {
    let ref_x: &i32;
    {
        let x: i32 = 10;
        ref_x = &x;
    }
    println!("ref_x: {ref_x}");
}
  • 所謂參照項目,可說是「借用」其參照的值。
  • Rust 會追蹤所有參照項目的生命週期,確保其存留時間夠長。
  • 我們會在講到擁有權時進一步探討「借用」。

切片

切片能讓您查看更大的集合:

fn main() {
    let mut a: [i32; 6] = [10, 20, 30, 40, 50, 60];
    println!("a: {a:?}");

    let s: &[i32] = &a[2..4];

    println!("s: {s:?}");
}
  • 切片會從切片型別借用資料。
  • 問題:如果在輸出 s 前修改 a[3],會有什麼影響?
  • 我們會建立一個切片,方法是先借用 a,然後在括號中指定起始和結束索引。

  • 如果切片從索引 0 開始,Rust 的範圍語法可允許我們捨棄起始索引,也就是說,&a[0..a.len()]&a[..a.len()] 意思相同。

  • 同理,最後一個索引也是如此,因此 &a[2..a.len()]&a[2..] 意思相同。

  • 因此,為了輕鬆建立完整陣列的切片,我們可以使用 &a[..]

  • s 是對 i32s 切片的參照。請注意,s (&[i32]) 的型別不再提及陣列長度,這有利於我們對不同大小的切片執行運算。

  • 切片一律會從其他物件借用。在本例中,a 必須持續「運作」(在範圍內),時間至少要和切片一樣長。

  • 有關修改 a[3] 的問題可能引發有趣的討論,但正解是,基於記憶體安全因素,您無法在執行作業的這個時間點,透過 a 修改 a[3],但可以放心從 as 讀取資料。此項目會在您建立切片前運作,並在 println 之後,也就是切片不再使用時再次運作。更多細節會在借用檢查器的章節說明。

Stringstr

現在,我們可以瞭解 Rust 中有兩種字串型別:

fn main() {
    let s1: &str = "World";
    println!("s1: {s1}");

    let mut s2: String = String::from("Hello ");
    println!("s2: {s2}");
    s2.push_str(s1);
    println!("s2: {s2}");
    
    let s3: &str = &s2[6..];
    println!("s3: {s3}");
}

以 Rust 術語來說會是這樣:

  • &str 是對字串切片的不可變參照。
  • String 是可變動的字串緩衝區。
  • &str 可引進字串切片,這是對 UTF-8 編碼字串的不可變參照;該編碼字串儲存在記憶體區塊中,字串常值 (”Hello”) 則儲存在程式的二進位檔中。

  • Rust 的 String 型別是位元組向量的包裝函式。就像使用 Vec<T> 一樣,該型別有專屬的擁有者。

  • 就像使用其他許多型別一樣,String::from() 會透過字串常值建立字串;String::new() 則建立新的空白字串,您可以使用 push()push_str() 方法將字串資料加到該字串。

  • 如要從動態值產生自有字串,使用 format!() 巨集是便捷的方法。該巨集接受與 println!() 相同的格式規格。

  • 您可以透過 &str 和可選的範圍選項,從 String 借用 &str 切片。

  • C++ 程式設計師請注意:您可以將 &str 想成 C++ 的 const char*,但這個 &str 將一律指向記憶體中的有效字串。Rust 的 String 大致等同於 C++ 的 std::string,主要差別是前者只能包含 UTF-8 編碼的位元組,且絕不會進行小字串最佳化。

函式

這是知名面試問題 FizzBuzz 的 Rust 版本:

fn main() {
    print_fizzbuzz_to(20);
}

fn is_divisible(n: u32, divisor: u32) -> bool {
    if divisor == 0 {
        return false;
    }
    n % divisor == 0
}

fn fizzbuzz(n: u32) -> String {
    let fizz = if is_divisible(n, 3) { "fizz" } else { "" };
    let buzz = if is_divisible(n, 5) { "buzz" } else { "" };
    if fizz.is_empty() && buzz.is_empty() {
        return format!("{n}");
    }
    format!("{fizz}{buzz}")
}

fn print_fizzbuzz_to(n: u32) {
    for i in 1..=n {
        println!("{}", fizzbuzz(i));
    }
}
  • 我們在 main 中參照以下所寫的函式。不需前向宣告,也不需標頭。
  • 宣告參數後面接有型別 (與某些程式設計語言相反),然後才是傳回型別。
  • 函式主體 (或任何區塊) 中的最後一個運算式會成為回傳值。您只要省略運算式結尾的 ; 即可。
  • 某些函式沒有回傳值,會傳回 () 這個「單位型別」。如果省略 -> () 傳回型別,編譯器則會推斷出這點。
  • print_fizzbuzz_to()for 迴圈的範圍運算式含有 =n,因此會包含上限值。

Rustdoc

Rust 中的所有語言項目都能以特殊的 /// 語法來描述使用方法。

/// Determine whether the first argument is divisible by the second argument.
///
/// If the second argument is zero, the result is false.
fn is_divisible_by(lhs: u32, rhs: u32) -> bool {
    if rhs == 0 {
        return false;  // Corner case, early return
    }
    lhs % rhs == 0     // The last expression in a block is the return value
}

系統會將內容視為 Markdown。所有已發布的 Rust 程式庫 Crate,都會使用 rustdoc 工具自動記錄於 docs.rs 中。這種記錄 API 中所有公開項目的模式是慣用做法。

  • 向學生展示針對 rand Crate 產生的文件,路徑如下:docs.rs/rand

  • 本課程未在投影片中加入 rustdoc 只是為了節省空間,但在實際程式碼中不應這麼做。

  • 文件內註解會在稍後討論 (在模組相關頁面中),因此這裡不需提到。

  • Rustdoc 註解可包含能使用 cargo test 執行及測試的程式碼片段。我們會在「測試」一節討論這些測試。

方法

方法是與型別相關聯的函式。方法的 self 引數是與其相關聯的型別執行個體:

struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn area(&self) -> u32 {
        self.width * self.height
    }

    fn inc_width(&mut self, delta: u32) {
        self.width += delta;
    }
}

fn main() {
    let mut rect = Rectangle { width: 10, height: 5 };
    println!("old area: {}", rect.area());
    rect.inc_width(5);
    println!("new area: {}", rect.area());
}
  • 我們將在今天的練習和明天的課程中深入探討更多方法。
  • 新增名為 Rectangle::new 的靜態方法,然後從 main 呼叫此方法:

    fn new(width: u32, height: u32) -> Rectangle {
        Rectangle { width, height }
    }
  • 雖然「技術上」來說,Rust 沒有自訂的建構函式,但靜態方法經常用來初始化結構體 (儘管並非必須)。因此,您可直接呼叫實際的建構函式 Rectangle { width, height }。詳情請參閱 Rustnomicon

  • 新增 Rectangle::square(width: u32) 建構函式,說明這類靜態方法可以採用任意參數。

函式超載

Rust 不支援超載:

  • 每個函式都有單一實作項目:
    • 一律採用固定數量的參數。
    • 一律採用單組參數型別。
  • 不支援預設值:
    • 所有呼叫的引數數目都相同。
    • 有時系統會改用巨集。

不過,函式參數可能為泛型:

fn pick_one<T>(a: T, b: T) -> T {
    if std::process::id() % 2 == 0 { a } else { b }
}

fn main() {
    println!("coin toss: {}", pick_one("heads", "tails"));
    println!("cash prize: {}", pick_one(500, 1000));
}
  • 使用泛型時,標準程式庫的 Into<T> 可以為引數型別提供一種受限的多態性。我們會在後續章節中進一步說明。

第 1 天:上午練習

在這些練習中,我們將探索 Rust 的兩個部分:

  • 不同型別間的隱含轉換。

  • 陣列和 for 迴圈。

練習解題時的注意事項:

  • 如果可以,請在本機安裝 Rust。這樣即可在編輯器中使用自動完成功能。如要進一步瞭解如何安裝 Rust,請參閱「使用 Cargo」。

  • 或者,您也可以使用 Rust Playground。

系統會特意將程式碼片段設為無法編輯:如果您離開網頁,內嵌程式碼片段的狀態就會遺失。

完成練習後,您可以看看我們提供的解決方案

隱含轉換

Rust 不會自動在型別間套用「隱含轉換」 (與 C++ 不同)。您可以在類似下方的程式中發現這點:

fn multiply(x: i16, y: i16) -> i16 {
    x * y
}

fn main() {
    let x: i8 = 15;
    let y: i16 = 1000;

    println!("{x} * {y} = {}", multiply(x, y));
}

Rust 整數型別全都會實作 From<T>Into<T> 特徵,方便我們在兩者間轉換。From<T> 特徵有單一的 from() 方法;同樣地,Into 特徵也有單一的into()` 方法。實作這些特徵,是型別表示自身可轉換為另一種型別的方式。

標準程式庫會內有 From<i8> for i16 實作項目,這表示我們可以呼叫 i16::from(x),將類型 i8 的變數 x 轉換為 i16。或者,更簡單的方法是使用 x.into(),因為 From<i8> for i16 實作會自動建立 Into<i16> for i8 的實作。

這同樣適用於您自身型別專屬的 From 實作,因此只要實作 From,就能自動取得相應的 Into 實作。

  1. 執行上述程式,並查看編譯器錯誤。

  2. 更新上述程式碼,使用 into() 執行轉換。

  3. xy 的型別變更為其他型別 (例如 f32booli128),查看這些型別可以轉換成其他哪些型別。不妨試著將小型別轉換成大型別,反之亦然。接著參閱標準程式庫說明文件,瞭解系統是否已為您查看的配對實作 From<T>

陣列和 for 迴圈

我們已瞭解陣列的宣告方式可能如下:

#![allow(unused)]
fn main() {
let array = [10, 20, 30];
}

您可以使用 {:?} 偵錯表示法輸出這樣的陣列:

fn main() {
    let array = [10, 20, 30];
    println!("array: {array:?}");
}

您可在 Rust 中使用 for 關鍵字,對陣列和範圍等項目進行疊代作業:

fn main() {
    let array = [10, 20, 30];
    print!("Iterating over array:");
    for n in &array {
        print!(" {n}");
    }
    println!();

    print!("Iterating over range:");
    for i in 0..3 {
        print!(" {}", array[i]);
    }
    println!();
}

使用上述程式碼編寫可用用於美化矩陣排版的 pretty_print 函式,以及用於轉置矩陣 (將列轉換為欄) 的 transpose 函式:

2584567⎤8⎥9⎦transpose==1473⎤6⎥9⎦123

為這兩個函式進行硬式編碼,以便在 3 × 3 矩陣上執行。

將下方程式碼複製到 https://play.rust-lang.org/,並實作函式:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    unimplemented!()
}

fn pretty_print(matrix: &[[i32; 3]; 3]) {
    unimplemented!()
}

fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix:");
    pretty_print(&matrix);

    let transposed = transpose(matrix);
    println!("transposed:");
    pretty_print(&transposed);
}

加分題

您是否能夠使用 &[i32] 切片 (而非硬式編碼的 3 × 3 矩陣) 做為引數和傳回型別?例如針對二維切片使用 &[&[i32]]。原因為何?

請參閱 ndarray Crate,瞭解如何在確保實際工作環境品質的情況下實作。

如需加分題的解決方案和答案,請前往「解決方案」一節。

for n in &array 內使用 &array 參照項目是一個巧妙方式,可稍微預示下午將談到的擁有權問題。

沒有 & 的影響…

  • 迴圈會是取用陣列的迴圈。這是 2021 年版導入的變更。
  • 會發生隱含陣列複製的情形。由於 i32 是複製型別,因此 [i32; 3] 也會是複製型別。

控制流程

如同我們所見,if 是 Rust 中的一種表達式。它可以用來根據條件執行兩個區塊之中的一個,而區塊的執行結果可以進一步轉變成 if 表達式的賦值。其他控制流程表達式在 Rust 中也有類似的用法。

區塊

A block in Rust contains a sequence of expressions. Each block has a value and a type, which are those of the last expression of the block:

fn main() {
    let x = {
        let y = 10;
        println!("y: {y}");
        let z = {
            let w = {
                3 + 4
            };
            println!("w: {w}");
            y * w
        };
        println!("z: {z}");
        z - y
    };
    println!("x: {x}");
}

If the last expression ends with ;, then the resulting value and type is ().

同樣的規則也適用於函式:函式的數值即為函式本體的回傳值:

fn double(x: i32) -> i32 {
    x + x
}

fn main() {
    println!("doubled: {}", double(7));
}

重點:

  • 這張投影片所表達的重點在於 Rust 中的區塊具有一個數值以及一個型別。
  • 你可以藉由改變區塊中的最後一行來觀察區塊數值的變化。舉例來說,新增或刪除一個分號,或者使用 return

if 表達式

你可以像在其他語言中使用 if 陳述式那樣地使用 if 表達式

fn main() {
    let mut x = 10;
    if x % 2 == 0 {
        x = x / 2;
    } else {
        x = 3 * x + 1;
    }
}

此外,你也可以將 if 當作表達式使用。每個區塊中的最後一行式子將成為 if 表達式的賦值:

fn main() {
    let mut x = 10;
    x = if x % 2 == 0 {
        x / 2
    } else {
        3 * x + 1
    };
}

因為 if 被當作表達式使用,它必須擁有一個特定的型別,因此兩個分支區塊必須擁有同樣的型別。試著在第二個範例中的 x / 2 之後加上 ;,並觀察其結果。

for 迴圈

The for loop is closely related to the while let loop. It will automatically call into_iter() on the expression and then iterate over it:

fn main() {
    let v = vec![10, 20, 30];

    for x in v {
        println!("x: {x}");
    }
    
    for i in (0..10).step_by(2) {
        println!("i: {i}");
    }
}

您可以照常使用 breakcontinue

  • 在 Rust 中,索引疊代不是只適用於該情況的特殊語法。
  • (0..10) 是實作 Iterator 特徵的範圍。
  • step_by 這個方法會傳回另一個略過其他所有元素的 Iterator
  • 請修改向量中的元素,並說明編譯器錯誤。將向量 v 變更為可變動項,並將 for 迴圈變更為 for x in v.iter_mut()

while 迴圈

while 關鍵字的運作方式與其他語言非常相似:

fn main() {
    let mut x = 10;
    while x != 1 {
        x = if x % 2 == 0 {
            x / 2
        } else {
            3 * x + 1
        };
    }
    println!("Final x: {x}");
}

breakcontinue

  • 如果你想提早跳出迴圈,可以使用 break
  • 如果你想立即進入下一次迭代,可以使用 continue

continue 以及 break 都可以選擇性地接收一個迴圈標籤,用來跳出巢狀迴圈中的某一層:

fn main() {
    let v = vec![10, 20, 30];
    let mut iter = v.into_iter();
    'outer: while let Some(x) = iter.next() {
        println!("x: {x}");
        let mut i = 0;
        while i < x {
            println!("x: {x}, i: {i}");
            i += 1;
            if i == 3 {
                break 'outer;
            }
        }
    }
}

在這個範例中,內層迴圈經過三次迭代後,我們使用 break 跳出外層迴圈。

loop 運算式

最後,有一個 loop 關鍵字會建立無限迴圈。

這時您必須執行 breakreturn 來停止迴圈:

fn main() {
    let mut x = 10;
    loop {
        x = if x % 2 == 0 {
            x / 2
        } else {
            3 * x + 1
        };
        if x == 1 {
            break;
        }
    }
    println!("Final x: {x}");
}
  • 請使用 break 8 等值中斷 loop,然後顯示出來。
  • 請注意,loop 是唯一會傳回重要值的迴圈結構。這是因為系統保證至少會輸入一次此迴圈結構,這一點不同於 whilefor 迴圈。

變數

Rust 可透過靜態型別確保型別安全。根據預設,變數綁定不可變動:

fn main() {
    let x: i32 = 10;
    println!("x: {x}");
    // x = 20;
    // println!("x: {x}");
}
  • 由於型別推論的關係,i32 為選用項目。隨著課程進行,我們會逐漸減少示範型別的比例。

型別推斷

Rust 會觀察變數的「使用」方式,藉此判斷型別:

fn takes_u32(x: u32) {
    println!("u32: {x}");
}

fn takes_i8(y: i8) {
    println!("i8: {y}");
}

fn main() {
    let x = 10;
    let y = 20;

    takes_u32(x);
    takes_i8(y);
    // takes_u32(y);
}

這張投影片展示了 Rust 編譯器如何根據變數宣告和用法設下的限制來推斷型別。

請務必強調,以這種方式宣告的變數,並非「任一型別」這類可存放任何資料的動態型別。此類宣告產生的機器碼與型別的明確宣告相同。編譯器會替我們執行工作,並協助編寫更精簡的程式碼。

以下程式碼會指示編譯器使用 _ 做為預留位置,進而複製到特定泛型容器中,而無須明確指出包含的型別:

fn main() {
    let mut v = Vec::new();
    v.push((10, false));
    v.push((20, true));
    println!("v: {v:?}");

    let vv = v.iter().collect::<std::collections::HashSet<_>>();
    println!("vv: {vv:?}");
}

collect relies on FromIterator, which HashSet implements.

靜態和常數變數

靜態和常數變數是建立全域範圍值的兩種不同方式,這個值無法在程式執行期間移動或重新分配。

const

常數變數會在編譯期間評估,且無論用於何處,其值都會內嵌:

const DIGEST_SIZE: usize = 3;
const ZERO: Option<u8> = Some(42);

fn compute_digest(text: &str) -> [u8; DIGEST_SIZE] {
    let mut digest = [ZERO.unwrap_or(0); DIGEST_SIZE];
    for (idx, &b) in text.as_bytes().iter().enumerate() {
        digest[idx % DIGEST_SIZE] = digest[idx % DIGEST_SIZE].wrapping_add(b);
    }
    digest
}

fn main() {
    let digest = compute_digest("Hello");
    println!("Digest: {digest:?}");
}

根據《Rust RFC 手冊》所述,這類值會在使用時內嵌。

您只能在編譯期間呼叫標示為 const 的函式,以便產生 const 值,但可以在執行階段呼叫 const 函式。

static

靜態變數會在程式的整個執行過程中持續運作,因此不會移動:

static BANNER: &str = "Welcome to RustOS 3.14";

fn main() {
    println!("{BANNER}");
}

如《Rust RFC 手冊》所述,這類值在使用時不會內嵌,且具備實際相關聯的記憶體位置。這對不安全和嵌入的程式碼很有幫助,且變數在程式執行全程都會持續運作。當全域範圍值沒有需要物件識別子的理由時,通常首選會是使用 const

由於 static 變數可從任何執行緒存取,因此必須是 Sync。內部可變動性則可透過原子或類似的 Mutex 實現。也可能有可變動的靜態項目,但這些需要手動同步,因此每當存取這類項目時就需要動用 unsafe 程式碼。我們會在「不安全的 Rust」章節中探討可變動的靜態項目

  • 別忘了提到 const 的行為在語意上與 C++ 的 constexpr 相似。
  • 另一方面,static 則更類似於 C++ 中的 const 或可變動的全域變數。
  • static 提供物件識別子,也就是記憶體中的位址,和具有內部可變動性型別 (例如 Mutex<T>) 所需的狀態。
  • 需要在執行階段評估常數的情況雖不常見,但這會比使用靜態項目更有用且安全。
  • 您可以使用 std::thread_local 巨集來建立 thread_local 資料。

屬性表:

資源靜態常數
具備記憶體中的位址否 (已內嵌)
在整個程式執行期間持續存在
可變動是 (不安全)
Evaluated at compile time是 (已在編譯時初始化)
無論在何處使用都會內嵌

範圍和遮蔽

您可以遮蔽變量,包括來自外部範圍以及來自同一範圍的變量:

fn main() {
    let a = 10;
    println!("before: {a}");

    {
        let a = "hello";
        println!("inner scope: {a}");

        let a = true;
        println!("shadowed in inner scope: {a}");
    }

    println!("after: {a}");
}
  • 定義:遮蔽與可變數不同,因為在遮蔽之後,兩個變數的記憶體位置會同時存在。這兩者可以使用同一個名稱,具體取決於您在程式碼中使用的位置。
  • 遮蔽變數可以有不同的型別。
  • 遮蔽一開始看起來模糊不清,但對於保留 .unwrap() 之後的值很方便。
  • 下列程式碼說明遮蔽範圍中不可變動的變數時,為何編譯器就是無法重複使用記憶體位置 (即使型別未變更也一樣)。
fn main() {
    let a = 1;
    let b = &a;
    let a = a + 1;
    println!("{a} {b}");
}

列舉

enum 關鍵字可建立具有幾個不同變體的型別:

fn generate_random_number() -> i32 {
    // Implementation based on https://xkcd.com/221/
    4  // Chosen by fair dice roll. Guaranteed to be random.
}

#[derive(Debug)]
enum CoinFlip {
    Heads,
    Tails,
}

fn flip_coin() -> CoinFlip {
    let random_number = generate_random_number();
    if random_number % 2 == 0 {
        return CoinFlip::Heads;
    } else {
        return CoinFlip::Tails;
    }
}

fn main() {
    println!("You got: {:?}", flip_coin());
}

重點:

  • 列舉可讓您在單一類別中收集一組值。
  • This page offers an enum type CoinFlip with two variants Heads and Tails. You might note the namespace when using variants.
  • 這或許是比較結構體和列舉的好時機:
    • 無論使用何者,都能取得沒有欄位的簡易版本 (單元結構體),或是具有不同欄位型別的版本 (變體負載)。
    • 無論使用何者,相關函式都會在 impl 區塊中定義。
    • 您甚至可以使用獨立的結構體實作列舉的不同變體,但比起在列舉中定義全部變體的情況,這麼做會讓變體的型別有所不同。

變體負載

您可以定義更豐富的列舉,讓列舉的變體攜帶資料。接著,您可以使用 match 陳述式,從各個變體擷取資料:

enum WebEvent {
    PageLoad,                 // Variant without payload
    KeyPress(char),           // Tuple struct variant
    Click { x: i64, y: i64 }, // Full struct variant
}

#[rustfmt::skip]
fn inspect(event: WebEvent) {
    match event {
        WebEvent::PageLoad       => println!("page loaded"),
        WebEvent::KeyPress(c)    => println!("pressed '{c}'"),
        WebEvent::Click { x, y } => println!("clicked at x={x}, y={y}"),
    }
}

fn main() {
    let load = WebEvent::PageLoad;
    let press = WebEvent::KeyPress('x');
    let click = WebEvent::Click { x: 20, y: 80 };

    inspect(load);
    inspect(press);
    inspect(click);
}
  • 只有在與模式配對相符後,才能存取列舉變數中的值。此模式會將參照繫結至 => 後方「配對分支」中的欄位。
    • 系統會從上到下將運算式與模式進行配對。在 Rust 中,不會像在 C 或 C++ 中一樣出現貫穿 (fall-through) 情形。
    • 配對運算式具有值。此值是系統執行的配對分支中的最後一個運算式。
    • 我們會從上方開始尋找符合該值的模式,然後執行箭頭後方的程式碼。一旦發現相符項目,就會停止。
  • 請示範非窮舉搜尋的情況。請確認系統處理所有案例的時間,指出 Rust 編譯器提供的優勢。
  • match 會檢查 enum 中隱藏的判別值欄位。
  • 只要呼叫 std::mem::discriminant(),就有可能擷取該判別值。`
    • 舉例來說,如果在為結構體實作 PartialEq 時,比較欄位值不會對相等性造成影響,這種做法就很實用。
  • WebEvent::Click { ... } 與具有頂層 struct Click { ... }WebEvent::Click(Click) 並非完全相同。舉例來說,內嵌版本無法實作特徵。

列舉大小

Rust 列舉會緊密封裝,並考量因對齊而造成的限制:

use std::any::type_name;
use std::mem::{align_of, size_of};

fn dbg_size<T>() {
    println!("{}: size {} bytes, align: {} bytes",
        type_name::<T>(), size_of::<T>(), align_of::<T>());
}

enum Foo {
    A,
    B,
}

fn main() {
    dbg_size::<Foo>();
}

重點:

  • 在內部,Rust 會使用欄位 (判別值) 追蹤列舉變體。

  • 您可以視需要控制判別值,例如為了與 C 相容:

    #[repr(u32)]
    enum Bar {
        A,  // 0
        B = 10000,
        C,  // 10001
    }
    
    fn main() {
        println!("A: {}", Bar::A as u32);
        println!("B: {}", Bar::B as u32);
        println!("C: {}", Bar::C as u32);
    }

    如果沒有 repr,判別值型別會需要 2 個位元組,因為 10001 適合 2 個位元組。

  • 請嘗試其他型別,例如以下項目:

    • dbg_size!(bool):大小為 1 個位元組,對齊:1 個位元組。
    • dbg_size!(Option<bool>):大小為 1 個位元組,對齊:1 個位元組 (區位最佳化,請見下文)。
    • dbg_size!(&i32):大小為 8 個位元組,對齊:8 個位元組 (在 64 位元機器上)。
    • dbg_size!(Option<&i32>):大小為 8 個位元組,對齊:8 個位元組 (空值指標最佳化,請見下文)。
  • Niche optimization: Rust will merge unused bit patterns for the enum discriminant.

  • 空值指標最佳化:針對部分型別,Rust 保證 size_of::<T>() 等於 size_of::<Option<T>>().

    如果想示範位元表示法實際運作時「可能」的樣子,可以使用下列範例程式碼。請務必注意,編譯器並無對這個表示法提供保證,因此這完全不安全。

    use std::mem::transmute;
    
    macro_rules! dbg_bits {
        ($e:expr, $bit_type:ty) => {
            println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
        };
    }
    
    fn main() {
        // TOTALLY UNSAFE. Rust provides no guarantees about the bitwise
        // representation of types.
        unsafe {
            println!("Bitwise representation of bool");
            dbg_bits!(false, u8);
            dbg_bits!(true, u8);
    
            println!("Bitwise representation of Option<bool>");
            dbg_bits!(None::<bool>, u8);
            dbg_bits!(Some(false), u8);
            dbg_bits!(Some(true), u8);
    
            println!("Bitwise representation of Option<Option<bool>>");
            dbg_bits!(Some(Some(false)), u8);
            dbg_bits!(Some(Some(true)), u8);
            dbg_bits!(Some(None::<bool>), u8);
            dbg_bits!(None::<Option<bool>>, u8);
    
            println!("Bitwise representation of Option<&i32>");
            dbg_bits!(None::<&i32>, usize);
            dbg_bits!(Some(&0i32), usize);
        }
    }

    如果想討論將超過 256 個 Option 鏈結在一起的情況,可以使用下列更複雜的範例。

    #![recursion_limit = "1000"]
    
    use std::mem::transmute;
    
    macro_rules! dbg_bits {
        ($e:expr, $bit_type:ty) => {
            println!("- {}: {:#x}", stringify!($e), transmute::<_, $bit_type>($e));
        };
    }
    
    // Macro to wrap a value in 2^n Some() where n is the number of "@" signs.
    // Increasing the recursion limit is required to evaluate this macro.
    macro_rules! many_options {
        ($value:expr) => { Some($value) };
        ($value:expr, @) => {
            Some(Some($value))
        };
        ($value:expr, @ $($more:tt)+) => {
            many_options!(many_options!($value, $($more)+), $($more)+)
        };
    }
    
    fn main() {
        // TOTALLY UNSAFE. Rust provides no guarantees about the bitwise
        // representation of types.
        unsafe {
            assert_eq!(many_options!(false), Some(false));
            assert_eq!(many_options!(false, @), Some(Some(false)));
            assert_eq!(many_options!(false, @@), Some(Some(Some(Some(false)))));
    
            println!("Bitwise representation of a chain of 128 Option's.");
            dbg_bits!(many_options!(false, @@@@@@@), u8);
            dbg_bits!(many_options!(true, @@@@@@@), u8);
    
            println!("Bitwise representation of a chain of 256 Option's.");
            dbg_bits!(many_options!(false, @@@@@@@@), u16);
            dbg_bits!(many_options!(true, @@@@@@@@), u16);
    
            println!("Bitwise representation of a chain of 257 Option's.");
            dbg_bits!(many_options!(Some(false), @@@@@@@@), u16);
            dbg_bits!(many_options!(Some(true), @@@@@@@@), u16);
            dbg_bits!(many_options!(None::<bool>, @@@@@@@@), u16);
        }
    }

Novel Control Flow

Rust 的某些控制流程結構與其他程式語言不同。這些結構會用於模式配對:

  • if let 運算式
  • while let expressions
  • match 運算式

if let 運算式

if let 運算式可讓您根據值是否符合模式,執行不同的程式碼:

fn main() {
    let arg = std::env::args().next();
    if let Some(value) = arg {
        println!("Program name: {value}");
    } else {
        println!("Missing name?");
    }
}

如要進一步瞭解 Rust 中的模式,請參閱「模式比對」。

  • Unlike match, if let does not have to cover all branches. This can make it more concise than match.

  • 常見用途是在使用 Option 時處理 Some 值。

  • match 不同,if let 不會為模式比對支援成立條件子句。

  • Since 1.65, a similar let-else construct allows to do a destructuring assignment, or if it fails, execute a block which is required to abort normal control flow (with panic/return/break/continue):

    fn main() {
        println!("{:?}", second_word_to_upper("foo bar"));
    }
     
    fn second_word_to_upper(s: &str) -> Option<String> {
        let mut it = s.split(' ');
        let (Some(_), Some(item)) = (it.next(), it.next()) else {
            return None;
        };
        Some(item.to_uppercase())
    }
    

while let 迴圈

if let 的情況一樣,有一個 while let 變數可針對模式重複測試值:

fn main() {
    let v = vec![10, 20, 30];
    let mut iter = v.into_iter();

    while let Some(x) = iter.next() {
        println!("x: {x}");
    }
}

Here the iterator returned by v.into_iter() will return a Option<i32> on every call to next(). It returns Some(x) until it is done, after which it will return None. The while let lets us keep iterating through all items.

如要進一步瞭解 Rust 中的模式,請參閱「模式比對」。

  • 請指出只要值符合模式,while let 迴圈就會持續運作。
  • 您可以將 while let 迴圈重寫為無限迴圈,並加上會在無法為 iter.next() 取消包裝值的情況下結束的 if 陳述式。while let 可為上述情況提供語法糖。

match 運算式

match 關鍵字是用來將值與一或多個模式進行比對。因此,這個關鍵字的運作方式類似於一系列的 if let 運算式:

fn main() {
    match std::env::args().next().as_deref() {
        Some("cat") => println!("Will do cat things"),
        Some("ls")  => println!("Will ls some files"),
        Some("mv")  => println!("Let's move some files"),
        Some("rm")  => println!("Uh, dangerous!"),
        None        => println!("Hmm, no program name?"),
        _           => println!("Unknown program name!"),
    }
}

if let 一樣,每個比對臂都必須具有相同型別。型別是區塊的最後一個運算式 (如有)。在上述範例中,型別為 ()

如要進一步瞭解 Rust 中的模式,請參閱「模式比對」。

  • 請將比對運算式儲存為變數,然後顯示出來。
  • 請移除 .as_deref() 並說明錯誤。
    • std::env::args().next() 會傳回 Option<String>,但我們無法與String 進行比對。
    • as_deref() 會將 Option<T> 轉換成 Option<&T::Target>。在我們的案例中,這會將 Option<String> 轉換成Option<&str>
    • 我們現在可以使用模式比對,與 Option 內的 &str 進行比對。

模式配對

您可以使用 match 關鍵字,將值與一或多個「模式」配對。系統會從最上方往下依序比對,並套用第一個比對成功的模式。

模式可以是簡單的值,類似 C 和 C++ 中的 switch

fn main() {
    let input = 'x';

    match input {
        'q'                   => println!("Quitting"),
        'a' | 's' | 'w' | 'd' => println!("Moving around"),
        '0'..='9'             => println!("Number input"),
        _                     => println!("Something else"),
    }
}

_ 模式是可與任何值配對的萬用字元模式。

重點:

  • 建議您特別指出某些特定字元在模式中的使用方式
    • | 可做為 or
    • .. 可以視需要展開
    • 1..=5 代表含頭尾的範圍
    • _ 是萬用字元
  • 示範綁定的運作方式可能會很有幫助,例如您可以將萬用字元取代為變數,或是移除 q 前後的引號。
  • 您可以在參照項目上示範如何配對。
  • 這時候可能很適合提到「不可反駁的模式」這個概念,因為這個詞可能會出現在錯誤消息中。

解構列舉

模式也可用來將變數綁定至值的某些部分。您可以透過這個方式檢查型別的結構。首先從簡單的 enum 型別開始吧:

enum Result {
    Ok(i32),
    Err(String),
}

fn divide_in_two(n: i32) -> Result {
    if n % 2 == 0 {
        Result::Ok(n / 2)
    } else {
        Result::Err(format!("cannot divide {n} into two equal parts"))
    }
}

fn main() {
    let n = 100;
    match divide_in_two(n) {
        Result::Ok(half) => println!("{n} divided in two is {half}"),
        Result::Err(msg) => println!("sorry, an error happened: {msg}"),
    }
}

這裡我們利用分支來「解構」Result 值。在第一個分支中,half 會與 Ok 變體中的值綁定。在第二個分支中,msg 會綁定至錯誤訊息。

重要須知:

  • if/else 運算式會傳回列舉,之後列舉會透過 match 解除封裝。
  • 您可以嘗試在列舉定義中加入第三個變體,並在執行程式碼時顯示錯誤。請向學員指出程式碼現在有哪些地方還不詳盡,並說明編譯器會如何嘗試給予提示。

解構結構體

您也可以解構 struct

struct Foo {
    x: (u32, u32),
    y: u32,
}

#[rustfmt::skip]
fn main() {
    let foo = Foo { x: (1, 2), y: 3 };
    match foo {
        Foo { x: (1, b), y } => println!("x.0 = 1, b = {b}, y = {y}"),
        Foo { y: 2, x: i }   => println!("y = 2, x = {i:?}"),
        Foo { y, .. }        => println!("y = {y}, other fields were ignored"),
    }
}
  • 請變更 foo 中的常值,與其他模式配對。
  • Foo 中新增一個欄位,並視需要變更模式。
  • 捕獲和常數運算式之間的區別可能不容易發現。請嘗試將第二個分支的 2 變更為變數,您會發現它幾乎無法運作。現在將其變更為 const,您會看到它再次運作。

解構陣列

您可以在陣列、元組和切片的元素上配對,藉此解構陣列、元組和切片:

#[rustfmt::skip]
fn main() {
    let triple = [0, -2, 3];
    println!("Tell me about {triple:?}");
    match triple {
        [0, y, z] => println!("First is 0, y = {y}, and z = {z}"),
        [1, ..]   => println!("First is 1 and the rest were ignored"),
        _         => println!("All elements were ignored"),
    }
}
  • 您可以解構未知長度的切片,同樣的方法也適用於固定長度的模式。

    fn main() {
        inspect(&[0, -2, 3]);
        inspect(&[0, -2, 3, 4]);
    }
    
    #[rustfmt::skip]
    fn inspect(slice: &[i32]) {
        println!("Tell me about {slice:?}");
        match slice {
            &[0, y, z] => println!("First is 0, y = {y}, and z = {z}"),
            &[1, ..]   => println!("First is 1 and the rest were ignored"),
            _          => println!("All elements were ignored"),
        }
    }
  • 建立使用 _ 來代表元素的新模式。

  • 在陣列中加入更多值。

  • 向學員指出為了因應不同元素數量的情況,.. 會如何展開。

  • 向學員說明與模式 ([.., b][a@..,b]) 末端的配對情形。

配對守衛

配對時,您可以為模式新增「守衛」。這是任意的布林運算式,會在模式配對成功時執行:

#[rustfmt::skip]
fn main() {
    let pair = (2, -2);
    println!("Tell me about {pair:?}");
    match pair {
        (x, y) if x == y     => println!("These are twins"),
        (x, y) if x + y == 0 => println!("Antimatter, kaboom!"),
        (x, _) if x % 2 == 1 => println!("The first one is odd"),
        _                    => println!("No correlation..."),
    }
}

重點:

  • 有些概念比模式本身所允許的更加複雜,如果我們希望簡要地表達這些想法,就必須把配對守衛視為獨立的語法功能。
  • 這與配對分支內的個別 if 運算式不同。分支區塊中的 if 運算式 (位於 => 之後) 會在選取配對分支後發生。即使該區塊內的 if 條件失敗,系統也不會考量原始 match 運算式的其他分支。
  • 您可以在 if 運算式中使用模式內定義的變數。
  • 只要運算式隸屬於具備 | 的模式之中,就會套用守衛定義的條件。

第 1 天:下午練習

我們將著重在以下兩點:

  • 盧恩演算法

  • 練習模式配對

完成練習後,您可以看看我們提供的解決方案

盧恩演算法

盧恩演算法可用於驗證信用卡號碼。這個演算法會將字串做為輸入內容,並執行下列操作來驗證信用卡號碼:

  • 忽略所有空格。拒絕少於兩位數的號碼。

  • 右到左,將偶數位的數字乘二。以數字 1234 為例,請將 31 乘二;若為數字 98765,請將 68 乘二。

  • 將數字乘二後,如果結果大於 9,請將每位數字相加。所以,7 乘二等於 14,那麼也就是 1 + 4 = 5

  • 將所有數字 (無論是否已乘二) 相加。

  • 如果加總所得數字的末位是 0,代表信用卡卡號有效。

將下方程式碼複製到 https://play.rust-lang.org/,並實作函式。

使用「for」迴圈和整數,先嘗試以「簡單」的方式解決問題。接著,重新查看解決方案,試著使用疊代器實作。

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub fn luhn(cc_number: &str) -> bool {
    unimplemented!()
}

#[test]
fn test_non_digit_cc_number() {
    assert!(!luhn("foo"));
    assert!(!luhn("foo 0 0"));
}

#[test]
fn test_empty_cc_number() {
    assert!(!luhn(""));
    assert!(!luhn(" "));
    assert!(!luhn("  "));
    assert!(!luhn("    "));
}

#[test]
fn test_single_digit_cc_number() {
    assert!(!luhn("0"));
}

#[test]
fn test_two_digit_cc_number() {
    assert!(luhn(" 0 0 "));
}

#[test]
fn test_valid_cc_number() {
    assert!(luhn("4263 9826 4026 9299"));
    assert!(luhn("4539 3195 0343 6467"));
    assert!(luhn("7992 7398 713"));
}

#[test]
fn test_invalid_cc_number() {
    assert!(!luhn("4223 9826 4026 9299"));
    assert!(!luhn("4539 3195 0343 6476"));
    assert!(!luhn("8273 1232 7352 0569"));
}

#[allow(dead_code)]
fn main() {}

歡迎參加第 2 天課程

您目前對 Rust 已有相當程度的認識,接下來我們將繼續講解以下概念:

  • 記憶體管理:堆疊和堆積、手動管理記憶體、範圍式記憶體管理,以及垃圾收集。

  • 擁有權:轉移語意、複製、借用,以及生命週期。

  • Structs and methods.

  • 標準程式庫:StringOptionResultVecHashMapRcArc

  • 模組:瀏覽權限、路徑和檔案系統階層。

記憶體管理

傳統上,語言大致可分為兩種:

  • 透過手動管理記憶體,取得完整掌控權:C、C++、Pascal…
  • 透過在執行階段中自動管理記憶體,取得完整安全性:Java、Python、Go、Haskell…

Rust 則融合這兩種做法:

透過正確的記憶體管理編譯時間強制執行措施,「同時」取得完整的掌控權和安全性。

Rust 運用明確所有權的概念實現這一點。

首先,讓我們回顧記憶體管理的運作方式。

堆疊與堆積

  • 堆疊 (Stack):本機變數的連續記憶體區域。

    • 值在編譯期間具有已知的固定大小。
    • 相當快速:只需移動堆疊指標。
    • 易於管理:追蹤函式呼叫。
    • 良好的記憶體區域性。
  • 堆積 (Heap):函式呼叫外的值儲存空間。

    • 值在執行階段中以動態方式判斷大小。
    • 速度稍慢於堆疊:需要作一些記錄。
    • 不保證記憶體區域性。

堆疊和堆積範例

Creating a String puts fixed-sized metadata on the stack and dynamically sized data, the actual string, on the heap:

fn main() {
    let s1 = String::from("Hello");
}
StackHeaps1ptrHellolen5capacity5
  • 請說明 String 是由 Vec 支援,因此具有容量和長度,而且還能成長 (前提是可透過堆積上的重新配置作業進行變動)。

  • 如有學員問起,您可以說明基礎記憶體是使用[系統配置器]配置的堆積,而自訂配置器可以使用[配置器 API] 實作。

  • 我們可以使用 unsafe 程式碼檢查記憶體配置。不過,您應指出這麼做非常不安全!

    fn main() {
        let mut s1 = String::from("Hello");
        s1.push(' ');
        s1.push_str("world");
        // DON'T DO THIS AT HOME! For educational purposes only.
        // String provides no guarantees about its layout, so this could lead to
        // undefined behavior.
        unsafe {
            let (ptr, capacity, len): (usize, usize, usize) = std::mem::transmute(s1);
            println!("ptr = {ptr:#x}, len = {len}, capacity = {capacity}");
        }
    }

手動記憶體管理

您可以自行配置及釋放堆積記憶體。

如果操作時不夠小心,可能會導致當機、錯誤、安全漏洞和記憶體泄漏。

C 範例

使用 malloc 配置每個指標時,都必須呼叫 free

void foo(size_t n) {
    int* int_array = malloc(n * sizeof(int));
    //
    // ... lots of code
    //
    free(int_array);
}

Memory is leaked if the function returns early between malloc and free: the pointer is lost and we cannot deallocate the memory. Worse, freeing the pointer twice, or accessing a freed pointer can lead to exploitable security vulnerabilities.

作用域式記憶體管理

建構函式和解構函式可讓您掌握物件的生命週期。

只要在物件中包裝指標,即可在物件刪除時釋放記憶體。即使發生例外狀況,編譯器仍會保證執行這項作業。

這通常稱為「資源取得即初始化」(RAII),且會提供智慧指標。

C++ 範例

void say_hello(std::unique_ptr<Person> person) {
  std::cout << "Hello " << person->name << std::endl;
}
  • std::unique_ptr 物件會在堆疊上配置,並指向在堆積上配置的記憶體。
  • say_hello 結束時,std::unique_ptr 解構函式就會執行。
  • 解構函式會釋放其指向的 Person 物件。

將所有權傳遞至函式時,系統會使用特殊的移動建構函式:

std::unique_ptr<Person> person = find_person("Carla");
say_hello(std::move(person));

自動記憶體管理

除了手動記憶體管理和作用域式記憶體管理之外,自動記憶體管理是另一種做法:

  • 程式設計師一律不會明確配置或釋放記憶體。
  • 垃圾收集器會找到未使用的記憶體,並釋放給程式設計師。

Java 範例

sayHello 傳回後,系統不會釋放 person 物件:

void sayHello(Person person) {
  System.out.println("Hello " + person.getName());
}

Rust 中的記憶體管理

Rust 中的記憶體管理融合了以下特色:

  • 像 Java 一樣安全又正確,但沒有垃圾回收機制。
  • 像 C++ 一樣的作用域式管理,但編譯器會強制遵循完整規定。
  • Rust 使用者可選擇適合情境的抽象方法,部分方法甚至像 C 一樣在執行階段無額外成本。

Rust achieves this by modeling ownership explicitly.

  • 如果這時學員詢問相關做法,您可以表示這在 Rust 中通常會以 RAII 包裝函式型別處理,例如 BoxVecRcArc。這些型別會透過多種方法封裝所有權和記憶體配置,防止在 C 中可能出現的錯誤。

  • 這時學員可能會詢問解構函式,Rust 中的類似項目就是 Drop 特徵。

所有權

所有變數繫結都會在特定「範圍」內有效,在範圍外使用變數會是錯誤:

struct Point(i32, i32);

fn main() {
    {
        let p = Point(3, 4);
        println!("x: {}", p.0);
    }
    println!("y: {}", p.1);
}
  • 範圍結束時,變數會遭到「捨棄」,資料也會釋放。
  • 解構函式可在這時執行,用來釋放資源。
  • 我們會說變數「擁有」值。

移動語意

An assignment will transfer ownership between variables:

fn main() {
    let s1: String = String::from("Hello!");
    let s2: String = s1;
    println!("s2: {s2}");
    // println!("s1: {s1}");
}
  • s1 指派給 s2 會轉移所有權。
  • When s1 goes out of scope, nothing happens: it does not own anything.
  • s2 超出範圍時,系統會釋放字串資料。
  • 一律「只有」一個變數綁定會擁有值。
  • 請說明這與 C++ 中的預設情形相反:您必須使用 std::move,且已定義移動建構函式,系統才會根據值進行複製。

  • 只有擁有權才會轉移。是否產生任何機器碼來操控資料本身是一個最優化問題,而系統會主動將這些副本最優化。

  • 簡單的值 (例如整數) 可標示為 Copy (請參閱後續投影片)。

  • 在 Rust 中,克隆作業皆為明確設定,方法為使用 clone

Rust 中移動的字串

fn main() {
    let s1: String = String::from("Rust");
    let s2: String = s1;
}
  • 系統會為 s2 重複使用 s1 的堆積資料。
  • s1 超出範圍時,系統不會執行任何動作,因為 s1 已移出。

移至 s2 前:

StackHeaps1ptrRustlen4capacity4

移至 s2 後:

s1ptrRustlen4capacity4s2ptrlen4capacity4(inaccessible)

Defensive Copies in Modern C++

現代 C++ 可使用不同方式解決這個問題:

std::string s1 = "Cpp";
std::string s2 = s1;  // Duplicate the data in s1.
  • s1 的堆積資料會重複,s2 會取得專屬的獨立副本。
  • s1s2 超出範圍時,皆會釋放自己的記憶體。

複製指派前:

StackHeaps1ptrCpplen3capacity3

複製指派後:

StackHeaps1ptrCpplen3capacity3s2ptrCpplen3capacity3

重要須知:

  • C++ 提供的選擇與 Rust 略有不同。由於 = 會複製資料,所以字串資料一定要完成複製。否則,假如其中任一字串超出範圍,就會導致重複釋放的結果。

  • C++ 也提供 std::move,用於指出何時可以轉移特定值。例如假設是 s2 = std::move(s1),就不會發生堆積分配的情形。轉移之後,s1 會處於有效但未指定的狀態。與 Rust 不同的是,程式設計師可以繼續使用 s1

  • C++ 中的 = 可以依照要複製或轉移的型別來執行任何程式碼,這點與 Rust 不同。

函式呼叫中的移動

將值傳遞至函式時,該值會指派給函式參數。這麼做會轉移所有權:

fn say_hello(name: String) {
    println!("Hello {name}")
}

fn main() {
    let name = String::from("Alice");
    say_hello(name);
    // say_hello(name);
}
  • 首次呼叫 say_hello 時,main 會放棄 name 的所有權。之後,name 就無法在 main 內使用。
  • name 配置的堆積記憶體會在 say_hello 函式結束時釋放。
  • 如果 main 以參照的形式傳送 name (&name),且 say_hello 能以參數的形式接受參照,main 就可以保留所有權。
  • 另外,main 可在首次呼叫 (name.clone()) 中傳遞 name 的克隆。
  • 在 Rust 中,移動語意為預設做法,且強制規定程式設計師必須明確設定克隆,因此不小心建立副本的可能性就會低於在 C++ 中。

複製和克隆

雖然移動語意是預設做法,但某些型別的預設做法為複製:

fn main() {
    let x = 42;
    let y = x;
    println!("x: {x}");
    println!("y: {y}");
}

這些型別會實作 Copy 特徵。

您可以自行選擇加入型別,使用複製語意的做法:

#[derive(Copy, Clone, Debug)]
struct Point(i32, i32);

fn main() {
    let p1 = Point(3, 4);
    let p2 = p1;
    println!("p1: {p1:?}");
    println!("p2: {p2:?}");
}
  • 指派後,p1p2 都會擁有自己的資料。
  • 我們也能使用 p1.clone() 明確複製資料。

複製和克隆並不相同:

  • 複製是指記憶體區域的按位元複製作業,不適用於任意物件。
  • 複製不允許用於自訂邏輯,這與 C++ 中的複製建構函式不同。
  • 克隆是較廣泛的作業,而且只要實作 Clone 特徵,即允許用於自訂行為。
  • 複製不適用於實作 Drop 特徵的型別。

在上述範例中,請嘗試下列操作:

  • String 欄位新增至 struct Point。由於 String 不屬於 Copy 型別,因此不會編譯。
  • derive 屬性中移除 Copy。編譯器錯誤現在位於 p1println! 中。
  • 示範如果改為克隆 p1,就能正常運作。

如有學員問起 derive,只需回答這是在 Rust 編譯時間中產生程式碼的方式。在這種情形下,系統會產生 CopyClone 特徵的預設實作方式。

借用

您可以不必在呼叫函式時轉移所有權,而是讓函式「借用」值:

#[derive(Debug)]
struct Point(i32, i32);

fn add(p1: &Point, p2: &Point) -> Point {
    Point(p1.0 + p2.0, p1.1 + p2.1)
}

fn main() {
    let p1 = Point(3, 4);
    let p2 = Point(10, 20);
    let p3 = add(&p1, &p2);
    println!("{p1:?} + {p2:?} = {p3:?}");
}
  • add 函式會「借用」兩個點,並傳回新的點。
  • 呼叫端會保留輸入內容的所有權。

有關堆疊回傳的注意事項:

  • Demonstrate that the return from add is cheap because the compiler can eliminate the copy operation. Change the above code to print stack addresses and run it on the Playground or look at the assembly in Godbolt. In the “DEBUG” optimization level, the addresses should change, while they stay the same when changing to the “RELEASE” setting:

    #[derive(Debug)]
    struct Point(i32, i32);
    
    fn add(p1: &Point, p2: &Point) -> Point {
        let p = Point(p1.0 + p2.0, p1.1 + p2.1);
        println!("&p.0: {:p}", &p.0);
        p
    }
    
    pub fn main() {
        let p1 = Point(3, 4);
        let p2 = Point(10, 20);
        let p3 = add(&p1, &p2);
        println!("&p3.0: {:p}", &p3.0);
        println!("{p1:?} + {p2:?} = {p3:?}");
    }
  • Rust 編譯器可以執行回傳值最佳化 (RVO)。

  • In C++, copy elision has to be defined in the language specification because constructors can have side effects. In Rust, this is not an issue at all. If RVO did not happen, Rust will always perform a simple and efficient memcpy copy.

共用借用和專屬借用

Rust 會限制借用值的方式:

  • 隨時擁有一或多個 &T 值,「或是」
  • 只擁有一個 &mut T 值。
fn main() {
    let mut a: i32 = 10;
    let b: &i32 = &a;

    {
        let c: &mut i32 = &mut a;
        *c = 20;
    }

    println!("a: {a}");
    println!("b: {b}");
}
  • 上述程式碼不會編譯,因為系統會同時透過 cb,以可變動項和不可變動項的格式借用 a
  • 請將 bprintln! 陳述式移到導入 c 的範圍前,即可編譯程式碼。
  • 經過該變更後,編譯器會發現系統使用 b 的時間,只會在新可變動項透過 c 借用 a 之前。這是借用檢查器中的功能,稱為「非詞彙生命週期」(non-lexical lifetimes)。

生命週期

借用的值具有「生命週期」:

  • 生命週期可以採用隱含方式:add(p1: &Point, p2: &Point) -> Point
  • 生命週期也可以採用明確方式:&'a Point&'document str
  • 請將 &'a Point 讀做「至少對生命週期 a 有效的借用 Point」。
  • 生命週期一律會由編譯器推論:您無法自行指派生命週期。
    • 生命週期註解會建立限制;編譯器會驗證是否有有效的解決方案。
  • Lifetimes for function arguments and return values must be fully specified, but Rust allows lifetimes to be elided in most cases with a few simple rules.

函式呼叫中的生命週期

除了借用引數,函式也可以傳回借用的值:

#[derive(Debug)]
struct Point(i32, i32);

fn left_most<'a>(p1: &'a Point, p2: &'a Point) -> &'a Point {
    if p1.0 < p2.0 { p1 } else { p2 }
}

fn main() {
    let p1: Point = Point(10, 10);
    let p2: Point = Point(20, 20);
    let p3: &Point = left_most(&p1, &p2);
    println!("left-most point: {:?}", p3);
}
  • 'a 是由編譯器推論的泛型參數。
  • 生命週期的開頭為 ',一般預設名稱為 'a
  • 請將 &'a Point 讀做「至少對生命週期 a 有效的借用 Point」。
    • 如果參數位於不同的範圍,「至少」一詞就至關重要。

在上述範例中,請嘗試下列操作:

  • p2p3 的宣告移至新範圍 ({ ... }),會產生以下程式碼:

    #[derive(Debug)]
    struct Point(i32, i32);
    
    fn left_most<'a>(p1: &'a Point, p2: &'a Point) -> &'a Point {
        if p1.0 < p2.0 { p1 } else { p2 }
    }
    
    fn main() {
        let p1: Point = Point(10, 10);
        let p3: &Point;
        {
            let p2: Point = Point(20, 20);
            p3 = left_most(&p1, &p2);
        }
        println!("left-most point: {:?}", p3);
    }

    請注意,這在 p3 超越 p2 並繼續留存後,就沒有編譯。

  • 重設工作區,並將函式簽章變更為 fn left_most<'a, 'b>(p1: &'a Point, p2: &'a Point) -> &'b Point。這不會編譯,因為生命週期 'a'b 之間的關係不明確。

  • 另一種說明方式:

    • 函式會借用兩個值的兩個參照,而函式會傳回另一個參照。
    • 該參照必須來自這兩種輸入來源的其中之一 (或來自全域變數)。
    • 究竟是哪一個來源?編譯器需要知道來源為何,因此在呼叫點上,所傳回參照的使用時間不會長於來自參照來源的變數。

資料結構中的生命週期

如果資料型別會儲存借用的資料,則必須使用生命週期註解:

#[derive(Debug)]
struct Highlight<'doc>(&'doc str);

fn erase(text: String) {
    println!("Bye {text}!");
}

fn main() {
    let text = String::from("The quick brown fox jumps over the lazy dog.");
    let fox = Highlight(&text[4..19]);
    let dog = Highlight(&text[35..43]);
    // erase(text);
    println!("{fox:?}");
    println!("{dog:?}");
}
  • 在上述範例中,Highlight 的註解會強制執行以下規定:若是包含在內的 &str 的基礎資料,留存時間應至少和使用該資料的所有 Highlight 例項一樣長。
  • 如果在 fox (或 dog) 的生命週期結束前消耗 text,借用檢查器會擲回錯誤。
  • 含有借用資料的型別會強制要求使用者保留原始資料。這在建立輕量檢視畫面可能很實用,但通常也會增加使用難度。
  • 請盡可能讓資料結構直接擁有資料。
  • 某些內含多個參照的結構體可擁有多個生命週期註解。如果除了結構體的生命週期之外,還需要描述參照之間的生命週期關係,就可能有必要擁有多個生命週期註解。那些是相當進階的用途。

結構體

與 C 和 C++ 一樣,Rust 支援自訂結構體:

struct Person {
    name: String,
    age: u8,
}

fn main() {
    let mut peter = Person {
        name: String::from("Peter"),
        age: 27,
    };
    println!("{} is {} years old", peter.name, peter.age);
    
    peter.age = 28;
    println!("{} is {} years old", peter.name, peter.age);
    
    let jackie = Person {
        name: String::from("Jackie"),
        ..peter
    };
    println!("{} is {} years old", jackie.name, jackie.age);
}

重點:

  • 結構體的運作方式與在 C 或 C++ 中類似。
    • 不需要 typedef 即可定義型別。這與 C++ 類似,但與 C 不同。
    • 與 C++ 不同的是,結構體之間沒有繼承關係。
  • 方法會在 impl 區塊中定義,我們將於接下來的投影片說明這點。
  • 不妨趁此機會讓學員瞭解還有幾種不同的結構體。
    • 針對某些型別實作特徵時,可能會使用大小為零的結構體 e.g., struct Foo;,但其中沒有任何需要儲存在值本身的資料。
    • 在下一張投影片中,我們會介紹元組結構體,可於欄位名稱不重要時使用。
  • ..peter 語法可讓我們從舊的結構體中複製大部分欄位,而不必明確輸入所有欄位。此元素一律須位於最後。

元組結構體

如果欄位名稱不重要,您可以使用元組結構體:

struct Point(i32, i32);

fn main() {
    let p = Point(17, 23);
    println!("({}, {})", p.0, p.1);
}

這通常用於單一欄位的包裝函式 (稱為 newtypes):

struct PoundsOfForce(f64);
struct Newtons(f64);

fn compute_thruster_force() -> PoundsOfForce {
    todo!("Ask a rocket scientist at NASA")
}

fn set_thruster_force(force: Newtons) {
    // ...
}

fn main() {
    let force = compute_thruster_force();
    set_thruster_force(force);
}
  • 如要對原始型別中值的額外資訊進行編碼,Newtypes 是絕佳的方式,舉例來說:
    • 此數字會採用某些測量單位:在上例中為 Newtons
    • 此值在建立時已通過某些驗證,因此往後不必在每次使用時再次驗證。例如:’PhoneNumber(String)OddNumber(u32)` 。
  • 示範如何透過存取 newtype 中的單一欄位,將 “f64” 值新增至 Newtons 類型。
    • Rust 通常不太能接受不明確的內容,例如自動展開或使用布林值做為整數。
    • 運算子超載會在第 3 天 (泛型) 討論。
  • 此範例巧妙地以 Mars Climate Orbiter 的失敗經驗做為參照。

欄位簡寫語法

如果您已有名稱相同的變數,可以透過簡寫 建立結構體:

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn new(name: String, age: u8) -> Person {
        Person { name, age }
    }
}

fn main() {
    let peter = Person::new(String::from("Peter"), 27);
    println!("{peter:?}");
}
  • 您可以將 Self 用做型別來編寫 new 函式,因為它可和結構體型別名稱互通。

    #[derive(Debug)]
    struct Person {
        name: String,
        age: u8,
    }
    impl Person {
        fn new(name: String, age: u8) -> Self {
            Self { name, age }
        }
    }
  • 實作結構體的 Default 特徵。請定義部分欄位,並針對其他欄位使用預設值。

    #[derive(Debug)]
    struct Person {
        name: String,
        age: u8,
    }
    impl Default for Person {
        fn default() -> Person {
            Person {
                name: "Bot".to_string(),
                age: 0,
            }
        }
    }
    fn create_default() {
        let tmp = Person {
            ..Person::default()
        };
        let tmp = Person {
            name: "Sam".to_string(),
            ..Person::default()
        };
    }
  • 方法會在 impl 區塊中定義。

  • 使用結構體更新語法,利用 peter 定義新結構。請注意,peter 這個變數之後將再也無法存取。

  • 輸出結構體時,請使用 {:#?} 提出 Debug 表示法要求。

方法

Rust 可讓您將函式與新型別建立關聯。您可以使用 impl 區塊來執行這項操作:

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn say_hello(&self) {
        println!("Hello, my name is {}", self.name);
    }
}

fn main() {
    let peter = Person {
        name: String::from("Peter"),
        age: 27,
    };
    peter.say_hello();
}

重點:

  • 導入方法時,若將方法比做函式,會很有幫助。
    • 系統會在型別的執行個體 (例如結構體或列舉) 上呼叫方法,第一個參數以 self 代表執行個體。
    • 開發人員可以選擇透過方法來充分利用方法接收器語法,以更有條理的方式進行整理。藉由使用方法,我們可以將所有實作程式碼存放在可預測的位置。
  • 指出我們會使用關鍵字 self,也就是方法接收器。
    • 說明 selfself: Self 的縮寫,或許也能示範結構體名稱的可能用法。
    • 講解 Selfimpl 區塊所屬型別的型別別名,可用於該區塊的其他位置。
    • 提醒學員如何以類似於其他結構體的方式來使用 self,並指出點標記法可用來參照個別欄位,
    • 這可能是示範 &selfself 差異的好時機,您只要修改程式碼並嘗試執行 say_hello 兩次即可。
  • 接下來我們將說明方法接收器之間的差異。

方法接收器

上述的 &self 表示方法會以不可變的方式借用物件。以下是其他可能的方法接收器:

  • &self:使用共用且不可變動的參照,從呼叫端借用物件。之後可以再次使用該物件。
  • &mut self:使用不重複且可變動的參照,從呼叫端借用物件。之後可以再次使用該物件。
  • self:取得物件擁有權,並將其移出呼叫端。方法會成為物件的擁有者。系統會在方法傳回時捨棄物件 (取消分配),但如果其擁有權已明確傳送的情況例外。具備完整擁有權,不自動等同於具備可變動性。
  • mut self:同上,但方法可以變動物件。
  • 沒有接收器:這會成為結構體上的靜態方法,通常用於建立依慣例稱為 new 的建構函式。

除了 self 的變體以外,您還可以使用特殊的包裝函式型別做為接收器型別,例如 Box<Self>

建議您強調「共用且不可變動」,以及「不重複且可變動」這兩個概念。由於借用檢查器規則的關係,這些限制在 Rust 中一律會一起出現,而 self 也不例外。您無法從多個位置參照結構體,並對其呼叫變異 (&mut self) 方法。

範例

#[derive(Debug)]
struct Race {
    name: String,
    laps: Vec<i32>,
}

impl Race {
    fn new(name: &str) -> Race {  // No receiver, a static method
        Race { name: String::from(name), laps: Vec::new() }
    }

    fn add_lap(&mut self, lap: i32) {  // Exclusive borrowed read-write access to self
        self.laps.push(lap);
    }

    fn print_laps(&self) {  // Shared and read-only borrowed access to self
        println!("Recorded {} laps for {}:", self.laps.len(), self.name);
        for (idx, lap) in self.laps.iter().enumerate() {
            println!("Lap {idx}: {lap} sec");
        }
    }

    fn finish(self) {  // Exclusive ownership of self
        let total = self.laps.iter().sum::<i32>();
        println!("Race {} is finished, total lap time: {}", self.name, total);
    }
}

fn main() {
    let mut race = Race::new("Monaco Grand Prix");
    race.add_lap(70);
    race.add_lap(68);
    race.print_laps();
    race.add_lap(71);
    race.print_laps();
    race.finish();
    // race.add_lap(42);
}

重點:

  • 這裡的四個方法都使用不同的方法接收器。
    • 您可以指出這會如何變更函式能對變數值執行的動作,以及可否/如何在 main 中再次使用該函式。
    • 您可以演示嘗試呼叫 finish 兩次時會出現什麼錯誤。
  • 請注意,雖然方法接收器不同,但主體中非靜態函式的呼叫方式相同。Rust 會在呼叫方法時啟用自動參照和取消參照功能,並自動加入 &*muts,讓該物件與方法簽章相符。
  • 您或許可以指出 print_laps 使用了不斷疊代的向量。我們會在下午詳細介紹向量。

第 2 天:上午練習

我們會探討如何在以下兩種情況下實作方法:

  • 儲存書籍並查詢館藏

  • 追蹤病患的健康統計資料

完成練習後,您可以看看我們提供的解決方案

Storing Books

明天我們會進一步講解結構體和 Vec<T> 型別。現階段,您只需要瞭解相關 API 的部分內容:

fn main() {
    let mut vec = vec![10, 20];
    vec.push(30);
    let midpoint = vec.len() / 2;
    println!("middle value: {}", vec[midpoint]);
    for item in &vec {
        println!("item: {item}");
    }
}

這可用來建立圖書館的館藏模型。請將下列程式碼複製到 https://play.rust-lang.org/,然後更新型別以利編譯:

struct Library {
    books: Vec<Book>,
}

struct Book {
    title: String,
    year: u16,
}

impl Book {
    // This is a constructor, used below.
    fn new(title: &str, year: u16) -> Book {
        Book {
            title: String::from(title),
            year,
        }
    }
}

// Implement the methods below. Update the `self` parameter to
// indicate the method's required level of ownership over the object:
//
// - `&self` for shared read-only access,
// - `&mut self` for unique and mutable access,
// - `self` for unique access by value.
impl Library {
    fn new() -> Library {
        todo!("Initialize and return a `Library` value")
    }

    //fn len(self) -> usize {
    //    todo!("Return the length of `self.books`")
    //}

    //fn is_empty(self) -> bool {
    //    todo!("Return `true` if `self.books` is empty")
    //}

    //fn add_book(self, book: Book) {
    //    todo!("Add a new book to `self.books`")
    //}

    //fn print_books(self) {
    //    todo!("Iterate over `self.books` and each book's title and year")
    //}

    //fn oldest_book(self) -> Option<&Book> {
    //    todo!("Return a reference to the oldest book (if any)")
    //}
}

// This shows the desired behavior. Uncomment the code below and
// implement the missing methods. You will need to update the
// method signatures, including the "self" parameter! You may
// also need to update the variable bindings within main.
fn main() {
    let library = Library::new();

    //println!("The library is empty: library.is_empty() -> {}", library.is_empty());
    //
    //library.add_book(Book::new("Lord of the Rings", 1954));
    //library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    //
    //println!("The library is no longer empty: library.is_empty() -> {}", library.is_empty());
    //
    //
    //library.print_books();
    //
    //match library.oldest_book() {
    //    Some(book) => println!("The oldest book is {}", book.title),
    //    None => println!("The library is empty!"),
    //}
    //
    //println!("The library has {} books", library.len());
    //library.print_books();
}

解決方案

健康統計資料

您正在實作健康監控系統,因此須追蹤使用者的健康統計資料。

您將從 impl 區塊中的部分虛設函式,以及 User 結構體定義著手,目標是在 impl 區塊中定義的 User struct 上導入虛設常式方法。

請將以下程式碼複製到 https://play.rust-lang.org/,並填入缺少的方法:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub struct User {
    name: String,
    age: u32,
    height: f32,
    visit_count: usize,
    last_blood_pressure: Option<(u32, u32)>,
}

pub struct Measurements {
    height: f32,
    blood_pressure: (u32, u32),
}

pub struct HealthReport<'a> {
    patient_name: &'a str,
    visit_count: u32,
    height_change: f32,
    blood_pressure_change: Option<(i32, i32)>,
}

impl User {
    pub fn new(name: String, age: u32, height: f32) -> Self {
        unimplemented!()
    }

    pub fn name(&self) -> &str {
        unimplemented!()
    }

    pub fn age(&self) -> u32 {
        unimplemented!()
    }

    pub fn height(&self) -> f32 {
        unimplemented!()
    }

    pub fn doctor_visits(&self) -> u32 {
        unimplemented!()
    }

    pub fn set_age(&mut self, new_age: u32) {
        unimplemented!()
    }

    pub fn set_height(&mut self, new_height: f32) {
        unimplemented!()
    }

    pub fn visit_doctor(&mut self, measurements: Measurements) -> HealthReport {
        unimplemented!()
    }
}

fn main() {
    let bob = User::new(String::from("Bob"), 32, 155.2);
    println!("I'm {} and my age is {}", bob.name(), bob.age());
}

#[test]
fn test_height() {
    let bob = User::new(String::from("Bob"), 32, 155.2);
    assert_eq!(bob.height(), 155.2);
}

#[test]
fn test_set_age() {
    let mut bob = User::new(String::from("Bob"), 32, 155.2);
    assert_eq!(bob.age(), 32);
    bob.set_age(33);
    assert_eq!(bob.age(), 33);
}

#[test]
fn test_visit() {
    let mut bob = User::new(String::from("Bob"), 32, 155.2);
    assert_eq!(bob.doctor_visits(), 0);
    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (120, 80),
    });
    assert_eq!(report.patient_name, "Bob");
    assert_eq!(report.visit_count, 1);
    assert_eq!(report.blood_pressure_change, None);

    let report = bob.visit_doctor(Measurements {
        height: 156.1,
        blood_pressure: (115, 76),
    });

    assert_eq!(report.visit_count, 2);
    assert_eq!(report.blood_pressure_change, Some((-5, -4)));
}

標準函式庫

Rust 提供標準函式庫,可用於建立供 Rust 函式庫和程式使用的常用型別集。如此一來,兩個函式庫會使用相同的 String 型別,因此能夠順暢搭配運作。

常見的詞彙型別包括:

  • OptionResult 型別:用於選擇性的值和錯誤處理

  • String:用於自有資料的預設字串型別。

  • Vec:標準的可延伸向量。

  • HashMap:採用可設定雜湊演算法的雜湊映射型別。

  • Box:堆積配置資料的擁有所有權的指標。

  • Rc:堆積配置資料的共用參考指標。

  • Rust 實際上含有多種層級的標準函式庫,分別是 coreallocstd
  • core 包括最基本的型別與函式,這些型別與函式不依附於 libc、配置器或作業系統。
  • alloc 包括需要全域堆積配置器的型別,例如 VecBoxArc
  • 嵌入式 Rust 應用程式通常只使用 core,偶爾會使用 alloc

OptionResult

這些型別代表選擇性的資料:

fn main() {
    let numbers = vec![10, 20, 30];
    let first: Option<&i8> = numbers.first();
    println!("first: {first:?}");

    let idx: Result<usize, usize> = numbers.binary_search(&10);
    println!("idx: {idx:?}");
}
  • OptionResult 這兩種型別的使用範圍很廣,不侷限於標準函式庫。
  • 相較於 &TOption<&T> 的空間開銷為零。
  • Result 是實作錯誤處理的標準型別,我們將在第 3 天的課程中介紹。
  • binary_search 會回傳 Result<usize, usize>
    • 如果找到該元素,Result::Ok 會保留該元素所在位置的索引。
    • 如果沒有找到,Result::Err 會包含應插入這類元素的索引。

String

String 是標準堆積配置的可成長 UTF-8 字串緩衝區:

fn main() {
    let mut s1 = String::new();
    s1.push_str("Hello");
    println!("s1: len = {}, capacity = {}", s1.len(), s1.capacity());

    let mut s2 = String::with_capacity(s1.len() + 1);
    s2.push_str(&s1);
    s2.push('!');
    println!("s2: len = {}, capacity = {}", s2.len(), s2.capacity());

    let s3 = String::from("🇨🇭");
    println!("s3: len = {}, number of chars = {}", s3.len(),
             s3.chars().count());
}

String 會實作 Deref<Target = str>。也就是說,您可以在 String 上呼叫所有 str 方法。

  • String::new 會傳回新的空白字串,如果您知道要向字串推送多少資料,請使用 String::with_capacity
  • String::len 會傳回 String 的大小 (以位元組為單位,可能與以字元為單位的長度不同)。
  • String::chars 會傳回實際字元的疊代器。請注意,由於字形叢集的關係,char 和一般人所認為的「字元」可能不同。
  • 提到字串時,一般人可能是指 &strString
  • 當型別實作 Deref<Target = T> 時,編譯器可讓您以公開透明的方式呼叫 T 中的方法。
    • String 會實作 Deref<Target = str>,後者能以公開透明的方式授予前者 str 方法的存取權。
    • 編寫及比較 let s3 = s1.deref();let s3 = &*s1;。
  • String 是以包裝函式的形式在位元組向量的四周實作,許多在向量上支援的作業也適用於 String,但需要某些額外保證。
  • 請比較各種為 String 建立索引的方法:
    • 使用 s3.chars().nth(i).unwrap() 變為字元,其中 i 代表是否出界。
    • 使用 s3[0..4] 變為子字串,其中該切片會位於字元邊界上,也可能不會。

Vec

Vec 是可調整大小的標準堆積配置緩衝區:

fn main() {
    let mut v1 = Vec::new();
    v1.push(42);
    println!("v1: len = {}, capacity = {}", v1.len(), v1.capacity());

    let mut v2 = Vec::with_capacity(v1.len() + 1);
    v2.extend(v1.iter());
    v2.push(9999);
    println!("v2: len = {}, capacity = {}", v2.len(), v2.capacity());

    // Canonical macro to initialize a vector with elements.
    let mut v3 = vec![0, 0, 1, 2, 3, 4];

    // Retain only the even elements.
    v3.retain(|x| x % 2 == 0);
    println!("{v3:?}");

    // Remove consecutive duplicates.
    v3.dedup();
    println!("{v3:?}");
}

Vec 會實作 Deref<Target = [T]>。也就是說,您可以在 Vec 上呼叫切片方法。

  • Vec 是一種集合型別,與 StringHashMap 都一樣。Vec 內含的資料會儲存在堆積上。這表示在編譯期間無需得知資料量,可在執行階段増量或減量。
  • 請留意 Vec<T> 也能做為泛型型別,但您不必明確指定 T。和往常的 Rust 型別推論一樣,系統會在第一次 push 呼叫期間建立 T
  • vec![...] 是用於取代 Vec::new() 的標準巨集,且支援在向量中加入初始元素。
  • 如要為向量建立索引,請使用 [ ],但如果超出範圍會引發恐慌。或者,使用 get 則可傳回 Optionpop 函式會移除最後一個元素。
  • 示範如何對向量進行疊代並修改值:for e in &mut v { *e += 50; }

HashMap

標準雜湊映射,可防範 HashDoS 攻擊:

use std::collections::HashMap;

fn main() {
    let mut page_counts = HashMap::new();
    page_counts.insert("Adventures of Huckleberry Finn".to_string(), 207);
    page_counts.insert("Grimms' Fairy Tales".to_string(), 751);
    page_counts.insert("Pride and Prejudice".to_string(), 303);

    if !page_counts.contains_key("Les Misérables") {
        println!("We know about {} books, but not Les Misérables.",
                 page_counts.len());
    }

    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        match page_counts.get(book) {
            Some(count) => println!("{book}: {count} pages"),
            None => println!("{book} is unknown.")
        }
    }

    // Use the .entry() method to insert a value if nothing is found.
    for book in ["Pride and Prejudice", "Alice's Adventure in Wonderland"] {
        let page_count: &mut i32 = page_counts.entry(book.to_string()).or_insert(0);
        *page_count += 1;
    }

    println!("{page_counts:#?}");
}
  • 我們一開始並未定義 HashMap,因此現在需要將其納入課程範圍。

  • 請嘗試使用以下幾行程式碼。第一行會查看書籍是否在雜湊表中,如果不在,系統會傳回替代值。如果系統找不到書籍,第二行會在雜湊表中插入替代值。

      let pc1 = page_counts
          .get("Harry Potter and the Sorcerer's Stone ")
          .unwrap_or(&336);
      let pc2 = page_counts
          .entry("The Hunger Games".to_string())
          .or_insert(374);
  • 可惜的是,並沒有所謂標準的 hashmap! 巨集。這點與 vec! 不同。

    • 不過,自 Rust 1.56 起,HashMap 會實作 From<[(K, V); N]>,以便讓我們能從常值陣列初始化雜湊映射:

        let page_counts = HashMap::from([
          ("Harry Potter and the Sorcerer's Stone".to_string(), 336),
          ("The Hunger Games".to_string(), 374),
        ]);
  • 或者,您也可以透過任何能產生鍵/值元組的 Iterator 建立 HashMap。

  • 我們示範的是 HashMap<String, i32>,請避免使用 `&str 做為鍵,讓範例變得更簡單。當然,也可以在集合中使用參照,但這可能會使借用檢查器變得複雜。

    • 請嘗試從上述範例中移除 to_string(),看看是否仍可編譯。您認為我們可能會在哪裡遇到問題?
  • 這個型別有多個「方法專屬」的傳回型別,例如 std::collections::hash_map::Keys。這些型別經常會在 Rust 文件的搜尋結果中出現。請向學生展示這個型別的文件,以及可返回 keys 方法的實用連結。

Box

Box 是具有所有權的指向堆積上的資料的指標:

fn main() {
    let five = Box::new(5);
    println!("five: {}", *five);
}
5StackHeapfive

Box<T> 會實作 Deref<Target = T>。也就是說,您可以直接在 Box<T> 上透過 T 呼叫方法

  • Box 就像是 C++ 中的 std::unique_ptr,兩者的差別在於 Box 不會是空值。
  • 上面的範例使用 Deref,因此 println! 陳述式甚至可以省略 *
  • 在以下情況下,您可以使用 Box
    • 編譯時遇到不知道大小為何的型別,但 Rust 編譯器需要知道確切大小。
    • 想要轉移大量資料的所有權。為避免在堆疊上複製大量資料,請改將資料儲存在 Box 的堆積上,這樣系統就只會移動指標。

包含遞迴資料結構的 Box

遞迴資料型別或含有動態大小的資料型別必須使用 Box

#[derive(Debug)]
enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

fn main() {
    let list: List<i32> = List::Cons(1, Box::new(List::Cons(2, Box::new(List::Nil))));
    println!("{list:?}");
}
StackHeaplistCons1Cons2Nil
  • If Box was not used and we attempted to embed a List directly into the List, the compiler would not compute a fixed size of the struct in memory (List would be of infinite size).

  • Box 大小與一般指標相同,並且只會指向堆積中的下一個 List 元素,因此可以解決這個問題。

  • Box 從 List 定義中移除後,畫面上會顯示編譯器錯誤。如果您看到「Recursive with indirection」錯誤訊息,建議您使用 Box 或其他種類的參考,而不是直接儲存值。

區位最佳化

#[derive(Debug)]
enum List<T> {
    Cons(T, Box<List<T>>),
    Nil,
}

fn main() {
    let list: List<i32> = List::Cons(1, Box::new(List::Cons(2, Box::new(List::Nil))));
    println!("{list:?}");
}

Box 不能空白,因此指標會一律有效,而且不會是 null。這樣一來,編譯器可以將記憶體配置最佳化:

StackHeaplist12null

Rc

Rc 是參考計數的共用指標。如要在多個位置參考相同的資料,可以使用這個指標:

use std::rc::Rc;

fn main() {
    let mut a = Rc::new(10);
    let mut b = Rc::clone(&a);

    println!("a: {a}");
    println!("b: {b}");
}
  • 如果您處於多執行緒的環境,請參閱 ArcMutex
  • 您可以將共用指標「降級」為 Weak 指標,以便建立之後會捨棄的循環。
  • Rc 的計數可確保只要有參考,內含的值就會保持有效。
  • Rust 中的 Rc 就像 C++ 中的 std::shared_ptr 一樣。
  • Rc::clone 的成本很低:這個做法會建立指向相同配置的指標,並增加參考計數,而不會產生深克隆,尋找程式碼效能問題時通常可以忽略。
  • make_mut 實際上會在必要時克隆內部值 (「clone-on-write」),並回傳可變動的參考。
  • 使用 Rc::strong_count 可查看參考計數。
  • Rc::downgrade gives you a weakly reference-counted object to create cycles that will be dropped properly (likely in combination with RefCell, on the next slide).

CellRefCell

Cell and RefCell implement what Rust calls interior mutability: mutation of values in an immutable context.

Cell 因為需要複製或移動值,通常用於簡單的型別。較複雜的內部型別通常會使用 RefCell,可在執行階段和恐慌時追蹤共用和專屬的參照 (如果這些參照遭到濫用的話)。

use std::cell::RefCell;
use std::rc::Rc;

#[derive(Debug, Default)]
struct Node {
    value: i64,
    children: Vec<Rc<RefCell<Node>>>,
}

impl Node {
    fn new(value: i64) -> Rc<RefCell<Node>> {
        Rc::new(RefCell::new(Node { value, ..Node::default() }))
    }

    fn sum(&self) -> i64 {
        self.value + self.children.iter().map(|c| c.borrow().sum()).sum::<i64>()
    }
}

fn main() {
    let root = Node::new(1);
    root.borrow_mut().children.push(Node::new(5));
    let subtree = Node::new(10);
    subtree.borrow_mut().children.push(Node::new(11));
    subtree.borrow_mut().children.push(Node::new(12));
    root.borrow_mut().children.push(subtree);

    println!("graph: {root:#?}");
    println!("graph sum: {}", root.borrow().sum());
}
  • 如果我們在本例中使用 Cell 而非 RefCell,可能須將 Node 移出 Rc 才能推送子項,然後再將其移回。您可以放心執行這項操作,因為儲存格中始終有一個未參照的值,但不符人體工學。
  • 如要對節點執行任何操作,您必須呼叫 RefCell 方法,通常是 borrowborrow_mut
  • 示範可以將 root 新增至 subtree.children (請勿嘗試輸出!) 來建立參照迴圈。
  • 如要演示執行階段發生的恐慌情形,請新增 fn inc(&mut self),這可讓 self.value 遞增,並在其子項呼叫相同的方法。在有參照迴圈的情況下,這會引發恐慌,其中的 thread 'main' 會因 'already borrowed: BorrowMutError' 而恐慌。

模組

我們已介紹 impl 區塊如何讓我們將函式的命名空間建立為型別。

同樣地,mod 可讓我們建立型別和函式的命名空間:

mod foo {
    pub fn do_something() {
        println!("In the foo module");
    }
}

mod bar {
    pub fn do_something() {
        println!("In the bar module");
    }
}

fn main() {
    foo::do_something();
    bar::do_something();
}
  • 套件會提供功能,並收錄 Cargo.toml 檔案,用於說明如何建構含有超過 1 個 Crate 的組合。
  • Crate 是模組的樹狀結構,其中二進位檔 Crate 會建立執行檔,而程式庫 Crate 則會編譯至程式庫。
  • 模組不僅會定義組織、範圍,同時也是本節重點。

能見度

我們可將模組視為隱私邊界:

  • 模組項目預設為不公開 (會隱藏實作詳細資料)。
  • 父項和同層項目一律會顯示。
  • 換句話說,如果項目顯示在 foo 模組中,則會出現在 foo 的所有子系中。
mod outer {
    fn private() {
        println!("outer::private");
    }

    pub fn public() {
        println!("outer::public");
    }

    mod inner {
        fn private() {
            println!("outer::inner::private");
        }

        pub fn public() {
            println!("outer::inner::public");
            super::private();
        }
    }
}

fn main() {
    outer::public();
}
  • 使用 pub 關鍵字將模組設為公開。

此外,您也可以使用進階的 pub(...) 指定碼來限制公開的瀏覽權限範圍。

  • 請參閱 Rust 參考資料
  • 設定 pub(crate) 瀏覽權限是一種常見模式。
  • 您也可以授予特定路徑的瀏覽權限,但這較不常見。
  • 無論如何,都請務必將瀏覽權限授予祖系模組 (及其所有子系)。

路徑

路徑的解析方式包括:

  1. 做為相對路徑:

    • fooself::foo 是指目前模組中的 foo
    • super::foo 是指父項模組中的 foo
  2. 做為絕對路徑:

    • crate::foo 是指目前 Crate 根目錄中的 foo
    • bar::foo 是指 bar Crate 中的 foo

模組可以使用 use 將其他模組的符號帶進範圍內。您通常會在每個模組的頂端看到類似下方的內容:

use std::collections::HashSet;
use std::mem::transmute;

檔案系統階層

如果您省略模組內容,系統會指示 Rust 在其他檔案中尋找該內容:

mod garden;

這會讓 Rust 知道 garden 模組內容是在 src/garden.rs 中找到的。同樣地,garden::vegetables 模組可在 src/garden/vegetables.rs 中找到。

crate 根層級位於:

  • src/lib.rs (適用於程式庫 Crate)
  • src/main.rs (適用於二進位檔 Crate)

您也可以使用 “inner doc comments” 記錄檔案中定義的模組。這些會記錄包含它們的項目,在本例中就是模組。

//! This module implements the garden, including a highly performant germination
//! implementation.

// Re-export types from this module.
pub use seeds::SeedPacket;
pub use garden::Garden;

/// Sow the given seed packets.
pub fn sow(seeds: Vec<SeedPacket>) { todo!() }

/// Harvest the produce in the garden that is ready.
pub fn harvest(garden: &mut Garden) { todo!() }
  • 在 Rust 2018 之前,模組需位於 module/mod.rs 而非 module.rs 中,這仍然是 2018 後續版本的可行替代方案。

  • 導入 filename.rs 做為 filename/mod.rs 的替代方案,主要是因為許多名為 mod.rs 的檔案在 IDE 中很難區分。

  • 更深層的巢狀結構可以使用資料夾,即使主要模組為檔案也一樣:

    src/
    ├── main.rs
    ├── top_module.rs
    └── top_module/
        └── sub_module.rs
    
  • Rust 尋找模組的位置可透過編譯器指令變更:

    #[path = "some/path.rs"]
    mod some_module;

    舉例來說,如果您想將模組的測試放在名為 some_module_test.rs 的檔案中 (類似 Go 中的慣例),這就會很實用。

第 2 天:下午練習

今天下午的練習著重在字串和疊代器。

完成練習後,您可以看看我們提供的解決方案

疊代器和擁有權

Rust 的擁有權模型會影響許多 API。IteratorIntoIterator 特徵就是一例。

Iterator

特徵就像介面一樣,可以說明型別的行為 (方法)。Iterator 特徵就是指您可以呼叫 next,直到取回 None 為止:

#![allow(unused)]
fn main() {
pub trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}
}

您可以像下方這樣使用這個特徵:

fn main() {
    let v: Vec<i8> = vec![10, 20, 30];
    let mut iter = v.iter();

    println!("v[0]: {:?}", iter.next());
    println!("v[1]: {:?}", iter.next());
    println!("v[2]: {:?}", iter.next());
    println!("No more items: {:?}", iter.next());
}

如要瞭解疊代器傳回的型別為何,不妨在這裡測試答案:

fn main() {
    let v: Vec<i8> = vec![10, 20, 30];
    let mut iter = v.iter();

    let v0: Option<..> = iter.next();
    println!("v0: {v0:?}");
}

思考一下,為什麼會使用這種型別?

IntoIterator

Iterator 特徵會告訴您如何在建立疊代器後進行「疊代」。相關特徵 IntoIterator 則會說明如何建立疊代器:

#![allow(unused)]
fn main() {
pub trait IntoIterator {
    type Item;
    type IntoIter: Iterator<Item = Self::Item>;

    fn into_iter(self) -> Self::IntoIter;
}
}

這裡的語法表示每個 IntoIterator 的實作都必須宣告兩種型別:

  • Item:進行疊代的型別,例如 i8
  • IntoIterinto_iter 方法傳回的 Iterator 型別。

請注意,IntoIterItem 已建立連結:疊代器必須具有相同的 Item 型別,表示會傳回 Option<Item>

和先前一樣,思考疊代器傳回的型別為何。

fn main() {
    let v: Vec<String> = vec![String::from("foo"), String::from("bar")];
    let mut iter = v.into_iter();

    let v0: Option<..> = iter.next();
    println!("v0: {v0:?}");
}

for 迴圈

現在我們已瞭解 IteratorIntoIterator,可以建構 for 迴圈了。這會在運算式上呼叫 into_iter(),並對產生的疊代器進行疊代:

fn main() {
    let v: Vec<String> = vec![String::from("foo"), String::from("bar")];

    for word in &v {
        println!("word: {word}");
    }

    for word in v {
        println!("word: {word}");
    }
}

思考一下,每個迴圈中的 word 型別為何?

請用上方的程式碼進行試驗,並參閱 impl IntoIterator for &Vec<T>impl IntoIterator for Vec<T> 的說明文件確認答案。

字串和疊代器

在本次練習中,您將實作網路伺服器的路由元件。伺服器設定了多個與「要求路徑」相符的「路徑前置字元」。路徑前置字元可包含與完整片段相符的萬用字元。請參閱下方的單元測試。

將下列程式碼複製到 https://play.rust-lang.org/,然後設法通過測試。請盡量避免為中繼結果分配 Vec

#![allow(unused)]
fn main() {
// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub fn prefix_matches(prefix: &str, request_path: &str) -> bool {
    unimplemented!()
}

#[test]
fn test_matches_without_wildcard() {
    assert!(prefix_matches("/v1/publishers", "/v1/publishers"));
    assert!(prefix_matches("/v1/publishers", "/v1/publishers/abc-123"));
    assert!(prefix_matches("/v1/publishers", "/v1/publishers/abc/books"));

    assert!(!prefix_matches("/v1/publishers", "/v1"));
    assert!(!prefix_matches("/v1/publishers", "/v1/publishersBooks"));
    assert!(!prefix_matches("/v1/publishers", "/v1/parent/publishers"));
}

#[test]
fn test_matches_with_wildcard() {
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/books"
    ));
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/bar/books"
    ));
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/books/book1"
    ));

    assert!(!prefix_matches("/v1/publishers/*/books", "/v1/publishers"));
    assert!(!prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/booksByAuthor"
    ));
}
}

歡迎參加第 3 天課程

今天我們會探討一些有關 Rust 的進階主題:

  • 特徵:衍生特徵、預設方法,以及重要的標準程式庫特徵。

  • 泛型:泛型資料型別、泛型方法、單型,以及特徵物件。

  • 錯誤處理:恐慌、Result,以及 try 運算子 ?

  • 測試:單元測試、說明文件測試,以及整合測試。

  • 不安全的 Rust:原始指標、靜態變數、不安全的函式,以及 extern 函式。

泛型

Rust support generics, which lets you abstract algorithms or data structures (such as sorting or a binary tree) over the types used or stored.

泛型資料型別

你可以使用泛型將具體的欄位型別抽象化:

#[derive(Debug)]
struct Point<T> {
    x: T,
    y: T,
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
    println!("{integer:?} and {float:?}");
}
  • 試著宣告一個新的變數 let p = Point { x: 5, y: 10.0 };.

  • 修改程式碼,讓 points 能擁有不同型別的元素。

泛型方法

你可以將 impl 區塊宣告為泛型型別:

#[derive(Debug)]
struct Point<T>(T, T);

impl<T> Point<T> {
    fn x(&self) -> &T {
        &self.0  // + 10
    }

    // fn set_x(&mut self, x: T)
}

fn main() {
    let p = Point(5, 10);
    println!("p.x = {}", p.x());
}
  • 問題: 為什麼 Timpl<T> Point<T> {} 中重複出現了兩次?
    • 因為這是一個泛型型別 TPoint 實作,而 Point 的型別為泛型 T。它們是各自獨立的泛型。
    • 這表示這個方法是為了任意型別 T 而定義的。
    • 你可以寫成 impl Point<u32> { .. }
      • 由於 Point 仍然是泛型型別,你可以使用 Point<f64>,但這個方法將只適用於 Point<u32>

單型化

Rust 在編譯時進行單型化 (Monomorphization),根據不同呼叫者,將泛型程式碼轉換成實際型別的程式碼:

fn main() {
    let integer = Some(5);
    let float = Some(5.0);
}

以上程式碼等同於下方的程式碼

enum Option_i32 {
    Some(i32),
    None,
}

enum Option_f64 {
    Some(f64),
    None,
}

fn main() {
    let integer = Option_i32::Some(5);
    let float = Option_f64::Some(5.0);
}

這是一種零成本抽象:單型化的結果,等同於不使用抽象化並手動寫出資料結構的實際型別。

特徵

Rust 可讓您依據特徵對型別進行抽象化處理,這與介面相似:

trait Pet {
    fn name(&self) -> String;
}

struct Dog {
    name: String,
}

struct Cat;

impl Pet for Dog {
    fn name(&self) -> String {
        self.name.clone()
    }
}

impl Pet for Cat {
    fn name(&self) -> String {
        String::from("The cat") // No name, cats won't respond to it anyway.
    }
}

fn greet<P: Pet>(pet: &P) {
    println!("Who's a cutie? {} is!", pet.name());
}

fn main() {
    let fido = Dog { name: "Fido".into() };
    greet(&fido);

    let captain_floof = Cat;
    greet(&captain_floof);
}

特徵物件

特徵物件可接受不同型別的值,舉例來說,在集合中會是這樣:

trait Pet {
    fn name(&self) -> String;
}

struct Dog {
    name: String,
}

struct Cat;

impl Pet for Dog {
    fn name(&self) -> String {
        self.name.clone()
    }
}

impl Pet for Cat {
    fn name(&self) -> String {
        String::from("The cat") // No name, cats won't respond to it anyway.
    }
}

fn main() {
    let pets: Vec<Box<dyn Pet>> = vec![
        Box::new(Cat),
        Box::new(Dog { name: String::from("Fido") }),
    ];
    for pet in pets {
        println!("Hello {}!", pet.name());
    }
}

以下是配置 pets 後的記憶體配置:

name:Fido<Dog as Pet>::name<Cat as Pet>::nameStackHeappetsptrlen2capacity2
  • 如果型別會實作特定特徵,大小可能會不同。因此在上例中就不可能出現 Vec<Pet> 這類項目。
  • 可透過 dyn Pet 這個方法向編譯器告知實作 Pet 的動態大小型別。
  • 在本例中,pets 會保留指向物件的「虛指標」__,而物件會實作 Pet。虛指標包含兩個元件,指向實際物件的指標,以及指向該特定物件中 Pet 實作項目的虛擬方法表格。
  • 比較上述範例的輸出內容:
        println!("{} {}", std::mem::size_of::<Dog>(), std::mem::size_of::<Cat>());
        println!("{} {}", std::mem::size_of::<&Dog>(), std::mem::size_of::<&Cat>());
        println!("{}", std::mem::size_of::<&dyn Pet>());
        println!("{}", std::mem::size_of::<Box<dyn Pet>>());

衍生特徵

Rust 衍生巨集的運作原理是自動產生程式碼,用於實作資料結構的指定特徵。

You can let the compiler derive a number of traits as follows:

#[derive(Debug, Clone, PartialEq, Eq, Default)]
struct Player {
    name: String,
    strength: u8,
    hit_points: u8,
}

fn main() {
    let p1 = Player::default();
    let p2 = p1.clone();
    println!("Is {:?}\nequal to {:?}?\nThe answer is {}!", &p1, &p2,
             if p1 == p2 { "yes" } else { "no" });
}

預設方法

特徵可以依照其他特徵方法來實作行為:

trait Equals {
    fn equals(&self, other: &Self) -> bool;
    fn not_equals(&self, other: &Self) -> bool {
        !self.equals(other)
    }
}

#[derive(Debug)]
struct Centimeter(i16);

impl Equals for Centimeter {
    fn equals(&self, other: &Centimeter) -> bool {
        self.0 == other.0
    }
}

fn main() {
    let a = Centimeter(10);
    let b = Centimeter(20);
    println!("{a:?} equals {b:?}: {}", a.equals(&b));
    println!("{a:?} not_equals {b:?}: {}", a.not_equals(&b));
}
  • 特徵可能會指定預先實作的 (預設) 方法,以及使用者必須自行實作的方法。採用預設實作項目的方法可以信賴必要方法。

  • not_equals 方法移至新特徵 NotEquals

  • Equals 設為 NotEquals 的超特徵。

    trait NotEquals: Equals {
        fn not_equals(&self, other: &Self) -> bool {
            !self.equals(other)
        }
    }
  • Equals 提供 NotEquals 的大量實作。

    trait NotEquals {
        fn not_equals(&self, other: &Self) -> bool;
    }
    
    impl<T> NotEquals for T where T: Equals {
        fn not_equals(&self, other: &Self) -> bool {
            !self.equals(other)
        }
    }
    • 採用大量實作後,您就不再需要使用 Equals 做為 NotEqual 的超特徵。

特徵界限

使用泛型時,您通常會需要該型別實作 某些特徵,這樣才能呼叫該特徵的方法。

您可以使用 T: Traitimpl Trait 執行此操作:

fn duplicate<T: Clone>(a: T) -> (T, T) {
    (a.clone(), a.clone())
}

// Syntactic sugar for:
//   fn add_42_millions<T: Into<i32>>(x: T) -> i32 {
fn add_42_millions(x: impl Into<i32>) -> i32 {
    x.into() + 42_000_000
}

// struct NotClonable;

fn main() {
    let foo = String::from("foo");
    let pair = duplicate(foo);
    println!("{pair:?}");

    let many = add_42_millions(42_i8);
    println!("{many}");
    let many_more = add_42_millions(10_000_000);
    println!("{many_more}");
}

顯示 where 子句,學生在閱讀程式碼時會看到此內容。

fn duplicate<T>(a: T) -> (T, T)
where
    T: Clone,
{
    (a.clone(), a.clone())
}
  • 如果您有多個參數,這個子句可以整理函式簽名。
  • 這個子句具有額外功能,因此效能也更強大。
    • 如果有人提問,請說明額外功能是指 “:” 左側的類別可為任意值,例如 Option<T>

impl Trait

與特徵界限類似,impl Trait 語法可用於 函式引數和回傳值中:

use std::fmt::Display;

fn get_x(name: impl Display) -> impl Display {
    format!("Hello {name}")
}

fn main() {
    let x = get_x("foo");
    println!("{x}");
}
  • impl Trait 可讓您使用無法命名的型別。

impl Trait 的涵義會因使用位置而有些微不同。

  • 對參數來說,impl Trait 就像是具有特徵界限的匿名泛型參數。

  • 對回傳型別來說,impl Trait 代表回傳型別就是實作特徵的 某些具體型別,因而不必指名特定型別。如果您不想在公用 API 中公開具體型別, 這就非常有用。

    在回傳位置進行推論並不容易。回傳 impl Foo 的函式 會挑選自身回傳的具體型別,而不必在來源中寫出此資訊。回傳泛型型別 (例如 collect<B>() -> B) 的函式則可回傳 符合 B 的任何型別 ,而呼叫端可能需要選擇一個型別,例如使用 let x: Vec<_> = foo.collect() 或 Turbofish:foo.collect::<Vec<_>>()

這個例子非常好,因為 impl Display 使用了兩次。這有助於說明此處沒有 任何項目會強制使用「相同的」impl Display 型別。如果我們使用單一的 T: Display,則會強制限制「輸入」T 和「回傳」T 屬於同一型別。 但這並不適合這個特定函式,因為我們預期做為「輸入」的型別不一定 會是 format! 回傳的內容。如要透過 : Display 語法執行相同操作,我們會 需要兩個獨立的泛型參數。

重要特徵

現在來探討 Rust 標準程式庫最常見的幾個特徵:

疊代器

您可以自行在型別上實作 Iterator 特徵:

struct Fibonacci {
    curr: u32,
    next: u32,
}

impl Iterator for Fibonacci {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        let new_next = self.curr + self.next;
        self.curr = self.next;
        self.next = new_next;
        Some(self.curr)
    }
}

fn main() {
    let fib = Fibonacci { curr: 0, next: 1 };
    for (i, n) in fib.enumerate().take(5) {
        println!("fib({i}): {n}");
    }
}
  • Iterator 特徵會對集合實作許多常見的函式程式操作,例如 mapfilterreduce 等等。您可以藉由此特徵找出所有相關的說明文件。在 Rust 中,這些 函式會產生程式碼,且應與對應的命令式實作項目一樣有效率。

  • IntoIterator 是迫使 for 迴圈運作的特徵。此特徵由集合型別(例如 Vec<T>) 和相關參照 (&Vec<T>&[T]) 實作而成。此外,範圍也會實作這項特徵。 這就說明了您為何可以透過 for i in some_vec { .. } 對向量進行疊代,即使沒有 some_vec.next() 也無妨。

FromIterator

FromIterator 可讓您透過 Iterator 建構集合。

fn main() {
    let primes = vec![2, 3, 5, 7];
    let prime_squares = primes
        .into_iter()
        .map(|prime| prime * prime)
        .collect::<Vec<_>>();
}

Iterator implements fn collect<B>(self) -> B where B: FromIterator<Self::Item>, Self: Sized

您也可以利用部分實作項目完成某些酷炫操作,例如將 Iterator<Item = Result<V, E>> 轉換成 Result<Vec<V>, E>

FromInto

型別會實作 FromInto 以利型別轉換作業執行:

fn main() {
    let s = String::from("hello");
    let addr = std::net::Ipv4Addr::from([127, 0, 0, 1]);
    let one = i16::from(true);
    let bigger = i32::from(123i16);
    println!("{s}, {addr}, {one}, {bigger}");
}

實作 From 時,Into 也會自動實作:

fn main() {
    let s: String = "hello".into();
    let addr: std::net::Ipv4Addr = [127, 0, 0, 1].into();
    let one: i16 = true.into();
    let bigger: i32 = 123i16.into();
    println!("{s}, {addr}, {one}, {bigger}");
}
  • 這就是為什麼通常只需實作 From,因為型別也會實作 Into
  • 如要宣告函式引數輸入型別 (例如「任何可轉換成 String 的型別」),規則便會相反,此時請使用 Into。 您的函式會接受實作 From 的型別,以及「僅」實作 Into 的型別。

ReadWrite

使用 ReadBufRead 即可對 u8 來源進行抽象化處理:

use std::io::{BufRead, BufReader, Read, Result};

fn count_lines<R: Read>(reader: R) -> usize {
    let buf_reader = BufReader::new(reader);
    buf_reader.lines().count()
}

fn main() -> Result<()> {
    let slice: &[u8] = b"foo\nbar\nbaz\n";
    println!("lines in slice: {}", count_lines(slice));

    let file = std::fs::File::open(std::env::current_exe()?)?;
    println!("lines in file: {}", count_lines(file));
    Ok(())
}

同樣地,Write 則可讓您將 u8 接收器抽象化:

use std::io::{Result, Write};

fn log<W: Write>(writer: &mut W, msg: &str) -> Result<()> {
    writer.write_all(msg.as_bytes())?;
    writer.write_all("\n".as_bytes())
}

fn main() -> Result<()> {
    let mut buffer = Vec::new();
    log(&mut buffer, "Hello")?;
    log(&mut buffer, "World")?;
    println!("Logged: {:?}", buffer);
    Ok(())
}

Drop 特徵

如果值實作了 Drop,即可在超出範圍時指定要執行哪個程式碼:

struct Droppable {
    name: &'static str,
}

impl Drop for Droppable {
    fn drop(&mut self) {
        println!("Dropping {}", self.name);
    }
}

fn main() {
    let a = Droppable { name: "a" };
    {
        let b = Droppable { name: "b" };
        {
            let c = Droppable { name: "c" };
            let d = Droppable { name: "d" };
            println!("Exiting block B");
        }
        println!("Exiting block A");
    }
    drop(a);
    println!("Exiting main");
}

討論要點:

  • 為什麼 Drop::drop 不使用 self
    • 簡答:如果這樣的話,系統會在 區塊結尾呼叫 std::mem::drop,進而觸發另一個對 Drop::drop 的呼叫並造成堆疊 溢位!
  • 請嘗試將 drop(a) 替換為 a.drop()

Default 特徵

Default 特徵會產生型別的預設值。

#[derive(Debug, Default)]
struct Derived {
    x: u32,
    y: String,
    z: Implemented,
}

#[derive(Debug)]
struct Implemented(String);

impl Default for Implemented {
    fn default() -> Self {
        Self("John Smith".into())
    }
}

fn main() {
    let default_struct = Derived::default();
    println!("{default_struct:#?}");

    let almost_default_struct = Derived {
        y: "Y is set!".into(),
        ..Derived::default()
    };
    println!("{almost_default_struct:#?}");

    let nothing: Option<Derived> = None;
    println!("{:#?}", nothing.unwrap_or_default());
}
  • 這可以直接實作,也可以透過 #[derive(Default)] 衍生得出。
  • A derived implementation will produce a value where all fields are set to their default values.
    • 也就是說,該結構體中的所有型別也都必須實作 Default
  • 標準的 Rust 型別通常會以合理的值 (例如 0"" 等等) 實作 Default
  • 部分結構體副本可與預設值完美搭配運作。
  • Rust 標準程式庫瞭解型別可能會實作 Default,因此提供了便利的使用方式。
  • .. 語法稱為結構體更新語法

AddMul

運算子超載會透過 std::ops: 內的特徵實作:

#[derive(Debug, Copy, Clone)]
struct Point { x: i32, y: i32 }

impl std::ops::Add for Point {
    type Output = Self;

    fn add(self, other: Self) -> Self {
        Self {x: self.x + other.x, y: self.y + other.y}
    }
}

fn main() {
    let p1 = Point { x: 10, y: 20 };
    let p2 = Point { x: 100, y: 200 };
    println!("{:?} + {:?} = {:?}", p1, p2, p1 + p2);
}

討論要點:

  • 您可以針對 &Point 實作 Add。但這能在哪些情況派上用場?
    • 回答:Add:add 會耗用 self。如果您要超載運算子的型別 T 不是 Copy,建議您一併為 &T 超載運算子。這可避免呼叫點中出現不必要 的複製作業。
  • 為什麼 Output 是關聯型別?可將其用做方法的型別參數嗎?
    • 簡答:函式型別參數是由呼叫端控管,但 Output 這類關聯型別則由特徵 實作者控管。
  • 您可以針對兩種不同型別實作 Add,舉例來說, impl Add<(i32, i32)> for Point 會將元組新增至 Point

閉包

無論是閉包還是 lambda 運算式,都含有無法命名的型別。不過,這兩者 都會實作特殊的 FnFnMutFnOnce 特徵:

fn apply_with_log(func: impl FnOnce(i32) -> i32, input: i32) -> i32 {
    println!("Calling function on {input}");
    func(input)
}

fn main() {
    let add_3 = |x| x + 3;
    println!("add_3: {}", apply_with_log(add_3, 10));
    println!("add_3: {}", apply_with_log(add_3, 20));

    let mut v = Vec::new();
    let mut accumulate = |x: i32| {
        v.push(x);
        v.iter().sum::<i32>()
    };
    println!("accumulate: {}", apply_with_log(&mut accumulate, 4));
    println!("accumulate: {}", apply_with_log(&mut accumulate, 5));

    let multiply_sum = |x| x * v.into_iter().sum::<i32>();
    println!("multiply_sum: {}", apply_with_log(multiply_sum, 3));
}

Fn (例如 add_3) 既不會耗用也不會修改擷取的值,或許 也可說是不會擷取任何值,因此可以多次並行呼叫。

FnMut (例如 accumulate) 可能會修改擷取的值,因此可以多次呼叫 (但不得並行呼叫)。

如果是 FnOnce (例如multiply_sum),也許就只能呼叫一次,因為這可能會耗用 擷取的值。

FnMutFnOnce 的子型別,而 FnFnMutFnOnce 的子型別。換句話說,您可以在任何需要呼叫 FnOnce 的地方使用 FnMut,而在任何需要呼叫 FnMutFnOnce 的地方 使用 Fn

編譯器也會根據閉包擷取到的內容來推論 Copy (例如針對 add_3) 和 Clone (例如 multiply_sum).

根據預設,閉包會依據參照來擷取內容 (如果可行的話)。move 關鍵字則可讓閉包根據值 來擷取內容。

fn make_greeter(prefix: String) -> impl Fn(&str) {
    return move |name| println!("{} {}", prefix, name)
}

fn main() {
    let hi = make_greeter("Hi".to_string());
    hi("there");
}

第 3 天:上午練習

我們將使用特徵和特徵物件設計一個典型的 GUI 程式庫。

我們也會透過點和多邊形的相關練習,探討列舉分派情形。

完成練習後,您可以看看我們提供的解決方案

簡易 GUI 程式庫

我們對特徵和特徵物件有了新的瞭解,現在來設計經典的 GUI 程式庫吧!

我們的程式庫中會有許多小工具:

  • Window:有 title 且包含其他小工具。
  • Button:含有 label 和回呼函式,按下按鈕就會叫用這個函式。
  • Label:含有 label

小工具會實作 Widget 特徵,請參閱下文。

請將以下程式碼複製到 https://play.rust-lang.org/,並填入缺少的 draw_into 方法,以便實作 Widget 特徵:

// TODO: remove this when you're done with your implementation.
#![allow(unused_imports, unused_variables, dead_code)]

pub trait Widget {
    /// Natural width of `self`.
    fn width(&self) -> usize;

    /// Draw the widget into a buffer.
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write);

    /// Draw the widget on standard output.
    fn draw(&self) {
        let mut buffer = String::new();
        self.draw_into(&mut buffer);
        println!("{buffer}");
    }
}

pub struct Label {
    label: String,
}

impl Label {
    fn new(label: &str) -> Label {
        Label {
            label: label.to_owned(),
        }
    }
}

pub struct Button {
    label: Label,
    callback: Box<dyn FnMut()>,
}

impl Button {
    fn new(label: &str, callback: Box<dyn FnMut()>) -> Button {
        Button {
            label: Label::new(label),
            callback,
        }
    }
}

pub struct Window {
    title: String,
    widgets: Vec<Box<dyn Widget>>,
}

impl Window {
    fn new(title: &str) -> Window {
        Window {
            title: title.to_owned(),
            widgets: Vec::new(),
        }
    }

    fn add_widget(&mut self, widget: Box<dyn Widget>) {
        self.widgets.push(widget);
    }

    fn inner_width(&self) -> usize {
        std::cmp::max(
            self.title.chars().count(),
            self.widgets.iter().map(|w| w.width()).max().unwrap_or(0),
        )
    }
}


impl Widget for Label {
    fn width(&self) -> usize {
        unimplemented!()
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        unimplemented!()
    }
}

impl Widget for Button {
    fn width(&self) -> usize {
        unimplemented!()
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        unimplemented!()
    }
}

impl Widget for Window {
    fn width(&self) -> usize {
        unimplemented!()
    }

    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        unimplemented!()
    }
}

fn main() {
    let mut window = Window::new("Rust GUI Demo 1.23");
    window.add_widget(Box::new(Label::new("This is a small text GUI demo.")));
    window.add_widget(Box::new(Button::new(
        "Click me!",
        Box::new(|| println!("You clicked the button!")),
    )));
    window.draw();
}

上方程式的輸出內容可以很簡單,像下面這樣:

========
Rust GUI Demo 1.23
========

This is a small text GUI demo.

| Click me! |

如要繪製對齊的文字,可以使用 fill/alignment 格式化運算子。請特別注意如何使用不同的字元 (此處為 '/') 設定邊框間距,以及可以如何控制對齊方式:

fn main() {
    let width = 10;
    println!("left aligned:  |{:/<width$}|", "foo");
    println!("centered:      |{:/^width$}|", "foo");
    println!("right aligned: |{:/>width$}|", "foo");
}

使用這種對齊技巧,您可以產生如下的輸出內容:

+--------------------------------+
|       Rust GUI Demo 1.23       |
+================================+
| This is a small text GUI demo. |
| +-----------+                  |
| | Click me! |                  |
| +-----------+                  |
+--------------------------------+

多邊形結構體

我們會建立一個包含某幾個點的 Polygon 結構體。請將下方程式碼複製到 https://play.rust-lang.org/,並填入缺少的方法,設法通過測試:

// TODO: remove this when you're done with your implementation.
#![allow(unused_variables, dead_code)]

pub struct Point {
    // add fields
}

impl Point {
    // add methods
}

pub struct Polygon {
    // add fields
}

impl Polygon {
    // add methods
}

pub struct Circle {
    // add fields
}

impl Circle {
    // add methods
}

pub enum Shape {
    Polygon(Polygon),
    Circle(Circle),
}

#[cfg(test)]
mod tests {
    use super::*;

    fn round_two_digits(x: f64) -> f64 {
        (x * 100.0).round() / 100.0
    }

    #[test]
    fn test_point_magnitude() {
        let p1 = Point::new(12, 13);
        assert_eq!(round_two_digits(p1.magnitude()), 17.69);
    }

    #[test]
    fn test_point_dist() {
        let p1 = Point::new(10, 10);
        let p2 = Point::new(14, 13);
        assert_eq!(round_two_digits(p1.dist(p2)), 5.00);
    }

    #[test]
    fn test_point_add() {
        let p1 = Point::new(16, 16);
        let p2 = p1 + Point::new(-4, 3);
        assert_eq!(p2, Point::new(12, 19));
    }

    #[test]
    fn test_polygon_left_most_point() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);
        assert_eq!(poly.left_most_point(), Some(p1));
    }

    #[test]
    fn test_polygon_iter() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);

        let points = poly.iter().cloned().collect::<Vec<_>>();
        assert_eq!(points, vec![Point::new(12, 13), Point::new(16, 16)]);
    }

    #[test]
    fn test_shape_perimeters() {
        let mut poly = Polygon::new();
        poly.add_point(Point::new(12, 13));
        poly.add_point(Point::new(17, 11));
        poly.add_point(Point::new(16, 16));
        let shapes = vec![
            Shape::from(poly),
            Shape::from(Circle::new(Point::new(10, 20), 5)),
        ];
        let perimeters = shapes
            .iter()
            .map(Shape::perimeter)
            .map(round_two_digits)
            .collect::<Vec<_>>();
        assert_eq!(perimeters, vec![15.48, 31.42]);
    }
}

#[allow(dead_code)]
fn main() {}

由於問題陳述式缺少方法簽章,因此練習的關鍵部分就是正確指定這些簽章。您不需要修改測試。

練習中的其他有趣部分如下:

  • 針對部分結構體衍生 Copy 特徵,因為在測試中,方法有時不會借用其引數。
  • 發現必須實作 Add 特徵才能透過「+」新增兩個物件。請注意,我們在第 3 天以前不會討論泛型。

錯誤處理

在 Rust 中,是透過明確的控制流程完成錯誤處理作業:

  • 可能含有錯誤的函式會在回傳型別中列出相關資訊。
  • 沒有任何例外。

恐慌

如果執行階段發生重大錯誤,Rust 就會觸發恐慌:

fn main() {
    let v = vec![10, 20, 30];
    println!("v[100]: {}", v[100]);
}
  • 恐慌代表發生無法復原的非預期錯誤。
    • 恐慌可以反映程式中的錯誤。
  • 如果無法接受程式崩潰,請使用不會觸發恐慌的 API,例如 Vec::get

擷取解開堆疊的動作

根據預設,恐慌會造成解開堆疊。您可以擷取這類動作:

use std::panic;

fn main() {
    let result = panic::catch_unwind(|| {
        println!("hello!");
    });
    assert!(result.is_ok());
    
    let result = panic::catch_unwind(|| {
        panic!("oh no!");
    });
    assert!(result.is_err());
}
  • 如果伺服器需要持續運作 (即使有單一要求崩潰也不例外),這種做法就能派上用場。
  • 如果您在 Cargo.toml 中設定 panic = 'abort',就無法採取此做法。

使用 Result 進行結構化錯誤處理

我們先前介紹了 Result 列舉。當正常運作過程中預期發生錯誤時,普遍都會使用這個列舉:

use std::fs;
use std::io::Read;

fn main() {
    let file = fs::File::open("diary.txt");
    match file {
        Ok(mut file) => {
            let mut contents = String::new();
            file.read_to_string(&mut contents);
            println!("Dear diary: {contents}");
        },
        Err(err) => {
            println!("The diary could not be opened: {err}");
        }
    }
}
  • 就跟使用 Option 一樣,成功的值會在 Result 內部,這會強制開發人員明確擷取該值,進而有利於檢查錯誤。在應該絕對不會發生錯誤的情況下,可以呼叫 unwrap()expect(),這也是開發人員意圖的訊號。
  • 建議您參閱 Result 說明文件。這不涵蓋在課程內,但值得一提。這份文件收錄許多方便的方法和函式,有助於您進行函式程式設計。

使用 ? 傳播錯誤

try 運算子 ? 用於將錯誤傳回呼叫端,讓您將下列常見的程式碼:

match some_expression {
    Ok(value) => value,
    Err(err) => return Err(err),
}

轉換成以下較簡潔的程式碼:

some_expression?

We can use this to simplify our error handling code:

use std::{fs, io};
use std::io::Read;

fn read_username(path: &str) -> Result<String, io::Error> {
    let username_file_result = fs::File::open(path);
    let mut username_file = match username_file_result {
        Ok(file) => file,
        Err(err) => return Err(err),
    };

    let mut username = String::new();
    match username_file.read_to_string(&mut username) {
        Ok(_) => Ok(username),
        Err(err) => Err(err),
    }
}

fn main() {
    //fs::write("config.dat", "alice").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}

重要須知:

  • username 變數可以是 Ok(string)Err(error)
  • 請使用 fs::write 呼叫來測試以下不同情況:沒有檔案、空白檔案、含使用者名稱的檔案。
  • 函式的傳回型別必須與它呼叫的巢狀函式相容。舉例來說,傳回 Result<T, Err> 的函式只能在傳回 Result<AnyT, Err> 的函式上套用 ? 運算子;無法在傳回 Option<AnyT>Result<T, OtherErr> 的函式上套用該運算子,除非 OtherErr 實作了 From<Err> 則例外。反之,傳回 Option<T> 的函式只能在傳回 Option<AnyT> 的函式上套用 ? 運算子。
    • 您可以使用不同的 OptionResult 方法,例如 Option::ok_orResult::okResult::err 等,將不相容的型別轉換為其他型別。

轉換錯誤型別

比起先前提到的下列程式碼,? 的有效擴展稍微更複雜一點:

expression?

運作方式與以下程式碼相同:

match expression {
    Ok(value) => value,
    Err(err)  => return Err(From::from(err)),
}

這裡的 From::from 呼叫意味著,我們嘗試將錯誤型別轉換成函式回傳的型別:

轉換錯誤型別

use std::error::Error;
use std::fmt::{self, Display, Formatter};
use std::fs::{self, File};
use std::io::{self, Read};

#[derive(Debug)]
enum ReadUsernameError {
    IoError(io::Error),
    EmptyUsername(String),
}

impl Error for ReadUsernameError {}

impl Display for ReadUsernameError {
    fn fmt(&self, f: &mut Formatter) -> fmt::Result {
        match self {
            Self::IoError(e) => write!(f, "IO error: {e}"),
            Self::EmptyUsername(filename) => write!(f, "Found no username in {filename}"),
        }
    }
}

impl From<io::Error> for ReadUsernameError {
    fn from(err: io::Error) -> ReadUsernameError {
        ReadUsernameError::IoError(err)
    }
}

fn read_username(path: &str) -> Result<String, ReadUsernameError> {
    let mut username = String::with_capacity(100);
    File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(ReadUsernameError::EmptyUsername(String::from(path)));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    let username = read_username("config.dat");
    println!("username or error: {username:?}");
}

重要須知:

  • username 變數可以是 Ok(string)Err(error)
  • 請使用 fs::write 呼叫來測試以下不同情況:沒有檔案、空白檔案、含使用者名稱的檔案。

對所有不需要是 no_std 的錯誤型別來說,實作 std::error::Error 是很好的做法,std::error::Error 會需要 DebugDisplaycoreError Crate 僅於 nightly 提供,因此與 no_std 尚未完全相容。

對這種錯誤型別來說,在可能的情況下實作 CloneEq 通常也很有用,不僅有利於程式庫的測試,使用者也會更輕鬆。但在本例中,我們無法輕易這麼做,因為 io::Error 並未實作 CloneEq

推導錯誤列舉

thiserror crate 很常用來建立錯誤列舉,我們在上一頁就曾這麼做:

use std::{fs, io};
use std::io::Read;
use thiserror::Error;

#[derive(Debug, Error)]
enum ReadUsernameError {
    #[error("Could not read: {0}")]
    IoError(#[from] io::Error),
    #[error("Found no username in {0}")]
    EmptyUsername(String),
}

fn read_username(path: &str) -> Result<String, ReadUsernameError> {
    let mut username = String::new();
    fs::File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(ReadUsernameError::EmptyUsername(String::from(path)));
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err}"),
    }
}

thiserror 的衍生巨集會自動實作 std::error::Error,並視情況實作 Display (如果提供了 #[error(...)] 屬性的話) 和 From (如果新增了 #[from] 屬性的話)。上述原則也適用於結構體。

這個不會影響你的公用API,因此這對程式庫很好。

動態錯誤型別

我們有時會想允許傳回任何型別的錯誤,而不是自行編寫涵蓋所有不同可能性的列舉。std::error::Error 可讓這項工作更輕鬆。

use std::fs;
use std::io::Read;
use thiserror::Error;
use std::error::Error;

#[derive(Clone, Debug, Eq, Error, PartialEq)]
#[error("Found no username in {0}")]
struct EmptyUsernameError(String);

fn read_username(path: &str) -> Result<String, Box<dyn Error>> {
    let mut username = String::new();
    fs::File::open(path)?.read_to_string(&mut username)?;
    if username.is_empty() {
        return Err(EmptyUsernameError(String::from(path)).into());
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err}"),
    }
}

這可少用一些程式碼,但犧牲掉的是無法在程式中以不同方式乾淨地處理各種錯誤情況。因此,在程式庫的公用 API 中使用 Box<dyn Error>,通常不是一個理想方式,但如果您只想在程式中的某處顯示錯誤訊息,它可能是不錯的選擇。

為錯誤添加背景資訊

透過廣泛使用的 anyhow crate,您可以為錯誤添加背景資訊,並減少自訂錯誤型別的數量:

use std::{fs, io};
use std::io::Read;
use anyhow::{Context, Result, bail};

fn read_username(path: &str) -> Result<String> {
    let mut username = String::with_capacity(100);
    fs::File::open(path)
        .with_context(|| format!("Failed to open {path}"))?
        .read_to_string(&mut username)
        .context("Failed to read")?;
    if username.is_empty() {
        bail!("Found no username in {path}");
    }
    Ok(username)
}

fn main() {
    //fs::write("config.dat", "").unwrap();
    match read_username("config.dat") {
        Ok(username) => println!("Username: {username}"),
        Err(err)     => println!("Error: {err:?}"),
    }
}
  • anyhow::Result<V>Result<V, anyhow::Error> 的型別別名。
  • anyhow::Error 基本上是 Box<dyn Error> 周遭的包裝函式。因此,通常也是不建議程式庫的公用 API 使用,但可在應用程式中廣泛使用。
  • 必要時,可以擷取其中的實際錯誤類型進行檢查。
  • Go 開發人員可能會覺得 anyhow::Result<T> 提供的功能似曾相識,因為該功能提供了與 Go 中的 (T, error) 類似的使用模式和人體工學。

測試

Rust 和 Cargo 提供了一個簡單的單元測試 (unit test) 框架:

  • 在你的程式碼的任何地方都可添加單元測試。

  • 整合測試 (integration test) 則可放置在 tests/ 資料夾下。

單元測試

#[test] 標記單元測試:

fn first_word(text: &str) -> &str {
    match text.find(' ') {
        Some(idx) => &text[..idx],
        None => &text,
    }
}

#[test]
fn test_empty() {
    assert_eq!(first_word(""), "");
}

#[test]
fn test_single_word() {
    assert_eq!(first_word("Hello"), "Hello");
}

#[test]
fn test_multiple_words() {
    assert_eq!(first_word("Hello World"), "Hello");
}

cargo test 尋找並執行單元測試。

測試模組

單元測試通常會位於巢狀模組中 (在 Playground 上執行測試):

fn helper(a: &str, b: &str) -> String {
    format!("{a} {b}")
}

pub fn main() {
    println!("{}", helper("Hello", "World"));
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_helper() {
        assert_eq!(helper("foo", "bar"), "foo bar");
    }
}
  • 這有助於您對私人輔助程式進行單元測試。
  • 只有在執行 cargo test 時,#[cfg(test)] 屬性才會生效。

說明文件測試

Rust 內建說明文件測試相關支援:

#![allow(unused)]
fn main() {
/// Shortens a string to the given length.
///
/// ```
/// use playground::shorten_string;
/// assert_eq!(shorten_string("Hello World", 5), "Hello");
/// assert_eq!(shorten_string("Hello World", 20), "Hello World");
/// ```
pub fn shorten_string(s: &str, length: usize) -> &str {
    &s[..std::cmp::min(length, s.len())]
}
}
  • 系統會自動將 /// 註解中的程式碼區塊視為 Rust 程式碼。
  • 系統會編譯程式碼,執行 cargo test 時會一併執行這些程式碼。
  • 請在 Rust Playground 上測試上述程式碼。

整合測試

如果您要以用戶端身分測試程式庫,請採用整合測試。

tests/ 之下建立一個 .rs 檔案:

use my_library::init;

#[test]
fn test_init() {
    assert!(init().is_ok());
}

這些測試只能存取 crate 的公用 API。

有助於編寫測試的 crate

Rust 只提供基本的編寫測試支援。

以下列出幾個額外的 crate,建議您在編寫測試時使用:

  • googletest:這是全面性的測試斷言程式庫,沿用了 GoogleTest (適用於 C++) 的傳統運作方式。
  • proptest:讓您對 Rust 執行以屬性為基礎的測試。
  • rstest:支援 Fixture 和參數化測試。

不安全的 Rust

Rust 語言包含兩個部分:

  • **安全的 Rust:**可確保記憶體安全,無法觸發未定義的行為。
  • **不安全的 Rust:**如果違反先決條件,便可能觸發未定義的行為。

雖然本課程中出現的大多都是安全的 Rust,但瞭解不安全的 Rust 也很重要。

不安全的程式碼通常都很簡短、受到隔離,而且封裝在安全的抽象層中。您應該仔細記錄這類程式碼的正確性。

透過不安全的 Rust,可以使用五項新功能:

  • 對裸指標解參考。
  • 存取或修改可變的靜態變數。
  • 存取 union 欄位。
  • 呼叫 unsafe 函式 (包括 extern 函式)。
  • 實作 unsafe 特徵。

接下來將簡單介紹不安全的功能。如需瞭解詳情,請參閱 Rust Book 的第 19.1 章,以及 Rustonomicon

Unsafe Rust does not mean the code is incorrect. It means that developers have turned off the compiler safety features and have to write correct code by themselves. It means the compiler no longer enforces Rust’s memory-safety rules.

對裸指標解參考

建立指標相當安全,不過對指標解參考就需要使用 unsafe

fn main() {
    let mut num = 5;

    let r1 = &mut num as *mut i32;
    let r2 = r1 as *const i32;

    // Safe because r1 and r2 were obtained from references and so are
    // guaranteed to be non-null and properly aligned, the objects underlying
    // the references from which they were obtained are live throughout the
    // whole unsafe block, and they are not accessed either through the
    // references or concurrently through any other pointers.
    unsafe {
        println!("r1 is: {}", *r1);
        *r1 = 10;
        println!("r2 is: {}", *r2);
    }
}

It is good practice (and required by the Android Rust style guide) to write a comment for each unsafe block explaining how the code inside it satisfies the safety requirements of the unsafe operations it is doing.

In the case of pointer dereferences, this means that the pointers must be valid, i.e.:

  • The pointer must be non-null.
  • The pointer must be dereferenceable (within the bounds of a single allocated object).
  • The object must not have been deallocated.
  • There must not be concurrent accesses to the same location.
  • If the pointer was obtained by casting a reference, the underlying object must be live and no reference may be used to access the memory.

In most cases the pointer must also be properly aligned.

可變的靜態變數

您可以放心讀取不可變的靜態變數:

static HELLO_WORLD: &str = "Hello, world!";

fn main() {
    println!("HELLO_WORLD: {HELLO_WORLD}");
}

不過,讀取並寫入可變的靜態變數並不安全,因為可能發生資料競爭:

static mut COUNTER: u32 = 0;

fn add_to_counter(inc: u32) {
    unsafe { COUNTER += inc; }  // Potential data race!
}

fn main() {
    add_to_counter(42);

    unsafe { println!("COUNTER: {COUNTER}"); }  // Potential data race!
}

Using a mutable static is generally a bad idea, but there are some cases where it might make sense in low-level no_std code, such as implementing a heap allocator or working with some C APIs.

聯合體

聯合體和列舉很像,但您需要自行追蹤可用欄位:

#[repr(C)]
union MyUnion {
    i: u8,
    b: bool,
}

fn main() {
    let u = MyUnion { i: 42 };
    println!("int: {}", unsafe { u.i });
    println!("bool: {}", unsafe { u.b });  // Undefined behavior!
}

Unions are very rarely needed in Rust as you can usually use an enum. They are occasionally needed for interacting with C library APIs.

If you just want to reinterpret bytes as a different type, you probably want std::mem::transmute or a safe wrapper such as the zerocopy crate.

呼叫不安全的函式

如果函式或方法具有額外先決條件,而您必須遵循這些條件才能避免未定義的行為,那麼就可以將該函式或方法標示為 unsafe

fn main() {
    let emojis = "🗻∈🌏";

    // Safe because the indices are in the correct order, within the bounds of
    // the string slice, and lie on UTF-8 sequence boundaries.
    unsafe {
        println!("emoji: {}", emojis.get_unchecked(0..4));
        println!("emoji: {}", emojis.get_unchecked(4..7));
        println!("emoji: {}", emojis.get_unchecked(7..11));
    }

    println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..7) }));

    // Not upholding the UTF-8 encoding requirement breaks memory safety!
    // println!("emoji: {}", unsafe { emojis.get_unchecked(0..3) });
    // println!("char count: {}", count_chars(unsafe { emojis.get_unchecked(0..3) }));
}

fn count_chars(s: &str) -> usize {
    s.chars().map(|_| 1).sum()
}

編寫不安全的函式

如果您的函式必須滿足特定條件才能避免未定義的行為,您可以將其標示為 unsafe

/// Swaps the values pointed to by the given pointers.
///
/// # Safety
///
/// The pointers must be valid and properly aligned.
unsafe fn swap(a: *mut u8, b: *mut u8) {
    let temp = *a;
    *a = *b;
    *b = temp;
}

fn main() {
    let mut a = 42;
    let mut b = 66;

    // Safe because ...
    unsafe {
        swap(&mut a, &mut b);
    }

    println!("a = {}, b = {}", a, b);
}

We wouldn’t actually use pointers for this because it can be done safely with references.

Note that unsafe code is allowed within an unsafe function without an unsafe block. We can prohibit this with #[deny(unsafe_op_in_unsafe_fn)]. Try adding it and see what happens.

呼叫外部程式碼

其他語言的函式可能會違反 Rust 保證,因此呼叫這類函式並不安全:

extern "C" {
    fn abs(input: i32) -> i32;
}

fn main() {
    unsafe {
        // Undefined behavior if abs misbehaves.
        println!("Absolute value of -3 according to C: {}", abs(-3));
    }
}

This is usually only a problem for extern functions which do things with pointers which might violate Rust’s memory model, but in general any C function might have undefined behaviour under any arbitrary circumstances.

此例中的 "C" 為 ABI;您也可以使用其他 ABI

實作不安全的特徵

與函式類似,如果實作程序必須保證符合特定條件才能避免未定義的行為,您可以將特徵標示為 unsafe

舉例來說,zerocopy crate 就具有不安全的特徵,如這個頁面所示:

use std::mem::size_of_val;
use std::slice;

/// ...
/// # Safety
/// The type must have a defined representation and no padding.
pub unsafe trait AsBytes {
    fn as_bytes(&self) -> &[u8] {
        unsafe {
            slice::from_raw_parts(self as *const Self as *const u8, size_of_val(self))
        }
    }
}

// Safe because u32 has a defined representation and no padding.
unsafe impl AsBytes for u32 {}

There should be a # Safety section on the Rustdoc for the trait explaining the requirements for the trait to be safely implemented.

The actual safety section for AsBytes is rather longer and more complicated.

The built-in Send and Sync traits are unsafe.

第 3 天:下午練習

我們來建構用於讀取目錄內容的安全包裝函式吧!

在這個練習中,建議您使用本機開發環境,不要用 Playground。這可讓您在自己的機器上執行二進位檔。

首先,請按照在本機執行的說明操作。

完成練習後,您可以看看我們提供的解決方案

安全的 FFI 包裝函式

Rust 對透過「外部函式介面」__(FFI) 呼叫函式的做法提供強大支援。我們會利用這點來為 libc 函式建立安全的包裝函式,這是您在 C 語言中用來讀取目錄檔案名稱的函式。

建議您參閱以下手冊頁面:

建議您一併瀏覽 std::ffi 模組。其中會有練習需用到的幾個字串型別:

類型編碼使用
strStringUTF-8在 Rust 中處理文字
CStrCString空字串結尾與 C 函式通訊
OsStrOsString特定 OS與 OS 通訊

您將在以下所有型別之間轉換:

  • &strCString:您需要為結尾的 \0 字元分配空間。
  • CString*const i8:您需要指標才能呼叫 C 函式。
  • *const i8&CStr:您需要一些可以找到結尾 \0 字元的內容。
  • &CStr&[u8]:位元組切片是「某些未知資料」的通用介面。
  • &[u8]&OsStr&OsStr 是通往 OsString 的一步,請以 OsStrExt 建立。
  • &OsStrOsString:您需複製 &OsStr 中的資料,才能傳回資料並再次呼叫 readdir

Nomicon 也有關於 FFI 的實用章節可供參閱。

請將以下程式碼複製到 https://play.rust-lang.org/,並填入缺少的函式和方法:

// TODO: remove this when you're done with your implementation.
#![allow(unused_imports, unused_variables, dead_code)]

mod ffi {
    use std::os::raw::{c_char, c_int};
    #[cfg(not(target_os = "macos"))]
    use std::os::raw::{c_long, c_ulong, c_ushort, c_uchar};

    // Opaque type. See https://doc.rust-lang.org/nomicon/ffi.html.
    #[repr(C)]
    pub struct DIR {
        _data: [u8; 0],
        _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
    }

    // Layout according to the Linux man page for readdir(3), where ino_t and
    // off_t are resolved according to the definitions in
    // /usr/include/x86_64-linux-gnu/{sys/types.h, bits/typesizes.h}.
    #[cfg(not(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_ino: c_ulong,
        pub d_off: c_long,
        pub d_reclen: c_ushort,
        pub d_type: c_uchar,
        pub d_name: [c_char; 256],
    }

    // Layout according to the macOS man page for dir(5).
    #[cfg(all(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_fileno: u64,
        pub d_seekoff: u64,
        pub d_reclen: u16,
        pub d_namlen: u16,
        pub d_type: u8,
        pub d_name: [c_char; 1024],
    }

    extern "C" {
        pub fn opendir(s: *const c_char) -> *mut DIR;

        #[cfg(not(all(target_os = "macos", target_arch = "x86_64")))]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        // See https://github.com/rust-lang/libc/issues/414 and the section on
        // _DARWIN_FEATURE_64_BIT_INODE in the macOS man page for stat(2).
        //
        // "Platforms that existed before these updates were available" refers
        // to macOS (as opposed to iOS / wearOS / etc.) on Intel and PowerPC.
        #[cfg(all(target_os = "macos", target_arch = "x86_64"))]
        #[link_name = "readdir$INODE64"]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        pub fn closedir(s: *mut DIR) -> c_int;
    }
}

use std::ffi::{CStr, CString, OsStr, OsString};
use std::os::unix::ffi::OsStrExt;

#[derive(Debug)]
struct DirectoryIterator {
    path: CString,
    dir: *mut ffi::DIR,
}

impl DirectoryIterator {
    fn new(path: &str) -> Result<DirectoryIterator, String> {
        // Call opendir and return a Ok value if that worked,
        // otherwise return Err with a message.
        unimplemented!()
    }
}

impl Iterator for DirectoryIterator {
    type Item = OsString;
    fn next(&mut self) -> Option<OsString> {
        // Keep calling readdir until we get a NULL pointer back.
        unimplemented!()
    }
}

impl Drop for DirectoryIterator {
    fn drop(&mut self) {
        // Call closedir as needed.
        unimplemented!()
    }
}

fn main() -> Result<(), String> {
    let iter = DirectoryIterator::new(".")?;
    println!("files: {:#?}", iter.collect::<Vec<_>>());
    Ok(())
}

Welcome to Rust in Android

Rust is supported for native platform development on Android. This means that you can write new operating system services in Rust, as well as extending existing services.

We will attempt to call Rust from one of your own projects today. So try to find a little corner of your code base where we can move some lines of code to Rust. The fewer dependencies and “exotic” types the better. Something that parses some raw bytes would be ideal.

設定

We will be using an Android Virtual Device to test our code. Make sure you have access to one or create a new one with:

source build/envsetup.sh
lunch aosp_cf_x86_64_phone-userdebug
acloud create

Please see the Android Developer Codelab for details.

建構規則

Android 的建構系統 (Soong) 透過以下模組支援 Rust:

模組型態敘述
rust_binary生成一個 Rust 執行檔。
rust_library生成一個 Rust 函式庫,及其對應的 rlibdylib 變體。
rust_ffi生成一個可被 cc 模組使用的 Rust C 函式庫,及其對應的靜態和共享變體。
rust_proc_macro生成一個 proc-macro Rust 函式庫,類似於編譯器 擴充。
rust_test使用Rust自動化測試框架,生成一個 Rust 測試檔。
rust_fuzz生成一個使用 libfuzzer 的 Rust 模糊測試執行檔。
rust_protobuf生成對應 protobuf 介面的 Rust 原始碼及函式庫。
rust_bindgen生成用於連接 C 函式庫的 Rust 原始碼及函式庫。

接下來我們會探討 rust_binaryrust_library

Rust Binaries

Let us start with a simple application. At the root of an AOSP checkout, create the following files:

hello_rust/Android.bp:

rust_binary {
    name: "hello_rust",
    crate_name: "hello_rust",
    srcs: ["src/main.rs"],
}

hello_rust/src/main.rs:

//! Rust demo.

/// Prints a greeting to standard output.
fn main() {
    println!("Hello from Rust!");
}

You can now build, push, and run the binary:

m hello_rust
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust /data/local/tmp"
adb shell /data/local/tmp/hello_rust
Hello from Rust!

Rust Libraries

You use rust_library to create a new Rust library for Android.

Here we declare a dependency on two libraries:

  • libgreeting, which we define below,
  • libtextwrap, which is a crate already vendored in external/rust/crates/.

hello_rust/Android.bp:

rust_binary {
    name: "hello_rust_with_dep",
    crate_name: "hello_rust_with_dep",
    srcs: ["src/main.rs"],
    rustlibs: [
        "libgreetings",
        "libtextwrap",
    ],
    prefer_rlib: true,
}

rust_library {
    name: "libgreetings",
    crate_name: "greetings",
    srcs: ["src/lib.rs"],
}

hello_rust/src/main.rs:

//! Rust demo.

use greetings::greeting;
use textwrap::fill;

/// Prints a greeting to standard output.
fn main() {
    println!("{}", fill(&greeting("Bob"), 24));
}

hello_rust/src/lib.rs:

//! Greeting library.

/// Greet `name`.
pub fn greeting(name: &str) -> String {
    format!("Hello {name}, it is very nice to meet you!")
}

You build, push, and run the binary like before:

m hello_rust_with_dep
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust_with_dep /data/local/tmp"
adb shell /data/local/tmp/hello_rust_with_dep
Hello Bob, it is very
nice to meet you!

AIDL

The Android Interface Definition Language (AIDL) is supported in Rust:

  • Rust code can call existing AIDL servers,
  • You can create new AIDL servers in Rust.

AIDL Interfaces

You declare the API of your service using an AIDL interface:

birthday_service/aidl/com/example/birthdayservice/IBirthdayService.aidl:

package com.example.birthdayservice;

/** Birthday service interface. */
interface IBirthdayService {
    /** Generate a Happy Birthday message. */
    String wishHappyBirthday(String name, int years);
}

birthday_service/aidl/Android.bp:

aidl_interface {
    name: "com.example.birthdayservice",
    srcs: ["com/example/birthdayservice/*.aidl"],
    unstable: true,
    backend: {
        rust: { // Rust is not enabled by default
            enabled: true,
        },
    },
}

Add vendor_available: true if your AIDL file is used by a binary in the vendor partition.

Service Implementation

We can now implement the AIDL service:

birthday_service/src/lib.rs:

//! Implementation of the `IBirthdayService` AIDL interface.
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::IBirthdayService;
use com_example_birthdayservice::binder;

/// The `IBirthdayService` implementation.
pub struct BirthdayService;

impl binder::Interface for BirthdayService {}

impl IBirthdayService for BirthdayService {
    fn wishHappyBirthday(&self, name: &str, years: i32) -> binder::Result<String> {
        Ok(format!(
            "Happy Birthday {name}, congratulations with the {years} years!"
        ))
    }
}

birthday_service/Android.bp:

rust_library {
    name: "libbirthdayservice",
    srcs: ["src/lib.rs"],
    crate_name: "birthdayservice",
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
    ],
}

AIDL Server

Finally, we can create a server which exposes the service:

birthday_service/src/server.rs:

//! Birthday service.
use birthdayservice::BirthdayService;
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::BnBirthdayService;
use com_example_birthdayservice::binder;

const SERVICE_IDENTIFIER: &str = "birthdayservice";

/// Entry point for birthday service.
fn main() {
    let birthday_service = BirthdayService;
    let birthday_service_binder = BnBirthdayService::new_binder(
        birthday_service,
        binder::BinderFeatures::default(),
    );
    binder::add_service(SERVICE_IDENTIFIER, birthday_service_binder.as_binder())
        .expect("Failed to register service");
    binder::ProcessState::join_thread_pool()
}

birthday_service/Android.bp:

rust_binary {
    name: "birthday_server",
    crate_name: "birthday_server",
    srcs: ["src/server.rs"],
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
        "libbirthdayservice",
    ],
    prefer_rlib: true,
}

部署

We can now build, push, and start the service:

m birthday_server
adb push "$ANDROID_PRODUCT_OUT/system/bin/birthday_server /data/local/tmp"
adb shell /data/local/tmp/birthday_server

In another terminal, check that the service runs:

adb shell service check birthdayservice
Service birthdayservice: found

You can also call the service with service call:

adb shell service call birthdayservice 1 s16 Bob i32 24
Result: Parcel(
  0x00000000: 00000000 00000036 00610048 00700070 '....6...H.a.p.p.'
  0x00000010: 00200079 00690042 00740072 00640068 'y. .B.i.r.t.h.d.'
  0x00000020: 00790061 00420020 0062006f 0020002c 'a.y. .B.o.b.,. .'
  0x00000030: 006f0063 0067006e 00610072 00750074 'c.o.n.g.r.a.t.u.'
  0x00000040: 0061006c 00690074 006e006f 00200073 'l.a.t.i.o.n.s. .'
  0x00000050: 00690077 00680074 00740020 00650068 'w.i.t.h. .t.h.e.'
  0x00000060: 00320020 00200034 00650079 00720061 ' .2.4. .y.e.a.r.'
  0x00000070: 00210073 00000000                   's.!.....        ')

AIDL Client

Finally, we can create a Rust client for our new service.

birthday_service/src/client.rs:

//! Birthday service.
use com_example_birthdayservice::aidl::com::example::birthdayservice::IBirthdayService::IBirthdayService;
use com_example_birthdayservice::binder;

const SERVICE_IDENTIFIER: &str = "birthdayservice";

/// Connect to the BirthdayService.
pub fn connect() -> Result<binder::Strong<dyn IBirthdayService>, binder::StatusCode> {
    binder::get_interface(SERVICE_IDENTIFIER)
}

/// Call the birthday service.
fn main() -> Result<(), binder::Status> {
    let name = std::env::args()
        .nth(1)
        .unwrap_or_else(|| String::from("Bob"));
    let years = std::env::args()
        .nth(2)
        .and_then(|arg| arg.parse::<i32>().ok())
        .unwrap_or(42);

    binder::ProcessState::start_thread_pool();
    let service = connect().expect("Failed to connect to BirthdayService");
    let msg = service.wishHappyBirthday(&name, years)?;
    println!("{msg}");
    Ok(())
}

birthday_service/Android.bp:

rust_binary {
    name: "birthday_client",
    crate_name: "birthday_client",
    srcs: ["src/client.rs"],
    rustlibs: [
        "com.example.birthdayservice-rust",
        "libbinder_rs",
    ],
    prefer_rlib: true,
}

Notice that the client does not depend on libbirthdayservice.

Build, push, and run the client on your device:

m birthday_client
adb push "$ANDROID_PRODUCT_OUT/system/bin/birthday_client /data/local/tmp"
adb shell /data/local/tmp/birthday_client Charlie 60
Happy Birthday Charlie, congratulations with the 60 years!

改寫 API

讓我們為這個 API 擴充更多功能:我們想要讓用戶能在生日卡上指定幾行字:

package com.example.birthdayservice;

/** Birthday service interface. */
interface IBirthdayService {
    /** Generate a Happy Birthday message. */
    String wishHappyBirthday(String name, int years, in String[] text);
}

記錄

You should use the log crate to automatically log to logcat (on-device) or stdout (on-host):

hello_rust_logs/Android.bp:

rust_binary {
    name: "hello_rust_logs",
    crate_name: "hello_rust_logs",
    srcs: ["src/main.rs"],
    rustlibs: [
        "liblog_rust",
        "liblogger",
    ],
    prefer_rlib: true,
    host_supported: true,
}

hello_rust_logs/src/main.rs:

//! Rust logging demo.

use log::{debug, error, info};

/// Logs a greeting.
fn main() {
    logger::init(
        logger::Config::default()
            .with_tag_on_device("rust")
            .with_min_level(log::Level::Trace),
    );
    debug!("Starting program.");
    info!("Things are going fine.");
    error!("Something went wrong!");
}

Build, push, and run the binary on your device:

m hello_rust_logs
adb push "$ANDROID_PRODUCT_OUT/system/bin/hello_rust_logs /data/local/tmp"
adb shell /data/local/tmp/hello_rust_logs

The logs show up in adb logcat:

adb logcat -s rust
09-08 08:38:32.454  2420  2420 D rust: hello_rust_logs: Starting program.
09-08 08:38:32.454  2420  2420 I rust: hello_rust_logs: Things are going fine.
09-08 08:38:32.454  2420  2420 E rust: hello_rust_logs: Something went wrong!

互通性

Rust has excellent support for interoperability with other languages. This means that you can:

  • Call Rust functions from other languages.
  • Call functions written in other languages from Rust.

When you call functions in a foreign language we say that you’re using a foreign function interface, also known as FFI.

Interoperability with C

Rust has full support for linking object files with a C calling convention. Similarly, you can export Rust functions and call them from C.

You can do it by hand if you want:

extern "C" {
    fn abs(x: i32) -> i32;
}

fn main() {
    let x = -42;
    let abs_x = unsafe { abs(x) };
    println!("{x}, {abs_x}");
}

We already saw this in the Safe FFI Wrapper exercise.

This assumes full knowledge of the target platform. Not recommended for production.

We will look at better options next.

Using Bindgen

The bindgen tool can auto-generate bindings from a C header file.

First create a small C library:

interoperability/bindgen/libbirthday.h:

typedef struct card {
  const char* name;
  int years;
} card;

void print_card(const card* card);

interoperability/bindgen/libbirthday.c:

#include <stdio.h>
#include "libbirthday.h"

void print_card(const card* card) {
  printf("+--------------\n");
  printf("| Happy Birthday %s!\n", card->name);
  printf("| Congratulations with the %i years!\n", card->years);
  printf("+--------------\n");
}

Add this to your Android.bp file:

interoperability/bindgen/Android.bp:

cc_library {
    name: "libbirthday",
    srcs: ["libbirthday.c"],
}

Create a wrapper header file for the library (not strictly needed in this example):

interoperability/bindgen/libbirthday_wrapper.h:

#include "libbirthday.h"

You can now auto-generate the bindings:

interoperability/bindgen/Android.bp:

rust_bindgen {
    name: "libbirthday_bindgen",
    crate_name: "birthday_bindgen",
    wrapper_src: "libbirthday_wrapper.h",
    source_stem: "bindings",
    static_libs: ["libbirthday"],
}

Finally, we can use the bindings in our Rust program:

interoperability/bindgen/Android.bp:

rust_binary {
    name: "print_birthday_card",
    srcs: ["main.rs"],
    rustlibs: ["libbirthday_bindgen"],
}

interoperability/bindgen/main.rs:

//! Bindgen demo.

use birthday_bindgen::{card, print_card};

fn main() {
    let name = std::ffi::CString::new("Peter").unwrap();
    let card = card {
        name: name.as_ptr(),
        years: 42,
    };
    unsafe {
        print_card(&card as *const card);
    }
}

Build, push, and run the binary on your device:

m print_birthday_card
adb push "$ANDROID_PRODUCT_OUT/system/bin/print_birthday_card /data/local/tmp"
adb shell /data/local/tmp/print_birthday_card

Finally, we can run auto-generated tests to ensure the bindings work:

interoperability/bindgen/Android.bp:

rust_test {
    name: "libbirthday_bindgen_test",
    srcs: [":libbirthday_bindgen"],
    crate_name: "libbirthday_bindgen_test",
    test_suites: ["general-tests"],
    auto_gen_config: true,
    clippy_lints: "none", // Generated file, skip linting
    lints: "none",
}
atest libbirthday_bindgen_test

Calling Rust

Exporting Rust functions and types to C is easy:

interoperability/rust/libanalyze/analyze.rs

//! Rust FFI demo.
#![deny(improper_ctypes_definitions)]

use std::os::raw::c_int;

/// Analyze the numbers.
#[no_mangle]
pub extern "C" fn analyze_numbers(x: c_int, y: c_int) {
    if x < y {
        println!("x ({x}) is smallest!");
    } else {
        println!("y ({y}) is probably larger than x ({x})");
    }
}

interoperability/rust/libanalyze/analyze.h

#ifndef ANALYSE_H
#define ANALYSE_H

extern "C" {
void analyze_numbers(int x, int y);
}

#endif

interoperability/rust/libanalyze/Android.bp

rust_ffi {
    name: "libanalyze_ffi",
    crate_name: "analyze_ffi",
    srcs: ["analyze.rs"],
    include_dirs: ["."],
}

We can now call this from a C binary:

interoperability/rust/analyze/main.c

#include "analyze.h"

int main() {
  analyze_numbers(10, 20);
  analyze_numbers(123, 123);
  return 0;
}

interoperability/rust/analyze/Android.bp

cc_binary {
    name: "analyze_numbers",
    srcs: ["main.c"],
    static_libs: ["libanalyze_ffi"],
}

Build, push, and run the binary on your device:

m analyze_numbers
adb push "$ANDROID_PRODUCT_OUT/system/bin/analyze_numbers /data/local/tmp"
adb shell /data/local/tmp/analyze_numbers

#[no_mangle] disables Rust’s usual name mangling, so the exported symbol will just be the name of the function. You can also use #[export_name = "some_name"] to specify whatever name you want.

與 C++ 的互通性

The CXX crate makes it possible to do safe interoperability between Rust and C++.

The overall approach looks like this:

See the CXX tutorial for an full example of using this.

  • At this point, the instructor should switch to the CXX tutorial.

  • Walk the students through the tutorial step by step.

  • Highlight how CXX presents a clean interface without unsafe code in both languages.

  • Show the correspondence between Rust and C++ types:

    • Explain how a Rust String cannot map to a C++ std::string (the latter does not uphold the UTF-8 invariant). Show that despite being different types, rust::String in C++ can be easily constructed from a C++ std::string, making it very ergonomic to use.

    • Explain that a Rust function returning Result<T, E> becomes a function which throws a E exception in C++ (and vice versa).

Interoperability with Java

Java can load shared objects via Java Native Interface (JNI). The jni crate allows you to create a compatible library.

First, we create a Rust function to export to Java:

interoperability/java/src/lib.rs:

#![allow(unused)]
fn main() {
//! Rust <-> Java FFI demo.

use jni::objects::{JClass, JString};
use jni::sys::jstring;
use jni::JNIEnv;

/// HelloWorld::hello method implementation.
#[no_mangle]
pub extern "system" fn Java_HelloWorld_hello(
    env: JNIEnv,
    _class: JClass,
    name: JString,
) -> jstring {
    let input: String = env.get_string(name).unwrap().into();
    let greeting = format!("Hello, {input}!");
    let output = env.new_string(greeting).unwrap();
    output.into_inner()
}
}

interoperability/java/Android.bp:

rust_ffi_shared {
    name: "libhello_jni",
    crate_name: "hello_jni",
    srcs: ["src/lib.rs"],
    rustlibs: ["libjni"],
}

Finally, we can call this function from Java:

interoperability/java/HelloWorld.java:

class HelloWorld {
    private static native String hello(String name);

    static {
        System.loadLibrary("hello_jni");
    }

    public static void main(String[] args) {
        String output = HelloWorld.hello("Alice");
        System.out.println(output);
    }
}

interoperability/java/Android.bp:

java_binary {
    name: "helloworld_jni",
    srcs: ["HelloWorld.java"],
    main_class: "HelloWorld",
    required: ["libhello_jni"],
}

Finally, you can build, sync, and run the binary:

m helloworld_jni
adb sync  # requires adb root && adb remount
adb shell /system/bin/helloworld_jni

練習

This is a group exercise: We will look at one of the projects you work with and try to integrate some Rust into it. Some suggestions:

  • Call your AIDL service with a client written in Rust.

  • Move a function from your project to Rust and call it.

No solution is provided here since this is open-ended: it relies on someone in the class having a piece of code which you can turn in to Rust on the fly.

Welcome to Bare Metal Rust

This is a standalone one-day course about bare-metal Rust, aimed at people who are familiar with the basics of Rust (perhaps from completing the Comprehensive Rust course), and ideally also have some experience with bare-metal programming in some other language such as C.

Today we will talk about ‘bare-metal’ Rust: running Rust code without an OS underneath us. This will be divided into several parts:

  • What is no_std Rust?
  • Writing firmware for microcontrollers.
  • Writing bootloader / kernel code for application processors.
  • Some useful crates for bare-metal Rust development.

For the microcontroller part of the course we will use the BBC micro:bit v2 as an example. It’s a development board based on the Nordic nRF51822 microcontroller with some LEDs and buttons, an I2C-connected accelerometer and compass, and an on-board SWD debugger.

To get started, install some tools we’ll need later. On gLinux or Debian:

sudo apt install gcc-aarch64-linux-gnu gdb-multiarch libudev-dev picocom pkg-config qemu-system-arm
rustup update
rustup target add aarch64-unknown-none thumbv7em-none-eabihf
rustup component add llvm-tools-preview
cargo install cargo-binutils cargo-embed

And give users in the plugdev group access to the micro:bit programmer:

echo 'SUBSYSTEM=="usb", ATTR{idVendor}=="0d28", MODE="0664", GROUP="plugdev"' |\
  sudo tee /etc/udev/rules.d/50-microbit.rules
sudo udevadm control --reload-rules

On MacOS:

xcode-select --install
brew install gdb picocom qemu
brew install --cask gcc-aarch64-embedded
rustup update
rustup target add aarch64-unknown-none thumbv7em-none-eabihf
rustup component add llvm-tools-preview
cargo install cargo-binutils cargo-embed

no_std

core

alloc

std

  • Slices, &str, CStr
  • NonZeroU8
  • Option, Result
  • Display, Debug, write!
  • Iterator
  • panic!, assert_eq!
  • NonNull and all the usual pointer-related functions
  • Future and async/await
  • fence, AtomicBool, AtomicPtr, AtomicU32
  • Duration
  • Box, Cow, Arc, Rc
  • Vec, BinaryHeap, BtreeMap, LinkedList, VecDeque
  • String, CString, format!
  • Error
  • HashMap
  • Mutex, Condvar, Barrier, Once, RwLock, mpsc
  • File and the rest of fs
  • println!, Read, Write, Stdin, Stdout and the rest of io
  • Path, OsString
  • net
  • Command, Child, ExitCode
  • spawn, sleep and the rest of thread
  • SystemTime, Instant
  • HashMap depends on RNG.
  • std re-exports the contents of both core and alloc.

A minimal no_std program

#![no_main]
#![no_std]

use core::panic::PanicInfo;

#[panic_handler]
fn panic(_panic: &PanicInfo) -> ! {
    loop {}
}
  • This will compile to an empty binary.
  • std provides a panic handler; without it we must provide our own.
  • It can also be provided by another crate, such as panic-halt.
  • Depending on the target, you may need to compile with panic = "abort" to avoid an error about eh_personality.
  • Note that there is no main or any other entry point; it’s up to you to define your own entry point. This will typically involve a linker script and some assembly code to set things up ready for Rust code to run.

alloc

To use alloc you must implement a global (heap) allocator.

#![no_main]
#![no_std]

extern crate alloc;
extern crate panic_halt as _;

use alloc::string::ToString;
use alloc::vec::Vec;
use buddy_system_allocator::LockedHeap;

#[global_allocator]
static HEAP_ALLOCATOR: LockedHeap<32> = LockedHeap::<32>::new();

static mut HEAP: [u8; 65536] = [0; 65536];

pub fn entry() {
    // Safe because `HEAP` is only used here and `entry` is only called once.
    unsafe {
        // Give the allocator some memory to allocate.
        HEAP_ALLOCATOR
            .lock()
            .init(HEAP.as_mut_ptr() as usize, HEAP.len());
    }

    // Now we can do things that require heap allocation.
    let mut v = Vec::new();
    v.push("A string".to_string());
}
  • buddy_system_allocator is a third-party crate implementing a basic buddy system allocator. Other crates are available, or you can write your own or hook into your existing allocator.
  • The const parameter of LockedHeap is the max order of the allocator; i.e. in this case it can allocate regions of up to 2**32 bytes.
  • If any crate in your dependency tree depends on alloc then you must have exactly one global allocator defined in your binary. Usually this is done in the top-level binary crate.
  • extern crate panic_halt as _ is necessary to ensure that the panic_halt crate is linked in so we get its panic handler.
  • This example will build but not run, as it doesn’t have an entry point.

微控制器

The cortex_m_rt crate provides (among other things) a reset handler for Cortex M microcontrollers.

#![no_main]
#![no_std]

extern crate panic_halt as _;

mod interrupts;

use cortex_m_rt::entry;

#[entry]
fn main() -> ! {
    loop {}
}

Next we’ll look at how to access peripherals, with increasing levels of abstraction.

  • The cortex_m_rt::entry macro requires that the function have type fn() -> !, because returning to the reset handler doesn’t make sense.
  • Run the example with cargo embed --bin minimal

原始 MMIO

Most microcontrollers access peripherals via memory-mapped IO. Let’s try turning on an LED on our micro:bit:

#![no_main]
#![no_std]

extern crate panic_halt as _;

mod interrupts;

use core::mem::size_of;
use cortex_m_rt::entry;

/// GPIO port 0 peripheral address
const GPIO_P0: usize = 0x5000_0000;

// GPIO peripheral offsets
const PIN_CNF: usize = 0x700;
const OUTSET: usize = 0x508;
const OUTCLR: usize = 0x50c;

// PIN_CNF fields
const DIR_OUTPUT: u32 = 0x1;
const INPUT_DISCONNECT: u32 = 0x1 << 1;
const PULL_DISABLED: u32 = 0x0 << 2;
const DRIVE_S0S1: u32 = 0x0 << 8;
const SENSE_DISABLED: u32 = 0x0 << 16;

#[entry]
fn main() -> ! {
    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    let pin_cnf_21 = (GPIO_P0 + PIN_CNF + 21 * size_of::<u32>()) as *mut u32;
    let pin_cnf_28 = (GPIO_P0 + PIN_CNF + 28 * size_of::<u32>()) as *mut u32;
    // Safe because the pointers are to valid peripheral control registers, and
    // no aliases exist.
    unsafe {
        pin_cnf_21.write_volatile(
            DIR_OUTPUT | INPUT_DISCONNECT | PULL_DISABLED | DRIVE_S0S1 | SENSE_DISABLED,
        );
        pin_cnf_28.write_volatile(
            DIR_OUTPUT | INPUT_DISCONNECT | PULL_DISABLED | DRIVE_S0S1 | SENSE_DISABLED,
        );
    }

    // Set pin 28 low and pin 21 high to turn the LED on.
    let gpio0_outset = (GPIO_P0 + OUTSET) as *mut u32;
    let gpio0_outclr = (GPIO_P0 + OUTCLR) as *mut u32;
    // Safe because the pointers are to valid peripheral control registers, and
    // no aliases exist.
    unsafe {
        gpio0_outclr.write_volatile(1 << 28);
        gpio0_outset.write_volatile(1 << 21);
    }

    loop {}
}
  • GPIO 0 pin 21 is connected to the first column of the LED matrix, and pin 28 to the first row.

Run the example with:

cargo embed --bin mmio

Peripheral Access Crates

svd2rust generates mostly-safe Rust wrappers for memory-mapped peripherals from CMSIS-SVD files.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use nrf52833_pac::Peripherals;

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();
    let gpio0 = p.P0;

    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    gpio0.pin_cnf[21].write(|w| {
        w.dir().output();
        w.input().disconnect();
        w.pull().disabled();
        w.drive().s0s1();
        w.sense().disabled();
        w
    });
    gpio0.pin_cnf[28].write(|w| {
        w.dir().output();
        w.input().disconnect();
        w.pull().disabled();
        w.drive().s0s1();
        w.sense().disabled();
        w
    });

    // Set pin 28 low and pin 21 high to turn the LED on.
    gpio0.outclr.write(|w| w.pin28().clear());
    gpio0.outset.write(|w| w.pin21().set());

    loop {}
}
  • SVD (System View Description) files are XML files typically provided by silicon vendors which describe the memory map of the device.
    • They are organised by peripheral, register, field and value, with names, descriptions, addresses and so on.
    • SVD files are often buggy and incomplete, so there are various projects which patch the mistakes, add missing details, and publish the generated crates.
  • cortex-m-rt provides the vector table, among other things.
  • If you cargo install cargo-binutils then you can run cargo objdump --bin pac -- -d --no-show-raw-insn to see the resulting binary.

Run the example with:

cargo embed --bin pac

HAL crates

HAL crates for many microcontrollers provide wrappers around various peripherals. These generally implement traits from embedded-hal.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use nrf52833_hal::gpio::{p0, Level};
use nrf52833_hal::pac::Peripherals;
use nrf52833_hal::prelude::*;

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();

    // Create HAL wrapper for GPIO port 0.
    let gpio0 = p0::Parts::new(p.P0);

    // Configure GPIO 0 pins 21 and 28 as push-pull outputs.
    let mut col1 = gpio0.p0_28.into_push_pull_output(Level::High);
    let mut row1 = gpio0.p0_21.into_push_pull_output(Level::Low);

    // Set pin 28 low and pin 21 high to turn the LED on.
    col1.set_low().unwrap();
    row1.set_high().unwrap();

    loop {}
}
  • set_low and set_high are methods on the embedded_hal OutputPin trait.
  • HAL crates exist for many Cortex-M and RISC-V devices, including various STM32, GD32, nRF, NXP, MSP430, AVR and PIC microcontrollers.

Run the example with:

cargo embed --bin hal

Board support crates

Board support crates provide a further level of wrapping for a specific board for convenience.

#![no_main]
#![no_std]

extern crate panic_halt as _;

use cortex_m_rt::entry;
use microbit::hal::prelude::*;
use microbit::Board;

#[entry]
fn main() -> ! {
    let mut board = Board::take().unwrap();

    board.display_pins.col1.set_low().unwrap();
    board.display_pins.row1.set_high().unwrap();

    loop {}
}
  • In this case the board support crate is just providing more useful names, and a bit of initialisation.
  • The crate may also include drivers for some on-board devices outside of the microcontroller itself.
    • microbit-v2 includes a simple driver for the LED matrix.

Run the example with:

cargo embed --bin board_support

The type state pattern

#[entry]
fn main() -> ! {
    let p = Peripherals::take().unwrap();
    let gpio0 = p0::Parts::new(p.P0);

    let pin: P0_01<Disconnected> = gpio0.p0_01;

    // let gpio0_01_again = gpio0.p0_01; // Error, moved.
    let pin_input: P0_01<Input<Floating>> = pin.into_floating_input();
    if pin_input.is_high().unwrap() {
        // ...
    }
    let mut pin_output: P0_01<Output<OpenDrain>> = pin_input
        .into_open_drain_output(OpenDrainConfig::Disconnect0Standard1, Level::Low);
    pin_output.set_high().unwrap();
    // pin_input.is_high(); // Error, moved.

    let _pin2: P0_02<Output<OpenDrain>> = gpio0
        .p0_02
        .into_open_drain_output(OpenDrainConfig::Disconnect0Standard1, Level::Low);
    let _pin3: P0_03<Output<PushPull>> = gpio0.p0_03.into_push_pull_output(Level::Low);

    loop {}
}
  • Pins don’t implement Copy or Clone, so only one instance of each can exist. Once a pin is moved out of the port struct nobody else can take it.
  • Changing the configuration of a pin consumes the old pin instance, so you can’t keep use the old instance afterwards.
  • The type of a value indicates the state that it is in: e.g. in this case, the configuration state of a GPIO pin. This encodes the state machine into the type system, and ensures that you don’t try to use a pin in a certain way without properly configuring it first. Illegal state transitions are caught at compile time.
  • You can call is_high on an input pin and set_high on an output pin, but not vice-versa.
  • Many HAL crates follow this pattern.

embedded-hal

The embedded-hal crate provides a number of traits covering common microcontroller peripherals.

  • GPIO
  • ADC
  • I2C, SPI, UART, CAN
  • RNG
  • Timers
  • Watchdogs

Other crates then implement drivers in terms of these traits, e.g. an accelerometer driver might need an I2C or SPI bus implementation.

  • There are implementations for many microcontrollers, as well as other platforms such as Linux on Raspberry Pi.
  • There is work in progress on an async version of embedded-hal, but it isn’t stable yet.

probe-rs, cargo-embed

probe-rs is a handy toolset for embedded debugging, like OpenOCD but better integrated.

  • SWD and JTAG via CMSIS-DAP, ST-Link and J-Link probes
  • GDB stub and Microsoft DAP server
  • Cargo integration

cargo-embed is a cargo subcommand to build and flash binaries, log RTT output and connect GDB. It’s configured by an Embed.toml file in your project directory.

  • CMSIS-DAP is an Arm standard protocol over USB for an in-circuit debugger to access the CoreSight Debug Access Port of various Arm Cortex processors. It’s what the on-board debugger on the BBC micro:bit uses.
  • ST-Link is a range of in-circuit debuggers from ST Microelectronics, J-Link is a range from SEGGER.
  • The Debug Access Port is usually either a 5-pin JTAG interface or 2-pin Serial Wire Debug.
  • probe-rs is a library which you can integrate into your own tools if you want to.
  • The Microsoft Debug Adapter Protocol lets VSCode and other IDEs debug code running on any supported microcontroller.
  • cargo-embed is a binary built using the probe-rs library.
  • RTT (Real Time Transfers) is a mechanism to transfer data between the debug host and the target through a number of ringbuffers.

偵錯

Embed.toml:

[default.general]
chip = "nrf52833_xxAA"

[debug.gdb]
enabled = true

In one terminal under src/bare-metal/microcontrollers/examples/:

cargo embed --bin board_support debug

In another terminal in the same directory:

gdb-multiarch target/thumbv7em-none-eabihf/debug/board_support --eval-command="target remote :1337"

In GDB, try running:

b src/bin/board_support.rs:29
b src/bin/board_support.rs:30
b src/bin/board_support.rs:32
c
c
c

Other projects

  • RTIC
    • “Real-Time Interrupt-driven Concurrency”
    • Shared resource management, message passing, task scheduling, timer queue
  • Embassy
    • async executors with priorities, timers, networking, USB
  • TockOS
    • Security-focused RTOS with preemptive scheduling and Memory Protection Unit support
  • Hubris
    • Microkernel RTOS from Oxide Computer Company with memory protection, unprivileged drivers, IPC
  • Bindings for FreeRTOS
  • Some platforms have std implementations, e.g. esp-idf.
  • RTIC can be considered either an RTOS or a concurrency framework.
    • It doesn’t include any HALs.
    • It uses the Cortex-M NVIC (Nested Virtual Interrupt Controller) for scheduling rather than a proper kernel.
    • Cortex-M only.
  • Google uses TockOS on the Haven microcontroller for Titan security keys.
  • FreeRTOS is mostly written in C, but there are Rust bindings for writing applications.

練習

We will read the direction from an I2C compass, and log the readings to a serial port.

完成練習後,您可以看看我們提供的解決方案

指南針

We will read the direction from an I2C compass, and log the readings to a serial port. If you have time, try displaying it on the LEDs somehow too, or use the buttons somehow.

Hints:

  • Check the documentation for the lsm303agr and microbit-v2 crates, as well as the micro:bit hardware.
  • The LSM303AGR Inertial Measurement Unit is connected to the internal I2C bus.
  • TWI is another name for I2C, so the I2C master peripheral is called TWIM.
  • The LSM303AGR driver needs something implementing the embedded_hal::blocking::i2c::WriteRead trait. The microbit::hal::Twim struct implements this.
  • You have a microbit::Board struct with fields for the various pins and peripherals.
  • You can also look at the nRF52833 datasheet if you want, but it shouldn’t be necessary for this exercise.

Download the exercise template and look in the compass directory for the following files.

src/main.rs:

#![no_main]
#![no_std]

extern crate panic_halt as _;

use core::fmt::Write;
use cortex_m_rt::entry;
use microbit::{hal::uarte::{Baudrate, Parity, Uarte}, Board};

#[entry]
fn main() -> ! {
    let board = Board::take().unwrap();

    // Configure serial port.
    let mut serial = Uarte::new(
        board.UARTE0,
        board.uart.into(),
        Parity::EXCLUDED,
        Baudrate::BAUD115200,
    );

    // Set up the I2C controller and Inertial Measurement Unit.
    // TODO

    writeln!(serial, "Ready.").unwrap();

    loop {
        // Read compass data and log it to the serial port.
        // TODO
    }
}

Cargo.toml (you shouldn’t need to change this):

[workspace]

[package]
name = "compass"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
cortex-m-rt = "0.7.3"
embedded-hal = "0.2.6"
lsm303agr = "0.2.2"
microbit-v2 = "0.13.0"
panic-halt = "0.2.0"

Embed.toml (you shouldn’t need to change this):

[default.general]
chip = "nrf52833_xxAA"

[debug.gdb]
enabled = true

[debug.reset]
halt_afterwards = true

.cargo/config.toml (you shouldn’t need to change this):

[build]
target = "thumbv7em-none-eabihf" # Cortex-M4F

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
rustflags = ["-C", "link-arg=-Tlink.x"]

See the serial output on Linux with:

picocom --baud 115200 --imap lfcrlf /dev/ttyACM0

Or on Mac OS something like (the device name may be slightly different):

picocom --baud 115200 --imap lfcrlf /dev/tty.usbmodem14502

Use Ctrl+A Ctrl+Q to quit picocom.

Application processors

So far we’ve talked about microcontrollers, such as the Arm Cortex-M series. Now let’s try writing something for Cortex-A. For simplicity we’ll just work with QEMU’s aarch64 ‘virt’ board.

  • Broadly speaking, microcontrollers don’t have an MMU or multiple levels of privilege (exception levels on Arm CPUs, rings on x86), while application processors do.
  • QEMU supports emulating various different machines or board models for each architecture. The ‘virt’ board doesn’t correspond to any particular real hardware, but is designed purely for virtual machines.

準備使用 Rust

Before we can start running Rust code, we need to do some initialisation.

.section .init.entry, "ax"
.global entry
entry:
    /*
     * Load and apply the memory management configuration, ready to enable MMU and
     * caches.
     */
    adrp x30, idmap
    msr ttbr0_el1, x30

    mov_i x30, .Lmairval
    msr mair_el1, x30

    mov_i x30, .Ltcrval
    /* Copy the supported PA range into TCR_EL1.IPS. */
    mrs x29, id_aa64mmfr0_el1
    bfi x30, x29, #32, #4

    msr tcr_el1, x30

    mov_i x30, .Lsctlrval

    /*
     * Ensure everything before this point has completed, then invalidate any
     * potentially stale local TLB entries before they start being used.
     */
    isb
    tlbi vmalle1
    ic iallu
    dsb nsh
    isb

    /*
     * Configure sctlr_el1 to enable MMU and cache and don't proceed until this
     * has completed.
     */
    msr sctlr_el1, x30
    isb

    /* Disable trapping floating point access in EL1. */
    mrs x30, cpacr_el1
    orr x30, x30, #(0x3 << 20)
    msr cpacr_el1, x30
    isb

    /* Zero out the bss section. */
    adr_l x29, bss_begin
    adr_l x30, bss_end
0:  cmp x29, x30
    b.hs 1f
    stp xzr, xzr, [x29], #16
    b 0b

1:  /* Prepare the stack. */
    adr_l x30, boot_stack_end
    mov sp, x30

    /* Set up exception vector. */
    adr x30, vector_table_el1
    msr vbar_el1, x30

    /* Call into Rust code. */
    bl main

    /* Loop forever waiting for interrupts. */
2:  wfi
    b 2b
  • This is the same as it would be for C: initialising the processor state, zeroing the BSS, and setting up the stack pointer.
    • The BSS (block starting symbol, for historical reasons) is the part of the object file which containing statically allocated variables which are initialised to zero. They are omitted from the image, to avoid wasting space on zeroes. The compiler assumes that the loader will take care of zeroing them.
  • The BSS may already be zeroed, depending on how memory is initialised and the image is loaded, but we zero it to be sure.
  • We need to enable the MMU and cache before reading or writing any memory. If we don’t:
    • Unaligned accesses will fault. We build the Rust code for the aarch64-unknown-none target which sets +strict-align to prevent the compiler generating unaligned accesses, so it should be fine in this case, but this is not necessarily the case in general.
    • If it were running in a VM, this can lead to cache coherency issues. The problem is that the VM is accessing memory directly with the cache disabled, while the host has cacheable aliases to the same memory. Even if the host doesn’t explicitly access the memory, speculative accesses can lead to cache fills, and then changes from one or the other will get lost when the cache is cleaned or the VM enables the cache. (Cache is keyed by physical address, not VA or IPA.)
  • For simplicity, we just use a hardcoded pagetable (see idmap.S) which identity maps the first 1 GiB of address space for devices, the next 1 GiB for DRAM, and another 1 GiB higher up for more devices. This matches the memory layout that QEMU uses.
  • We also set up the exception vector (vbar_el1), which we’ll see more about later.
  • All examples this afternoon assume we will be running at exception level 1 (EL1). If you need to run at a different exception level you’ll need to modify entry.S accordingly.

Inline assembly

Sometimes we need to use assembly to do things that aren’t possible with Rust code. For example, to make an HVC to tell the firmware to power off the system:

#![no_main]
#![no_std]

use core::arch::asm;
use core::panic::PanicInfo;

mod exceptions;

const PSCI_SYSTEM_OFF: u32 = 0x84000008;

#[no_mangle]
extern "C" fn main(_x0: u64, _x1: u64, _x2: u64, _x3: u64) {
    // Safe because this only uses the declared registers and doesn't do
    // anything with memory.
    unsafe {
        asm!("hvc #0",
            inout("w0") PSCI_SYSTEM_OFF => _,
            inout("w1") 0 => _,
            inout("w2") 0 => _,
            inout("w3") 0 => _,
            inout("w4") 0 => _,
            inout("w5") 0 => _,
            inout("w6") 0 => _,
            inout("w7") 0 => _,
            options(nomem, nostack)
        );
    }

    loop {}
}

(If you actually want to do this, use the smccc crate which has wrappers for all these functions.)

  • PSCI is the Arm Power State Coordination Interface, a standard set of functions to manage system and CPU power states, among other things. It is implemented by EL3 firmware and hypervisors on many systems.
  • The 0 => _ syntax means initialise the register to 0 before running the inline assembly code, and ignore its contents afterwards. We need to use inout rather than in because the call could potentially clobber the contents of the registers.
  • This main function needs to be #[no_mangle] and extern "C" because it is called from our entry point in entry.S.
  • _x0_x3 are the values of registers x0x3, which are conventionally used by the bootloader to pass things like a pointer to the device tree. According to the standard aarch64 calling convention (which is what extern "C" specifies to use), registers x0x7 are used for the first 8 arguments passed to a function, so entry.S doesn’t need to do anything special except make sure it doesn’t change these registers.
  • Run the example in QEMU with make qemu_psci under src/bare-metal/aps/examples.

Volatile memory access for MMIO

  • Use pointer::read_volatile and pointer::write_volatile.
  • Never hold a reference.
  • addr_of! lets you get fields of structs without creating an intermediate reference.
  • Volatile access: read or write operations may have side-effects, so prevent the compiler or hardware from reordering, duplicating or eliding them.
    • Usually if you write and then read, e.g. via a mutable reference, the compiler may assume that the value read is the same as the value just written, and not bother actually reading memory.
  • Some existing crates for volatile access to hardware do hold references, but this is unsound. Whenever a reference exist, the compiler may choose to dereference it.
  • Use the addr_of! macro to get struct field pointers from a pointer to the struct.

Let’s write a UART driver

The QEMU ‘virt’ machine has a PL011 UART, so let’s write a driver for that.

const FLAG_REGISTER_OFFSET: usize = 0x18;
const FR_BUSY: u8 = 1 << 3;
const FR_TXFF: u8 = 1 << 5;

/// Minimal driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    base_address: *mut u8,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the 8 MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u8) -> Self {
        Self { base_address }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register() & FR_TXFF != 0 {}

        // Safe because we know that the base address points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            self.base_address.write_volatile(byte);
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register() & FR_BUSY != 0 {}
    }

    fn read_flag_register(&self) -> u8 {
        // Safe because we know that the base address points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { self.base_address.add(FLAG_REGISTER_OFFSET).read_volatile() }
    }
}
  • Note that Uart::new is unsafe while the other methods are safe. This is because as long as the caller of Uart::new guarantees that its safety requirements are met (i.e. that there is only ever one instance of the driver for a given UART, and nothing else aliasing its address space), then it is always safe to call write_byte later because we can assume the necessary preconditions.
  • We could have done it the other way around (making new safe but write_byte unsafe), but that would be much less convenient to use as every place that calls write_byte would need to reason about the safety
  • This is a common pattern for writing safe wrappers of unsafe code: moving the burden of proof for soundness from a large number of places to a smaller number of places.

More traits

We derived the Debug trait. It would be useful to implement a few more traits too.

use core::fmt::{self, Write};

impl Write for Uart {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        for c in s.as_bytes() {
            self.write_byte(*c);
        }
        Ok(())
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Uart {}
  • Implementing Write lets us use the write! and writeln! macros with our Uart type.
  • Run the example in QEMU with make qemu_minimal under src/bare-metal/aps/examples.

A better UART driver

The PL011 actually has a bunch more registers, and adding offsets to construct pointers to access them is error-prone and hard to read. Plus, some of them are bit fields which would be nice to access in a structured way.

OffsetRegister nameWidth
0x00DR12
0x04RSR4
0x18FR9
0x20ILPR8
0x24IBRD16
0x28FBRD6
0x2cLCR_H8
0x30CR16
0x34IFLS6
0x38IMSC11
0x3cRIS11
0x40MIS11
0x44ICR11
0x48DMACR3
  • There are also some ID registers which have been omitted for brevity.

Bitflags

The bitflags crate is useful for working with bitflags.

use bitflags::bitflags;

bitflags! {
    /// Flags from the UART flag register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct Flags: u16 {
        /// Clear to send.
        const CTS = 1 << 0;
        /// Data set ready.
        const DSR = 1 << 1;
        /// Data carrier detect.
        const DCD = 1 << 2;
        /// UART busy transmitting data.
        const BUSY = 1 << 3;
        /// Receive FIFO is empty.
        const RXFE = 1 << 4;
        /// Transmit FIFO is full.
        const TXFF = 1 << 5;
        /// Receive FIFO is full.
        const RXFF = 1 << 6;
        /// Transmit FIFO is empty.
        const TXFE = 1 << 7;
        /// Ring indicator.
        const RI = 1 << 8;
    }
}
  • The bitflags! macro creates a newtype something like Flags(u16), along with a bunch of method implementations to get and set flags.

Multiple registers

We can use a struct to represent the memory layout of the UART’s registers.

#[repr(C, align(4))]
struct Registers {
    dr: u16,
    _reserved0: [u8; 2],
    rsr: ReceiveStatus,
    _reserved1: [u8; 19],
    fr: Flags,
    _reserved2: [u8; 6],
    ilpr: u8,
    _reserved3: [u8; 3],
    ibrd: u16,
    _reserved4: [u8; 2],
    fbrd: u8,
    _reserved5: [u8; 3],
    lcr_h: u8,
    _reserved6: [u8; 3],
    cr: u16,
    _reserved7: [u8; 3],
    ifls: u8,
    _reserved8: [u8; 3],
    imsc: u16,
    _reserved9: [u8; 2],
    ris: u16,
    _reserved10: [u8; 2],
    mis: u16,
    _reserved11: [u8; 2],
    icr: u16,
    _reserved12: [u8; 2],
    dmacr: u8,
    _reserved13: [u8; 3],
}
  • #[repr(C)] tells the compiler to lay the struct fields out in order, following the same rules as C. This is necessary for our struct to have a predictable layout, as default Rust representation allows the compiler to (among other things) reorder fields however it sees fit.

驅動程式

Now let’s use the new Registers struct in our driver.

/// Driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    registers: *mut Registers,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the 8 MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self {
            registers: base_address as *mut Registers,
        }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register().contains(Flags::TXFF) {}

        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            addr_of_mut!((*self.registers).dr).write_volatile(byte.into());
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register().contains(Flags::BUSY) {}
    }

    /// Reads and returns a pending byte, or `None` if nothing has been received.
    pub fn read_byte(&self) -> Option<u8> {
        if self.read_flag_register().contains(Flags::RXFE) {
            None
        } else {
            let data = unsafe { addr_of!((*self.registers).dr).read_volatile() };
            // TODO: Check for error conditions in bits 8-11.
            Some(data as u8)
        }
    }

    fn read_flag_register(&self) -> Flags {
        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).fr).read_volatile() }
    }
}
  • Note the use of addr_of! / addr_of_mut! to get pointers to individual fields without creating an intermediate reference, which would be unsound.

Using it

Let’s write a small program using our driver to write to the serial console, and echo incoming bytes.

#![no_main]
#![no_std]

mod exceptions;
mod pl011;

use crate::pl011::Uart;
use core::fmt::Write;
use core::panic::PanicInfo;
use log::error;
use smccc::psci::system_off;
use smccc::Hvc;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let mut uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };

    writeln!(uart, "main({x0:#x}, {x1:#x}, {x2:#x}, {x3:#x})").unwrap();

    loop {
        if let Some(byte) = uart.read_byte() {
            uart.write_byte(byte);
            match byte {
                b'\r' => {
                    uart.write_byte(b'\n');
                }
                b'q' => break,
                _ => {}
            }
        }
    }

    writeln!(uart, "Bye!").unwrap();
    system_off::<Hvc>().unwrap();
}
  • As in the inline assembly example, this main function is called from our entry point code in entry.S. See the speaker notes there for details.
  • Run the example in QEMU with make qemu under src/bare-metal/aps/examples.

記錄

It would be nice to be able to use the logging macros from the log crate. We can do this by implementing the Log trait.

use crate::pl011::Uart;
use core::fmt::Write;
use log::{LevelFilter, Log, Metadata, Record, SetLoggerError};
use spin::mutex::SpinMutex;

static LOGGER: Logger = Logger {
    uart: SpinMutex::new(None),
};

struct Logger {
    uart: SpinMutex<Option<Uart>>,
}

impl Log for Logger {
    fn enabled(&self, _metadata: &Metadata) -> bool {
        true
    }

    fn log(&self, record: &Record) {
        writeln!(
            self.uart.lock().as_mut().unwrap(),
            "[{}] {}",
            record.level(),
            record.args()
        )
        .unwrap();
    }

    fn flush(&self) {}
}

/// Initialises UART logger.
pub fn init(uart: Uart, max_level: LevelFilter) -> Result<(), SetLoggerError> {
    LOGGER.uart.lock().replace(uart);

    log::set_logger(&LOGGER)?;
    log::set_max_level(max_level);
    Ok(())
}
  • The unwrap in log is safe because we initialise LOGGER before calling set_logger.

Using it

We need to initialise the logger before we use it.

#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;

use crate::pl011::Uart;
use core::panic::PanicInfo;
use log::{error, info, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({x0:#x}, {x1:#x}, {x2:#x}, {x3:#x})");

    assert_eq!(x1, 42);

    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}
  • Note that our panic handler can now log details of panics.
  • Run the example in QEMU with make qemu_logger under src/bare-metal/aps/examples.

例外狀況

AArch64 defines an exception vector table with 16 entries, for 4 types of exceptions (synchronous, IRQ, FIQ, SError) from 4 states (current EL with SP0, current EL with SPx, lower EL using AArch64, lower EL using AArch32). We implement this in assembly to save volatile registers to the stack before calling into Rust code:

use log::error;
use smccc::psci::system_off;
use smccc::Hvc;

#[no_mangle]
extern "C" fn sync_exception_current(_elr: u64, _spsr: u64) {
    error!("sync_exception_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_current(_elr: u64, _spsr: u64) {
    error!("irq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_current(_elr: u64, _spsr: u64) {
    error!("fiq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_current(_elr: u64, _spsr: u64) {
    error!("serr_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn sync_lower(_elr: u64, _spsr: u64) {
    error!("sync_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_lower(_elr: u64, _spsr: u64) {
    error!("irq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_lower(_elr: u64, _spsr: u64) {
    error!("fiq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_lower(_elr: u64, _spsr: u64) {
    error!("serr_lower");
    system_off::<Hvc>().unwrap();
}
  • EL is exception level; all our examples this afternoon run in EL1.
  • For simplicity we aren’t distinguishing between SP0 and SPx for the current EL exceptions, or between AArch32 and AArch64 for the lower EL exceptions.
  • For this example we just log the exception and power down, as we don’t expect any of them to actually happen.
  • We can think of exception handlers and our main execution context more or less like different threads. Send and Sync will control what we can share between them, just like with threads. For example, if we want to share some value between exception handlers and the rest of the program, and it’s Send but not Sync, then we’ll need to wrap it in something like a Mutex and put it in a static.

Other projects

  • oreboot
    • “coreboot without the C”
    • Supports x86, aarch64 and RISC-V.
    • Relies on LinuxBoot rather than having many drivers itself.
  • Rust RaspberryPi OS tutorial
    • Initialisation, UART driver, simple bootloader, JTAG, exception levels, exception handling, page tables
    • Some dodginess around cache maintenance and initialisation in Rust, not necessarily a good example to copy for production code.
  • cargo-call-stack
    • Static analysis to determine maximum stack usage.
  • The RaspberryPi OS tutorial runs Rust code before the MMU and caches are enabled. This will read and write memory (e.g. the stack). However:
    • Without the MMU and cache, unaligned accesses will fault. It builds with aarch64-unknown-none which sets +strict-align to prevent the compiler generating unaligned accesses so it should be alright, but this is not necessarily the case in general.
    • If it were running in a VM, this can lead to cache coherency issues. The problem is that the VM is accessing memory directly with the cache disabled, while the host has cacheable aliases to the same memory. Even if the host doesn’t explicitly access the memory, speculative accesses can lead to cache fills, and then changes from one or the other will get lost. Again this is alright in this particular case (running directly on the hardware with no hypervisor), but isn’t a good pattern in general.

實用的 Crate

We’ll go over a few crates which solve some common problems in bare-metal programming.

zerocopy

The zerocopy crate (from Fuchsia) provides traits and macros for safely converting between byte sequences and other types.

use zerocopy::AsBytes;

#[repr(u32)]
#[derive(AsBytes, Debug, Default)]
enum RequestType {
    #[default]
    In = 0,
    Out = 1,
    Flush = 4,
}

#[repr(C)]
#[derive(AsBytes, Debug, Default)]
struct VirtioBlockRequest {
    request_type: RequestType,
    reserved: u32,
    sector: u64,
}

fn main() {
    let request = VirtioBlockRequest {
        request_type: RequestType::Flush,
        sector: 42,
        ..Default::default()
    };

    assert_eq!(
        request.as_bytes(),
        &[4, 0, 0, 0, 0, 0, 0, 0, 42, 0, 0, 0, 0, 0, 0, 0]
    );
}

This is not suitable for MMIO (as it doesn’t use volatile reads and writes), but can be useful for working with structures shared with hardware e.g. by DMA, or sent over some external interface.

  • FromBytes can be implemented for types for which any byte pattern is valid, and so can safely be converted from an untrusted sequence of bytes.
  • Attempting to derive FromBytes for these types would fail, because RequestType doesn’t use all possible u32 values as discriminants, so not all byte patterns are valid.
  • zerocopy::byteorder has types for byte-order aware numeric primitives.
  • Run the example with cargo run under src/bare-metal/useful-crates/zerocopy-example/. (It won’t run in the Playground because of the crate dependency.)

aarch64-paging

The aarch64-paging crate lets you create page tables according to the AArch64 Virtual Memory System Architecture.

use aarch64_paging::{
    idmap::IdMap,
    paging::{Attributes, MemoryRegion},
};

const ASID: usize = 1;
const ROOT_LEVEL: usize = 1;

// Create a new page table with identity mapping.
let mut idmap = IdMap::new(ASID, ROOT_LEVEL);
// Map a 2 MiB region of memory as read-only.
idmap.map_range(
    &MemoryRegion::new(0x80200000, 0x80400000),
    Attributes::NORMAL | Attributes::NON_GLOBAL | Attributes::READ_ONLY,
).unwrap();
// Set `TTBR0_EL1` to activate the page table.
idmap.activate();
  • For now it only supports EL1, but support for other exception levels should be straightforward to add.
  • This is used in Android for the Protected VM Firmware.
  • There’s no easy way to run this example, as it needs to run on real hardware or under QEMU.

buddy_system_allocator

buddy_system_allocator is a third-party crate implementing a basic buddy system allocator. It can be used both for LockedHeap implementing GlobalAlloc so you can use the standard alloc crate (as we saw before), or for allocating other address space. For example, we might want to allocate MMIO space for PCI BARs:

use buddy_system_allocator::FrameAllocator;
use core::alloc::Layout;

fn main() {
    let mut allocator = FrameAllocator::<32>::new();
    allocator.add_frame(0x200_0000, 0x400_0000);

    let layout = Layout::from_size_align(0x100, 0x100).unwrap();
    let bar = allocator
        .alloc_aligned(layout)
        .expect("Failed to allocate 0x100 byte MMIO region");
    println!("Allocated 0x100 byte MMIO region at {:#x}", bar);
}
  • PCI BARs always have alignment equal to their size.
  • Run the example with cargo run under src/bare-metal/useful-crates/allocator-example/. (It won’t run in the Playground because of the crate dependency.)

tinyvec

Sometimes you want something which can be resized like a Vec, but without heap allocation. tinyvec provides this: a vector backed by an array or slice, which could be statically allocated or on the stack, which keeps track of how many elements are used and panics if you try to use more than are allocated.

use tinyvec::{array_vec, ArrayVec};

fn main() {
    let mut numbers: ArrayVec<[u32; 5]> = array_vec!(42, 66);
    println!("{numbers:?}");
    numbers.push(7);
    println!("{numbers:?}");
    numbers.remove(1);
    println!("{numbers:?}");
}
  • tinyvec requires that the element type implement Default for initialisation.
  • The Rust Playground includes tinyvec, so this example will run fine inline.

spin

std::sync::Mutex and the other synchronisation primitives from std::sync are not available in core or alloc. How can we manage synchronisation or interior mutability, such as for sharing state between different CPUs?

The spin crate provides spinlock-based equivalents of many of these primitives.

use spin::mutex::SpinMutex;

static counter: SpinMutex<u32> = SpinMutex::new(0);

fn main() {
    println!("count: {}", counter.lock());
    *counter.lock() += 2;
    println!("count: {}", counter.lock());
}
  • Be careful to avoid deadlock if you take locks in interrupt handlers.
  • spin also has a ticket lock mutex implementation; equivalents of RwLock, Barrier and Once from std::sync; and Lazy for lazy initialisation.
  • The once_cell crate also has some useful types for late initialisation with a slightly different approach to spin::once::Once.
  • The Rust Playground includes spin, so this example will run fine inline.

Android

To build a bare-metal Rust binary in AOSP, you need to use a rust_ffi_static Soong rule to build your Rust code, then a cc_binary with a linker script to produce the binary itself, and then a raw_binary to convert the ELF to a raw binary ready to be run.

rust_ffi_static {
    name: "libvmbase_example",
    defaults: ["vmbase_ffi_defaults"],
    crate_name: "vmbase_example",
    srcs: ["src/main.rs"],
    rustlibs: [
        "libvmbase",
    ],
}

cc_binary {
    name: "vmbase_example",
    defaults: ["vmbase_elf_defaults"],
    srcs: [
        "idmap.S",
    ],
    static_libs: [
        "libvmbase_example",
    ],
    linker_scripts: [
        "image.ld",
        ":vmbase_sections",
    ],
}

raw_binary {
    name: "vmbase_example_bin",
    stem: "vmbase_example.bin",
    src: ":vmbase_example",
    enabled: false,
    target: {
        android_arm64: {
            enabled: true,
        },
    },
}

vmbase

For VMs running under crosvm on aarch64, the vmbase library provides a linker script and useful defaults for the build rules, along with an entry point, UART console logging and more.

#![no_main]
#![no_std]

use vmbase::{main, println};

main!(main);

pub fn main(arg0: u64, arg1: u64, arg2: u64, arg3: u64) {
    println!("Hello world");
}
  • The main! macro marks your main function, to be called from the vmbase entry point.
  • The vmbase entry point handles console initialisation, and issues a PSCI_SYSTEM_OFF to shutdown the VM if your main function returns.

練習

We will write a driver for the PL031 real-time clock device.

完成練習後,您可以看看我們提供的解決方案

RTC driver

The QEMU aarch64 virt machine has a PL031 real-time clock at 0x9010000. For this exercise, you should write a driver for it.

  1. Use it to print the current time to the serial console. You can use the chrono crate for date/time formatting.
  2. Use the match register and raw interrupt status to busy-wait until a given time, e.g. 3 seconds in the future. (Call core::hint::spin_loop inside the loop.)
  3. Extension if you have time: Enable and handle the interrupt generated by the RTC match. You can use the driver provided in the arm-gic crate to configure the Arm Generic Interrupt Controller.
    • Use the RTC interrupt, which is wired to the GIC as IntId::spi(2).
    • Once the interrupt is enabled, you can put the core to sleep via arm_gic::wfi(), which will cause the core to sleep until it receives an interrupt.

Download the exercise template and look in the rtc directory for the following files.

src/main.rs:

#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;

use crate::pl011::Uart;
use arm_gic::gicv3::GicV3;
use core::panic::PanicInfo;
use log::{error, info, trace, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base addresses of the GICv3.
const GICD_BASE_ADDRESS: *mut u64 = 0x800_0000 as _;
const GICR_BASE_ADDRESS: *mut u64 = 0x80A_0000 as _;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;

#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({:#x}, {:#x}, {:#x}, {:#x})", x0, x1, x2, x3);

    // Safe because `GICD_BASE_ADDRESS` and `GICR_BASE_ADDRESS` are the base
    // addresses of a GICv3 distributor and redistributor respectively, and
    // nothing else accesses those address ranges.
    let mut gic = unsafe { GicV3::new(GICD_BASE_ADDRESS, GICR_BASE_ADDRESS) };
    gic.setup();

    // TODO: Create instance of RTC driver and print current time.

    // TODO: Wait for 3 seconds.

    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}

src/exceptions.rs (you should only need to change this for the 3rd part of the exercise):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use arm_gic::gicv3::GicV3;
use log::{error, info, trace};
use smccc::psci::system_off;
use smccc::Hvc;

#[no_mangle]
extern "C" fn sync_exception_current(_elr: u64, _spsr: u64) {
    error!("sync_exception_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_current(_elr: u64, _spsr: u64) {
    trace!("irq_current");
    let intid = GicV3::get_and_acknowledge_interrupt().expect("No pending interrupt");
    info!("IRQ {intid:?}");
}

#[no_mangle]
extern "C" fn fiq_current(_elr: u64, _spsr: u64) {
    error!("fiq_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_current(_elr: u64, _spsr: u64) {
    error!("serr_current");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn sync_lower(_elr: u64, _spsr: u64) {
    error!("sync_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn irq_lower(_elr: u64, _spsr: u64) {
    error!("irq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn fiq_lower(_elr: u64, _spsr: u64) {
    error!("fiq_lower");
    system_off::<Hvc>().unwrap();
}

#[no_mangle]
extern "C" fn serr_lower(_elr: u64, _spsr: u64) {
    error!("serr_lower");
    system_off::<Hvc>().unwrap();
}
}

src/logger.rs (you shouldn’t need to change this):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: main
use crate::pl011::Uart;
use core::fmt::Write;
use log::{LevelFilter, Log, Metadata, Record, SetLoggerError};
use spin::mutex::SpinMutex;

static LOGGER: Logger = Logger {
    uart: SpinMutex::new(None),
};

struct Logger {
    uart: SpinMutex<Option<Uart>>,
}

impl Log for Logger {
    fn enabled(&self, _metadata: &Metadata) -> bool {
        true
    }

    fn log(&self, record: &Record) {
        writeln!(
            self.uart.lock().as_mut().unwrap(),
            "[{}] {}",
            record.level(),
            record.args()
        )
        .unwrap();
    }

    fn flush(&self) {}
}

/// Initialises UART logger.
pub fn init(uart: Uart, max_level: LevelFilter) -> Result<(), SetLoggerError> {
    LOGGER.uart.lock().replace(uart);

    log::set_logger(&LOGGER)?;
    log::set_max_level(max_level);
    Ok(())
}
}

src/pl011.rs (you shouldn’t need to change this):

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#![allow(unused)]

use core::fmt::{self, Write};
use core::ptr::{addr_of, addr_of_mut};

// ANCHOR: Flags
use bitflags::bitflags;

bitflags! {
    /// Flags from the UART flag register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct Flags: u16 {
        /// Clear to send.
        const CTS = 1 << 0;
        /// Data set ready.
        const DSR = 1 << 1;
        /// Data carrier detect.
        const DCD = 1 << 2;
        /// UART busy transmitting data.
        const BUSY = 1 << 3;
        /// Receive FIFO is empty.
        const RXFE = 1 << 4;
        /// Transmit FIFO is full.
        const TXFF = 1 << 5;
        /// Receive FIFO is full.
        const RXFF = 1 << 6;
        /// Transmit FIFO is empty.
        const TXFE = 1 << 7;
        /// Ring indicator.
        const RI = 1 << 8;
    }
}
// ANCHOR_END: Flags

bitflags! {
    /// Flags from the UART Receive Status Register / Error Clear Register.
    #[repr(transparent)]
    #[derive(Copy, Clone, Debug, Eq, PartialEq)]
    struct ReceiveStatus: u16 {
        /// Framing error.
        const FE = 1 << 0;
        /// Parity error.
        const PE = 1 << 1;
        /// Break error.
        const BE = 1 << 2;
        /// Overrun error.
        const OE = 1 << 3;
    }
}

// ANCHOR: Registers
#[repr(C, align(4))]
struct Registers {
    dr: u16,
    _reserved0: [u8; 2],
    rsr: ReceiveStatus,
    _reserved1: [u8; 19],
    fr: Flags,
    _reserved2: [u8; 6],
    ilpr: u8,
    _reserved3: [u8; 3],
    ibrd: u16,
    _reserved4: [u8; 2],
    fbrd: u8,
    _reserved5: [u8; 3],
    lcr_h: u8,
    _reserved6: [u8; 3],
    cr: u16,
    _reserved7: [u8; 3],
    ifls: u8,
    _reserved8: [u8; 3],
    imsc: u16,
    _reserved9: [u8; 2],
    ris: u16,
    _reserved10: [u8; 2],
    mis: u16,
    _reserved11: [u8; 2],
    icr: u16,
    _reserved12: [u8; 2],
    dmacr: u8,
    _reserved13: [u8; 3],
}
// ANCHOR_END: Registers

// ANCHOR: Uart
/// Driver for a PL011 UART.
#[derive(Debug)]
pub struct Uart {
    registers: *mut Registers,
}

impl Uart {
    /// Constructs a new instance of the UART driver for a PL011 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the MMIO control registers of a
    /// PL011 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self {
            registers: base_address as *mut Registers,
        }
    }

    /// Writes a single byte to the UART.
    pub fn write_byte(&self, byte: u8) {
        // Wait until there is room in the TX buffer.
        while self.read_flag_register().contains(Flags::TXFF) {}

        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe {
            // Write to the TX buffer.
            addr_of_mut!((*self.registers).dr).write_volatile(byte.into());
        }

        // Wait until the UART is no longer busy.
        while self.read_flag_register().contains(Flags::BUSY) {}
    }

    /// Reads and returns a pending byte, or `None` if nothing has been received.
    pub fn read_byte(&self) -> Option<u8> {
        if self.read_flag_register().contains(Flags::RXFE) {
            None
        } else {
            let data = unsafe { addr_of!((*self.registers).dr).read_volatile() };
            // TODO: Check for error conditions in bits 8-11.
            Some(data as u8)
        }
    }

    fn read_flag_register(&self) -> Flags {
        // Safe because we know that self.registers points to the control
        // registers of a PL011 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).fr).read_volatile() }
    }
}
// ANCHOR_END: Uart

impl Write for Uart {
    fn write_str(&mut self, s: &str) -> fmt::Result {
        for c in s.as_bytes() {
            self.write_byte(*c);
        }
        Ok(())
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Uart {}
}

Cargo.toml (you shouldn’t need to change this):

[workspace]

[package]
name = "rtc"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
arm-gic = "0.1.0"
bitflags = "2.0.0"
chrono = { version = "0.4.24", default-features = false }
log = "0.4.17"
smccc = "0.1.1"
spin = "0.9.8"

[build-dependencies]
cc = "1.0.73"

build.rs (you shouldn’t need to change this):

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use cc::Build;
use std::env;

fn main() {
    #[cfg(target_os = "linux")]
    env::set_var("CROSS_COMPILE", "aarch64-linux-gnu");
    #[cfg(not(target_os = "linux"))]
    env::set_var("CROSS_COMPILE", "aarch64-none-elf");

    Build::new()
        .file("entry.S")
        .file("exceptions.S")
        .file("idmap.S")
        .compile("empty")
}

entry.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

.macro adr_l, reg:req, sym:req
	adrp \reg, \sym
	add \reg, \reg, :lo12:\sym
.endm

.macro mov_i, reg:req, imm:req
	movz \reg, :abs_g3:\imm
	movk \reg, :abs_g2_nc:\imm
	movk \reg, :abs_g1_nc:\imm
	movk \reg, :abs_g0_nc:\imm
.endm

.set .L_MAIR_DEV_nGnRE,	0x04
.set .L_MAIR_MEM_WBWA,	0xff
.set .Lmairval, .L_MAIR_DEV_nGnRE | (.L_MAIR_MEM_WBWA << 8)

/* 4 KiB granule size for TTBR0_EL1. */
.set .L_TCR_TG0_4KB, 0x0 << 14
/* 4 KiB granule size for TTBR1_EL1. */
.set .L_TCR_TG1_4KB, 0x2 << 30
/* Disable translation table walk for TTBR1_EL1, generating a translation fault instead. */
.set .L_TCR_EPD1, 0x1 << 23
/* Translation table walks for TTBR0_EL1 are inner sharable. */
.set .L_TCR_SH_INNER, 0x3 << 12
/*
 * Translation table walks for TTBR0_EL1 are outer write-back read-allocate write-allocate
 * cacheable.
 */
.set .L_TCR_RGN_OWB, 0x1 << 10
/*
 * Translation table walks for TTBR0_EL1 are inner write-back read-allocate write-allocate
 * cacheable.
 */
.set .L_TCR_RGN_IWB, 0x1 << 8
/* Size offset for TTBR0_EL1 is 2**39 bytes (512 GiB). */
.set .L_TCR_T0SZ_512, 64 - 39
.set .Ltcrval, .L_TCR_TG0_4KB | .L_TCR_TG1_4KB | .L_TCR_EPD1 | .L_TCR_RGN_OWB
.set .Ltcrval, .Ltcrval | .L_TCR_RGN_IWB | .L_TCR_SH_INNER | .L_TCR_T0SZ_512

/* Stage 1 instruction access cacheability is unaffected. */
.set .L_SCTLR_ELx_I, 0x1 << 12
/* SP alignment fault if SP is not aligned to a 16 byte boundary. */
.set .L_SCTLR_ELx_SA, 0x1 << 3
/* Stage 1 data access cacheability is unaffected. */
.set .L_SCTLR_ELx_C, 0x1 << 2
/* EL0 and EL1 stage 1 MMU enabled. */
.set .L_SCTLR_ELx_M, 0x1 << 0
/* Privileged Access Never is unchanged on taking an exception to EL1. */
.set .L_SCTLR_EL1_SPAN, 0x1 << 23
/* SETEND instruction disabled at EL0 in aarch32 mode. */
.set .L_SCTLR_EL1_SED, 0x1 << 8
/* Various IT instructions are disabled at EL0 in aarch32 mode. */
.set .L_SCTLR_EL1_ITD, 0x1 << 7
.set .L_SCTLR_EL1_RES1, (0x1 << 11) | (0x1 << 20) | (0x1 << 22) | (0x1 << 28) | (0x1 << 29)
.set .Lsctlrval, .L_SCTLR_ELx_M | .L_SCTLR_ELx_C | .L_SCTLR_ELx_SA | .L_SCTLR_EL1_ITD | .L_SCTLR_EL1_SED
.set .Lsctlrval, .Lsctlrval | .L_SCTLR_ELx_I | .L_SCTLR_EL1_SPAN | .L_SCTLR_EL1_RES1

/**
 * This is a generic entry point for an image. It carries out the operations required to prepare the
 * loaded image to be run. Specifically, it zeroes the bss section using registers x25 and above,
 * prepares the stack, enables floating point, and sets up the exception vector. It preserves x0-x3
 * for the Rust entry point, as these may contain boot parameters.
 */
.section .init.entry, "ax"
.global entry
entry:
	/* Load and apply the memory management configuration, ready to enable MMU and caches. */
	adrp x30, idmap
	msr ttbr0_el1, x30

	mov_i x30, .Lmairval
	msr mair_el1, x30

	mov_i x30, .Ltcrval
	/* Copy the supported PA range into TCR_EL1.IPS. */
	mrs x29, id_aa64mmfr0_el1
	bfi x30, x29, #32, #4

	msr tcr_el1, x30

	mov_i x30, .Lsctlrval

	/*
	 * Ensure everything before this point has completed, then invalidate any potentially stale
	 * local TLB entries before they start being used.
	 */
	isb
	tlbi vmalle1
	ic iallu
	dsb nsh
	isb

	/*
	 * Configure sctlr_el1 to enable MMU and cache and don't proceed until this has completed.
	 */
	msr sctlr_el1, x30
	isb

	/* Disable trapping floating point access in EL1. */
	mrs x30, cpacr_el1
	orr x30, x30, #(0x3 << 20)
	msr cpacr_el1, x30
	isb

	/* Zero out the bss section. */
	adr_l x29, bss_begin
	adr_l x30, bss_end
0:	cmp x29, x30
	b.hs 1f
	stp xzr, xzr, [x29], #16
	b 0b

1:	/* Prepare the stack. */
	adr_l x30, boot_stack_end
	mov sp, x30

	/* Set up exception vector. */
	adr x30, vector_table_el1
	msr vbar_el1, x30

	/* Call into Rust code. */
	bl main

	/* Loop forever waiting for interrupts. */
2:	wfi
	b 2b

exceptions.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/**
 * Saves the volatile registers onto the stack. This currently takes 14
 * instructions, so it can be used in exception handlers with 18 instructions
 * left.
 *
 * On return, x0 and x1 are initialised to elr_el2 and spsr_el2 respectively,
 * which can be used as the first and second arguments of a subsequent call.
 */
.macro save_volatile_to_stack
	/* Reserve stack space and save registers x0-x18, x29 & x30. */
	stp x0, x1, [sp, #-(8 * 24)]!
	stp x2, x3, [sp, #8 * 2]
	stp x4, x5, [sp, #8 * 4]
	stp x6, x7, [sp, #8 * 6]
	stp x8, x9, [sp, #8 * 8]
	stp x10, x11, [sp, #8 * 10]
	stp x12, x13, [sp, #8 * 12]
	stp x14, x15, [sp, #8 * 14]
	stp x16, x17, [sp, #8 * 16]
	str x18, [sp, #8 * 18]
	stp x29, x30, [sp, #8 * 20]

	/*
	 * Save elr_el1 & spsr_el1. This such that we can take nested exception
	 * and still be able to unwind.
	 */
	mrs x0, elr_el1
	mrs x1, spsr_el1
	stp x0, x1, [sp, #8 * 22]
.endm

/**
 * Restores the volatile registers from the stack. This currently takes 14
 * instructions, so it can be used in exception handlers while still leaving 18
 * instructions left; if paired with save_volatile_to_stack, there are 4
 * instructions to spare.
 */
.macro restore_volatile_from_stack
	/* Restore registers x2-x18, x29 & x30. */
	ldp x2, x3, [sp, #8 * 2]
	ldp x4, x5, [sp, #8 * 4]
	ldp x6, x7, [sp, #8 * 6]
	ldp x8, x9, [sp, #8 * 8]
	ldp x10, x11, [sp, #8 * 10]
	ldp x12, x13, [sp, #8 * 12]
	ldp x14, x15, [sp, #8 * 14]
	ldp x16, x17, [sp, #8 * 16]
	ldr x18, [sp, #8 * 18]
	ldp x29, x30, [sp, #8 * 20]

	/* Restore registers elr_el1 & spsr_el1, using x0 & x1 as scratch. */
	ldp x0, x1, [sp, #8 * 22]
	msr elr_el1, x0
	msr spsr_el1, x1

	/* Restore x0 & x1, and release stack space. */
	ldp x0, x1, [sp], #8 * 24
.endm

/**
 * This is a generic handler for exceptions taken at the current EL while using
 * SP0. It behaves similarly to the SPx case by first switching to SPx, doing
 * the work, then switching back to SP0 before returning.
 *
 * Switching to SPx and calling the Rust handler takes 16 instructions. To
 * restore and return we need an additional 16 instructions, so we can implement
 * the whole handler within the allotted 32 instructions.
 */
.macro current_exception_sp0 handler:req
	msr spsel, #1
	save_volatile_to_stack
	bl \handler
	restore_volatile_from_stack
	msr spsel, #0
	eret
.endm

/**
 * This is a generic handler for exceptions taken at the current EL while using
 * SPx. It saves volatile registers, calls the Rust handler, restores volatile
 * registers, then returns.
 *
 * This also works for exceptions taken from EL0, if we don't care about
 * non-volatile registers.
 *
 * Saving state and jumping to the Rust handler takes 15 instructions, and
 * restoring and returning also takes 15 instructions, so we can fit the whole
 * handler in 30 instructions, under the limit of 32.
 */
.macro current_exception_spx handler:req
	save_volatile_to_stack
	bl \handler
	restore_volatile_from_stack
	eret
.endm

.section .text.vector_table_el1, "ax"
.global vector_table_el1
.balign 0x800
vector_table_el1:
sync_cur_sp0:
	current_exception_sp0 sync_exception_current

.balign 0x80
irq_cur_sp0:
	current_exception_sp0 irq_current

.balign 0x80
fiq_cur_sp0:
	current_exception_sp0 fiq_current

.balign 0x80
serr_cur_sp0:
	current_exception_sp0 serr_current

.balign 0x80
sync_cur_spx:
	current_exception_spx sync_exception_current

.balign 0x80
irq_cur_spx:
	current_exception_spx irq_current

.balign 0x80
fiq_cur_spx:
	current_exception_spx fiq_current

.balign 0x80
serr_cur_spx:
	current_exception_spx serr_current

.balign 0x80
sync_lower_64:
	current_exception_spx sync_lower

.balign 0x80
irq_lower_64:
	current_exception_spx irq_lower

.balign 0x80
fiq_lower_64:
	current_exception_spx fiq_lower

.balign 0x80
serr_lower_64:
	current_exception_spx serr_lower

.balign 0x80
sync_lower_32:
	current_exception_spx sync_lower

.balign 0x80
irq_lower_32:
	current_exception_spx irq_lower

.balign 0x80
fiq_lower_32:
	current_exception_spx fiq_lower

.balign 0x80
serr_lower_32:
	current_exception_spx serr_lower

idmap.S (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

.set .L_TT_TYPE_BLOCK, 0x1
.set .L_TT_TYPE_PAGE,  0x3
.set .L_TT_TYPE_TABLE, 0x3

/* Access flag. */
.set .L_TT_AF, 0x1 << 10
/* Not global. */
.set .L_TT_NG, 0x1 << 11
.set .L_TT_XN, 0x3 << 53

.set .L_TT_MT_DEV, 0x0 << 2			// MAIR #0 (DEV_nGnRE)
.set .L_TT_MT_MEM, (0x1 << 2) | (0x3 << 8)	// MAIR #1 (MEM_WBWA), inner shareable

.set .L_BLOCK_DEV, .L_TT_TYPE_BLOCK | .L_TT_MT_DEV | .L_TT_AF | .L_TT_XN
.set .L_BLOCK_MEM, .L_TT_TYPE_BLOCK | .L_TT_MT_MEM | .L_TT_AF | .L_TT_NG

.section ".rodata.idmap", "a", %progbits
.global idmap
.align 12
idmap:
	/* level 1 */
	.quad		.L_BLOCK_DEV | 0x0		    // 1 GiB of device mappings
	.quad		.L_BLOCK_MEM | 0x40000000	// 1 GiB of DRAM
	.fill		254, 8, 0x0			// 254 GiB of unmapped VA space
	.quad		.L_BLOCK_DEV | 0x4000000000 // 1 GiB of device mappings
	.fill		255, 8, 0x0			// 255 GiB of remaining VA space

image.ld (you shouldn’t need to change this):

/*
 * Copyright 2023 Google LLC
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     https://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

/*
 * Code will start running at this symbol which is placed at the start of the
 * image.
 */
ENTRY(entry)

MEMORY
{
	image : ORIGIN = 0x40080000, LENGTH = 2M
}

SECTIONS
{
	/*
	 * Collect together the code.
	 */
	.init : ALIGN(4096) {
		text_begin = .;
		*(.init.entry)
		*(.init.*)
	} >image
	.text : {
		*(.text.*)
	} >image
	text_end = .;

	/*
	 * Collect together read-only data.
	 */
	.rodata : ALIGN(4096) {
		rodata_begin = .;
		*(.rodata.*)
	} >image
	.got : {
		*(.got)
	} >image
	rodata_end = .;

	/*
	 * Collect together the read-write data including .bss at the end which
	 * will be zero'd by the entry code.
	 */
	.data : ALIGN(4096) {
		data_begin = .;
		*(.data.*)
		/*
		 * The entry point code assumes that .data is a multiple of 32
		 * bytes long.
		 */
		. = ALIGN(32);
		data_end = .;
	} >image

	/* Everything beyond this point will not be included in the binary. */
	bin_end = .;

	/* The entry point code assumes that .bss is 16-byte aligned. */
	.bss : ALIGN(16)  {
		bss_begin = .;
		*(.bss.*)
		*(COMMON)
		. = ALIGN(16);
		bss_end = .;
	} >image

	.stack (NOLOAD) : ALIGN(4096) {
		boot_stack_begin = .;
		. += 40 * 4096;
		. = ALIGN(4096);
		boot_stack_end = .;
	} >image

	. = ALIGN(4K);
	PROVIDE(dma_region = .);

	/*
	 * Remove unused sections from the image.
	 */
	/DISCARD/ : {
		/* The image loads itself so doesn't need these sections. */
		*(.gnu.hash)
		*(.hash)
		*(.interp)
		*(.eh_frame_hdr)
		*(.eh_frame)
		*(.note.gnu.build-id)
	}
}

Makefile (you shouldn’t need to change this):

# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

UNAME := $(shell uname -s)
ifeq ($(UNAME),Linux)
	TARGET = aarch64-linux-gnu
else
	TARGET = aarch64-none-elf
endif
OBJCOPY = $(TARGET)-objcopy

.PHONY: build qemu_minimal qemu qemu_logger

all: rtc.bin

build:
	cargo build

rtc.bin: build
	$(OBJCOPY) -O binary target/aarch64-unknown-none/debug/rtc $@

qemu: rtc.bin
	qemu-system-aarch64 -machine virt,gic-version=3 -cpu max -serial mon:stdio -display none -kernel $< -s

clean:
	cargo clean
	rm -f *.bin

.cargo/config.toml (you shouldn’t need to change this):

[build]
target = "aarch64-unknown-none"
rustflags = ["-C", "link-arg=-Timage.ld"]

Run the code in QEMU with make qemu.

歡迎使用 Rust 的並行程式設計

Rust 使用 OS 執行緒搭配著互斥鎖和通道來完整支援並行處理。

在將許多執行期並行錯誤轉換為編譯期錯誤的過程中,Rust 型別系統扮演了 重要角色。這通常稱為「無懼並行」,因為你可以依賴編譯器, 確保執行期能夠正確運作。

執行緒

Rust 執行緒的運作方式與其他語言類似:

use std::thread;
use std::time::Duration;

fn main() {
    thread::spawn(|| {
        for i in 1..10 {
            println!("Count in thread: {i}!");
            thread::sleep(Duration::from_millis(5));
        }
    });

    for i in 1..5 {
        println!("Main thread: {i}");
        thread::sleep(Duration::from_millis(5));
    }
}
  • 執行緒都是 daemon 執行緒,主執行緒不會等待這類執行緒完成運作。
  • 執行緒恐慌均為各自獨立,並非彼此相關。
    • 如果恐慌附帶酬載,可使用 downcast_ref 解除封裝。

重要須知:

  • 請注意,執行緒會在達到 10 之前停止運作,因為主執行緒不會 等待其完成運作。

  • 請依序使用 let handle = thread::spawn(...)handle.join(),等待 執行緒完成運作。

  • 在執行緒中觸發恐慌,請注意,這不會影響 main

  • 使用 handle.join()Result 傳回值,取得恐慌酬載的 存取權。這個階段是提起 Any 的好時機。

限定範圍執行緒

一般執行緒無法借用環境的資源:

use std::thread;

fn foo() {
    let s = String::from("Hello");
    thread::spawn(|| {
        println!("Length: {}", s.len());
    });
}

fn main() {
    foo();
}

但是,你可以使用限定範圍執行緒執行這項功能:

use std::thread;

fn main() {
    let s = String::from("Hello");

    thread::scope(|scope| {
        scope.spawn(|| {
            println!("Length: {}", s.len());
        });
    });
}
  • 原因在於 thread::scope 函式完成時,能保證所有執行緒都已加入,因此能夠傳回借用的資料。
  • 適用 Rust 一般借用規則:可以由一個執行緒以可變方式借用,或者由任意數量的執行緒以不可變方式借用。

通道

Rust 通道分為兩個部分:Sender<T>Receiver<T>。這兩個部分 透過通道相連,但你只能看到端點。

use std::sync::mpsc;
use std::thread;

fn main() {
    let (tx, rx) = mpsc::channel();

    tx.send(10).unwrap();
    tx.send(20).unwrap();

    println!("Received: {:?}", rx.recv());
    println!("Received: {:?}", rx.recv());

    let tx2 = tx.clone();
    tx2.send(30).unwrap();
    println!("Received: {:?}", rx.recv());
}
  • mpsc 代表多重生產者、唯一消費者。SenderSyncSender 會實作 Clone (用於製作多重生產者),但 Receiver 不會。
  • send()recv() 會傳回 Result。如果傳回的是 Err,表示對應的 SenderReceiver 已釋放,且通道已關閉。

無界限的通道

你可以使用 mpsc::channel() 取得無界限的非同步通道:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::channel();

    thread::spawn(move || {
        let thread_id = thread::current().id();
        for i in 1..10 {
            tx.send(format!("Message {i}")).unwrap();
            println!("{thread_id:?}: sent Message {i}");
        }
        println!("{thread_id:?}: done");
    });
    thread::sleep(Duration::from_millis(100));

    for msg in rx.iter() {
        println!("Main: got {msg}");
    }
}

有界限的通道

With bounded (synchronous) channels, send can block the current thread:

use std::sync::mpsc;
use std::thread;
use std::time::Duration;

fn main() {
    let (tx, rx) = mpsc::sync_channel(3);

    thread::spawn(move || {
        let thread_id = thread::current().id();
        for i in 1..10 {
            tx.send(format!("Message {i}")).unwrap();
            println!("{thread_id:?}: sent Message {i}");
        }
        println!("{thread_id:?}: done");
    });
    thread::sleep(Duration::from_millis(100));

    for msg in rx.iter() {
        println!("Main: got {msg}");
    }
}
  • Calling send will block the current thread until there is space in the channel for the new message. The thread can be blocked indefinitely if there is nobody who reads from the channel.
  • A call to send will abort with an error (that is why it returns Result) if the channel is closed. A channel is closed when the receiver is dropped.
  • A bounded channel with a size of zero is called a “rendezvous channel”. Every send will block the current thread until another thread calls read.

SendSync

Rust 如何得知要禁止在執行緒間共享存取權?答案就在以下兩個特徵中:

  • Send:如果可以將 T 在執行緒界線間安全轉移,型別 T 就會是 Send
  • Sync:如果可以將 &T 在執行緒界線間安全轉移,型別 T 就會是 Sync

SendSync不安全的特徵。如果你的型別只包含其他有 SendSync 的型別, 編譯器就會自動根據型別為你產生 SendSync。或許如果你知道你的型別是適用的, 也可以手動實作。

  • 這些特徵可視為標記,表示該型別含有特定執行緒安全屬性。
  • 這些特徵就像一般特徵,可用於泛型條件約束。

Send

如果可以將 T 值安全轉移至其他執行緒,型別 T 就會是 Send

將所有權轉移到其他執行緒的結果,就是「destructors」會在該執行緒中 執行。因此問題是,何時能在一個執行緒中配置一個值, 並在另一個執行緒中釋放這個值的記憶體。

舉例來說,與 SQLite 資料庫的連線必須只能透過單一執行緒 存取。

Sync

如果可以同時從多個執行緒存取 T 值, 型別 T 就會是 Sync

更精確的定義如下:

&T 必須為 SendT 才會是 Sync

這定義簡單的表示,若一個型別可以在確保執行緒安全的情況下被共用,這型別的參考值也可以安全的被傳遞於其他的執行緒。

原因在於如果型別為 Sync,表示能在多個執行緒之間共用,沒有資料競爭或其他同步問題的風險,因此可以安全轉移到其他執行緒。此外,由於可以從任何執行緒安全存取型別參考的資料,型別參考也能安全地轉移到其他執行緒。

範例

Send + Sync

你遇到的多數型別會是 Send + Sync

  • i8f32boolchar&str、…
  • (T1, T2)[T; N]&[T]struct { x: T }、…
  • StringOption<T>Vec<T>Box<T>、…
  • Arc<T>:透過原子參考計數明確防護執行緒安全。
  • Mutex<T>:透過內部鎖定系統明確防護執行緒安全。
  • AtomicBoolAtomicU8、…:使用特殊原子性指示。

如果型別參數是 Send + Sync, 一般型別通常就會是 Send + Sync

Send + !Sync

以下型別可以轉移到其他執行緒,但不會防護執行緒安全。 原因通常在於內部可變性:

  • mpsc::Sender<T>
  • mpsc::Receiver<T>
  • Cell<T>
  • RefCell<T>

!Send + Sync

以下型別會防護執行緒安全,但無法轉移至其他執行緒:

  • MutexGuard<T>:使用 OS 層級的原始元件,這類元件必須在建立該元件的 執行緒上釋放記憶體。

!Send + !Sync

以下型別不會防護執行緒安全,也無法轉移至其他執行緒:

  • Rc<T>:每個 Rc<T> 都有一個 RcBox<T> 參考,其中包含一個 非原子參考計數。
  • *const T*mut T:Rust 會假定原始指標可能有特殊的 並行考量。

共享狀態

Rust 會使用型別系統強制同步共享的資料,主要透過兩種型別 執行:

  • Arc<T>,原子參考計數為 T:處理執行緒間的共享狀態, 並且在最後參考被丟棄時負責釋放 T 的記憶體。
  • Mutex<T>:確保能提供 T 值的可變專屬存取權。

Arc

Arc<T> 可透過 Arc::clone 取得共享唯讀存取權:

use std::thread;
use std::sync::Arc;

fn main() {
    let v = Arc::new(vec![10, 20, 30]);
    let mut handles = Vec::new();
    for _ in 1..5 {
        let v = Arc::clone(&v);
        handles.push(thread::spawn(move || {
            let thread_id = thread::current().id();
            println!("{thread_id:?}: {v:?}");
        }));
    }

    handles.into_iter().for_each(|h| h.join().unwrap());
    println!("v: {v:?}");
}
  • Arc 代表「原子參考計數」,這個 Rc 的執行緒安全版本會採用原子性 運算。
  • Arc<T> implements Clone whether or not T does. It implements Send and Sync if and only if T implements them both.
  • Arc::clone() 會導致執行原子性運算的費用,但之後使用得到的 T 不需任何費用。
  • 留意參考循環,Arc 並不使用垃圾收集器進行偵測。
    • std::sync::Weak 可協助執行這項功能。

Mutex

Mutex<T> 可確保執行互斥功能,「並」在唯讀介面背後授予 T 的可變存取權:

use std::sync::Mutex;

fn main() {
    let v = Mutex::new(vec![10, 20, 30]);
    println!("v: {:?}", v.lock().unwrap());

    {
        let mut guard = v.lock().unwrap();
        guard.push(40);
    }

    println!("v: {:?}", v.lock().unwrap());
}

請留意我們如何進行 impl<T: Send> Sync for Mutex<T> 的概括性 實作。

  • Rust 中的 Mutex 就像是只有一個元素的集合,也就是受保護的資料。
    • 必須先取得互斥鎖,才能存取受保護的資料。
  • 只要使用這個鎖,就能從 &Mutex<T> 取得 &mut TMutexGuard 可確保 &mut T 的壽命不會超過所持有的鎖。
  • Mutex<T> implements both Send and Sync iff (if and only if) T implements Send.
  • 可讀寫的對應鎖 - RwLock
  • 為何 lock() 會傳回 Result
    • 如果持有 Mutex 的執行緒發生恐慌,Mutex 就會「中毒」,指出 其保護的資料可能處於不一致的狀態。如果對已中毒的互斥鎖呼叫 lock(), 會發生 PoisonError 錯誤。無論如何,你都可以對錯誤呼叫 into_inner() 來復原 資料。

範例

我們來看看 ArcMutex 的實際應用情形:

use std::thread;
// use std::sync::{Arc, Mutex};

fn main() {
    let v = vec![10, 20, 30];
    let handle = thread::spawn(|| {
        v.push(10);
    });
    v.push(1000);

    handle.join().unwrap();
    println!("v: {v:?}");
}

可能的解決方案:

use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    let v = Arc::new(Mutex::new(vec![10, 20, 30]));

    let v2 = Arc::clone(&v);
    let handle = thread::spawn(move || {
        let mut v2 = v2.lock().unwrap();
        v2.push(10);
    });

    {
        let mut v = v.lock().unwrap();
        v.push(1000);
    }

    handle.join().unwrap();

    println!("v: {v:?}");
}

重要部分:

  • v 已同時納入 ArcMutex,因為兩者的考量互不相關。
    • Mutex 納入 Arc,是在執行緒間共享可變狀態的常見模式。
  • v: Arc<_> 需要複製成 v2,才能轉移到其他執行緒。請注意,move 已新增至 lambda 簽章。
  • 采用區塊,盡量縮小 LockGuard 的範圍。

練習

Let us practice our new concurrency skills with

  • Dining philosophers: a classic problem in concurrency.

  • Multi-threaded link checker: a larger project where you’ll use Cargo to download dependencies and then check links in parallel.

完成練習後,您可以看看我們提供的解決方案

哲學家就餐問題

The dining philosophers problem is a classic problem in concurrency:

Five philosophers dine together at the same table. Each philosopher has their own place at the table. There is a fork between each plate. The dish served is a kind of spaghetti which has to be eaten with two forks. Each philosopher can only alternately think and eat. Moreover, a philosopher can only eat their spaghetti when they have both a left and right fork. Thus two forks will only be available when their two nearest neighbors are thinking, not eating. After an individual philosopher finishes eating, they will put down both forks.

You will need a local Cargo installation for this exercise. Copy the code below to a file called src/main.rs, fill out the blanks, and test that cargo run does not deadlock:

use std::sync::{mpsc, Arc, Mutex};
use std::thread;
use std::time::Duration;

struct Fork;

struct Philosopher {
    name: String,
    // left_fork: ...
    // right_fork: ...
    // thoughts: ...
}

impl Philosopher {
    fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .unwrap();
    }

    fn eat(&self) {
        // Pick up forks...
        println!("{} is eating...", &self.name);
        thread::sleep(Duration::from_millis(10));
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Plato", "Aristotle", "Thales", "Pythagoras"];

fn main() {
    // Create forks

    // Create philosophers

    // Make each of them think and eat 100 times

    // Output their thoughts
}

You can use the following Cargo.toml:

[package]
name = "dining-philosophers"
version = "0.1.0"
edition = "2021"

多執行緒連結檢查器

Let us use our new knowledge to create a multi-threaded link checker. It should start at a webpage and check that links on the page are valid. It should recursively check other pages on the same domain and keep doing this until all pages have been validated.

For this, you will need an HTTP client such as reqwest. Create a new Cargo project and reqwest it as a dependency with:

cargo new link-checker
cd link-checker
cargo add --features blocking,rustls-tls reqwest

If cargo add fails with error: no such subcommand, then please edit the Cargo.toml file by hand. Add the dependencies listed below.

You will also need a way to find links. We can use scraper for that:

cargo add scraper

Finally, we’ll need some way of handling errors. We use thiserror for that:

cargo add thiserror

The cargo add calls will update the Cargo.toml file to look like this:

[package]
name = "link-checker"
version = "0.1.0"
edition = "2021"
publish = false

[dependencies]
reqwest = { version = "0.11.12", features = ["blocking", "rustls-tls"] }
scraper = "0.13.0"
thiserror = "1.0.37"

You can now download the start page. Try with a small site such as https://www.google.org/.

Your src/main.rs file should look something like this:

use reqwest::{blocking::Client, Url};
use scraper::{Html, Selector};
use thiserror::Error;

#[derive(Error, Debug)]
enum Error {
    #[error("request error: {0}")]
    ReqwestError(#[from] reqwest::Error),
    #[error("bad http response: {0}")]
    BadResponse(String),
}

#[derive(Debug)]
struct CrawlCommand {
    url: Url,
    extract_links: bool,
}

fn visit_page(client: &Client, command: &CrawlCommand) -> Result<Vec<Url>, Error> {
    println!("Checking {:#}", command.url);
    let response = client.get(command.url.clone()).send()?;
    if !response.status().is_success() {
        return Err(Error::BadResponse(response.status().to_string()));
    }

    let mut link_urls = Vec::new();
    if !command.extract_links {
        return Ok(link_urls);
    }

    let base_url = response.url().to_owned();
    let body_text = response.text()?;
    let document = Html::parse_document(&body_text);

    let selector = Selector::parse("a").unwrap();
    let href_values = document
        .select(&selector)
        .filter_map(|element| element.value().attr("href"));
    for href in href_values {
        match base_url.join(href) {
            Ok(link_url) => {
                link_urls.push(link_url);
            }
            Err(err) => {
                println!("On {base_url:#}: ignored unparsable {href:?}: {err}");
            }
        }
    }
    Ok(link_urls)
}

fn main() {
    let client = Client::new();
    let start_url = Url::parse("https://www.google.org").unwrap();
    let crawl_command = CrawlCommand{ url: start_url, extract_links: true };
    match visit_page(&client, &crawl_command) {
        Ok(links) => println!("Links: {links:#?}"),
        Err(err) => println!("Could not extract links: {err:#}"),
    }
}

Run the code in src/main.rs with

cargo run

工作

  • Use threads to check the links in parallel: send the URLs to be checked to a channel and let a few threads check the URLs in parallel.
  • Extend this to recursively extract links from all pages on the www.google.org domain. Put an upper limit of 100 pages or so so that you don’t end up being blocked by the site.

非同步的 Rust

「非同步(async)」是一種將多個任務併行執行的模式。在這樣的模式中,當其中一個任務進入阻塞狀態時,系統會去執行至另一個可執行的任務。這種模式允許在執行緒數量有限的環境下執行大量的任務,這是因為每個任務所造成的開銷通常都很低,而且作業系統提供的基本功能能夠有效地辨識可處理的 I/O。

Rust 的非同步操作是透過「future」來處理,代表可能在未來完成的工作。Future 會處在被「輪詢(poll)」的狀態,直到它送出信號來表示工作已經處理完成。

Future 會被非同步的執行環境(runtime)輪詢,而執行環境有許多種可選擇。

比較

  • Python 有一個類似的模型 asyncio。不過 asyncioFuture 類型是根據回呼函數(callback)而非輪詢。非同步的 Python 程式需要「迴圈(loop)」來處理,類似於 Rust 的執行環境。

  • JavaScript 的 Promise 也是類似的概念,但仍是基於回呼函數。JavaScript 的語言執行環境實作了事件迴圈(event loop),所以隱藏了很多關於 Promise 的處理細節。

async/await

從高層次的角度來看,非同步的 Rust 程式碼看起來很像「一般的」同步程式碼:

use futures::executor::block_on;

async fn count_to(count: i32) {
    for i in 1..=count {
        println!("Count is: {i}!");
    }
}

async fn async_main(count: i32) {
    count_to(count).await;
}

fn main() {
    block_on(async_main(10));
}

重要須知:

  • 注意這只是一個簡化過的程式碼,目的是要示範程式語法。這份範例程式碼當中並沒有需要長時間運行的操作,也沒有真正的併行處理!

  • 如何得知非同步函數的回傳型別?

    • main 函數中使用 let feature: () = async_main(10); 以查看型態。
  • 「async」這個關鍵字只是個程式碼語法糖。編譯器會將函數回傳型態以 future 取代。

  • 你不能把 main 函數標示成非同步函數,除非你對編譯器額外設定了如何處理回傳的 future 的方式。

  • 你需要處理器去執行非同步的程式碼。block_on 會阻塞當前的執行緒,直到 future 已執行完畢。

  • .await 會非同步地等待其他操作執行完畢。別於 block_on.await 不會阻塞當前的執行緒。

  • .await 只能用在 async 函數(或程式碼區塊,之後會介紹)中。

Futures

Future is a trait, implemented by objects that represent an operation that may not be complete yet. A future can be polled, and poll returns a Poll.

#![allow(unused)]
fn main() {
use std::pin::Pin;
use std::task::Context;

pub trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

pub enum Poll<T> {
    Ready(T),
    Pending,
}
}

An async function returns an impl Future. It’s also possible (but uncommon) to implement Future for your own types. For example, the JoinHandle returned from tokio::spawn implements Future to allow joining to it.

The .await keyword, applied to a Future, causes the current async function to pause until that Future is ready, and then evaluates to its output.

  • The Future and Poll types are implemented exactly as shown; click the links to show the implementations in the docs.

  • We will not get to Pin and Context, as we will focus on writing async code, rather than building new async primitives. Briefly:

    • Context allows a Future to schedule itself to be polled again when an event occurs.

    • Pin ensures that the Future isn’t moved in memory, so that pointers into that future remain valid. This is required to allow references to remain valid after an .await.

Runtimes

A runtime provides support for performing operations asynchronously (a reactor) and is responsible for executing futures (an executor). Rust does not have a “built-in” runtime, but several options are available:

  • Tokio: performant, with a well-developed ecosystem of functionality like Hyper for HTTP or Tonic for gRPC.
  • async-std: aims to be a “std for async”, and includes a basic runtime in async::task.
  • smol: simple and lightweight

Several larger applications have their own runtimes. For example, Fuchsia already has one.

  • Note that of the listed runtimes, only Tokio is supported in the Rust playground. The playground also does not permit any I/O, so most interesting async things can’t run in the playground.

  • Futures are “inert” in that they do not do anything (not even start an I/O operation) unless there is an executor polling them. This differs from JS Promises, for example, which will run to completion even if they are never used.

Tokio

Tokio provides:

  • A multi-threaded runtime for executing asynchronous code.
  • An asynchronous version of the standard library.
  • A large ecosystem of libraries.
use tokio::time;

async fn count_to(count: i32) {
    for i in 1..=count {
        println!("Count in task: {i}!");
        time::sleep(time::Duration::from_millis(5)).await;
    }
}

#[tokio::main]
async fn main() {
    tokio::spawn(count_to(10));

    for i in 1..5 {
        println!("Main task: {i}");
        time::sleep(time::Duration::from_millis(5)).await;
    }
}
  • With the tokio::main macro we can now make main async.

  • The spawn function creates a new, concurrent “task”.

  • Note: spawn takes a Future, you don’t call .await on count_to.

Further exploration:

  • Why does count_to not (usually) get to 10? This is an example of async cancellation. tokio::spawn returns a handle which can be awaited to wait until it finishes.

  • Try count_to(10).await instead of spawning.

  • Try awaiting the task returned from tokio::spawn.

工作

Rust has a task system, which is a form of lightweight threading.

A task has a single top-level future which the executor polls to make progress. That future may have one or more nested futures that its poll method polls, corresponding loosely to a call stack. Concurrency within a task is possible by polling multiple child futures, such as racing a timer and an I/O operation.

use tokio::io::{self, AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpListener;

#[tokio::main]
async fn main() -> io::Result<()> {
    let listener = TcpListener::bind("127.0.0.1:6142").await?;
	println!("listening on port 6142");

    loop {
        let (mut socket, addr) = listener.accept().await?;

        println!("connection from {addr:?}");

        tokio::spawn(async move {
            if let Err(e) = socket.write_all(b"Who are you?\n").await {
                println!("socket error: {e:?}");
                return;
            }

            let mut buf = vec![0; 1024];
            let reply = match socket.read(&mut buf).await {
                Ok(n) => {
                    let name = std::str::from_utf8(&buf[..n]).unwrap().trim();
                    format!("Thanks for dialing in, {name}!\n")
                }
                Err(e) => {
                    println!("socket error: {e:?}");
                    return;
                }
            };

            if let Err(e) = socket.write_all(reply.as_bytes()).await {
                println!("socket error: {e:?}");
            }
        });
    }
}

Copy this example into your prepared src/main.rs and run it from there.

  • Ask students to visualize what the state of the example server would be with a few connected clients. What tasks exist? What are their Futures?

  • This is the first time we’ve seen an async block. This is similar to a closure, but does not take any arguments. Its return value is a Future, similar to an async fn.

  • Refactor the async block into a function, and improve the error handling using ?.

非同步管道

Several crates have support for asynchronous channels. For instance tokio:

use tokio::sync::mpsc::{self, Receiver};

async fn ping_handler(mut input: Receiver<()>) {
    let mut count: usize = 0;

    while let Some(_) = input.recv().await {
        count += 1;
        println!("Received {count} pings so far.");
    }

    println!("ping_handler complete");
}

#[tokio::main]
async fn main() {
    let (sender, receiver) = mpsc::channel(32);
    let ping_handler_task = tokio::spawn(ping_handler(receiver));
    for i in 0..10 {
        sender.send(()).await.expect("Failed to send ping.");
        println!("Sent {} pings so far.", i + 1);
    }

    drop(sender);
    ping_handler_task.await.expect("Something went wrong in ping handler task.");
}
  • Change the channel size to 3 and see how it affects the execution.

  • Overall, the interface is similar to the sync channels as seen in the morning class.

  • Try removing the std::mem::drop call. What happens? Why?

  • The Flume crate has channels that implement both sync and async send and recv. This can be convenient for complex applications with both IO and heavy CPU processing tasks.

  • What makes working with async channels preferable is the ability to combine them with other futures to combine them and create complex control flow.

Futures Control Flow

Futures can be combined together to produce concurrent compute flow graphs. We have already seen tasks, that function as independent threads of execution.

加入

A join operation waits until all of a set of futures are ready, and returns a collection of their results. This is similar to Promise.all in JavaScript or asyncio.gather in Python.

use anyhow::Result;
use futures::future;
use reqwest;
use std::collections::HashMap;

async fn size_of_page(url: &str) -> Result<usize> {
    let resp = reqwest::get(url).await?;
    Ok(resp.text().await?.len())
}

#[tokio::main]
async fn main() {
    let urls: [&str; 4] = [
        "https://google.com",
        "https://httpbin.org/ip",
        "https://play.rust-lang.org/",
        "BAD_URL",
    ];
    let futures_iter = urls.into_iter().map(size_of_page);
    let results = future::join_all(futures_iter).await;
    let page_sizes_dict: HashMap<&str, Result<usize>> =
        urls.into_iter().zip(results.into_iter()).collect();
    println!("{:?}", page_sizes_dict);
}

Copy this example into your prepared src/main.rs and run it from there.

  • For multiple futures of disjoint types, you can use std::future::join! but you must know how many futures you will have at compile time. This is currently in the futures crate, soon to be stabilised in std::future.

  • The risk of join is that one of the futures may never resolve, this would cause your program to stall.

  • You can also combine join_all with join! for instance to join all requests to an http service as well as a database query. Try adding a tokio::time::sleep to the future, using futures::join!. This is not a timeout (that requires select!, explained in the next chapter), but demonstrates join!.

選取

A select operation waits until any of a set of futures is ready, and responds to that future’s result. In JavaScript, this is similar to Promise.race. In Python, it compares to asyncio.wait(task_set, return_when=asyncio.FIRST_COMPLETED).

Similar to a match statement, the body of select! has a number of arms, each of the form pattern = future => statement. When the future is ready, the statement is executed with the variables in pattern bound to the future’s result.

use tokio::sync::mpsc::{self, Receiver};
use tokio::time::{sleep, Duration};

#[derive(Debug, PartialEq)]
enum Animal {
    Cat { name: String },
    Dog { name: String },
}

async fn first_animal_to_finish_race(
    mut cat_rcv: Receiver<String>,
    mut dog_rcv: Receiver<String>,
) -> Option<Animal> {
    tokio::select! {
        cat_name = cat_rcv.recv() => Some(Animal::Cat { name: cat_name? }),
        dog_name = dog_rcv.recv() => Some(Animal::Dog { name: dog_name? })
    }
}

#[tokio::main]
async fn main() {
    let (cat_sender, cat_receiver) = mpsc::channel(32);
    let (dog_sender, dog_receiver) = mpsc::channel(32);
    tokio::spawn(async move {
        sleep(Duration::from_millis(500)).await;
        cat_sender
            .send(String::from("Felix"))
            .await
            .expect("Failed to send cat.");
    });
    tokio::spawn(async move {
        sleep(Duration::from_millis(50)).await;
        dog_sender
            .send(String::from("Rex"))
            .await
            .expect("Failed to send dog.");
    });

    let winner = first_animal_to_finish_race(cat_receiver, dog_receiver)
        .await
        .expect("Failed to receive winner");

    println!("Winner is {winner:?}");
}
  • In this example, we have a race between a cat and a dog. first_animal_to_finish_race listens to both channels and will pick whichever arrives first. Since the dog takes 50ms, it wins against the cat that take 500ms.

  • You can use oneshot channels in this example as the channels are supposed to receive only one send.

  • Try adding a deadline to the race, demonstrating selecting different sorts of futures.

  • Note that select! drops unmatched branches, which cancels their futures. It is easiest to use when every execution of select! creates new futures.

    • An alternative is to pass &mut future instead of the future itself, but this can lead to issues, further discussed in the pinning slide.

Pitfalls of async/await

Async / await provides convenient and efficient abstraction for concurrent asynchronous programming. However, the async/await model in Rust also comes with its share of pitfalls and footguns. We illustrate some of them in this chapter:

Blocking the executor

Most async runtimes only allow IO tasks to run concurrently. This means that CPU blocking tasks will block the executor and prevent other tasks from being executed. An easy workaround is to use async equivalent methods where possible.

use futures::future::join_all;
use std::time::Instant;

async fn sleep_ms(start: &Instant, id: u64, duration_ms: u64) {
    std::thread::sleep(std::time::Duration::from_millis(duration_ms));
    println!(
        "future {id} slept for {duration_ms}ms, finished after {}ms",
        start.elapsed().as_millis()
    );
}

#[tokio::main(flavor = "current_thread")]
async fn main() {
    let start = Instant::now();
    let sleep_futures = (1..=10).map(|t| sleep_ms(&start, t, t * 10));
    join_all(sleep_futures).await;
}
  • Run the code and see that the sleeps happen consecutively rather than concurrently.

  • The "current_thread" flavor puts all tasks on a single thread. This makes the effect more obvious, but the bug is still present in the multi-threaded flavor.

  • Switch the std::thread::sleep to tokio::time::sleep and await its result.

  • Another fix would be to tokio::task::spawn_blocking which spawns an actual thread and transforms its handle into a future without blocking the executor.

  • You should not think of tasks as OS threads. They do not map 1 to 1 and most executors will allow many tasks to run on a single OS thread. This is particularly problematic when interacting with other libraries via FFI, where that library might depend on thread-local storage or map to specific OS threads (e.g., CUDA). Prefer tokio::task::spawn_blocking in such situations.

  • Use sync mutexes with care. Holding a mutex over an .await may cause another task to block, and that task may be running on the same thread.

釘選

When you await a future, all local variables (that would ordinarily be stored on a stack frame) are instead stored in the Future for the current async block. If your future has pointers to data on the stack, those pointers might get invalidated. This is unsafe.

Therefore, you must guarantee that the addresses your future points to don’t change. That is why we need to pin futures. Using the same future repeatedly in a select! often leads to issues with pinned values.

use tokio::sync::{mpsc, oneshot};
use tokio::task::spawn;
use tokio::time::{sleep, Duration};

// A work item. In this case, just sleep for the given time and respond
// with a message on the `respond_on` channel.
#[derive(Debug)]
struct Work {
    input: u32,
    respond_on: oneshot::Sender<u32>,
}

// A worker which listens for work on a queue and performs it.
async fn worker(mut work_queue: mpsc::Receiver<Work>) {
    let mut iterations = 0;
    loop {
        tokio::select! {
            Some(work) = work_queue.recv() => {
                sleep(Duration::from_millis(10)).await; // Pretend to work.
                work.respond_on
                    .send(work.input * 1000)
                    .expect("failed to send response");
                iterations += 1;
            }
            // TODO: report number of iterations every 100ms
        }
    }
}

// A requester which requests work and waits for it to complete.
async fn do_work(work_queue: &mpsc::Sender<Work>, input: u32) -> u32 {
    let (tx, rx) = oneshot::channel();
    work_queue
        .send(Work {
            input,
            respond_on: tx,
        })
        .await
        .expect("failed to send on work queue");
    rx.await.expect("failed waiting for response")
}

#[tokio::main]
async fn main() {
    let (tx, rx) = mpsc::channel(10);
    spawn(worker(rx));
    for i in 0..100 {
        let resp = do_work(&tx, i).await;
        println!("work result for iteration {i}: {resp}");
    }
}
  • You may recognize this as an example of the actor pattern. Actors typically call select! in a loop.

  • This serves as a summation of a few of the previous lessons, so take your time with it.

    • Naively add a _ = sleep(Duration::from_millis(100)) => { println!(..) } to the select!. This will never execute. Why?

    • Instead, add a timeout_fut containing that future outside of the loop:

      #![allow(unused)]
      fn main() {
      let mut timeout_fut = sleep(Duration::from_millis(100));
      loop {
          select! {
              ..,
              _ = timeout_fut => { println!(..); },
          }
      }
      }
    • This still doesn’t work. Follow the compiler errors, adding &mut to the timeout_fut in the select! to work around the move, then using Box::pin:

      #![allow(unused)]
      fn main() {
      let mut timeout_fut = Box::pin(sleep(Duration::from_millis(100)));
      loop {
          select! {
              ..,
              _ = &mut timeout_fut => { println!(..); },
          }
      }
      }
    • This compiles, but once the timeout expires it is Poll::Ready on every iteration (a fused future would help with this). Update to reset timeout_fut every time it expires.

  • Box allocates on the heap. In some cases, std::pin::pin! (only recently stabilized, with older code often using tokio::pin!) is also an option, but that is difficult to use for a future that is reassigned.

  • Another alternative is to not use pin at all but spawn another task that will send to a oneshot channel every 100ms.

非同步特徵

Async methods in traits are not yet supported in the stable channel (An experimental feature exists in nightly and should be stabilized in the mid term.)

The crate async_trait provides a workaround through a macro:

use async_trait::async_trait;
use std::time::Instant;
use tokio::time::{sleep, Duration};

#[async_trait]
trait Sleeper {
    async fn sleep(&self);
}

struct FixedSleeper {
    sleep_ms: u64,
}

#[async_trait]
impl Sleeper for FixedSleeper {
    async fn sleep(&self) {
        sleep(Duration::from_millis(self.sleep_ms)).await;
    }
}

async fn run_all_sleepers_multiple_times(sleepers: Vec<Box<dyn Sleeper>>, n_times: usize) {
    for _ in 0..n_times {
        println!("running all sleepers..");
        for sleeper in &sleepers {
            let start = Instant::now();
            sleeper.sleep().await;
            println!("slept for {}ms", start.elapsed().as_millis());
        }
    }
}

#[tokio::main]
async fn main() {
    let sleepers: Vec<Box<dyn Sleeper>> = vec![
        Box::new(FixedSleeper { sleep_ms: 50 }),
        Box::new(FixedSleeper { sleep_ms: 100 }),
    ];
    run_all_sleepers_multiple_times(sleepers, 5).await;
}
  • async_trait is easy to use, but note that it’s using heap allocations to achieve this. This heap allocation has performance overhead.

  • The challenges in language support for async trait are deep Rust and probably not worth describing in-depth. Niko Matsakis did a good job of explaining them in this post if you are interested in digging deeper.

  • Try creating a new sleeper struct that will sleep for a random amount of time and adding it to the Vec.

Cancellation

Dropping a future implies it can never be polled again. This is called cancellation and it can occur at any await point. Care is needed to ensure the system works correctly even when futures are cancelled. For example, it shouldn’t deadlock or lose data.

use std::io::{self, ErrorKind};
use std::time::Duration;
use tokio::io::{AsyncReadExt, AsyncWriteExt, DuplexStream};

struct LinesReader {
    stream: DuplexStream,
}

impl LinesReader {
    fn new(stream: DuplexStream) -> Self {
        Self { stream }
    }

    async fn next(&mut self) -> io::Result<Option<String>> {
        let mut bytes = Vec::new();
        let mut buf = [0];
        while self.stream.read(&mut buf[..]).await? != 0 {
            bytes.push(buf[0]);
            if buf[0] == b'\n' {
                break;
            }
        }
        if bytes.is_empty() {
            return Ok(None)
        }
        let s = String::from_utf8(bytes)
            .map_err(|_| io::Error::new(ErrorKind::InvalidData, "not UTF-8"))?;
        Ok(Some(s))
    }
}

async fn slow_copy(source: String, mut dest: DuplexStream) -> std::io::Result<()> {
    for b in source.bytes() {
        dest.write_u8(b).await?;
        tokio::time::sleep(Duration::from_millis(10)).await
    }
    Ok(())
}

#[tokio::main]
async fn main() -> std::io::Result<()> {
    let (client, server) = tokio::io::duplex(5);
    let handle = tokio::spawn(slow_copy("hi\nthere\n".to_owned(), client));

    let mut lines = LinesReader::new(server);
    let mut interval = tokio::time::interval(Duration::from_millis(60));
    loop {
        tokio::select! {
            _ = interval.tick() => println!("tick!"),
            line = lines.next() => if let Some(l) = line? {
                print!("{}", l)
            } else {
                break
            },
        }
    }
    handle.await.unwrap()?;
    Ok(())
}
  • The compiler doesn’t help with cancellation-safety. You need to read API documentation and consider what state your async fn holds.

  • Unlike panic and ?, cancellation is part of normal control flow (vs error-handling).

  • The example loses parts of the string.

    • Whenever the tick() branch finishes first, next() and its buf are dropped.

    • LinesReader can be made cancellation-safe by making buf part of the struct:

      #![allow(unused)]
      fn main() {
      struct LinesReader {
          stream: DuplexStream,
          bytes: Vec<u8>,
          buf: [u8; 1],
      }
      
      impl LinesReader {
          fn new(stream: DuplexStream) -> Self {
              Self { stream, bytes: Vec::new(), buf: [0] }
          }
          async fn next(&mut self) -> io::Result<Option<String>> {
              // prefix buf and bytes with self.
              // ...
              let raw = std::mem::take(&mut self.bytes);
              let s = String::from_utf8(raw)
              // ...
          }
      }
      }
  • Interval::tick is cancellation-safe because it keeps track of whether a tick has been ‘delivered’.

  • AsyncReadExt::read is cancellation-safe because it either returns or doesn’t read data.

  • AsyncBufReadExt::read_line is similar to the example and isn’t cancellation-safe. See its documentation for details and alternatives.

練習

To practice your Async Rust skills, we have again two exercises for you:

  • Dining philosophers: we already saw this problem in the morning. This time you are going to implement it with Async Rust.

  • A Broadcast Chat Application: this is a larger project that allows you experiment with more advanced Async Rust features.

完成練習後,您可以看看我們提供的解決方案

Dining Philosophers - Async

See dining philosophers for a description of the problem.

As before, you will need a local Cargo installation for this exercise. Copy the code below to a file called src/main.rs, fill out the blanks, and test that cargo run does not deadlock:

use std::sync::Arc;
use tokio::time;
use tokio::sync::mpsc::{self, Sender};
use tokio::sync::Mutex;

struct Fork;

struct Philosopher {
    name: String,
    // left_fork: ...
    // right_fork: ...
    // thoughts: ...
}

impl Philosopher {
    async fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name)).await
            .unwrap();
    }

    async fn eat(&self) {
        // Pick up forks...
        println!("{} is eating...", &self.name);
        time::sleep(time::Duration::from_millis(5)).await;
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Plato", "Aristotle", "Thales", "Pythagoras"];

#[tokio::main]
async fn main() {
    // Create forks

    // Create philosophers

    // Make them think and eat

    // Output their thoughts
}

Since this time you are using Async Rust, you’ll need a tokio dependency. You can use the following Cargo.toml:

[package]
name = "dining-philosophers-async-dine"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = {version = "1.26.0", features = ["sync", "time", "macros", "rt-multi-thread"]}

Also note that this time you have to use the Mutex and the mpsc module from the tokio crate.

  • Can you make your implementation single-threaded?

廣播聊天應用程式

In this exercise, we want to use our new knowledge to implement a broadcast chat application. We have a chat server that the clients connect to and publish their messages. The client reads user messages from the standard input, and sends them to the server. The chat server broadcasts each message that it receives to all the clients.

For this, we use a broadcast channel on the server, and tokio_websockets for the communication between the client and the server.

Create a new Cargo project and add the following dependencies:

Cargo.toml:

[package]
name = "chat-async"
version = "0.1.0"
edition = "2021"

[dependencies]
futures-util = { version = "0.3.28", features = ["sink"] }
http = "0.2.9"
tokio = { version = "1.28.1", features = ["full"] }
tokio-websockets = { version = "0.4.0", features = ["client", "fastrand", "server", "sha1_smol"] }

The required APIs

You are going to need the following functions from tokio and tokio_websockets. Spend a few minutes to familiarize yourself with the API.

  • StreamExt::next() implemented by WebsocketStream: for asynchronously reading messages from a Websocket Stream.
  • SinkExt::send() implemented by WebsocketStream: for asynchronously sending messages on a Websocket Stream.
  • Lines::next_line(): for asynchronously reading user messages from the standard input.
  • Sender::subscribe(): for subscribing to a broadcast channel.

Two binaries

Normally in a Cargo project, you can have only one binary, and one src/main.rs file. In this project, we need two binaries. One for the client, and one for the server. You could potentially make them two separate Cargo projects, but we are going to put them in a single Cargo project with two binaries. For this to work, the client and the server code should go under src/bin (see the documentation).

Copy the following server and client code into src/bin/server.rs and src/bin/client.rs, respectively. Your task is to complete these files as described below.

src/bin/server.rs:

use futures_util::sink::SinkExt;
use futures_util::stream::StreamExt;
use std::error::Error;
use std::net::SocketAddr;
use tokio::net::{TcpListener, TcpStream};
use tokio::sync::broadcast::{channel, Sender};
use tokio_websockets::{Message, ServerBuilder, WebsocketStream};

async fn handle_connection(
    addr: SocketAddr,
    mut ws_stream: WebsocketStream<TcpStream>,
    bcast_tx: Sender<String>,
) -> Result<(), Box<dyn Error + Send + Sync>> {

    // TODO: For a hint, see the description of the task below.

}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let (bcast_tx, _) = channel(16);

    let listener = TcpListener::bind("127.0.0.1:2000").await?;
    println!("listening on port 2000");

    loop {
        let (socket, addr) = listener.accept().await?;
        println!("New connection from {addr:?}");
        let bcast_tx = bcast_tx.clone();
        tokio::spawn(async move {
            // Wrap the raw TCP stream into a websocket.
            let ws_stream = ServerBuilder::new().accept(socket).await?;

            handle_connection(addr, ws_stream, bcast_tx).await
        });
    }
}

src/bin/client.rs:

use futures_util::stream::StreamExt;
use futures_util::SinkExt;
use http::Uri;
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio_websockets::{ClientBuilder, Message};

#[tokio::main]
async fn main() -> Result<(), tokio_websockets::Error> {
    let (mut ws_stream, _) =
        ClientBuilder::from_uri(Uri::from_static("ws://127.0.0.1:2000"))
            .connect()
            .await?;

    let stdin = tokio::io::stdin();
    let mut stdin = BufReader::new(stdin).lines();


    // TODO: For a hint, see the description of the task below.

}

Running the binaries

Run the server with:

cargo run --bin server

and the client with:

cargo run --bin client

工作

  • Implement the handle_connection function in src/bin/server.rs.
    • Hint: Use tokio::select! for concurrently performing two tasks in a continuous loop. One task receives messages from the client and broadcasts them. The other sends messages received by the server to the client.
  • Complete the main function in src/bin/client.rs.
    • Hint: As before, use tokio::select! in a continuous loop for concurrently performing two tasks: (1) reading user messages from standard input and sending them to the server, and (2) receiving messages from the server, and displaying them for the user.
  • Optional: Once you are done, change the code to broadcast messages to all clients, but the sender of the message.

謝謝!

感謝您參加 Comprehensive Rust 🦀 課程!__希望您喜歡這門課,並能學以致用。

我們在整合課程時獲得許多樂趣。但這門課並非完美無缺,因此您若發現任何錯誤,或有改善的想法,歡迎透過 GitHub 與我們聯絡。我們很樂於傾聽您的意見!

其他 Rust 資源

Rust 社群在線上提供了大量優質的免費資源。

官方說明文件

Rust 專案中有許多資源。您可以透過這些資源瞭解 Rust 的一般概念:

  • The Rust Programming Language:Rust 的免費標準用書,詳細介紹這個語言的種種知識,也收錄了一些可供使用者建構的專案。
  • Rust By Example:透過一系列範例示範不同結構,進而介紹 Rust 語法。偶爾也會提供牛刀小試的練習,請您擴寫範例的程式碼。
  • Rust Standard Library:Rust 標準程式庫的完整說明文件。
  • The Rust Reference:本書並不完整,但會說明 Rust 文法和記憶體模型。

在 Rust 官方網站上還有更多專業指南:

非官方學習教材

以下精選一些 Rust 的其他指南和教學課程:

如需更多 Rust 相關書籍,請參閱 Little Book of Rust Books

出處清單

這份教材是以許多優質的 Rust 說明文件來源為基礎。請參閱 其他資源 頁面,查看完整的實用資源清單。

The material of Comprehensive Rust is licensed under the terms of the Apache 2.0 license, please see LICENSE for details.

Rust by Example

部分範例和習題是複製自 Rust by Example,並經過調整。詳情請參閱 third_party/rust-by-example/ 目錄,包括授權條款。

Exercism 上的 Rust

部分習題是複製自 Exercism 上的 Rust 相關內容,並經過調整。詳情請參閱 third_party/rust-on-exercism/ 目錄,包括授權條款。

CXX

在「互通性」該節的「與 C++」部分中,所使用的圖片是出自 CXX。詳情請參閱 third_party/cxx/ 目錄,包括授權條款。

解決方案

您可以在以下幾頁查看練習的解決方案。

歡迎前往 GitHub針對解決方案提問。如果您有與本書不同或更好的解決方案,請告訴我們。

**注意:**請忽略解決方案中的 // ANCHOR: label// ANCHOR_END: label 註解。之所以有這些註解,是要方便您重複利用解決方案的某些部分來練習。

第 1 天:上午練習

陣列和 for 迴圈

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: transpose
fn transpose(matrix: [[i32; 3]; 3]) -> [[i32; 3]; 3] {
    // ANCHOR_END: transpose
    let mut result = [[0; 3]; 3];
    for i in 0..3 {
        for j in 0..3 {
            result[j][i] = matrix[i][j];
        }
    }
    return result;
}

// ANCHOR: pretty_print
fn pretty_print(matrix: &[[i32; 3]; 3]) {
    // ANCHOR_END: pretty_print
    for row in matrix {
        println!("{row:?}");
    }
}

// ANCHOR: tests
#[test]
fn test_transpose() {
    let matrix = [
        [101, 102, 103], //
        [201, 202, 203],
        [301, 302, 303],
    ];
    let transposed = transpose(matrix);
    assert_eq!(
        transposed,
        [
            [101, 201, 301], //
            [102, 202, 302],
            [103, 203, 303],
        ]
    );
}
// ANCHOR_END: tests

// ANCHOR: main
fn main() {
    let matrix = [
        [101, 102, 103], // <-- the comment makes rustfmt add a newline
        [201, 202, 203],
        [301, 302, 303],
    ];

    println!("matrix:");
    pretty_print(&matrix);

    let transposed = transpose(matrix);
    println!("transposed:");
    pretty_print(&transposed);
}

加分題

這需要用到更進階的概念。表面上看起來我們可以使用多維度切片 (&[&[i32]]) 做為轉置的輸入型別,進而讓函式處理任何大小的矩陣。但實際上這很快就會失敗,也就是說傳回型別不能是 &[&[i32]],因為它需要擁有您傳回的資料。

您可以嘗試使用 Vec<Vec<i32>> 這類代碼,但這也不能立即見效,因為從 Vec<Vec<i32>> 轉換成 &[&[i32]] 並不容易,因此您現在也無法輕易使用 pretty_print

一旦瞭解特徵和泛型,我們就能使用 std::convert::AsRef 特徵,對任何可當做切片參照的資料進行抽象化。

use std::convert::AsRef;
use std::fmt::Debug;

fn pretty_print<T, Line, Matrix>(matrix: Matrix)
where
    T: Debug,
    // A line references a slice of items
    Line: AsRef<[T]>,
    // A matrix references a slice of lines
    Matrix: AsRef<[Line]>
{
    for row in matrix.as_ref() {
        println!("{:?}", row.as_ref());
    }
}

fn main() {
    // &[&[i32]]
    pretty_print(&[&[1, 2, 3], &[4, 5, 6], &[7, 8, 9]]);
    // [[&str; 2]; 2]
    pretty_print([["a", "b"], ["c", "d"]]);
    // Vec<Vec<i32>>
    pretty_print(vec![vec![1, 2], vec![3, 4]]);
}

此外,型別本身不會強制規定子項切片的長度必須相同,因此這類變數可能含有無效的矩陣。

第 1 天:下午練習

盧恩演算法

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: luhn
pub fn luhn(cc_number: &str) -> bool {
    // ANCHOR_END: luhn
    let mut digits_seen = 0;
    let mut sum = 0;
    for (i, ch) in cc_number.chars().rev().filter(|&ch| ch != ' ').enumerate() {
        match ch.to_digit(10) {
            Some(d) => {
                sum += if i % 2 == 1 {
                    let dd = d * 2;
                    dd / 10 + dd % 10
                } else {
                    d
                };
                digits_seen += 1;
            }
            None => return false,
        }
    }

    if digits_seen < 2 {
        return false;
    }

    sum % 10 == 0
}

fn main() {
    let cc_number = "1234 5678 1234 5670";
    println!(
        "Is {cc_number} a valid credit card number? {}",
        if luhn(cc_number) { "yes" } else { "no" }
    );
}

// ANCHOR: unit-tests
#[test]
fn test_non_digit_cc_number() {
    assert!(!luhn("foo"));
    assert!(!luhn("foo 0 0"));
}

#[test]
fn test_empty_cc_number() {
    assert!(!luhn(""));
    assert!(!luhn(" "));
    assert!(!luhn("  "));
    assert!(!luhn("    "));
}

#[test]
fn test_single_digit_cc_number() {
    assert!(!luhn("0"));
}

#[test]
fn test_two_digit_cc_number() {
    assert!(luhn(" 0 0 "));
}

#[test]
fn test_valid_cc_number() {
    assert!(luhn("4263 9826 4026 9299"));
    assert!(luhn("4539 3195 0343 6467"));
    assert!(luhn("7992 7398 713"));
}

#[test]
fn test_invalid_cc_number() {
    assert!(!luhn("4223 9826 4026 9299"));
    assert!(!luhn("4539 3195 0343 6476"));
    assert!(!luhn("8273 1232 7352 0569"));
}
// ANCHOR_END: unit-tests

模式配對

未定。

第 2 天:上午練習

設計程式庫

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: setup
struct Library {
    books: Vec<Book>,
}

struct Book {
    title: String,
    year: u16,
}

impl Book {
    // This is a constructor, used below.
    fn new(title: &str, year: u16) -> Book {
        Book {
            title: String::from(title),
            year,
        }
    }
}

// Implement the methods below. Update the `self` parameter to
// indicate the method's required level of ownership over the object:
//
// - `&self` for shared read-only access,
// - `&mut self` for unique and mutable access,
// - `self` for unique access by value.
impl Library {
    // ANCHOR_END: setup

    // ANCHOR: Library_new
    fn new() -> Library {
        // ANCHOR_END: Library_new
        Library { books: Vec::new() }
    }

    // ANCHOR: Library_len
    //fn len(self) -> usize {
    //    todo!("Return the length of `self.books`")
    //}
    // ANCHOR_END: Library_len
    fn len(&self) -> usize {
        self.books.len()
    }

    // ANCHOR: Library_is_empty
    //fn is_empty(self) -> bool {
    //    todo!("Return `true` if `self.books` is empty")
    //}
    // ANCHOR_END: Library_is_empty
    fn is_empty(&self) -> bool {
        self.books.is_empty()
    }

    // ANCHOR: Library_add_book
    //fn add_book(self, book: Book) {
    //    todo!("Add a new book to `self.books`")
    //}
    // ANCHOR_END: Library_add_book
    fn add_book(&mut self, book: Book) {
        self.books.push(book)
    }

    // ANCHOR: Library_print_books
    //fn print_books(self) {
    //    todo!("Iterate over `self.books` and each book's title and year")
    //}
    // ANCHOR_END: Library_print_books
    fn print_books(&self) {
        for book in &self.books {
            println!("{}, published in {}", book.title, book.year);
        }
    }

    // ANCHOR: Library_oldest_book
    //fn oldest_book(self) -> Option<&Book> {
    //    todo!("Return a reference to the oldest book (if any)")
    //}
    // ANCHOR_END: Library_oldest_book
    fn oldest_book(&self) -> Option<&Book> {
        // Using a closure and a built-in method:
        // self.books.iter().min_by_key(|book| book.year)

        // Longer hand-written solution:
        let mut oldest: Option<&Book> = None;
        for book in self.books.iter() {
            if oldest.is_none() || book.year < oldest.unwrap().year {
                oldest = Some(book);
            }
        }

        oldest
    }
}

// ANCHOR: main
// This shows the desired behavior. Uncomment the code below and
// implement the missing methods. You will need to update the
// method signatures, including the "self" parameter! You may
// also need to update the variable bindings within main.
fn main() {
    let library = Library::new();

    //println!("The library is empty: library.is_empty() -> {}", library.is_empty());
    //
    //library.add_book(Book::new("Lord of the Rings", 1954));
    //library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    //
    //println!("The library is no longer empty: library.is_empty() -> {}", library.is_empty());
    //
    //
    //library.print_books();
    //
    //match library.oldest_book() {
    //    Some(book) => println!("The oldest book is {}", book.title),
    //    None => println!("The library is empty!"),
    //}
    //
    //println!("The library has {} books", library.len());
    //library.print_books();
}
// ANCHOR_END: main

#[test]
fn test_library_len() {
    let mut library = Library::new();
    assert_eq!(library.len(), 0);
    assert!(library.is_empty());

    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    assert_eq!(library.len(), 2);
    assert!(!library.is_empty());
}

#[test]
fn test_library_is_empty() {
    let mut library = Library::new();
    assert!(library.is_empty());

    library.add_book(Book::new("Lord of the Rings", 1954));
    assert!(!library.is_empty());
}

#[test]
fn test_library_print_books() {
    let mut library = Library::new();
    library.add_book(Book::new("Lord of the Rings", 1954));
    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    // We could try and capture stdout, but let us just call the
    // method to start with.
    library.print_books();
}

#[test]
fn test_library_oldest_book() {
    let mut library = Library::new();
    assert!(library.oldest_book().is_none());

    library.add_book(Book::new("Lord of the Rings", 1954));
    assert_eq!(
        library.oldest_book().map(|b| b.title.as_str()),
        Some("Lord of the Rings")
    );

    library.add_book(Book::new("Alice's Adventures in Wonderland", 1865));
    assert_eq!(
        library.oldest_book().map(|b| b.title.as_str()),
        Some("Alice's Adventures in Wonderland")
    );
}

第 2 天:下午練習

字串和疊代器

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: prefix_matches
pub fn prefix_matches(prefix: &str, request_path: &str) -> bool {
    // ANCHOR_END: prefix_matches

    let mut request_segments = request_path.split('/');

    for prefix_segment in prefix.split('/') {
        let Some(request_segment) = request_segments.next() else {
            return false;
        };
        if request_segment != prefix_segment && prefix_segment != "*" {
            return false;
        }
    }
    true

    // Alternatively, Iterator::zip() lets us iterate simultaneously over prefix
    // and request segments. The zip() iterator is finished as soon as one of
    // the source iterators is finished, but we need to iterate over all request
    // segments. A neat trick that makes zip() work is to use map() and chain()
    // to produce an iterator that returns Some(str) for each pattern segments,
    // and then returns None indefinitely.
}

// ANCHOR: unit-tests
#[test]
fn test_matches_without_wildcard() {
    assert!(prefix_matches("/v1/publishers", "/v1/publishers"));
    assert!(prefix_matches("/v1/publishers", "/v1/publishers/abc-123"));
    assert!(prefix_matches("/v1/publishers", "/v1/publishers/abc/books"));

    assert!(!prefix_matches("/v1/publishers", "/v1"));
    assert!(!prefix_matches("/v1/publishers", "/v1/publishersBooks"));
    assert!(!prefix_matches("/v1/publishers", "/v1/parent/publishers"));
}

#[test]
fn test_matches_with_wildcard() {
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/books"
    ));
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/bar/books"
    ));
    assert!(prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/books/book1"
    ));

    assert!(!prefix_matches("/v1/publishers/*/books", "/v1/publishers"));
    assert!(!prefix_matches(
        "/v1/publishers/*/books",
        "/v1/publishers/foo/booksByAuthor"
    ));
}
// ANCHOR_END: unit-tests

fn main() {}

第 3 天:上午練習

簡易 GUI 程式庫

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: setup
pub trait Widget {
    /// Natural width of `self`.
    fn width(&self) -> usize;

    /// Draw the widget into a buffer.
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write);

    /// Draw the widget on standard output.
    fn draw(&self) {
        let mut buffer = String::new();
        self.draw_into(&mut buffer);
        println!("{buffer}");
    }
}

pub struct Label {
    label: String,
}

impl Label {
    fn new(label: &str) -> Label {
        Label {
            label: label.to_owned(),
        }
    }
}

pub struct Button {
    label: Label,
    callback: Box<dyn FnMut()>,
}

impl Button {
    fn new(label: &str, callback: Box<dyn FnMut()>) -> Button {
        Button {
            label: Label::new(label),
            callback,
        }
    }
}

pub struct Window {
    title: String,
    widgets: Vec<Box<dyn Widget>>,
}

impl Window {
    fn new(title: &str) -> Window {
        Window {
            title: title.to_owned(),
            widgets: Vec::new(),
        }
    }

    fn add_widget(&mut self, widget: Box<dyn Widget>) {
        self.widgets.push(widget);
    }

    fn inner_width(&self) -> usize {
        std::cmp::max(
            self.title.chars().count(),
            self.widgets.iter().map(|w| w.width()).max().unwrap_or(0),
        )
    }
}

// ANCHOR_END: setup

// ANCHOR: Window-width
impl Widget for Window {
    fn width(&self) -> usize {
        // ANCHOR_END: Window-width
        // Add 4 paddings for borders
        self.inner_width() + 4
    }

    // ANCHOR: Window-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Window-draw_into
        let mut inner = String::new();
        for widget in &self.widgets {
            widget.draw_into(&mut inner);
        }

        let inner_width = self.inner_width();

        // TODO: after learning about error handling, you can change
        // draw_into to return Result<(), std::fmt::Error>. Then use
        // the ?-operator here instead of .unwrap().
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
        writeln!(buffer, "| {:^inner_width$} |", &self.title).unwrap();
        writeln!(buffer, "+={:=<inner_width$}=+", "").unwrap();
        for line in inner.lines() {
            writeln!(buffer, "| {:inner_width$} |", line).unwrap();
        }
        writeln!(buffer, "+-{:-<inner_width$}-+", "").unwrap();
    }
}

// ANCHOR: Button-width
impl Widget for Button {
    fn width(&self) -> usize {
        // ANCHOR_END: Button-width
        self.label.width() + 8 // add a bit of padding
    }

    // ANCHOR: Button-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Button-draw_into
        let width = self.width();
        let mut label = String::new();
        self.label.draw_into(&mut label);

        writeln!(buffer, "+{:-<width$}+", "").unwrap();
        for line in label.lines() {
            writeln!(buffer, "|{:^width$}|", &line).unwrap();
        }
        writeln!(buffer, "+{:-<width$}+", "").unwrap();
    }
}

// ANCHOR: Label-width
impl Widget for Label {
    fn width(&self) -> usize {
        // ANCHOR_END: Label-width
        self.label
            .lines()
            .map(|line| line.chars().count())
            .max()
            .unwrap_or(0)
    }

    // ANCHOR: Label-draw_into
    fn draw_into(&self, buffer: &mut dyn std::fmt::Write) {
        // ANCHOR_END: Label-draw_into
        writeln!(buffer, "{}", &self.label).unwrap();
    }
}

// ANCHOR: main
fn main() {
    let mut window = Window::new("Rust GUI Demo 1.23");
    window.add_widget(Box::new(Label::new("This is a small text GUI demo.")));
    window.add_widget(Box::new(Button::new(
        "Click me!",
        Box::new(|| println!("You clicked the button!")),
    )));
    window.draw();
}
// ANCHOR_END: main

點和多邊形

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#[derive(Debug, Copy, Clone, PartialEq, Eq)]
// ANCHOR: Point
pub struct Point {
    // ANCHOR_END: Point
    x: i32,
    y: i32,
}

// ANCHOR: Point-impl
impl Point {
    // ANCHOR_END: Point-impl
    pub fn new(x: i32, y: i32) -> Point {
        Point { x, y }
    }

    pub fn magnitude(self) -> f64 {
        f64::from(self.x.pow(2) + self.y.pow(2)).sqrt()
    }

    pub fn dist(self, other: Point) -> f64 {
        (self - other).magnitude()
    }
}

impl std::ops::Add for Point {
    type Output = Self;

    fn add(self, other: Self) -> Self::Output {
        Self {
            x: self.x + other.x,
            y: self.y + other.y,
        }
    }
}

impl std::ops::Sub for Point {
    type Output = Self;

    fn sub(self, other: Self) -> Self::Output {
        Self {
            x: self.x - other.x,
            y: self.y - other.y,
        }
    }
}

// ANCHOR: Polygon
pub struct Polygon {
    // ANCHOR_END: Polygon
    points: Vec<Point>,
}

// ANCHOR: Polygon-impl
impl Polygon {
    // ANCHOR_END: Polygon-impl
    pub fn new() -> Polygon {
        Polygon { points: Vec::new() }
    }

    pub fn add_point(&mut self, point: Point) {
        self.points.push(point);
    }

    pub fn left_most_point(&self) -> Option<Point> {
        self.points.iter().min_by_key(|p| p.x).copied()
    }

    pub fn iter(&self) -> impl Iterator<Item = &Point> {
        self.points.iter()
    }

    pub fn length(&self) -> f64 {
        if self.points.is_empty() {
            return 0.0;
        }

        let mut result = 0.0;
        let mut last_point = self.points[0];
        for point in &self.points[1..] {
            result += last_point.dist(*point);
            last_point = *point;
        }
        result += last_point.dist(self.points[0]);
        result
        // Alternatively, Iterator::zip() lets us iterate over the points as pairs
        // but we need to pair each point with the next one, and the last point
        // with the first point. The zip() iterator is finished as soon as one of 
        // the source iterators is finished, a neat trick is to combine Iterator::cycle
        // with Iterator::skip to create the second iterator for the zip and using map 
        // and sum to calculate the total length.
    }
}

// ANCHOR: Circle
pub struct Circle {
    // ANCHOR_END: Circle
    center: Point,
    radius: i32,
}

// ANCHOR: Circle-impl
impl Circle {
    // ANCHOR_END: Circle-impl
    pub fn new(center: Point, radius: i32) -> Circle {
        Circle { center, radius }
    }

    pub fn circumference(&self) -> f64 {
        2.0 * std::f64::consts::PI * f64::from(self.radius)
    }

    pub fn dist(&self, other: &Self) -> f64 {
        self.center.dist(other.center)
    }
}

// ANCHOR: Shape
pub enum Shape {
    Polygon(Polygon),
    Circle(Circle),
}
// ANCHOR_END: Shape

impl From<Polygon> for Shape {
    fn from(poly: Polygon) -> Self {
        Shape::Polygon(poly)
    }
}

impl From<Circle> for Shape {
    fn from(circle: Circle) -> Self {
        Shape::Circle(circle)
    }
}

impl Shape {
    pub fn perimeter(&self) -> f64 {
        match self {
            Shape::Polygon(poly) => poly.length(),
            Shape::Circle(circle) => circle.circumference(),
        }
    }
}

// ANCHOR: unit-tests
#[cfg(test)]
mod tests {
    use super::*;

    fn round_two_digits(x: f64) -> f64 {
        (x * 100.0).round() / 100.0
    }

    #[test]
    fn test_point_magnitude() {
        let p1 = Point::new(12, 13);
        assert_eq!(round_two_digits(p1.magnitude()), 17.69);
    }

    #[test]
    fn test_point_dist() {
        let p1 = Point::new(10, 10);
        let p2 = Point::new(14, 13);
        assert_eq!(round_two_digits(p1.dist(p2)), 5.00);
    }

    #[test]
    fn test_point_add() {
        let p1 = Point::new(16, 16);
        let p2 = p1 + Point::new(-4, 3);
        assert_eq!(p2, Point::new(12, 19));
    }

    #[test]
    fn test_polygon_left_most_point() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);
        assert_eq!(poly.left_most_point(), Some(p1));
    }

    #[test]
    fn test_polygon_iter() {
        let p1 = Point::new(12, 13);
        let p2 = Point::new(16, 16);

        let mut poly = Polygon::new();
        poly.add_point(p1);
        poly.add_point(p2);

        let points = poly.iter().cloned().collect::<Vec<_>>();
        assert_eq!(points, vec![Point::new(12, 13), Point::new(16, 16)]);
    }

    #[test]
    fn test_shape_perimeters() {
        let mut poly = Polygon::new();
        poly.add_point(Point::new(12, 13));
        poly.add_point(Point::new(17, 11));
        poly.add_point(Point::new(16, 16));
        let shapes = vec![
            Shape::from(poly),
            Shape::from(Circle::new(Point::new(10, 20), 5)),
        ];
        let perimeters = shapes
            .iter()
            .map(Shape::perimeter)
            .map(round_two_digits)
            .collect::<Vec<_>>();
        assert_eq!(perimeters, vec![15.48, 31.42]);
    }
}
// ANCHOR_END: unit-tests

fn main() {}

第 3 天:下午練習

安全的 FFI 包裝函式

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: ffi
mod ffi {
    use std::os::raw::{c_char, c_int};
    #[cfg(not(target_os = "macos"))]
    use std::os::raw::{c_long, c_ulong, c_ushort, c_uchar};

    // Opaque type. See https://doc.rust-lang.org/nomicon/ffi.html.
    #[repr(C)]
    pub struct DIR {
        _data: [u8; 0],
        _marker: core::marker::PhantomData<(*mut u8, core::marker::PhantomPinned)>,
    }

    // Layout according to the Linux man page for readdir(3), where ino_t and
    // off_t are resolved according to the definitions in
    // /usr/include/x86_64-linux-gnu/{sys/types.h, bits/typesizes.h}.
    #[cfg(not(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_ino: c_ulong,
        pub d_off: c_long,
        pub d_reclen: c_ushort,
        pub d_type: c_uchar,
        pub d_name: [c_char; 256],
    }

    // Layout according to the macOS man page for dir(5).
    #[cfg(all(target_os = "macos"))]
    #[repr(C)]
    pub struct dirent {
        pub d_fileno: u64,
        pub d_seekoff: u64,
        pub d_reclen: u16,
        pub d_namlen: u16,
        pub d_type: u8,
        pub d_name: [c_char; 1024],
    }

    extern "C" {
        pub fn opendir(s: *const c_char) -> *mut DIR;

        #[cfg(not(all(target_os = "macos", target_arch = "x86_64")))]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        // See https://github.com/rust-lang/libc/issues/414 and the section on
        // _DARWIN_FEATURE_64_BIT_INODE in the macOS man page for stat(2).
        //
        // "Platforms that existed before these updates were available" refers
        // to macOS (as opposed to iOS / wearOS / etc.) on Intel and PowerPC.
        #[cfg(all(target_os = "macos", target_arch = "x86_64"))]
        #[link_name = "readdir$INODE64"]
        pub fn readdir(s: *mut DIR) -> *const dirent;

        pub fn closedir(s: *mut DIR) -> c_int;
    }
}

use std::ffi::{CStr, CString, OsStr, OsString};
use std::os::unix::ffi::OsStrExt;

#[derive(Debug)]
struct DirectoryIterator {
    path: CString,
    dir: *mut ffi::DIR,
}
// ANCHOR_END: ffi

// ANCHOR: DirectoryIterator
impl DirectoryIterator {
    fn new(path: &str) -> Result<DirectoryIterator, String> {
        // Call opendir and return a Ok value if that worked,
        // otherwise return Err with a message.
        // ANCHOR_END: DirectoryIterator
        let path = CString::new(path).map_err(|err| format!("Invalid path: {err}"))?;
        // SAFETY: path.as_ptr() cannot be NULL.
        let dir = unsafe { ffi::opendir(path.as_ptr()) };
        if dir.is_null() {
            Err(format!("Could not open {:?}", path))
        } else {
            Ok(DirectoryIterator { path, dir })
        }
    }
}

// ANCHOR: Iterator
impl Iterator for DirectoryIterator {
    type Item = OsString;
    fn next(&mut self) -> Option<OsString> {
        // Keep calling readdir until we get a NULL pointer back.
        // ANCHOR_END: Iterator
        // SAFETY: self.dir is never NULL.
        let dirent = unsafe { ffi::readdir(self.dir) };
        if dirent.is_null() {
            // We have reached the end of the directory.
            return None;
        }
        // SAFETY: dirent is not NULL and dirent.d_name is NUL
        // terminated.
        let d_name = unsafe { CStr::from_ptr((*dirent).d_name.as_ptr()) };
        let os_str = OsStr::from_bytes(d_name.to_bytes());
        Some(os_str.to_owned())
    }
}

// ANCHOR: Drop
impl Drop for DirectoryIterator {
    fn drop(&mut self) {
        // Call closedir as needed.
        // ANCHOR_END: Drop
        if !self.dir.is_null() {
            // SAFETY: self.dir is not NULL.
            if unsafe { ffi::closedir(self.dir) } != 0 {
                panic!("Could not close {:?}", self.path);
            }
        }
    }
}

// ANCHOR: main
fn main() -> Result<(), String> {
    let iter = DirectoryIterator::new(".")?;
    println!("files: {:#?}", iter.collect::<Vec<_>>());
    Ok(())
}
// ANCHOR_END: main

#[cfg(test)]
mod tests {
    use super::*;
    use std::error::Error;

    #[test]
    fn test_nonexisting_directory() {
        let iter = DirectoryIterator::new("no-such-directory");
        assert!(iter.is_err());
    }

    #[test]
    fn test_empty_directory() -> Result<(), Box<dyn Error>> {
        let tmp = tempfile::TempDir::new()?;
        let iter = DirectoryIterator::new(
            tmp.path().to_str().ok_or("Non UTF-8 character in path")?,
        )?;
        let mut entries = iter.collect::<Vec<_>>();
        entries.sort();
        assert_eq!(entries, &[".", ".."]);
        Ok(())
    }

    #[test]
    fn test_nonempty_directory() -> Result<(), Box<dyn Error>> {
        let tmp = tempfile::TempDir::new()?;
        std::fs::write(tmp.path().join("foo.txt"), "The Foo Diaries\n")?;
        std::fs::write(tmp.path().join("bar.png"), "<PNG>\n")?;
        std::fs::write(tmp.path().join("crab.rs"), "//! Crab\n")?;
        let iter = DirectoryIterator::new(
            tmp.path().to_str().ok_or("Non UTF-8 character in path")?,
        )?;
        let mut entries = iter.collect::<Vec<_>>();
        entries.sort();
        assert_eq!(entries, &[".", "..", "bar.png", "crab.rs", "foo.txt"]);
        Ok(())
    }
}

Rust 裸機開發:上午練習

指南針

(返回練習)

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: top
#![no_main]
#![no_std]

extern crate panic_halt as _;

use core::fmt::Write;
use cortex_m_rt::entry;
// ANCHOR_END: top
use core::cmp::{max, min};
use lsm303agr::{AccelOutputDataRate, Lsm303agr, MagOutputDataRate};
use microbit::display::blocking::Display;
use microbit::hal::prelude::*;
use microbit::hal::twim::Twim;
use microbit::hal::uarte::{Baudrate, Parity, Uarte};
use microbit::hal::Timer;
use microbit::pac::twim0::frequency::FREQUENCY_A;
use microbit::Board;

const COMPASS_SCALE: i32 = 30000;
const ACCELEROMETER_SCALE: i32 = 700;

// ANCHOR: main
#[entry]
fn main() -> ! {
    let board = Board::take().unwrap();

    // Configure serial port.
    let mut serial = Uarte::new(
        board.UARTE0,
        board.uart.into(),
        Parity::EXCLUDED,
        Baudrate::BAUD115200,
    );

    // Set up the I2C controller and Inertial Measurement Unit.
    // ANCHOR_END: main
    writeln!(serial, "Setting up IMU...").unwrap();
    let i2c = Twim::new(board.TWIM0, board.i2c_internal.into(), FREQUENCY_A::K100);
    let mut imu = Lsm303agr::new_with_i2c(i2c);
    imu.init().unwrap();
    imu.set_mag_odr(MagOutputDataRate::Hz50).unwrap();
    imu.set_accel_odr(AccelOutputDataRate::Hz50).unwrap();
    let mut imu = imu.into_mag_continuous().ok().unwrap();

    // Set up display and timer.
    let mut timer = Timer::new(board.TIMER0);
    let mut display = Display::new(board.display_pins);

    let mut mode = Mode::Compass;
    let mut button_pressed = false;

    // ANCHOR: loop
    writeln!(serial, "Ready.").unwrap();

    loop {
        // Read compass data and log it to the serial port.
        // ANCHOR_END: loop
        while !(imu.mag_status().unwrap().xyz_new_data
            && imu.accel_status().unwrap().xyz_new_data)
        {}
        let compass_reading = imu.mag_data().unwrap();
        let accelerometer_reading = imu.accel_data().unwrap();
        writeln!(
            serial,
            "{},{},{}\t{},{},{}",
            compass_reading.x,
            compass_reading.y,
            compass_reading.z,
            accelerometer_reading.x,
            accelerometer_reading.y,
            accelerometer_reading.z,
        )
        .unwrap();

        let mut image = [[0; 5]; 5];
        let (x, y) = match mode {
            Mode::Compass => (
                scale(-compass_reading.x, -COMPASS_SCALE, COMPASS_SCALE, 0, 4) as usize,
                scale(compass_reading.y, -COMPASS_SCALE, COMPASS_SCALE, 0, 4) as usize,
            ),
            Mode::Accelerometer => (
                scale(
                    accelerometer_reading.x,
                    -ACCELEROMETER_SCALE,
                    ACCELEROMETER_SCALE,
                    0,
                    4,
                ) as usize,
                scale(
                    -accelerometer_reading.y,
                    -ACCELEROMETER_SCALE,
                    ACCELEROMETER_SCALE,
                    0,
                    4,
                ) as usize,
            ),
        };
        image[y][x] = 255;
        display.show(&mut timer, image, 100);

        // If button A is pressed, switch to the next mode and briefly blink all LEDs on.
        if board.buttons.button_a.is_low().unwrap() {
            if !button_pressed {
                mode = mode.next();
                display.show(&mut timer, [[255; 5]; 5], 200);
            }
            button_pressed = true;
        } else {
            button_pressed = false;
        }
    }
}

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum Mode {
    Compass,
    Accelerometer,
}

impl Mode {
    fn next(self) -> Self {
        match self {
            Self::Compass => Self::Accelerometer,
            Self::Accelerometer => Self::Compass,
        }
    }
}

fn scale(value: i32, min_in: i32, max_in: i32, min_out: i32, max_out: i32) -> i32 {
    let range_in = max_in - min_in;
    let range_out = max_out - min_out;
    cap(
        min_out + range_out * (value - min_in) / range_in,
        min_out,
        max_out,
    )
}

fn cap(value: i32, min_value: i32, max_value: i32) -> i32 {
    max(min_value, min(value, max_value))
}

Rust 裸機開發:下午

RTC driver

(返回練習)

main.rs

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: top
#![no_main]
#![no_std]

mod exceptions;
mod logger;
mod pl011;
// ANCHOR_END: top
mod pl031;

use crate::pl031::Rtc;
use arm_gic::gicv3::{IntId, Trigger};
use arm_gic::{irq_enable, wfi};
use chrono::{TimeZone, Utc};
use core::hint::spin_loop;
// ANCHOR: imports
use crate::pl011::Uart;
use arm_gic::gicv3::GicV3;
use core::panic::PanicInfo;
use log::{error, info, trace, LevelFilter};
use smccc::psci::system_off;
use smccc::Hvc;

/// Base addresses of the GICv3.
const GICD_BASE_ADDRESS: *mut u64 = 0x800_0000 as _;
const GICR_BASE_ADDRESS: *mut u64 = 0x80A_0000 as _;

/// Base address of the primary PL011 UART.
const PL011_BASE_ADDRESS: *mut u32 = 0x900_0000 as _;
// ANCHOR_END: imports

/// Base address of the PL031 RTC.
const PL031_BASE_ADDRESS: *mut u32 = 0x901_0000 as _;
/// The IRQ used by the PL031 RTC.
const PL031_IRQ: IntId = IntId::spi(2);

// ANCHOR: main
#[no_mangle]
extern "C" fn main(x0: u64, x1: u64, x2: u64, x3: u64) {
    // Safe because `PL011_BASE_ADDRESS` is the base address of a PL011 device,
    // and nothing else accesses that address range.
    let uart = unsafe { Uart::new(PL011_BASE_ADDRESS) };
    logger::init(uart, LevelFilter::Trace).unwrap();

    info!("main({:#x}, {:#x}, {:#x}, {:#x})", x0, x1, x2, x3);

    // Safe because `GICD_BASE_ADDRESS` and `GICR_BASE_ADDRESS` are the base
    // addresses of a GICv3 distributor and redistributor respectively, and
    // nothing else accesses those address ranges.
    let mut gic = unsafe { GicV3::new(GICD_BASE_ADDRESS, GICR_BASE_ADDRESS) };
    gic.setup();
    // ANCHOR_END: main

    // Safe because `PL031_BASE_ADDRESS` is the base address of a PL031 device,
    // and nothing else accesses that address range.
    let mut rtc = unsafe { Rtc::new(PL031_BASE_ADDRESS) };
    let timestamp = rtc.read();
    let time = Utc.timestamp_opt(timestamp.into(), 0).unwrap();
    info!("RTC: {time}");

    GicV3::set_priority_mask(0xff);
    gic.set_interrupt_priority(PL031_IRQ, 0x80);
    gic.set_trigger(PL031_IRQ, Trigger::Level);
    irq_enable();
    gic.enable_interrupt(PL031_IRQ, true);

    // Wait for 3 seconds, without interrupts.
    let target = timestamp + 3;
    rtc.set_match(target);
    info!(
        "Waiting for {}",
        Utc.timestamp_opt(target.into(), 0).unwrap()
    );
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    while !rtc.matched() {
        spin_loop();
    }
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    info!("Finished waiting");

    // Wait another 3 seconds for an interrupt.
    let target = timestamp + 6;
    info!(
        "Waiting for {}",
        Utc.timestamp_opt(target.into(), 0).unwrap()
    );
    rtc.set_match(target);
    rtc.clear_interrupt();
    rtc.enable_interrupt(true);
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    while !rtc.interrupt_pending() {
        wfi();
    }
    trace!(
        "matched={}, interrupt_pending={}",
        rtc.matched(),
        rtc.interrupt_pending()
    );
    info!("Finished waiting");

    // ANCHOR: main_end
    system_off::<Hvc>().unwrap();
}

#[panic_handler]
fn panic(info: &PanicInfo) -> ! {
    error!("{info}");
    system_off::<Hvc>().unwrap();
    loop {}
}
// ANCHOR_END: main_end

pl031.rs

#![allow(unused)]
fn main() {
// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use core::ptr::{addr_of, addr_of_mut};

#[repr(C, align(4))]
struct Registers {
    /// Data register
    dr: u32,
    /// Match register
    mr: u32,
    /// Load register
    lr: u32,
    /// Control register
    cr: u8,
    _reserved0: [u8; 3],
    /// Interrupt Mask Set or Clear register
    imsc: u8,
    _reserved1: [u8; 3],
    /// Raw Interrupt Status
    ris: u8,
    _reserved2: [u8; 3],
    /// Masked Interrupt Status
    mis: u8,
    _reserved3: [u8; 3],
    /// Interrupt Clear Register
    icr: u8,
    _reserved4: [u8; 3],
}

/// Driver for a PL031 real-time clock.
#[derive(Debug)]
pub struct Rtc {
    registers: *mut Registers,
}

impl Rtc {
    /// Constructs a new instance of the RTC driver for a PL031 device at the
    /// given base address.
    ///
    /// # Safety
    ///
    /// The given base address must point to the MMIO control registers of a
    /// PL031 device, which must be mapped into the address space of the process
    /// as device memory and not have any other aliases.
    pub unsafe fn new(base_address: *mut u32) -> Self {
        Self {
            registers: base_address as *mut Registers,
        }
    }

    /// Reads the current RTC value.
    pub fn read(&self) -> u32 {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of!((*self.registers).dr).read_volatile() }
    }

    /// Writes a match value. When the RTC value matches this then an interrupt
    /// will be generated (if it is enabled).
    pub fn set_match(&mut self, value: u32) {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).mr).write_volatile(value) }
    }

    /// Returns whether the match register matches the RTC value, whether or not
    /// the interrupt is enabled.
    pub fn matched(&self) -> bool {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        let ris = unsafe { addr_of!((*self.registers).ris).read_volatile() };
        (ris & 0x01) != 0
    }

    /// Returns whether there is currently an interrupt pending.
    ///
    /// This should be true if and only if `matched` returns true and the
    /// interrupt is masked.
    pub fn interrupt_pending(&self) -> bool {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        let ris = unsafe { addr_of!((*self.registers).mis).read_volatile() };
        (ris & 0x01) != 0
    }

    /// Sets or clears the interrupt mask.
    ///
    /// When the mask is true the interrupt is enabled; when it is false the
    /// interrupt is disabled.
    pub fn enable_interrupt(&mut self, mask: bool) {
        let imsc = if mask { 0x01 } else { 0x00 };
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).imsc).write_volatile(imsc) }
    }

    /// Clears a pending interrupt, if any.
    pub fn clear_interrupt(&mut self) {
        // Safe because we know that self.registers points to the control
        // registers of a PL031 device which is appropriately mapped.
        unsafe { addr_of_mut!((*self.registers).icr).write_volatile(0x01) }
    }
}

// Safe because it just contains a pointer to device memory, which can be
// accessed from any context.
unsafe impl Send for Rtc {}
}

並行:上午練習

哲學家就餐問題

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: Philosopher
use std::sync::{mpsc, Arc, Mutex};
use std::thread;
use std::time::Duration;

struct Fork;

struct Philosopher {
    name: String,
    // ANCHOR_END: Philosopher
    left_fork: Arc<Mutex<Fork>>,
    right_fork: Arc<Mutex<Fork>>,
    thoughts: mpsc::SyncSender<String>,
}

// ANCHOR: Philosopher-think
impl Philosopher {
    fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name))
            .unwrap();
    }
    // ANCHOR_END: Philosopher-think

    // ANCHOR: Philosopher-eat
    fn eat(&self) {
        // ANCHOR_END: Philosopher-eat
        println!("{} is trying to eat", &self.name);
        let left = self.left_fork.lock().unwrap();
        let right = self.right_fork.lock().unwrap();

        // ANCHOR: Philosopher-eat-end
        println!("{} is eating...", &self.name);
        thread::sleep(Duration::from_millis(10));
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Plato", "Aristotle", "Thales", "Pythagoras"];

fn main() {
    // ANCHOR_END: Philosopher-eat-end
    let (tx, rx) = mpsc::sync_channel(10);

    let forks = (0..PHILOSOPHERS.len())
        .map(|_| Arc::new(Mutex::new(Fork)))
        .collect::<Vec<_>>();

    for i in 0..forks.len() {
        let tx = tx.clone();
        let mut left_fork = Arc::clone(&forks[i]);
        let mut right_fork = Arc::clone(&forks[(i + 1) % forks.len()]);

        // To avoid a deadlock, we have to break the symmetry
        // somewhere. This will swap the forks without deinitializing
        // either of them.
        if i == forks.len() - 1 {
            std::mem::swap(&mut left_fork, &mut right_fork);
        }

        let philosopher = Philosopher {
            name: PHILOSOPHERS[i].to_string(),
            thoughts: tx,
            left_fork,
            right_fork,
        };

        thread::spawn(move || {
            for _ in 0..100 {
                philosopher.eat();
                philosopher.think();
            }
        });
    }

    drop(tx);
    for thought in rx {
        println!("{thought}");
    }
}

連結檢查器

(返回練習)

// Copyright 2022 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

use std::{sync::Arc, sync::Mutex, sync::mpsc, thread};

// ANCHOR: setup
use reqwest::{blocking::Client, Url};
use scraper::{Html, Selector};
use thiserror::Error;

#[derive(Error, Debug)]
enum Error {
    #[error("request error: {0}")]
    ReqwestError(#[from] reqwest::Error),
    #[error("bad http response: {0}")]
    BadResponse(String),
}
// ANCHOR_END: setup

// ANCHOR: visit_page
#[derive(Debug)]
struct CrawlCommand {
    url: Url,
    extract_links: bool,
}

fn visit_page(client: &Client, command: &CrawlCommand) -> Result<Vec<Url>, Error> {
    println!("Checking {:#}", command.url);
    let response = client.get(command.url.clone()).send()?;
    if !response.status().is_success() {
        return Err(Error::BadResponse(response.status().to_string()));
    }

    let mut link_urls = Vec::new();
    if !command.extract_links {
        return Ok(link_urls);
    }

    let base_url = response.url().to_owned();
    let body_text = response.text()?;
    let document = Html::parse_document(&body_text);

    let selector = Selector::parse("a").unwrap();
    let href_values = document
        .select(&selector)
        .filter_map(|element| element.value().attr("href"));
    for href in href_values {
        match base_url.join(href) {
            Ok(link_url) => {
                link_urls.push(link_url);
            }
            Err(err) => {
                println!("On {base_url:#}: ignored unparsable {href:?}: {err}");
            }
        }
    }
    Ok(link_urls)
}
// ANCHOR_END: visit_page

struct CrawlState {
    domain: String,
    visited_pages: std::collections::HashSet<String>,
}

impl CrawlState {
    fn new(start_url: &Url) -> CrawlState {
        let mut visited_pages = std::collections::HashSet::new();
        visited_pages.insert(start_url.as_str().to_string());
        CrawlState {
            domain: start_url.domain().unwrap().to_string(),
            visited_pages,
        }
    }

    /// Determine whether links within the given page should be extracted.
    fn should_extract_links(&self, url: &Url) -> bool {
        let Some(url_domain) = url.domain() else {
            return false;
        };
        url_domain == self.domain
    }

    /// Mark the given page as visited, returning true if it had already
    /// been visited.
    fn mark_visited(&mut self, url: &Url) -> bool {
        self.visited_pages.insert(url.as_str().to_string())
    }
}

type CrawlResult = Result<Vec<Url>, (Url, Error)>;
fn spawn_crawler_threads(
    command_receiver: mpsc::Receiver<CrawlCommand>,
    result_sender: mpsc::Sender<CrawlResult>,
    thread_count: u32,
) {
    let command_receiver = Arc::new(Mutex::new(command_receiver));

    for _ in 0..thread_count {
        let result_sender = result_sender.clone();
        let command_receiver = command_receiver.clone();
        thread::spawn(move || {
            let client = Client::new();
            loop {
                let command_result = {
                    let receiver_guard = command_receiver.lock().unwrap();
                    receiver_guard.recv()
                };
                let Ok(crawl_command) = command_result else {
                    // The sender got dropped. No more commands coming in.
                    break;
                };
                let crawl_result = match visit_page(&client, &crawl_command) {
                    Ok(link_urls) => Ok(link_urls),
                    Err(error) => Err((crawl_command.url, error)),
                };
                result_sender.send(crawl_result).unwrap();
            }
        });
    }
}

fn control_crawl(
    start_url: Url,
    command_sender: mpsc::Sender<CrawlCommand>,
    result_receiver: mpsc::Receiver<CrawlResult>,
) -> Vec<Url> {
    let mut crawl_state = CrawlState::new(&start_url);
    let start_command = CrawlCommand { url: start_url, extract_links: true };
    command_sender.send(start_command).unwrap();
    let mut pending_urls = 1;

    let mut bad_urls = Vec::new();
    while pending_urls > 0 {
        let crawl_result = result_receiver.recv().unwrap();
        pending_urls -= 1;

        match crawl_result {
            Ok(link_urls) => {
                for url in link_urls {
                    if crawl_state.mark_visited(&url) {
                        let extract_links = crawl_state.should_extract_links(&url);
                        let crawl_command = CrawlCommand { url, extract_links };
                        command_sender.send(crawl_command).unwrap();
                        pending_urls += 1;
                    }
                }
            }
            Err((url, error)) => {
                bad_urls.push(url);
                println!("Got crawling error: {:#}", error);
                continue;
            }
        }
    }
    bad_urls
}

fn check_links(start_url: Url) -> Vec<Url> {
    let (result_sender, result_receiver) = mpsc::channel::<CrawlResult>();
    let (command_sender, command_receiver) = mpsc::channel::<CrawlCommand>();
    spawn_crawler_threads(command_receiver, result_sender, 16);
    control_crawl(start_url, command_sender, result_receiver)
}

fn main() {
    let start_url = reqwest::Url::parse("https://www.google.org").unwrap();
    let bad_urls = check_links(start_url);
    println!("Bad URLs: {:#?}", bad_urls);
}

並行:下午練習

Dining Philosophers - Async

(返回練習)

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: Philosopher
use std::sync::Arc;
use tokio::time;
use tokio::sync::mpsc::{self, Sender};
use tokio::sync::Mutex;

struct Fork;

struct Philosopher {
    name: String,
    // ANCHOR_END: Philosopher
    left_fork: Arc<Mutex<Fork>>,
    right_fork: Arc<Mutex<Fork>>,
    thoughts: Sender<String>,
}

// ANCHOR: Philosopher-think
impl Philosopher {
    async fn think(&self) {
        self.thoughts
            .send(format!("Eureka! {} has a new idea!", &self.name)).await
            .unwrap();
    }
    // ANCHOR_END: Philosopher-think

    // ANCHOR: Philosopher-eat
    async fn eat(&self) {
        // Pick up forks...
        // ANCHOR_END: Philosopher-eat
        let _first_lock = self.left_fork.lock().await;
        // Add a delay before picking the second fork to allow the execution
        // to transfer to another task
        time::sleep(time::Duration::from_millis(1)).await;
        let _second_lock = self.right_fork.lock().await;

        // ANCHOR: Philosopher-eat-body
        println!("{} is eating...", &self.name);
        time::sleep(time::Duration::from_millis(5)).await;
        // ANCHOR_END: Philosopher-eat-body

        // The locks are dropped here
        // ANCHOR: Philosopher-eat-end
    }
}

static PHILOSOPHERS: &[&str] =
    &["Socrates", "Plato", "Aristotle", "Thales", "Pythagoras"];

#[tokio::main]
async fn main() {
    // ANCHOR_END: Philosopher-eat-end
    // Create forks
    let mut forks = vec![];
    (0..PHILOSOPHERS.len()).for_each(|_| forks.push(Arc::new(Mutex::new(Fork))));

    // Create philosophers
    let (philosophers, mut rx) = {
        let mut philosophers = vec![];
        let (tx, rx) = mpsc::channel(10);
        for (i, name) in PHILOSOPHERS.iter().enumerate() {
            let left_fork = Arc::clone(&forks[i]);
            let right_fork = Arc::clone(&forks[(i + 1) % PHILOSOPHERS.len()]);
            // To avoid a deadlock, we have to break the symmetry
            // somewhere. This will swap the forks without deinitializing
            // either of them.
            if i  == 0 {
                std::mem::swap(&mut left_fork, &mut right_fork);
            }
            philosophers.push(Philosopher {
                name: name.to_string(),
                left_fork,
                right_fork,
                thoughts: tx.clone(),
            });
        }
        (philosophers, rx)
        // tx is dropped here, so we don't need to explicitly drop it later
    };

    // Make them think and eat
    for phil in philosophers {
        tokio::spawn(async move {
            for _ in 0..100 {
                phil.think().await;
                phil.eat().await;
            }
        });

    }

    // Output their thoughts
    while let Some(thought) = rx.recv().await {
        println!("Here is a thought: {thought}");
    }
}

廣播聊天應用程式

(返回練習)

src/bin/server.rs:

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: setup
use futures_util::sink::SinkExt;
use futures_util::stream::StreamExt;
use std::error::Error;
use std::net::SocketAddr;
use tokio::net::{TcpListener, TcpStream};
use tokio::sync::broadcast::{channel, Sender};
use tokio_websockets::{Message, ServerBuilder, WebsocketStream};
// ANCHOR_END: setup

// ANCHOR: handle_connection
async fn handle_connection(
    addr: SocketAddr,
    mut ws_stream: WebsocketStream<TcpStream>,
    bcast_tx: Sender<String>,
) -> Result<(), Box<dyn Error + Send + Sync>> {
    // ANCHOR_END: handle_connection

    ws_stream
        .send(Message::text("Welcome to chat! Type a message".into()))
        .await?;
    let mut bcast_rx = bcast_tx.subscribe();

    // A continuous loop for concurrently performing two tasks: (1) receiving
    // messages from `ws_stream` and broadcasting them, and (2) receiving
    // messages on `bcast_rx` and sending them to the client.
    loop {
        tokio::select! {
            incoming = ws_stream.next() => {
                match incoming {
                    Some(Ok(msg)) => {
                        if let Some(text) = msg.as_text() {
                            println!("From client {addr:?} {text:?}");
                            bcast_tx.send(text.into())?;
                        }
                    }
                    Some(Err(err)) => return Err(err.into()),
                    None => return Ok(()),
                }
            }
            msg = bcast_rx.recv() => {
                ws_stream.send(Message::text(msg?)).await?;
            }
        }
    }
    // ANCHOR: main
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let (bcast_tx, _) = channel(16);

    let listener = TcpListener::bind("127.0.0.1:2000").await?;
    println!("listening on port 2000");

    loop {
        let (socket, addr) = listener.accept().await?;
        println!("New connection from {addr:?}");
        let bcast_tx = bcast_tx.clone();
        tokio::spawn(async move {
            // Wrap the raw TCP stream into a websocket.
            let ws_stream = ServerBuilder::new().accept(socket).await?;

            handle_connection(addr, ws_stream, bcast_tx).await
        });
    }
}
// ANCHOR_END: main

src/bin/client.rs:

// Copyright 2023 Google LLC
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//      http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// ANCHOR: setup
use futures_util::stream::StreamExt;
use futures_util::SinkExt;
use http::Uri;
use tokio::io::{AsyncBufReadExt, BufReader};
use tokio_websockets::{ClientBuilder, Message};

#[tokio::main]
async fn main() -> Result<(), tokio_websockets::Error> {
    let (mut ws_stream, _) =
        ClientBuilder::from_uri(Uri::from_static("ws://127.0.0.1:2000"))
            .connect()
            .await?;

    let stdin = tokio::io::stdin();
    let mut stdin = BufReader::new(stdin).lines();

    // ANCHOR_END: setup
    // Continuous loop for concurrently sending and receiving messages.
    loop {
        tokio::select! {
            incoming = ws_stream.next() => {
                match incoming {
                    Some(Ok(msg)) => {
                        if let Some(text) = msg.as_text() {
                            println!("From server: {}", text);
                        }
                    },
                    Some(Err(err)) => return Err(err.into()),
                    None => return Ok(()),
                }
            }
            res = stdin.next_line() => {
                match res {
                    Ok(None) => return Ok(()),
                    Ok(Some(line)) => ws_stream.send(Message::text(line.to_string())).await?,
                    Err(err) => return Err(err.into()),
                }
            }

        }
    }
}