test(jailer): enable multi-threaded test #4438

cm-iwata · 2024-02-09T03:31:54Z

Changes

Previously, all tests shared same temporary file/directory, causing concurrency conflicts when running tests in multi-threaded.
Resolved test concurrency issues by incorporating random strings into file/directory names.

Reason

Fixes #4412

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

If a specific issue led to this PR, this PR closes the issue.
The description of changes is clear and encompassing.
Any required documentation changes (code and docs) are included in this
PR.
API changes follow the Runbook for Firecracker API changes.
User-facing changes are mentioned in CHANGELOG.md.
All added/changed functionality is tested.
New TODOs link to an issue.
Commits meet
contribution quality standards.

This functionality cannot be added in rust-vmm.

codecov · 2024-02-09T09:04:03Z

Codecov Report

Attention: Patch coverage is 96.82540% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 81.66%. Comparing base (19f3623) to head (e01d256).

❗ Current head e01d256 differs from pull request most recent head 05b0fea. Consider uploading reports for the commit 05b0fea to get more accurate results

Files	Patch %	Lines
src/jailer/src/env.rs	85.71%	1 Missing ⚠️
src/jailer/src/main.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4438      +/-   ##
==========================================
- Coverage   82.13%   81.66%   -0.48%     
==========================================
  Files         255      243      -12     
  Lines       31268    29871    -1397     
==========================================
- Hits        25683    24394    -1289     
+ Misses       5585     5477     -108

Flag	Coverage Δ
4.14-c5n.metal	`78.99% <96.82%> (-0.65%)`	⬇️
4.14-c7g.metal	`?`
4.14-m5n.metal	`78.98% <96.82%> (-0.65%)`	⬇️
4.14-m6a.metal	`78.12% <96.82%> (-0.72%)`	⬇️
4.14-m6g.metal	`77.08% <96.82%> (+0.39%)`	⬆️
4.14-m6i.metal	`78.97% <96.82%> (-0.65%)`	⬇️
4.14-m7g.metal	`77.08% <96.82%> (+0.39%)`	⬆️
5.10-c5n.metal	`81.64% <96.82%> (-0.52%)`	⬇️
5.10-m5n.metal	`81.62% <96.82%> (-0.52%)`	⬇️
5.10-m6a.metal	`80.86% <96.82%> (-0.58%)`	⬇️
5.10-m6g.metal	`79.96% <96.82%> (+0.49%)`	⬆️
5.10-m6i.metal	`81.62% <96.82%> (-0.52%)`	⬇️
5.10-m7g.metal	`79.96% <96.82%> (+0.49%)`	⬆️
6.1-c5n.metal	`81.64% <96.82%> (-0.52%)`	⬇️
6.1-m5n.metal	`81.62% <96.82%> (-0.52%)`	⬇️
6.1-m6a.metal	`80.86% <96.82%> (-0.59%)`	⬇️
6.1-m6g.metal	`79.96% <96.82%> (+0.49%)`	⬆️
6.1-m6i.metal	`81.62% <96.82%> (-0.51%)`	⬇️
6.1-m7g.metal	`79.96% <96.82%> (+0.50%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

zulinx86 · 2024-02-09T09:36:49Z

Hello @cm-iwata ,

Thank you for your contribution!
Could you please fix clippy test errors?
https://buildkite.com/firecracker/firecracker-pr/builds/8932#018d8d15-d42f-4863-969e-4df250266426/55-402

cm-iwata · 2024-02-09T10:25:25Z

@zulinx86
Sorry. fixed it.

zulinx86 · 2024-02-12T13:45:58Z

@cm-iwata

Could you please do the following things?:

Add a commit message explaining the context like why this commit is needed ( https://buildkite.com/firecracker/firecracker-pr/builds/8976#018d9cd9-1a22-4508-b4bb-f1da0279335a/55-182 )
Fix all the unit test failures ( https://buildkite.com/firecracker/firecracker-pr/builds/8976#018d9cd9-1a2e-407f-b159-6eca360663da )
Post the result here when running it in parallel as instructed in Fix running cargo test in parallel for Jailer #4412 since we cannot still run all the rust unit tests in parallel at the moment.

Thanks!

zulinx86 · 2024-02-12T17:02:56Z

It's a matter of taste, but it might be better to use "test(jailer)" rather than "fix(jailer)". "fix" may look like fixing a bug of jailer. I believe it's like improvement of jailer unit tests.

cm-iwata · 2024-02-13T02:25:00Z

@zulinx86
Thank you for the review!

Add a commit message explaining the context like why this commit is needed

fixed.

Fix all the unit test failures

I believe I've probably fixed the relevant parts.
Since the tests were passing in my dev environment before the fix, I haven't been able to confirm correctly that the fix made the tests pass.
Also, since some unit tests are failing in my dev environment, I haven't been able to confirm that all unit tests are passing.
Could you please run CI?

※I use GCEfor dev environment.

Post the result here when running it in parallel as instructed in Fix running cargo test in parallel for vmm #4412 since we cannot still run all the rust unit tests in parallel at the moment.

here is result.

root@dbaa0c959c45:/firecracker# cargo test --package jailer
warning: patch for `kvm-bindings` uses the features mechanism. default-features and features will not take effect because the patch dependency does not support this mechanism
    Finished test [unoptimized + debuginfo] target(s) in 0.10s
     Running unittests src/main.rs (build/cargo_target/debug/deps/jailer-2ccd13efbd56ca77)

running 32 tests
test cgroup::tests::test_cgroup_builder_no_mounts ... ok
test cgroup::tests::test_get_controller ... ok
test env::tests::test_dup2 ... ok
test env::tests::test_join_netns ... ok
test env::tests::test_parse_resource_limits ... ok
test cgroup::tests::test_inherit_from_parent ... ok
test env::tests::test_copy_exec_to_chroot ... ok
test resource_limits::tests::test_default_resource_limits ... ok
test env::tests::test_validate_exec_file ... ok
test cgroup::tests::test_cgroup_builder_v2 ... ok
test resource_limits::tests::test_display_resource ... ok
test cgroup::tests::test_cgroup_builder_v1_with_v2_mounts ... ok
test resource_limits::tests::test_from_resource ... ok
test resource_limits::tests::test_install ... ok
test resource_limits::tests::test_set_resource_limits ... ok
test tests::test_clean_env_vars ... ok
test env::tests::test_save_exec_file_pid ... ok
test cgroup::tests::test_cgroup_builder_v1 ... ok
test cgroup::tests::test_cgroup_builder_v2_with_v1_mounts ... ok
test tests::test_to_cstring ... ok
test env::tests::test_copy_cache_info ... ok
test env::tests::test_cgroups_parsing ... ok
test env::tests::test_setup_jailed_folder ... ok
test env::tests::test_userfaultfd_dev ... ok
test env::tests::test_mknod_and_own_dev ... ok
test cgroup::tests::test_cgroup_build ... ok
test cgroup::tests::test_cgroup_build_invalid ... ok
test cgroup::tests::test_cgroup_v2_write_value ... ok
test env::tests::test_new_env ... ok
test tests::test_fds_close_range ... ok
test tests::test_sanitize_process ... ok
test tests::test_fds_proc ... ok

test result: ok. 32 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.12s

JonathanWoollett-Light · 2024-03-11T10:56:27Z

src/jailer/src/cgroup.rs

+pub struct CgroupBuilder<'a> {
    version: u8,
    hierarchies: HashMap<String, PathBuf>,
    mount_points: Vec<CgroupMountPoint>,
+    proc_mounts_path: &'a str,
 }


This may use more memory than before.

It would be better to do:

Suggested change

pub struct CgroupBuilder<'a> {

version: u8,

hierarchies: HashMap<String, PathBuf>,

mount_points: Vec<CgroupMountPoint>,

proc_mounts_path: &'a str,

}

pub struct CgroupBuilder {

version: u8,

hierarchies: HashMap<String, PathBuf>,

mount_points: Vec<CgroupMountPoint>,

proc_mounts_path: &'static str,

}

@JonathanWoollett-Light
Thank you for your suggestion. However, since the field requires dynamic values.
Because it's necessary to isolate the environment between multiple threads running tests.
If static lifetime were used, multiple threads running tests would access the same directory, potentially causing fail tests.

I only saw static strings passed to be used as proc_mounts_path, if the number of threads in tests is constant we could have a static string for each. Could you point to where a non-static string is used?

This is a nit regardless.

Although only for testing, dynamic values are used in the following places:

firecracker/src/jailer/src/cgroup.rs

Lines 477 to 479 in 14f8f38

let mock_proc_mounts = format!(

"/tmp/firecracker/test/{}/jailer/proc/mounts",

rand::rand_alphanumerics(4).into_string().unwrap()

firecracker/src/jailer/src/cgroup.rs

Line 608 in 14f8f38

let builder = CgroupBuilder::new(1, mock_cgroups.proc_mounts_path.as_str());

How about changing the type of proc_mounts_path to Cow<'static, str>? With this change, the production code can accept the constant PROC_MOUNTS, while the test code can dynamically generate a String. What do you think about this approach?

zulinx86 · 2024-03-13T13:53:33Z

@cm-iwata Could you please squash 3134792 and 33ced74 into one commit? Two identical commits are shown in https://github.com/firecracker-microvm/firecracker/pull/4438/commits.

zulinx86 · 2024-03-13T16:10:04Z

src/jailer/src/cgroup.rs

+    pub fn get_mock_proc_mounts() -> String {
+        format!(
+            "/tmp/firecracker/test/{}/jailer/proc/mounts",
+            rand::rand_alphanumerics(4).into_string().unwrap()
+        )
+    }


get_mock_proc_mounts() is always called just before MockCgroupFs::new(). How about putting get_mock_proc_monts() inside the MockCgroupFs::new()? By doing this and owning the string, the lifetime parameter <'a> of MockCgroupFs can be removed as follows:

- pub fn get_mock_proc_mounts() -> String { - format!( - "/tmp/firecracker/test/{}/jailer/proc/mounts", - rand::rand_alphanumerics(4).into_string().unwrap() - ) - } - #[derive(Debug)] - pub struct MockCgroupFs<'a> { + pub struct MockCgroupFs { mounts_file: File, - pub proc_mount_path: &'a str, + pub proc_mount_path: String, } // Helper object that simulates the layout of the cgroup file system // This can be used for testing regardless of the availability of a particular // version of cgroups on the system - impl<'a> MockCgroupFs<'a> { + impl MockCgroupFs { pub fn create_file_with_contents<P: AsRef<Path> + Debug>( filename: P, contents: &str, @@ -480,10 +473,12 @@ pub mod test_util { Ok(()) } - pub fn new( - mock_proc_mount: &'a str, - ) -> std::result::Result<MockCgroupFs<'a>, std::io::Error> { - let mock_proc_dir = Path::new(mock_proc_mount).parent().unwrap(); + pub fn new() -> std::result::Result<MockCgroupFs, std::io::Error> { + let mock_proc_mount = format!( + "/tmp/firecracker/test/{}/jailer/proc/mounts", + rand::rand_alphanumerics(4).into_string().unwrap() + ); + let mock_proc_dir = Path::new(&mock_proc_mount).parent().unwrap();

fixed in 540eea5 and 14f8f38

zulinx86 · 2024-03-13T16:41:45Z

src/jailer/src/cgroup.rs

-        let f = File::open(PROC_MOUNTS)
-            .map_err(|err| JailerError::FileOpen(PathBuf::from(PROC_MOUNTS), err))?;
+        let f = File::open(b.proc_mounts_path)
+            .map_err(|err| JailerError::FileOpen(PathBuf::from(&b.proc_mounts_path), err))?;


nitpicking, but & can be removed.

Suggested change

.map_err(|err| JailerError::FileOpen(PathBuf::from(&b.proc_mounts_path), err))?;

.map_err(|err| JailerError::FileOpen(PathBuf::from(b.proc_mounts_path), err))?;

fixed in e77496a

zulinx86 · 2024-03-13T16:42:10Z

src/jailer/src/cgroup.rs

@@ -72,7 +68,8 @@ impl CgroupBuilder {
        ).map_err(JailerError::RegEx)?;

        for l in BufReader::new(f).lines() {
-            let l = l.map_err(|err| JailerError::ReadLine(PathBuf::from(PROC_MOUNTS), err))?;
+            let l =
+                l.map_err(|err| JailerError::ReadLine(PathBuf::from(&b.proc_mounts_path), err))?;


nitpicking: See #4438 (comment)

Suggested change

l.map_err(|err| JailerError::ReadLine(PathBuf::from(&b.proc_mounts_path), err))?;

l.map_err(|err| JailerError::ReadLine(PathBuf::from(b.proc_mounts_path), err))?;

fixed in e77496a

zulinx86 · 2024-03-13T16:46:23Z

src/jailer/src/cgroup.rs

        mounts_file: File,
+        pub proc_mount_path: &'a str,


nitpicking: the name proc_mounts_path is used for CgroupBuilder and the file path /proc/mounts, so proc_mounts_path would be preferable here as well.

Suggested change

pub proc_mount_path: &'a str,

pub proc_mounts_path: &'a str,

fixed in a9979ed

zulinx86 · 2024-03-13T17:42:08Z

src/jailer/src/cgroup.rs

@@ -478,28 +480,45 @@ pub mod test_util {
            Ok(())
        }

-        pub fn new() -> std::result::Result<MockCgroupFs, std::io::Error> {
+        pub fn new(
+            mock_proc_mount: &'a str,


nitpicking: See #4438 (comment).

Suggested change

mock_proc_mount: &'a str,

mock_proc_mounts: &'a str,

fixed in a9979ed

Previously, all tests shared same temporary file/directory, causing concurrency conflicts when running tests in multi-threaded. Resolved test concurrency issues by incorporating random strings into file/directory names. Link: firecracker-microvm#4412 Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

Renamed several variables. Because these variables are typically assigned to `/proc/mounts`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

Move mocked path prepare process inside `MockCgroupFs::new()`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

Remove unnecessary `&` from function call arguments Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

Move mocked path prepare process inside `MockCgroupFs::new()`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

cm-iwata · 2024-03-14T04:14:54Z

Could you please squash 3134792 and 33ced74 into one commit?

Sorry. I made a mistake while rebasing.
I squashed tow commits.

cm-iwata · 2024-03-14T07:05:58Z

I thought the code was functioning correctly after the fix, but I've noticed occasional test failures.
Could you please advise on the following issues?

1.Too many open files

Sometimes tests fail as follows.

---- tests::test_fds_proc stdout ----
thread 'tests::test_fds_proc' panicked at src/jailer/src/main.rs:409:33:
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }

---- tests::test_fds_close_range stdout ----
thread 'tests::test_fds_close_range' panicked at src/jailer/src/main.rs:409:33:
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::test_sanitize_process stdout ----
thread 'tests::test_sanitize_process' panicked at src/jailer/src/main.rs:409:33:
called `Result::unwrap()` on an `Err` value: Os { code: 24, kind: Uncategorized, message: "Too many open files" }

The code causing this issue can be found here.

firecracker/src/jailer/src/main.rs

Lines 405 to 408 in b2eee19

    
           for i in 0..n { 
        
               let maybe_file = File::create(format!("{}/{}", &tmp_dir_path, i)); 
        
               fds.push(maybe_file.unwrap().into_raw_fd()); 
        
           }

When running the tests, 100 files are create per thread, it causing the file descriptor limit to be exceeded when running tests in multithread. Changing the loop condition n from 100 to 10 resolved the issue. Is there any specific reason for using the number 100? If not, would it be acceptable to change n to 10?

2. Bad file descriptor

Even after fixing 1, tests may still fail.

---- env::tests::test_new_env stdout ----
thread 'env::tests::test_new_env' panicked at src/jailer/src/env.rs:911:14:
This another new environment should be created successfully.: ReadLine("/tmp/firecracker/test/MCzQ/jailer/proc/mounts", Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I believe the cause is the following code.

firecracker/src/jailer/src/main.rs

Lines 442 to 445 in b2eee19

    
           fn test_fds_proc() { 
        
               run_close_fds_test(close_fds_by_reading_proc); 
        
           }

The error should occur in the following sequence:

thread1 opens fd4.
thread2 opens fd5.
thread1 closes fd4 and fd5.
thread 2 access to fd5, then Bad file descriptor error will occur

Commenting out the above test cases allows the tests to run successfully. Do you have any suggestions for how to fix the code? I couldn't think of any good ideas, so I would appreciate any advice.

cm-iwata marked this pull request as ready for review February 9, 2024 05:19

xmarcalx added the Priority: Low Indicates that an issue or pull request should be resolved behind issues or pull requests labelled ` label Feb 12, 2024

xmarcalx requested review from pb8o, zulinx86 and sudanl0 February 12, 2024 10:31

cm-iwata force-pushed the fix/parallel-jailer-test branch from e0f47d6 to 7ddda51 Compare February 13, 2024 00:14

cm-iwata requested review from xmarcalx and kalyazin as code owners February 13, 2024 00:14

cm-iwata force-pushed the fix/parallel-jailer-test branch 4 times, most recently from 3134792 to 33ced74 Compare February 13, 2024 02:10

JonathanWoollett-Light reviewed Mar 11, 2024

View reviewed changes

zulinx86 changed the title ~~fix(jailer): enable multi-threaded test~~ test(jailer): enable multi-threaded test Mar 13, 2024

zulinx86 reviewed Mar 13, 2024

View reviewed changes

cm-iwata force-pushed the fix/parallel-jailer-test branch from e01d256 to 6ec6be2 Compare March 14, 2024 00:54

cm-iwata added 2 commits March 14, 2024 01:05

refactor: changed variable name in jailer

a9979ed

Renamed several variables. Because these variables are typically assigned to `/proc/mounts`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

refactor(jailer): simplify MockCgroupFs prepare process

540eea5

Move mocked path prepare process inside `MockCgroupFs::new()`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

cm-iwata added 2 commits March 14, 2024 02:13

refactor(jailer): remove unnecessary &

e77496a

Remove unnecessary `&` from function call arguments Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

refactor(jailer): simplify MockCgroupFs prepare process(aarch64)

14f8f38

Move mocked path prepare process inside `MockCgroupFs::new()`. Signed-off-by: Tomoya Iwata <iwata.tomoya@classmethod.jp>

zulinx86 mentioned this pull request Mar 18, 2024

Enable unit tests to run in parallel #1569

Open

2 tasks

cm-iwata added 2 commits April 30, 2024 03:57

Merge branch 'main' into fix/parallel-jailer-test

a7c8e41

Merge branch 'main' into fix/parallel-jailer-test

05b0fea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(jailer): enable multi-threaded test #4438

test(jailer): enable multi-threaded test #4438

cm-iwata commented Feb 9, 2024 •

edited by zulinx86

codecov bot commented Feb 9, 2024 •

edited

zulinx86 commented Feb 9, 2024

cm-iwata commented Feb 9, 2024

zulinx86 commented Feb 12, 2024 •

edited

zulinx86 commented Feb 12, 2024

cm-iwata commented Feb 13, 2024

JonathanWoollett-Light Mar 11, 2024 •

edited

cm-iwata Mar 13, 2024

JonathanWoollett-Light Mar 13, 2024 •

edited

cm-iwata Mar 14, 2024

zulinx86 commented Mar 13, 2024

zulinx86 Mar 13, 2024 •

edited

cm-iwata Mar 14, 2024

zulinx86 Mar 13, 2024

cm-iwata Mar 14, 2024

zulinx86 Mar 13, 2024 •

edited

cm-iwata Mar 14, 2024

zulinx86 Mar 13, 2024

cm-iwata Mar 14, 2024

zulinx86 Mar 13, 2024 •

edited

cm-iwata Mar 14, 2024

cm-iwata commented Mar 14, 2024

cm-iwata commented Mar 14, 2024 •

edited

	let mock_proc_mounts = format!(
	"/tmp/firecracker/test/{}/jailer/proc/mounts",
	rand::rand_alphanumerics(4).into_string().unwrap()

	.map_err(\|err\| JailerError::FileOpen(PathBuf::from(&b.proc_mounts_path), err))?;
	.map_err(\|err\| JailerError::FileOpen(PathBuf::from(b.proc_mounts_path), err))?;

	l.map_err(\|err\| JailerError::ReadLine(PathBuf::from(&b.proc_mounts_path), err))?;
	l.map_err(\|err\| JailerError::ReadLine(PathBuf::from(b.proc_mounts_path), err))?;

test(jailer): enable multi-threaded test #4438

Are you sure you want to change the base?

test(jailer): enable multi-threaded test #4438

Conversation

cm-iwata commented Feb 9, 2024 • edited by zulinx86

Changes

Reason

License Acceptance

PR Checklist

codecov bot commented Feb 9, 2024 • edited

Codecov Report

zulinx86 commented Feb 9, 2024

cm-iwata commented Feb 9, 2024

zulinx86 commented Feb 12, 2024 • edited

zulinx86 commented Feb 12, 2024

cm-iwata commented Feb 13, 2024

JonathanWoollett-Light Mar 11, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JonathanWoollett-Light Mar 13, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zulinx86 commented Mar 13, 2024

zulinx86 Mar 13, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zulinx86 Mar 13, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zulinx86 Mar 13, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cm-iwata commented Mar 14, 2024

cm-iwata commented Mar 14, 2024 • edited

1.Too many open files

2. Bad file descriptor

cm-iwata commented Feb 9, 2024 •

edited by zulinx86

codecov bot commented Feb 9, 2024 •

edited

zulinx86 commented Feb 12, 2024 •

edited

JonathanWoollett-Light Mar 11, 2024 •

edited

JonathanWoollett-Light Mar 13, 2024 •

edited

zulinx86 Mar 13, 2024 •

edited

zulinx86 Mar 13, 2024 •

edited

zulinx86 Mar 13, 2024 •

edited

cm-iwata commented Mar 14, 2024 •

edited