Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap: AST-level mutations #867

Open
3 of 11 tasks
stanislaw opened this issue May 29, 2021 · 10 comments
Open
3 of 11 tasks

Roadmap: AST-level mutations #867

stanislaw opened this issue May 29, 2021 · 10 comments

Comments

@stanislaw
Copy link
Member

stanislaw commented May 29, 2021

I would like to add the AST-level roadmap here, and it might be somehow related to the #775. I don't think it is a blocker in any way for 1.0, but let's keep these prospects connected:

  • Make Mull Runner read the AST information from the binary, not from the compilation database (new source code provider).
  • Mutations in the preprocessed sources and macros.
  • Remove the legacy "white" AST search.
  • Passing the command-line arguments to the Clang plugin.
    • Try plugin option (like cl::opt::Whatever) and try -mllvm -whatever?
    • The Clang plugin args system is quite verbose, so it could be reasonable to move to a YAML config file-based approach.
  • Support incremental testing.
  • Simplify a distribution of a plugin by including it to the binary distributions of Mull.
  • For CMake: implement necessary boilerplate to support finding Mull's plugin and runner via find_package(Mull)...
  • It would be great to introduce a mapping with the conventional AOR-like naming scheme.
  • Equivalent mutant detection using code coverage.
@stanislaw stanislaw changed the title AST-level mutations roadmap Roadmap: AST-level mutations Jun 7, 2021
@m42e
Copy link
Contributor

m42e commented Nov 14, 2021

@stanislaw what is the current status? Is there something you need/want support?

@stanislaw
Copy link
Member Author

stanislaw commented Nov 14, 2021

Hi @m42e, thanks for asking! You are going great contributions, which Alex and I appreciate!

The AST-level work has been waiting for my longer Christmas and New Year Holidays at this point.

There is one annoying blocker where it is not possible to compile a C/C++ file when the coverage flags are enabled and the mutations are applied at the same time.

The reason why we need the normal coverage information is because the Mull runner can decide whether to avoid doing the mutations because it is more efficient to only run mutations on the covered code anyway.

This is the original branch with the issue: https://github.com/mull-project/mull/commits/code-coverage-experiment.

The problem is as follows: when a mutation is applied on the AST node, we inject a mutation as follows (pseudocode):

if (ENV contains mutation identifier) {
  run mutated code
} else {
  run nonmutated code
}

The problem is that currently all the new nodes that Mull adds are marked as having its location to be a NULL location because the added mutation code is not traceable back to the non-mutated source file. This contradicts to the expectation of the code coverage AST-level code that expects the source location to be realistic and consistent for all nodes in the AST tree.

To reproduce this best, you should compile any of the by-Mull-supported LLVM versions from source with assertions enabled. Then, if you try to run the sandbox test in my branch, you will hit at least one LLVM-internal assert saying that the Mull AST code breaks the invariants about the source locations (because it does use the NULL_LOCATION for the injected mutated code).

One idea that @AlexDenisov and I had as a workaround was to create an ephemeral file in the AST context and redirect the mutated locations there as if they were included by the original AST unit using a preprocessor. This way, we could potentially make the LLVM happy about the allocation of the mutations in the source code and with that switch Mull completely to the AST approach with the current behavior of Mull preserved one-to-one.

Let me know if this gives you enough context.

@m42e
Copy link
Contributor

m42e commented Nov 14, 2021

Thanks. I am really keen on helping this project. I really like it and think it is a great benefit for everyone, and everyone should use it :). I heard of mutation testing many years ago and did not have a good mandate to invest further in it. But now I have the opportunity to setup a project and we totally rely on mull as one of our tools for quality checks.

Ok, so I think you already put a lot of thought into that and probably have a plan how to deal with it. I do not want to interfere with your plan at this point in time but would be happy to assist either by implementing or by discussing some interesting topics. So feel free to reach out if there is something I can do. Or pair up during the holiday season.

@m42e
Copy link
Contributor

m42e commented Nov 14, 2021

I will still have a look at it anyway ;) out of curiosity.

@stanislaw
Copy link
Member Author

Definitely go ahead! When I was looking at this conflict between Mull and code coverage on the AST level, I thought that the coverage could be less opinionated and conflicting. I didn't have time to spend enough time on that issue, though, so I would be curious to know if you confirm my findings or solve it in some other way.

@m42e
Copy link
Contributor

m42e commented Nov 15, 2021

After a few attempts I managed to build it against LLVM 13.0.0

Well, I started looking into it, but stumbled across a few things:

  • I had some issues with the SectionAttr, while printing the AST it had to be set to something (I set it hard to zero, instead of "not calculated"). This may be very bad, but I overcome an issue with that.
  • I changed the CMakeFile to also build compiler-rt, else instrumentated object report missing library during linking.
  • I had to remove the debug flag from mull-runner as llvm reports its been registered twice.

With what is exactly in your branch I got the error you'd expected (I hope it is the same):

....CoverageMappingGen.cpp:144: void {anonymous}::SourceMappingRegion::setEndLoc(clang::SourceLocation): Assertion `Loc.isValid() && "Setting an invalid end location"' failed.

I tried to set the location to just exactly what the original location was and it seems to overcome the issue. But I am not sure about the side effects. I'll play around a bit.

EDIT:
In the coverage date I can also see the branch inserted by mull:

        3:   11:  voidSum(a, b, &result);
        3:   11-block  0
branch  0 taken 33%
branch  1 taken 67%
        1:   11-block  1
        2:   11-block  2

@stanislaw
Copy link
Member Author

After a few attempts I managed to build it against LLVM 13.0.0
Well, I started looking into it, but stumbled across a few things:

* I had some issues with the SectionAttr, while printing the AST it had to be set to something (I set it hard to zero, instead of "not calculated"). This may be very bad, but I overcome an issue with that.

* I changed the CMakeFile to also build compiler-rt, else instrumentated object report missing library during linking.

* I had to remove the `debug` flag from `mull-runner` as llvm reports its been registered twice.

With what is exactly in your branch I got the error you'd expected (I hope it is the same):

....CoverageMappingGen.cpp:144: void {anonymous}::SourceMappingRegion::setEndLoc(clang::SourceLocation): Assertion `Loc.isValid() && "Setting an invalid end location"' failed.

I tried to set the location to just exactly what the original location was and it seems to overcome the issue. But I am not sure about the side effects. I'll play around a bit.

EDIT: In the coverage date I can also see the branch inserted by mull:

        3:   11:  voidSum(a, b, &result);
        3:   11-block  0
branch  0 taken 33%
branch  1 taken 67%
        1:   11-block  1
        2:   11-block  2

I don't remember the original error, but the error looks similar. I would also expect that LLVM might drift a bit to the newer implementations because I was mostly testing against LLVM 9 (no specific reason). Sounds like you are on the right track! What we need is simply lack of conflicts with LLVM internal machinery (no internal asserts) and working coverage that we can use to let Mull rely on this information to skip the non-covered parts.

@stanislaw
Copy link
Member Author

Hey @m42e, just curious: have you made any progress regarding the conflict between Mull and Coverage on the AST level? I am still committed to fighting this problem around the holidays (20.12-10.01) unless you have solved it already 🗡️

@m42e
Copy link
Contributor

m42e commented Dec 3, 2021

To be honest I did not spend to much time on it as I wasn't sure I fully understood the issue. I have created a branch ast-wip in my clone. Where I did minimal fixes to make it work with llvm 13.

I'd be happy to assist but didn't want to go for the wrong goal.

@AlexDenisov
Copy link
Member

We can now build against official LLVM/Clang packages from Ubuntu repos #926
It means we can ship clang plugins safely, without depending on the precompiled LLVM versions from llvm.org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants