Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep track of the set of included files #239

Open
andrewrk opened this issue Feb 7, 2022 · 2 comments
Open

keep track of the set of included files #239

andrewrk opened this issue Feb 7, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@andrewrk
Copy link
Contributor

andrewrk commented Feb 7, 2022

When integrating with Zig as a frontend, Zig needs to know about all the included header files in order to avoid false positive cache hits. Additionally, Zig wants to know this information about every header file:

  • mtime
  • inode
  • file size
  • hash of the file contents using whatever internal hashing algorithm we are using for the cache system

Keeping the metadata for header files seems reasonable to me but it seems unreasonable for Aro to do the hashing because that involves tight coupling to Zig's caching mechanism. Some other ideas to accomplish this are:

  • have an option to keep all header sources in memory so that the API user can access them via Compilation and do whatever hashing is desired
  • allow specifying a "load file" callback so that Zig could gain this information before sending the contents to Aro.
  • keep only the file paths and Zig can redundantly open the file again, and do whatever it needs to.

Ideally it would also report the file paths that were checked and did not exist so that we can detect a stale compilation if a new file was placed in one of those locations. For example, if the include dirs are -Ifoo/ -Ibar/ and an #include "hello.h" directive ends up including bar/hello.h then we want to know that a new file named foo/hello.h should cause a cache miss.

@Vexu
Copy link
Owner

Vexu commented Feb 7, 2022

#235 will keep track of all included files for each file. Storing the metadata and optionally a list of checked files is also easy to add. For the file content hash the file loaded callback which would be called before the contents are spliced and normalized to LF would probably be the simplest solution.

@Vexu Vexu added the enhancement New feature or request label Feb 7, 2022
@ehaas
Copy link
Collaborator

ehaas commented May 31, 2022

There's an edge case in the Zig caching mechanism due to __has_include - it's probably not a realistic real-world usage of it, but I thought I'd point it out since it exists in the current zig cc:

#if __has_include("test.h")
#define FOO 1
#else
#define FOO 2
#endif

#include <stdio.h>
int main(void) {
    printf("%d\n", FOO);
}

If you compile this once with zig cc, with test.h non-existent, it will print 2. Then touch test.h and compile it again, and it will use the cache and not recompile anything. But recompiling from scratch will cause it to print 1.

It's probably rare to check for the existence of a header but then not include it, so maybe it's not worth worrying about. One solution could be to also have a callback for __has_include and __has_include_next which receives the checked path and whether or not the file was found, so that info can also be hashed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants