Skip to content
This repository has been archived by the owner on Apr 29, 2024. It is now read-only.

Allow large models to be compiled, avoiding RuntimeError: Can't emit artifacts #81

Open
34plw5d2 opened this issue Feb 6, 2023 · 1 comment

Comments

@34plw5d2
Copy link

34plw5d2 commented Feb 6, 2023

Summary

When being compiled, large models that would produce an executable object requiring more than 2GB of virtual memory fail during linking with following error (example Concrete-ML 0.6.1, Concrete-numpy 0.9.0):

File /usr/local/lib/python3.8/dist-packages/concrete/compiler/library_support.py:155, in LibrarySupport.compile(self, mlir_program, options)
	150 if not isinstance(options, CompilationOptions):
	151     raise TypeError(
	152         f"options must be of type CompilationOptions, not {type(options)}"
	153     )
	154 return LibraryCompilationResult.wrap(
--> 155     self.cpp().compile(mlir_program, options.cpp())
	156 )

RuntimeError: Can't emit artifacts: Command failed:ld --shared -o /tmp/tmpXXXXXXXX/sharedlib.so /tmp/tmpXXXXXXXX.module-0.mlir.o /usr/local/lib/python3.8/dist-packages/concrete_compiler.libs/libConcretelangRuntime-14f67b9a.so -rpath=/usr/local/lib/python3.8/dist-packages/concrete_compiler.libs --disable-new-dtags 2>&1
Code:256
/tmp/tmpXXXXXXXX.module-0.mlir.o: in function `main':
LLVMDialectModule:(.text+0x65): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0x8dc9): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0x8e06): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0x8fa9): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0xb0e4): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0xb121): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0xd69a): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0xd87c): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0xddf7): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0x100f3): relocation truncated to fit: R_X86_64_PC32 against `.data.rel.ro'
LLVMDialectModule:(.text+0x10130): additional relocation overflows omitted from the output
/tmp/tmpXXXXXXXX/sharedlib.so: PC-relative offset overflow in PLT entry for `_dfr_start'

Problem to solve

Enable compilation of models exceeding 2GB virtual memory address limit.

Proposals

According to man ld, following flags might help to solve this issue:

       --no-keep-memory
           ld normally optimizes for speed over memory usage by caching the symbol tables of input files in memory. 
           This option tells ld to instead optimize for memory usage, by rereading the symbol tables as necessary.
           This may be required if ld runs out of memory space while linking a large executable.

       --large-address-aware
           If given, the appropriate bit in the "Characteristics" field of the COFF header is set to indicate
           that this executable supports virtual addresses greater than 2 gigabytes.  This should be used
           in conjunction with the /3GB or /USERVA=value megabytes switch in the "[operating systems]"
           section of the BOOT.INI. Otherwise, this bit has no effect.  [This option is specific to PE targeted
           ports of the linker]

The flags have a performance cost and might not be a suitable default as large models are not necessarily the target for Concrete libraries.

From user's point of view, it could be possible to make the flag(s) available through options/arguments when calling the compiler in Concrete-numpy (and Concrete-ML).

@rudy-6-4
Copy link

rudy-6-4 commented Feb 6, 2023

@34plw5d2
Is it possible to submit your code ? Either the python code or the MLIR code (in circuit.mlir) ?
This would help us to reproduce and eventually add some optimisation for circuit size.

If not, could you describe the FHE circuit details ? Like the number of different TLU in your model or any detail that make the model big.

Thank you

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants