Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use composition for non-interactive encrypted training [BLOCKED BY CP] #660

Draft
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

RomanBredehoft
Copy link
Collaborator

@RomanBredehoft RomanBredehoft commented Apr 29, 2024

@cla-bot cla-bot bot added the cla-signed label Apr 29, 2024
@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch from 62e5ad5 to ce9fb57 Compare April 30, 2024 08:46
@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch 2 times, most recently from a420bef to 2ffb370 Compare May 23, 2024 09:58
@RomanBredehoft RomanBredehoft changed the title feat: use composition for non-interactive encrypted training feat: use composition for non-interactive encrypted training [BLOCKED BY CP] May 27, 2024
@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch 5 times, most recently from 332ee71 to afca943 Compare May 29, 2024 13:30
src/concrete/ml/sklearn/linear_model.py Show resolved Hide resolved
for output_i, input_i in self._composition_mapping.items()
)

if len(q_results) == 1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you assert on the shape here ? the input/output shapes should match

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the checks in the new _add_requant_for_composition method (name to be confirmed)

@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch 3 times, most recently from 2d2ed7a to 454f8fa Compare June 3, 2024 13:52
@@ -290,6 +293,61 @@ def _set_output_quantizers(self) -> List[UniformQuantizer]:
)
return output_quantizers

# Remove this once we handle the re-quantization step in post-training only
# FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/4472
def _add_requant_for_composition(self, composition_mapping: Optional[Dict]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new (private) method for quantized module (avoids adding a param to the init and thus keep thing really internal)

max_output_pos = len(self.output_quantizers) - 1
max_input_pos = len(self.input_quantizers) - 1

for output_position, input_position in composition_mapping.items():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure the mapping is of the form {0:1, 3:2}


# Ignore [arg-type] check from mypy as it is not able to see that the input to `quant`
# cannot be None
q_x = tuple(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are needed to match how CP works with encrypt, ie encrypt(None, x) = None, x_enc, since we do not encrypt all inputs at the same time with composition


# Similarly, we only quantize the weight and bias values using the third and fourth
# position parameter
_, _, q_weights, q_bias = self.training_quantized_module.quantize_input(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -181,16 +184,32 @@ def _compile_torch_or_onnx_model(
for each input. By default all arguments will be encrypted.
reduce_sum_copy (bool): if the inputs of QuantizedReduceSum should be copied to avoid
bit-width propagation
composition_mapping (Optional[Dict]): Dictionary that maps output positions with input
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this new parameter to the private funct _compile_torch_or_onnx_model instead of the other public ones to keep things internal

# If a mapping between input and output quantizers is set, add a re-quantization step at the
# end of the forward call. This is only useful for composable circuits in order to make sure
# that input and output quantizers match
if composition_mapping is not None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is where we decide to add the requant step or not

# Additionally, there is no point in computing the following in case of a partial fit,
# as it only represents a single iteration
if self.early_stopping and not is_partial_fit:
weights_float, bias_float = self._decrypt_dequantize_training_output(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we keep early stopping possible with composition by adding this decrypt/dequant step here (since this is only for development, we believe it's not really an issue to do that)

# FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/4477
# We should also rename the input arguments to remove the `serialized` part, as we now accept
# both serialized and deserialized input values
# FIXME: https://github.com/zama-ai/concrete-ml-internal/issues/4476
def run(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now allow serialized and deserialized inputs (avoids having to deser + ser at each server call with composition)

@@ -357,97 +388,78 @@ def get_serialized_evaluation_keys(self) -> bytes:
return self.client.evaluation_keys.serialize()

def quantize_encrypt_serialize(
self, x: Union[numpy.ndarray, Tuple[numpy.ndarray, ...]]
) -> Union[bytes, Tuple[bytes, ...]]:
self, *x: Optional[numpy.ndarray]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we now allow unpacking. This is not a breaking change since allow tuples has been added by @jfrery only recently

this is mainly to make things more coherent with other methods + concrete

def deserialize_decrypt(
self, serialized_encrypted_quantized_result: Union[bytes, Tuple[bytes, ...]]
self, *serialized_encrypted_quantized_result: Optional[bytes]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

def deserialize_decrypt_dequantize(
self, serialized_encrypted_quantized_result: Union[bytes, Tuple[bytes, ...]]
) -> numpy.ndarray:
self, *serialized_encrypted_quantized_result: Optional[bytes]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch from 4cc0545 to c54b458 Compare June 6, 2024 12:23
@RomanBredehoft RomanBredehoft marked this pull request as draft June 6, 2024 14:53
@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch from a8eaab9 to 9f352b7 Compare June 7, 2024 08:50
@jfrery
Copy link
Collaborator

jfrery commented Jun 7, 2024

If so are we sure the new PBS dequant requant isn't impacting the convergence?

good question, i don't know. The requant part makes sure that values are the same with respect to what we were doing in interactive training before, so this part is ok. But I guess you are more worried of the PBS part and the fact that we are using rounding, right ? In that case yes not sure how we could easily assess that it does not impact the convergence. All I can say for now is that tests are fine and the notebook looks good 😅 @jfrery

I did the analysis -> #660 (comment).

All in all, looks like the convergence isn't impacted. Good to go!

@RomanBredehoft RomanBredehoft force-pushed the feat/use_composition_encrypted_training_4374 branch from 7a52745 to 1037569 Compare June 10, 2024 16:20
Copy link

Coverage passed ✅

Coverage details

---------- coverage: platform linux, python 3.8.18-final-0 -----------
Name    Stmts   Miss  Cover   Missing
-------------------------------------
TOTAL    7878      0   100%

60 files skipped due to complete coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants