-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-objective experiments generate duplicated data #2392
Comments
Taking a look! |
@HuizhiXu Thanks for reporting! |
Also, to address your initial issue, Ax sometimes suggests previously sampled arms if there is observational noise in your metrics, or if Ax is unsure of the degree of observational noise, in order to get a better idea of the true mean and noise at a given point in the search space. If you would like to prevent Ax from re-suggesting points, please see my comment above for some pointers. |
@bernardbeckerman thank you for your response and guidance. The reasons I chose the Developer API over the Service API are as follows:
For example, the Service API's parameter constraints are defined as: parameter_constraints= ["A + B + C + C + E <= 100.0"], In contrast, the Developer API allows for: parameter_constraints.append(
SumConstraint(
parameters=parameters,
is_upper_bound=True,
bound=param_cons.bound,
)
) Additionally, there are settings for searchspace and objectiveconfig that I find more aligned with the Pydantic style in the Developer API. However, overall, I am not aware of the specific advantages that the Developer API has over the Service API. Furthermore, I would like to know if the following approach to deduplication in my code is correct? I have defined a Here is the code snippet: def deduplicated_arm(arm):
if arm.signature in arms_by_signature_for_deduplication:
return True
def generate(i, model_moo):
generator_run = model_moo.gen(n=1)
trial = self.experiment.new_trial(generator_run=generator_run)
arm = trial.arms[0]
if deduplicated_arm(arm):
logger.info("Duplicated arm")
trial.mark_abandoned(reason="duplication")
return
logger.info("Not duplicated arm")
self.arms_by_signature_for_deduplication[arm.signature] = arm
... |
These are excellent points! I'll discuss these with the team and see if there's an opportunity to clarify our documentation and/or move toward typed inputs in the service API. Thanks for raising these!
This looks good although to match this snippet more closely, it might be better to check for duplication on the generator arms themselves, and retry gen until you either succeed in generating a unique arm or reach some maximum number of tries (draws): def is_duplicate(arm, arms_by_signature_for_deduplication):
return arm.signature in arms_by_signature_for_deduplication:
def generate(model_moo):
arm_is_unique = False
n_gen_draws = 0
while not arm_is_unique and n_gen_draws < max_gen_draws_for_deduplication:
generator_run = model_moo.gen(n=1)
arm_is_unique = is_duplicate(generator_run.arms[0], self.arms_by_signature_for_deduplication)
n_gen_draws += 1
if not arm_is_unique:
raise RepeatedPointsException("Could not generate unique arm after trying {max_gen_draws_for_deduplication} times.")
trial = self.experiment.new_trial(generator_run=generator_run)
arm = trial.arms[0]
self.arms_by_signature_for_deduplication[arm.signature] = arm This way you can give the generation strategy a few tries before you give up, and you won't end up with a ton of abandoned trials on your experiment, but rather can handle |
Summary: Update docs to point users toward the Service API ([context](facebook#2392)). Reviewed By: mpolson64 Differential Revision: D56635704
Summary: Update docs to point users toward the Service API ([context](#2392)). Reviewed By: mpolson64 Differential Revision: D56635704 fbshipit-source-id: f68dd0e2248fd662c82d1c5c9f8c712466022240
Hello, the deduplicate function with the Developer API is working well, preventing any duplication from occurring. We have also taken your advice and switched to the Ax service API lately. Thank you for your assistance. |
Hello,
I have been utilizing the multi-objective optimization approach for my specific use case. Here is a brief overview of the code I have implemented:
The code has been functioning effectively up until now. However, I have recently encountered an error. It appears that the generator may be producing duplicate data points during its operation.
Trial 23 and Trial 24 have same parameters.
Upon investigating potential causes, I found a similar issue here: Completed Multi-objective NAS experiments without metrics and repeated parameter selections The user in that issue also faced a problem with duplicate data generation.
The resolution suggested for their situation involved utilizing the Ax service API and setting the
should_deduplicate
parameter toTrue
in the call tochoose_generation_strategy
.Given this information, I am inquiring whether there is an equivalent parameter available within the Ax developer API that I could use to address the issue of duplicate data generation in my case. If such a parameter exists, I would greatly appreciate guidance on how to implement it in my current code setup.
The text was updated successfully, but these errors were encountered: