Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add autonomy controller and autonomy manager #2033

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

vie-serendipity
Copy link
Contributor

chore: gci write

chore: add parameter

fix: run autonomy manager asynchronously

fix: disable autonomy manager if it's a cloud node

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:
/kind bug
/kind documentation
/kind enhancement
/kind good-first-issue
/kind feature
/kind question
/kind design
/sig ai
/sig iot
/sig network
/sig storage

/kind feature

What this PR does / why we need it:

Supplement autonomy ability.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


other Note

@rambohe-ch
Copy link
Member

@vie-serendipity Thanks for posting this pull request. it is ready for reviewing or not?

@vie-serendipity
Copy link
Contributor Author

@rambohe-ch I think it's ready for review, I just need to modify and add more tests.

@vie-serendipity vie-serendipity marked this pull request as ready for review April 26, 2024 09:43
Copy link

codecov bot commented Apr 28, 2024

Codecov Report

Attention: Patch coverage is 64.68401% with 95 lines in your changes missing coverage. Please review.

Project coverage is 56.15%. Comparing base (c61299e) to head (08f3266).
Report is 4 commits behind head on master.

Files Patch % Lines
...manager/controller/autonomy/autonomy_controller.go 45.58% 33 Missing and 4 partials ⚠️
pkg/yurthub/cachemanager/cache_manager.go 52.85% 31 Missing and 2 partials ⚠️
pkg/yurthub/cachemanager/error_keys.go 82.17% 15 Missing and 8 partials ⚠️
pkg/yurthub/proxy/local/faketoken.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2033      +/-   ##
==========================================
+ Coverage   56.09%   56.15%   +0.05%     
==========================================
  Files         186      188       +2     
  Lines       18092    18344     +252     
==========================================
+ Hits        10149    10301     +152     
- Misses       6910     6995      +85     
- Partials     1033     1048      +15     
Flag Coverage Δ
unittests 56.15% <64.68%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@vie-serendipity
Copy link
Contributor Author

/rerun

@vie-serendipity
Copy link
Contributor Author

/rerun

cm.Lock()
cm.errorKeys[keyBuildInfo] = err
cm.Unlock()
cm.deltaFIFO <- Delta{Key: keyBuildInfo, Err: err, Type: Added}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worried about the overflow of this channel if there are too many error happened. should we need to consider this corner case?

// ConsistencyValidate check the data consistency between cache manager and api server
func (am *AutonomyManager) ConsistencyValidate(client kubernetes.Interface) {
klog.Info("start to check the data consistency between cache manager and api server")
node, err := client.CoreV1().Nodes().Get(context.TODO(), am.nodeName, metav1.GetOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a very high cost for Yurthub to get node data for every 20s, because cloud-edge traffic will rise a lot.

}
}

func EnsureAutonomyCondition(client kubernetes.Interface, node *v1.Node, oldConditionStatus []v1.ConditionStatus, expectedConditionStatus v1.ConditionStatus, reason, message string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a good idea to update Node status by Yurthub, how about intercept the node status update request from kubelet and modify the node status condition of this request?

return
}
key := errorKey.(storage.KeyBuildInfo)
unstructuredObj, err := dynamicClient.Resource(schema.GroupVersionResource{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make sure that dynamicClient has the correct rbac right to get resource from kube-apiserver.

if node.Labels == nil {
node.Labels = make(map[string]string)
}
node.Labels[projectinfo.GetAutonomyStatusLabel()] = "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAutonomyStatusLabel is not located in node annotations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the way, we also need to change GetAutonomyStatusLabel to false when node condition is false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAutonomyAnnotation is in node annotations and it is used by users to open node autonomy. And the real autonomy status is reflected in the label GetAutonomyStatusLabel.

Copy link

sonarcloud bot commented Jun 6, 2024

Quality Gate Passed Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants