[doc][yba] 2024.1 YBA CLI landing page #22209

ddhodge · 2024-04-30T20:46:11Z

YBA CLI landing page
DOC-330

@netlify /preview/yugabyte-platform/anywhere-automation/

…p string Summary: So when we first look at the input config, the problem with current logic is that it finds the first instance directly, rather we need to ensure we match via regexnwith whitespace characters For example: Validation FAILS: host all +ldap_service_users 10.0.0.0/8 ldap ldapurl=ldaps://ldap.dev.schwab.com:636 ldapsearchattribute=""sAMAccountName"" ldapbasedn=""OU=ServiceAccount,DC=csdev,DC=corp"" ldapbinddn=""CN=svc.yb_ldap_dev,OU=ServiceAccount,DC=csdev,DC=corp"" ldapbindpasswd=""Password"" The above is not working as it finds the first instance of ldap being +ldap_service_users which is wrong, the first instance of ldap is one which is the whole word with whitespace characters Validation SUCCEEDS: host all +asldwfhhasg 10.0.0.0/8 ldap ldapurl=ldaps://ldap.dev.schwab.com:636 ldapsearchattribute=""sAMAccountName"" ldapbasedn=""OU=ServiceAccount,DC=csdev,DC=corp"" ldapbinddn=""CN=svc.yb_ldap_dev,OU=ServiceAccount,DC=csdev,DC=corp"" ldapbindpasswd=""Password"" Test Plan: Please refer to the screenshots {F170693} {F170694} Reviewers: jmak Reviewed By: jmak Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34230

…ount Summary: In this issue we see a tserver crash that can be traced to the root cause that yb-master's catalog version went backwards. Tserver has already seen version pair of (320, 242) for DB 16429 but later yb-master's catalog version pair for DB 16429 became (319, 242). Timing wise it coincided with a PITR restore operation, and we see the following log repeatedly after a PITR restore operation: ``` E0401 18:14:53.567183 31982 tablet_server.cc:917] Ignoring ysql db 16429 catalog version update: new version too old. New: 319, Old: 320 ``` It is unexpected for catalog version to go backwards on a PITR restore operation. This lasted for more than 20 minutes until a ysql_dump-based restore operation happened. As part of the restore operation we were running a global-impact DDL statement: ``` \if :use_roles ALTER DATABASE "postgres_88" OWNER TO postgres; ALTER DATABASE \endif ``` This DDL will increment catalog versions for all databases, including 16429. Now the new pair for 16429 becomes (320, 320). Note that the second number is the breaking version. When it changes it is always equal to the first number. It is this new pair (320, 320) that caused the tserver to crash because we see the breaking version associated with 320 is 242 and it is not possible to have breaking version change given the same current version. This bug has so far only appeared once in about 80+ runs of the integration test `test_cross_db_concurrent_ddls`. The root cause is that we have seen master version going backwards. However we only have a LOG(DFATAL) when that happens. If it really happens in production environment which has release build, this can simply go undetected for long time and eventually crashes when a new breaking DDL statement come to bump up the current catalog version and breaking version at the same time. I added a new gflag --ysql_min_new_version_ignored_count that if we see stale catalog version returned from master for consecutively this many number of times, then crash the tserver earlier to sync up with master again. This helps to * reproduce the bug easier in our integration test (now have --log_ysql_catalog_versions=true) * avoids running a tserver for too long a time when its catalog version is out of sync with master. This diff is only limited to per-database catalog version mode because we do not want to change the behavior in global catalog version mode at this time. Jira: DB-10651 Test Plan: (1) ./yb_build.sh release --cxx-test pg_catalog_version-test (2) Manual test * # create a RF-3 local cluster ./bin/yb-ctl create --rf 3 * ./bin/ysqlsh -c "create table foo(id int)" * # run a DDL that increments DB yugabyte's current_version to 2 ./bin/ysqlsh -c "alter table foo add column v1 text" * # verify database yugabyte's current_version is 2 ./bin/ysqlsh -c "select * from pg_yb_catalog_version" * # manually force DB yugabyte's version to go back to 1 ./bin/ysqlsh -c "set yb_non_ddl_txn_for_sys_tables_allowed=1; update pg_yb_catalog_version set current_version = 1" * Wait and see all 3 tservers crash with the expected log messages: F0415 23:51:02.940397 29969 tablet_server.cc:924] Ignoring ysql db 13248 catalog version update: new version too old. New: 1, Old: 2, ignored count: 19 F0415 23:51:05.871065 30011 tablet_server.cc:924] Ignoring ysql db 13248 catalog version update: new version too old. New: 1, Old: 2, ignored count: 31 F0415 23:51:17.931600 29927 tablet_server.cc:924] Ignoring ysql db 13248 catalog version update: new version too old. New: 1, Old: 2, ignored count: 48 Reviewers: jason Reviewed By: jason Subscribers: ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D34146

… SST files only retained for CDC" Summary: D33131 introduced a segmentation fault which was identified in multiple tests. ``` * thread #1, name = 'yb-tserver', stop reason = signal SIGSEGV * frame #0: 0x00007f4d2b6f3a84 libpthread.so.0`__pthread_mutex_lock + 4 frame #1: 0x000055d6d1e1190b yb-tserver`yb::tablet::MvccManager::SafeTimeForFollower(yb::HybridTime, std::__1::chrono::time_point<yb::CoarseMonoClock, std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l>>>) const [inlined] std::__1::unique_lock<std::__1::mutex>::unique_lock[abi:v170002](this=0x00007f4ccb6feaa0, __m=0x0000000000000110) at unique_lock.h:41:11 frame #2: 0x000055d6d1e118f5 yb-tserver`yb::tablet::MvccManager::SafeTimeForFollower(this=0x00000000000000f0, min_allowed=<unavailable>, deadline=yb::CoarseTimePoint @ 0x00007f4ccb6feb08) const at mvcc.cc:500:32 frame #3: 0x000055d6d1ef58e3 yb-tserver`yb::tablet::TransactionParticipant::Impl::ProcessRemoveQueueUnlocked(this=0x000037e27d26fb00, min_running_notifier=0x00007f4ccb6fef28) at transaction_participant.cc:1537:45 frame #4: 0x000055d6d1efc11a yb-tserver`yb::tablet::TransactionParticipant::Impl::EnqueueRemoveUnlocked(this=0x000037e27d26fb00, id=<unavailable>, reason=<unavailable>, min_running_notifier=0x00007f4ccb6fef28, expected_deadlock_status=<unavailable>) at transaction_participant.cc:1516:5 frame #5: 0x000055d6d1e3afbe yb-tserver`yb::tablet::RunningTransaction::DoStatusReceived(this=0x000037e2679b5218, status_tablet="d5922c26c9704f298d6812aff8f615f6", status=<unavailable>, response=<unavailable>, serial_no=56986, shared_self=std::__1::shared_ptr<yb::tablet::RunningTransaction>::element_type @ 0x000037e2679b5218) at running_transaction.cc:424:16 frame #6: 0x000055d6d0d7db5f yb-tserver`yb::client::(anonymous namespace)::TransactionRpcBase::Finished(this=0x000037e29c80b420, status=<unavailable>) at transaction_rpc.cc:67:7 ``` This diff reverts the change to unblock the tests. The proper fix for this problem is WIP Jira: DB-10780, DB-10466 Test Plan: Jenkins: urgent Reviewers: rthallam Reviewed By: rthallam Subscribers: ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D34245

* Sandbox support for Innovation track * typo

…from multiple tablets of a tserver leading to undetected deadlocks Summary: The local waiting txn registry at the Tablet Server maintains wait-for dependencies arising at all tablet leaders hosted on the node. When a request is processed at the wait-queue, the local registry sends a partial update containing just the wait-for dependencies of the request. The registry keeps accumulating all dependencies, and periodically sends full update requests which comprise of wait-for dependencies of all outstanding requests. `UpdateTransactionWaitingForStatusRequestPB` proto is used for sending the wait-for dependency info. Each `WaitingTransaction` in `UpdateTransactionWaitingForStatusRequestPB` has a `wait_start_time` which is populated at the local registry and is set to `clock_->Now()`. The deadlock detector maintains a multi container `waiters_` indexed by key pair <txn_id, tserver_uuid>. In the existing implementation, the detector overwrites the wait-for dependencies of a waiter when it encounters a `WaitingTransaction` with a later timestamp than the existing one. Since multiple requests of the same txn (at the same tablet or at multiple tablets) with different blockers can exist at a given time, it led to incomplete wait-for info at the detector, thus resulting in undetected deadlocks. This diff addresses the issue by changing the logic at the detector to keep track of all dependencies of the waiter and not overwrite it based on start time. Each time the detector sees a `WaitingTransaction`, it triggers probes for just the new wait-for dependencies contained in the message and appends the blocker info to the existing waiter record (if any). Additionally, we propagate the request id info of the waiter to the deadlock detector, and store a list of request ids blocked on the `{blocker_id, subtxn, status_tablet}` tuple. Note that the diff doesn't change the periodic deadlock probing algorithm at the detector. Sample vlogs from the detector for one of the below tests ``` [ts-1] I0416 05:31:20.734838 129157 deadlock_detector.cc:611] vlog4: T 9eed4e1f5b2745aeb8333e8fe0fd9c61 D c78320b1-e14c-44ba-b6fe-1518c4be25d2 Adding new wait-for relationship -- waiter txn id: 631c3948-6816-403c-890a-95560fc4615f blocker id: 6cb37588-3d67-4301-8336-cf944222cdab, status tablet: 9eed4e1f5b2745aeb8333e8fe0fd9c61, blocking subtxn info: [2, 2], waiting_requests (id, start_time): [{11, { days: 19829 time: 05:31:20.733994 }}] received from TS: 25db8deb4335401ab1261c01e8634ab8 [ts-1] I0416 05:31:20.735044 129157 deadlock_detector.cc:629] vlog4: T 9eed4e1f5b2745aeb8333e8fe0fd9c61 D c78320b1-e14c-44ba-b6fe-1518c4be25d2 Updated blocking data -- txn_id_: 631c3948-6816-403c-890a-95560fc4615f, tserver_uuid_: 25db8deb4335401ab1261c01e8634ab8, waiter_data_: [blocker id: b9356bf9-6bb6-4ef6-be4e-96dcfec46ddd, status tablet: a8643356f0e24c1a836b029f1bea0dfa, blocking subtxn info: [4, 4], waiting_requests (id, start_time): [{10, { days: 19829 time: 05:31:20.733203 }}], blocker id: 6cb37588-3d67-4301-8336-cf944222cdab, status tablet: 9eed4e1f5b2745aeb8333e8fe0fd9c61, blocking subtxn info: [2, 2], waiting_requests (id, start_time): [{11, { days: 19829 time: 05:31:20.733994 }}, {9, { days: 19829 time: 05:31:20.727855 }}]] [ts-1] I0416 05:31:20.735205 129157 deadlock_detector.cc:244] vlog4: T 9eed4e1f5b2745aeb8333e8fe0fd9c61 D c78320b1-e14c-44ba-b6fe-1518c4be25d2 - probe(c78320b1-e14c-44ba-b6fe-1518c4be25d2, 1) AddBlocker: waiting_txn_id: 631c3948-6816-403c-890a-95560fc4615f, blocker id: b9356bf9-6bb6-4ef6-be4e-96dcfec46ddd, status tablet: a8643356f0e24c1a836b029f1bea0dfa, blocking subtxn info: [4, 4], waiting_requests (id, start_time): [{10, { days: 19829 time: 05:31:20.733203 }}], probe_num: 1, min_probe_num: 0 [ts-1] I0416 05:31:20.735318 129157 deadlock_detector.cc:244] vlog4: T 9eed4e1f5b2745aeb8333e8fe0fd9c61 D c78320b1-e14c-44ba-b6fe-1518c4be25d2 - probe(c78320b1-e14c-44ba-b6fe-1518c4be25d2, 1) AddBlocker: waiting_txn_id: 631c3948-6816-403c-890a-95560fc4615f, blocker id: 6cb37588-3d67-4301-8336-cf944222cdab, status tablet: 9eed4e1f5b2745aeb8333e8fe0fd9c61, blocking subtxn info: [2, 2], waiting_requests (id, start_time): [{11, { days: 19829 time: 05:31:20.733994 }}, {9, { days: 19829 time: 05:31:20.727855 }}], probe_num: 1, min_probe_num: 0 ``` Test Plan: Jenkins ./yb_build.sh --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueueRF1Test.TestDeadlockAcrossMultipleTablets -n 20 ./yb_build.sh --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueueRF1Test.TestDetectorPreservesBlockerSubtxnInfo -n 20 ./yb_build.sh --cxx-test='TEST_F(UnsignedIntSetTest, Hash) {' test 1 fails consistently prior to this diff - w waits on blockers b1 and b2. the test ensures that the detector doesn't erase w->b1 on seeing w->b2) test 2 ensures that the detector doesn't overwrite blocking subtxn info of a given blocker - w waits on b1(subtxn 2) and b1(subtxn 3). the test asserts that the detector doesn't erase w->b1(subtxn 2) on seeing w->b1(subtxn 3). Reviewers: rsami Reviewed By: rsami Subscribers: yql, ybase Differential Revision: https://phorge.dev.yugabyte.com/D33641

Summary: Audit logs were not exported from newly added read replica cluster. We were sending `AuditLogConfig` as null when calling the read replica API. Since the primary cluster already has the `AuditLogConfig`, we should be using that instead when provisioning the new nodes. The following flow didn't work before, which does now: ``` Create a universe without RR. Enable DB audit logging. Add RR to this universe. Verify --> Audit logs are not visible on DD ``` Test Plan: Manually tested the following flow: Tried the following scenarios: Case 1: ``` Create a universe with primary cluster and RR. Enable DB audit logs. Verify. ---> Works as expected. Audit logs from primary cluster nodes and RR nodes are visible on DD. ``` Case 2: ``` Create a universe without RR. Enable DB audit logging. Add RR to this universe. Verify --> Works as expected. Audit logs from primary cluster nodes and RR nodes are visible on DD. ``` Case 3 (Patched this diff on my diff: https://phorge.dev.yugabyte.com/D33949): ``` Create a universe without RR. Enable DB audit logging. Add RR to this universe. Verify --> Works as expected. Audit logs from primary cluster nodes and RR nodes are visible on DD. Add new node to universe. Verify --> Works as expected. Audit logs from both primary cluster nodes and RR nodes are visible on DD. ``` Reviewers: amalyshev Reviewed By: amalyshev Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D33995

Summary: In case the AMI is not present for a region in bundle, we were falling back to retrieve the same from the region. This diff makes changes as - 1. For YBA managed bundles - we will read the AMIs from the YBA metadata in case they are not present in the bundle. 2. For Custom bundles - We will fail without having any fall back mechanism. Removes the dependency on region -> ybImage. Test Plan: Created a provider with custom bundle. Removed the ybImage from the bundle. Deployed the universe using the same. Verified that it failed. Created a provider with YBA managed bundles. Removed the ybImage from the bundle. Deployed the universe. Verified that it picks up from the YBA's image metadata. Reviewers: amalyshev, nbhatia Reviewed By: amalyshev Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34119

Summary: Special characters like `$` get escaped when provided via the flags for passwords. The help text indicates using `''` quotes to provide these values to be parsed correctly Test Plan: Test create universe and ysql connection with single quotes Reviewers: skurapati, rohita.payideti Reviewed By: rohita.payideti Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34264

Test Plan: Start tserver with rpc bind and http addresses set differently and verify that the display shows this. Tested with --rpc_bind_addresses set to 0.0.0.0 and verified that the hostname shows up instead (this seems to depend on whether ybdb can discover a local hostname) {F166067} Reviewers: hsunder Reviewed By: hsunder Subscribers: hsunder, esheng, ybase, bogdan Differential Revision: https://phorge.dev.yugabyte.com/D33726

Summary: Created new class to exclude base upgrade tasks from unit tests The general idea is to distinguish checking of basic subtasks (like `WaitForServer` etc) from logic that is specific to particular upgrade. In this approach each upgrade test will check only actual upgrade subtasks (like `InstanceActions`) and nodes which these actions are applied to. And there should also be test, that tests basic sequence (but currently is missing in this diff!) The problem with approach in master - all upgrade tests check both things, and we have to modify all of tests if we alter basic logic. Test Plan: sbt test Reviewers: nsingh, sanketh Reviewed By: nsingh Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D33032

…rver proxy if not using a distributed transaction Summary: Before this revision, every RollbackToSubTransaction operation in PG would lead to a corresponding RPC call to the local tserver. The local tserver used to return early in case there was no distributed transaction. This revision adds the logic in the PG layer (pg_session) to skip sending the RPC if the transaction is read-only or a fast-path transaction i.e., has NON_TRANSACTIONAL isolation level. Note that we were already doing that for transaction commit/aborts but weren't skipping the RPC for rollback of sub-transaction. This change was proposed as part of the implementation of the PG compatible logical replication support. While streaming the changes to the Walsender, it starts and aborts transactions for every transaction that gets streamed. This is required for reading PG catalog tables. As a result, we were seeing a lot of unnecessary RPC calls to the local tserver. Jira: DB-10402 Test Plan: All tests Reviewers: asrinivasan, pjain Reviewed By: pjain Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D34162

Summary: This diff enables the OS Patching runtime flag by default. Test Plan: iTest pipeline Reviewers: amalyshev, nbhatia, #yba-api-review! Reviewed By: amalyshev Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34184

…record if needed Summary: When there's a `GetChanges` request (req_1) and service layer receives a `CacheMissError`, it refetches the enum labels and executes a new internal `GetChanges` request (req_2) for a fresh `GetChangesResponse`. Now suppose this is the first `GetChanges` request from the connector where it still hasn't received the DDL record, after the service clears the response, it looks at the `cached_schema_details` object while making `req_2` to decide whether or not to publish the DDL record. But since we have already populated the `cached_schema_details` while processing `req_1`, it will mean that we do not populate the DDL record and thus the client will not receive the DDL record in `GetChangesResponse` causing it to fail while decoding further change events. **Solution:** This diff implements a simple solution by clearing the `cached_schema_details` while executing `req_2` if the connector/client has indicated that it needs the schema i.e. if `req->need_schema_info() == true`. Jira: DB-9701 Test Plan: ``` ./yb_build.sh --cxx-test cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestPopulationOfDDLRecordUponCacheMiss ``` Reviewers: skumar, asrinivasan, stiwary Reviewed By: skumar Subscribers: ycdcxcluster Differential Revision: https://phorge.dev.yugabyte.com/D34107

Summary: This change addresses a bug introduced in diff D32566 that caused some tablet metrics to have the wrong metric_type. It also fixes a pre-existing issue with how metric attributes are stored. **Root Cause:** D32566 started storing table metrics into the aggregation map for grouping purposes. (Previously, table metrics were flushed directly if they were at the table level as no aggregation needed). Because of this, It also started saving the attributes of these table metrics in an attributes map with their entity_id, which is table_id, as the key. However, there were two problems: - Table level Attribute Collision: Both table and tablet metrics used the table_id as the key for storing their attributes, leading to collisions and incorrect attributes when aggregating them at table level. - Potential Pre-existing Stream level Attribute Collision: Even before D32566, using entity_id as the key wasn't ideal because some metrics like XClusterMetric and CdcsdkMetric have different attribute structures despite having the same entity_id (stream_id in this case). **Fix:** This change addresses both issues by storing metric attributes with a composite key consisting of: - metric_type: Identifies the specific type of metric (e.g., XClusterMetric, CdcsdkMetric). - entity_id: Identifies the entity the metric belongs to (e.g., table_id, stream_id). This approach ensures unique keys for storing metric attributes and avoids collisions based solely on entity_id. Jira: DB-10501 Test Plan: Jenkins: urgent To verify the fix addresses both issues, a `DCHECK` was added in `PrometheusWriter::AddAggregatedEntry`. This check compares the stored attribute map with the incoming attribute map. If there's a mismatch, it indicates a collision. This DCHECK effectively covers both scenarios: - Table level Attribute Collision: Detected by `PrometheusMetricFilterTest.TestV1Default` - Potential Pre-existing Stream level Attribute Collision: Detected by `MetricsTest.VerifyHelpAndTypeTags` Reviewers: mlillibridge, rthallam Reviewed By: mlillibridge Subscribers: bogdan, ybase Differential Revision: https://phorge.dev.yugabyte.com/D33396

* partition by region node settings * screen shot * update CLI * update CLI * minor edit * review comments * cli help edits * update screenshots

* remove redis/yedis references from docs * remove old realworld apps * Apply suggestions from code review Co-authored-by: Dwight Hodge <79169168+ddhodge@users.noreply.github.com> * fix broken links * fix external links * fix link --------- Co-authored-by: Dwight Hodge <79169168+ddhodge@users.noreply.github.com> Co-authored-by: Dwight Hodge <ghodge@yugabyte.com>

…tional replication setup Summary: This commit modifies the behavior when a user adds a YSQL table to a bidirectional replication. With these changes, the bootstrapping process is always skipped when adding a table to a bidirectional replication, regardless of whether it is required or not. The detection of bidirectional replication operates at the database granularity. This means that when adding a table to a replication, the replication is considered bidirectional if any sibling table (i.e., other tables within the same database as the table being added) is already part of a bidirectional replication. Note: because the bootstrapping is skipped, it will be the responsibility of the user to ensure the existing data are copied over. Test Plan: - Made sure the user is able to add tables to a bidirectional replication, no matter it requires bootstrapping or not. - Made sure for unidirectional replication, it does bootstrapping if required (previous behavior). Reviewers: #yba-api-review, cwang, jmak, sanketh Reviewed By: #yba-api-review, sanketh Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34228

…smatch Summary: With changes to send backup code sending backups to followers of lower YBA version is now an error, which meant we weren't syncing the config. As part of config sync we generate version mismatch events. If we try to sync the config to all instances we will both update the config correctly so standby shows correct state even when it is a lower version and the active correctly fires alert to upgrade standby. moving backup update code when we send backup simplifying code Test Plan: Setup HA, upgrade standby, then promote. Ensure that alert fires and config looks correct on standby. Reviewers: dshubin, sanketh Reviewed By: sanketh Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34125

…elCache failure Summary: `CREATE PUBLICATION FOR ALL TABLES` invalidates entire relcache (via CacheInvalidateRelcacheAll) and hence is global impact DDL. This test fails as a result. This diff fixes the test failure by checking the PG backend version via `SELECT version()` and adjust the test result depending on whether it is PG11 or PG15. Jira: DB-10952 Test Plan: In both master (PG11) and PG15 branches, apply the diff patch and run: ./yb_build.sh release --cxx-test pg_catalog_version-test --gtest_filter PgCatalogVersionTest.InvalidateWholeRelCache Reviewers: aagrawal Reviewed By: aagrawal Subscribers: jason, yql Differential Revision: https://phorge.dev.yugabyte.com/D34262

…@yugabyte-ui-common-component library Summary: Handle ASH as special case as it has OUTLIER style buttons in case of OVERALL mode Ensure graph API call is made when WAIT EVENT or WAIT EVENT CLASS or WAIT EVENT COMPONENT is selected Test Plan: Tested locally via TS Web UI Reviewers: amalyshev, cdavid Reviewed By: cdavid Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34243

Summary: We have timezone related bug in ASH retrieval code, which breaks retrieval. Also, old code could skip some sample events. It's fixed now. Also it makes universe details retrieval sync with universe metadata update. Test Plan: Unit tested + tested ASH retrieval manually Reviewers: rmadhavan, cdavid Reviewed By: rmadhavan, cdavid Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34236

…tIndex Summary: Use std::upper_bound instead of std::lower_bound, which allows to find the answer in one statement. Move sanity check forward, so error messages are more accurate Jira: DB-10789 Test Plan: ./yb_build.sh --cxx-test client_client-test Reviewers: arybochkin, dmitry, mlillibridge, timur Reviewed By: arybochkin Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D33935

…eys when yb.cloud.enabled is true Summary: This is a workaround to let itests pass. The way we are reading YBM enabled is messy. Another pass to make it uniform can be done that may require validating all the paths work. Test Plan: Itest should pass. Reviewers: cwang, yshchetinin, sanketh, kvikraman Reviewed By: yshchetinin Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34257

… log anchor session Summary: When RBS is done from a non-leader peer, the rbs source creates a session id of the form `<requestor_uuid>-<tablet_id>-<MonoTime::Now()>`. It sends back the same identifier to the destination node, and the rbs destination node uses this for the subsequent calls. The same is used for propagating the log anchor information to the leader peer, i.e we use this session id in `RegisterLogAnchorRequestPB`. While creating a session for anchoring the log on the leader, the logic was similar to the below ``` auto tablet_peer_result = tablet_peer_lookup_->GetServingTablet(req->tablet_id()); ... auto it = log_anchors_map_.find(req->owner_info()); if (it == log_anchors_map_.end()) { ... } else { tablet_peer.reset(it->second->tablet_peer_.get()); // <- this line creates a problem } ``` When re-using the session, `tablet_peer.reset` takes ownership of the underlying manager `TabletPeer` object and fails to consider the existing shared_ptrs. So once it goes out of the scope (function `RemoteBootstrapServiceImpl::RegisterLogAnchor`), `~TabletPeer()` is called on the underlying object, which leads to the below fatal. ``` ../../src/yb/tablet/maintenance_manager.cc:101] Check failed: !manager_.get() You must unregister the LogGCOp(7b72199ed9714f649d10be762212950d) Op before destroying it. ``` This diff addresses the issue by using `=` operator which rightly tracking the existing shared_pts as well, and doesn't destruct the underlying object once `tablet_peer` goes out of scope. Note: To repro this in a test, `RemoteBootstrapServiceImpl::RegisterLogAnchor` should be called with the same `owner_info` set in `RegisterLogAnchorRequestPB`. But that is only possible when the rbs source re-uses its rbs session, whose session id is computed with a suffix of `MonoTime::Now()`. So wasn't able to simulate the above crash in a test, but this was observed in the logs reported by in the community forum. Jira: DB-10926 Test Plan: Jenkins Reviewers: amitanand Reviewed By: amitanand Subscribers: ybase Differential Revision: https://phorge.dev.yugabyte.com/D34205

Summary: Show release date as empty string in release details Release date in general for customers will never be empty, it will always have a date, but in case of internal usage, release date for most of the dev builds will be empty Test Plan: Please refer to the screenshot {F171445} Reviewers: jmak, dshubin Reviewed By: dshubin Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34291

…eUniverseReplicationRequestPB Summary: For xCluster DR we want to be able to failover very quickly. DeleteUniverseReplication cleans up the streams on the source, which will timeout since the source is unavailable during failover. When `skip_producer_stream_deletion` is set on `DeleteUniverseReplicationRequestPB` we will skip the cleanup process. **Upgrade/Rollback safety:** New field is optional and false by default, so safe for upgrade and rollbacks. Fixes yugabyte#22050 Jira: DB-10965 Test Plan: XClusterTest.DeleteWithoutStreamCleanup Reviewers: slingam, jhe, xCluster Reviewed By: slingam Subscribers: ybase, xCluster Differential Revision: https://phorge.dev.yugabyte.com/D34286

Summary: All connections that `postgres_fdw` establishes to foreign servers are kept open in the local session for re-use. With option `use_remote_estimate true` specified during a foreign table's creation, when PG estimates the cost of the foreign table, it executes a SQL statement remotely using the existing open connection to the foreign server where the foreign table resides. With changes made in commit 9a27aff, open PG connections need to refresh catalog cache because ANALYZE increments catalog version. Thus, the plan in test `TestPgRegressContribPostgresFdw` changed based on cost because open connections use up-to-date statistics instead of stable statistics. Jira: DB-10738 Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressContribPostgresFdw' Reviewers: tverona, myang Reviewed By: myang Subscribers: jason, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34071

…d tservers do not have any tablets assigned to them Summary: Verify that tablet count is zero on the blacklisted nodes after wait for data move. Also, made a fix to host port comparison for YBM dual NIC. Test Plan: 1. Create a universe. 2. Run full move (2 times for previously blacklisted nodes not in the universe). 3. Verified the log messgaes. ``` 2024-04-15T20:51:17.743Z [debug] 3525d0d2-3c25-4cb7-b717-35f2f2dbeb6a UniverseTaskBase.java:2207 [TaskPool-EditUniverse(f6314ef5-7324-4671-a335-eea42bc4f758)-3] com.yugabyte.yw.commissioner.tasks.UniverseTaskBase Making url request to endpoint: http://10.9.120.245:7000/dump-entities 2024-04-15T20:51:18.891Z [info] AsyncYBClient.java:2758 [yb-nio-1] org.yb.client.AsyncYBClient Discovered tablet YB Master for table YB Master with partition ["", "") 2024-04-15T20:51:18.940Z [debug] 3525d0d2-3c25-4cb7-b717-35f2f2dbeb6a UniverseTaskBase.java:2271 [TaskPool-EditUniverse(f6314ef5-7324-4671-a335-eea42bc4f758)-3] com.yugabyte.yw.commissioner.tasks.UniverseTaskBase Number of tablets on tserver yb-admin-nsingh-test-universe1-n1 is 0 tablets 2024-04-15T20:51:18.940Z [debug] 3525d0d2-3c25-4cb7-b717-35f2f2dbeb6a UniverseTaskBase.java:2271 [TaskPool-EditUniverse(f6314ef5-7324-4671-a335-eea42bc4f758)-3] com.yugabyte.yw.commissioner.tasks.UniverseTaskBase Number of tablets on tserver yb-admin-nsingh-test-universe1-n2 is 0 tablets 2024-04-15T20:51:18.940Z [debug] 3525d0d2-3c25-4cb7-b717-35f2f2dbeb6a UniverseTaskBase.java:2271 [TaskPool-EditUniverse(f6314ef5-7324-4671-a335-eea42bc4f758)-3] com.yugabyte.yw.commissioner.tasks.UniverseTaskBase Number of tablets on tserver yb-admin-nsingh-test-universe1-n3 is 0 tablets 2024-04-15T20:51:18.940Z [debug] 3525d0d2-3c25-4cb7-b717-35f2f2dbeb6a UniverseTaskBase.java:2271 [TaskPool-EditUniverse(f6314ef5-7324-4671-a335-eea42bc4f758)-3] com.yugabyte.yw.commissioner.tasks.UniverseTaskBase Number of tablets on tserver yb-admin-nsingh-test-universe1-n4 is 0 tablets ``` Also tested with on-prem. 1. Create an onprem universe. 2. Run full move. The old nodes are DEAD but not blacklisted. 3. Run ybadmin command to blacklist the old nodes and make sure from the master leader UI that the DEAD node is blacklisted. 4. Run full move again. It completed successfully. Reviewers: cwang, sanketh, yshchetinin Reviewed By: cwang, yshchetinin Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34140

…view Flag Summary: Convert to enable_xcluster_api_v2 to a Preview Flag Jira: DB-10928 Test Plan: Jenkins Reviewers: slingam, xCluster Reviewed By: slingam Subscribers: ybase Differential Revision: https://phorge.dev.yugabyte.com/D34295

…y test with Connection Manager enabled Summary: In the test `org.yb.pgsql.TestYbPgStatActivity.testMemUsageOfQueryFromPgStatActivity`, we check the RSS memory consumed in a session before and after doing certain operations, with the help of a second connection. With YSQL Connection Manager enabled, both the connections would use the same physical connection, defeating the purpose of running this test. This patch ensures to skip this test whenever Connection Manager is enabled at the time of running the test. Jira: DB-10907 Test Plan: Ensure below test is skip when executed: ```./yb_build.sh --enable-ysql-conn-mgr-test --java-test org.yb.pgsql.TestYbPgStatActivity#testMemUsageOfQueryFromPgStatActivity``` Reviewers: rbarigidad Reviewed By: rbarigidad Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D34096

…r/dr Summary: Added support to remove dropped index_table/table from xclyuster/dr. As part of this change, we will avoid fetching index tables for already dropped tables, and customers will have to remove tables and index tables separately. On the newer DB version, it does not require any additional ignore error flags, but since the old DB version requires it, we will pass the ignore error flag while removing dropped tableIDs from replication for both the old and newer versions. Test Plan: - Create universe and setup xcluster/dr having tables and indexes - drop tables and indexes from source - remove tables and indexes from xcluster/dr - verify that tables does not exists anymore in the xcluster_config available in master UI and YBA DB. - verified on old YBDB version, where DB errors out while altering replication, and on newer DB version which ignore the errors without any flag. Additional test case: - Removed tables form 2 database where removed 1 table from DB_1 which is not dropped and removed 2 table where one is dropped nad another is not from DB_2 in a single edit replication request. Reviewers: hzare, cwang, sanketh, #yba-api-review Reviewed By: hzare, sanketh, #yba-api-review Subscribers: jmak, hsunder, yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34407

Summary: Increase the limit for az names from 25 to 100 characters. Test Plan: Verified the migration works fine. Monitor UTs and Itests for any failures. The ticket mentions ap-southeast-1-sggov-sin-1a zone but since this is private region we won't have access to it for testing. Reviewers: #yba-api-review, sneelakantan Reviewed By: #yba-api-review, sneelakantan Subscribers: sanketh, yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34465

…er attributes as well Summary: Currently during our YBA ↔︎ YBDB LDAP sync, we assume that the user name we want to sync will be present on the DN. While that will be true for most of the scenarios, but we can have customers wanting to sync the user present on a different attribute for example - `sAMAccountName` This diff performs the sync based on the attribute the user specified in the payload. If the specified `ldapUserfield` is not present in the `DN`, the user name will be retrieved from this attribute on the LDAP server. If the attribute is also not found, the user is simply skipped from the sync. Test Plan: Manual testing - Triggered the sync with the ldapUserfield and observed the sync where the user name is retrieved from the dn - Triggered the sync with the ldapUserfield [not present on the DN] and synced only the users that have this attribute set on the LDAP server Reviewers: #yba-api-review!, svarshney Reviewed By: svarshney Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34518

…l Flag value Summary: In this diff, support has been added to allow changing the value of publication refresh interval via the flag `cdcsdk_publication_list_refresh_interval_secs`. Inorder to protect LSN determinism, the values of publication refresh times will be persisted until a suitable acknowledgement is received. For this purpose a new key has been added to the data map in cdc_state called `pub_refresh_times`. It will contain coma separated values of the publication refresh times which have been popped from the priority queue but not yet acknowledged. In the GetConsistentChanges, whenever the tablet queue for publication_refresh_records becomes empty, the new entry added to the tablet queue will also be added to `pub_refresh_times` list and persisted in the state table. This will ensure that before shipping any LSN with commit time greater than the last_pub_refresh_time, we persist the next pub_refresh_time. When an acknowledgement reaches virtual WAL, in `UpdateAndPersistLSN()`, the `pub_refresh_times` list will be trimmed, so that it contains only those values which are strictly greater than the acknowledged publication refresh time. The field `last_pub_refresh_time` in the state table slot entry will hold the latest acknowledged publication refresh time. In this diff the precision of the flag `cdcsdk_publication_list_refresh_interval` has also been changed from microseconds to seconds. This change is done to improve the usability of the flag. For the test purposes the flags `TEST_cdcsdk_use_microseconds_refresh_interval` and `TEST_cdcsdk_publication_list_refresh_interval_micros` can be used to set the refresh interval in microseconds. Jira: DB-10688 Test Plan: Jenkins: urgent Jenkins: test regex: .*CDCSDKConsumptionConsistentChangesTest.* ./yb_build.sh --cxx-test integration-tests_cdcsdk_consumption_consistent_changes-test --gtest_filter CDCSDKConsumptionConsistentChangesTest.TestChangingPublicationRefreshInterval ./yb_build.sh --cxx-test integration-tests_cdcsdk_consumption_consistent_changes-test --gtest_filter CDCSDKConsumptionConsistentChangesTest.TestLSNDeterminismWithChangingPubRefreshInterval Reviewers: skumar, asrinivasan, stiwary, siddharth.shah Reviewed By: asrinivasan Subscribers: ybase, ycdcxcluster Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D34460

… operations Summary: This diff sets a randomly generated root_request_id for background operations Jira: DB-10887 Test Plan: Jenkins Reviewers: amitanand Reviewed By: amitanand Subscribers: hbhanawat, ybase Differential Revision: https://phorge.dev.yugabyte.com/D34627

Summary: Adds the validation for ssh key validation in case of onprem providers. Also, changes the validation error format validation to be in sync with other providers. Given the fact, that these validations are not consumed as of now by any client, we default to false https://github.com/yugabyte/yugabyte-db/blob/master/managed/src/main/resources/v1.routes#L61, should be safe to change the format. Test Plan: Manually tried creating onprem provider. Verified that the errors are thrown as expected. ` { "success": false, "error": { "$.regions[0].zones[0].name": [ "Cannot contain any special characters except '-' and '_'." ], "$.allAccessKeys[0].keyInfo.sshPrivateKeyContent": [ "Not a valid RSA key!" ], "errorSource": [ "providerValidation" ], "$.regions[0].zones[1].code": [ "Cannot contain any special characters except '-' and '_'." ], "$.regions[0].zones[1].name": [ "Cannot contain any special characters except '-' and '_'." ] }, ` Reviewers: asharma, amalyshev, #yba-api-review, sneelakantan Reviewed By: asharma, #yba-api-review, sneelakantan Subscribers: dkumar, yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34593

Summary: Stateful service client creates its own messenger, reactor and yb client. This will use up 4 threads, and its own meta cache. Instead we should reuse the server processes yb_client. Fixes yugabyte#22102 Jira: DB-11035 Test Plan: Jenkins Reviewers: pjain Reviewed By: pjain Subscribers: ybase Differential Revision: https://phorge.dev.yugabyte.com/D34360

…-examples application (yugabyte#21900) * Upgrade the gorm docs to use latest go version and gorm v2 * Mentioned the use of smart drivers * Harsh daryani896 patch 1 (#2) * Update pgx driver version from v4 to v5 in docs. * review comments and copied to preview --------- Co-authored-by: Harsh Daryani <82017686+HarshDaryani896@users.noreply.github.com> Co-authored-by: aishwarya24 <ashchakravarthy@gmail.com>

* added r2dbc smart driver * added supported versions * added multiple hosts * fixed typo * added table defaults * added table defaults * edited the table parameters and URL * more updates from review * missed edit * changed name to maintain consistency

Summary: This diff enables DDL atomicity feature by default. (1) changing the default value of several gflags from false to true. --ysql_yb_ddl_rollback_enabled --report_ysql_ddl_txn_status_to_master --ysql_ddl_transaction_wait_for_ddl_verification (2) code cleanup related to (1), for example, some unit tests needed to explicitly enable one or more of these 3 gflags. Now that they are turned on by default, those code are removed. (3) other unit tests update related to (1). For example, in pg_packed_row-test.cc, the test output has changed from `PACKED_ROW[2]` to `PACKED_ROW[3]`. The number in the bracket represents the table schema version. With DDL atomicity, a DDL such as `ALTER TABLE test DROP COLUMN v2` causes the schema version of table test to bump by 2 after the DDL commits. Without DDL atomicity, the schema version of table test used to only bump up by 1 after the DDL commits. Jira: DB-11028 Test Plan: Jenkins run Reviewers: hsunder Reviewed By: hsunder Subscribers: ycdcxcluster, hsunder Differential Revision: https://phorge.dev.yugabyte.com/D30471

…ber of CPU used Summary: Fix an issue where the number of used/available cores is not calculated correctly in the Sankey diagram for CPU usage. Test Plan: no test plan Reviewers: nikhil Reviewed By: nikhil Subscribers: yugabyted-dev, djiang Differential Revision: https://phorge.dev.yugabyte.com/D34575

netlify · 2024-04-30T20:49:33Z

✅ Deploy Preview for infallible-bardeen-164bc9 ready!

Name	Link
🔨 Latest commit	`d81006c`
🔍 Latest deploy log	https://app.netlify.com/sites/infallible-bardeen-164bc9/deploys/663472edb105d000089fc1e3
😎 Deploy Preview	https://deploy-preview-22209--infallible-bardeen-164bc9.netlify.app/preview/yugabyte-platform/anywhere-automation/
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

subramanian-neelakantan

Looks great Dwight! A few minor suggestions and comments to remove unsupported material.

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md

aishwarya24

LGTM. Thanks!

aishwarya24 · 2024-05-01T19:15:13Z

docs/content/preview/yugabyte-platform/anywhere-automation/_index.md

@@ -19,6 +19,7 @@ Use the following automation tools to manage your YugabyteDB Anywhere installati
 | :--------- | :---------- |
 | [REST API](anywhere-api/) | Deploy and manage database universes using a REST API. |
 | [Terraform provider](anywhere-terraform/) | Provider for automating YugabyteDB Anywhere resources that are accessible via the API. |
+| [CLI](anywhere-cli/) | Manage YugabyteDB Anywhere resources from the command line. {{<badge/tp>}} |


Add a tile below for automation.

subramanian-neelakantan

Looks good so far. Thanks.

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md

rajmaddy89 and others added 30 commits April 17, 2024 17:52

[doc][ybm] Sandbox support for Innovation track (yugabyte#22032)

ed94b4a

* Sandbox support for Innovation track * typo

[PLAT-10551] Enable OS Patching

851f6bd

Summary: This diff enables the OS Patching runtime flag by default. Test Plan: iTest pipeline Reviewers: amalyshev, nbhatia, #yba-api-review! Reviewed By: amalyshev Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D34184

[doc][ybm] Partition by region asym node settings (yugabyte#21385)

8255274

* partition by region node settings * screen shot * update CLI * update CLI * minor edit * review comments * cli help edits * update screenshots

vipul-yb and others added 12 commits April 30, 2024 04:34

yba cli landing page

e41e2ed

ddhodge self-assigned this Apr 30, 2024

ddhodge added the area/documentation Documentation needed label Apr 30, 2024

ddhodge added this to In progress in Documentation via automation Apr 30, 2024

ddhodge requested review from aishwarya24, subramanian-neelakantan and yushenng April 30, 2024 20:46

ddhodge changed the title ~~[doc][yba] YBA CLI landing page~~ [doc][yba] 2024.1 YBA CLI landing page Apr 30, 2024

subramanian-neelakantan reviewed May 1, 2024

View reviewed changes

review comments

4c4cc44

aishwarya24 approved these changes May 1, 2024

View reviewed changes

subramanian-neelakantan approved these changes May 2, 2024

View reviewed changes

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md Outdated Show resolved Hide resolved

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md Show resolved Hide resolved

Deepti-yb reviewed May 2, 2024

View reviewed changes

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md Outdated Show resolved Hide resolved

docs/content/preview/yugabyte-platform/anywhere-automation/anywhere-cli.md Show resolved Hide resolved

ddhodge added 2 commits May 2, 2024 09:34

review comments

6fbce5e

review comment

d81006c

jharveysmith force-pushed the master branch from a6efc57 to 23ce4a3 Compare May 24, 2024 23:37

svarnau force-pushed the master branch 2 times, most recently from fe7d36e to d212e43 Compare May 25, 2024 02:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[doc][yba] 2024.1 YBA CLI landing page #22209

[doc][yba] 2024.1 YBA CLI landing page #22209

ddhodge commented Apr 30, 2024

netlify bot commented Apr 30, 2024 •

edited

subramanian-neelakantan left a comment

aishwarya24 left a comment

aishwarya24 May 1, 2024

subramanian-neelakantan left a comment

[doc][yba] 2024.1 YBA CLI landing page #22209

Are you sure you want to change the base?

[doc][yba] 2024.1 YBA CLI landing page #22209

Conversation

ddhodge commented Apr 30, 2024

netlify bot commented Apr 30, 2024 • edited

✅ Deploy Preview for infallible-bardeen-164bc9 ready!

subramanian-neelakantan left a comment

Choose a reason for hiding this comment

aishwarya24 left a comment

Choose a reason for hiding this comment

aishwarya24 May 1, 2024

Choose a reason for hiding this comment

subramanian-neelakantan left a comment

Choose a reason for hiding this comment

netlify bot commented Apr 30, 2024 •

edited