Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Amazon Aurora PostgreSQL support (required changes provided) #1791

Open
gaplo917 opened this issue Apr 25, 2024 · 6 comments
Open

[FR] Amazon Aurora PostgreSQL support (required changes provided) #1791

gaplo917 opened this issue Apr 25, 2024 · 6 comments

Comments

@gaplo917
Copy link

馃専 Feature Request

Support Amazon Aurora PostgreSQL (Standards and Serverless V2) on AWS.

馃摑 Description

Amazon Aurora PostgreSQL is a fully managed, PostgreSQL鈥揷ompatible, and ACID鈥揷ompliant relational database engine that combines the speed, reliability, and manageability of Amazon Aurora with the simplicity and cost-effectiveness of open-source databases. Aurora PostgreSQL is a drop-in replacement for PostgreSQL and makes it simple and cost-effective to set up, operate, and scale your new and existing PostgreSQL deployments, thus freeing you to focus on your business and applications. --- Working with Amazon Aurora PostgreSQL

Aurora PostgreSQL usually is a drop-in replacement for PostgreSQL but unfortunately not in this case. When I tried to initial the stack, it fails
aurora error

After deep dive into the source code, I found the following codes that check replication lag
https://github.com/matter-labs/zksync-era/blob/main/core/lib/dal/src/system_dal.rs#L20-L39

Aurora PostgreSQL use underlying architecture that does not use WAL to do replication. It doesn't have the pg_last_wal_receive_lsn, pg_last_wal_replay_lsn , pg_last_xact_replay_timestamp pg functions. Thus, it throws SQLSTATE 0A000 feature_not_supported exception.

I think there are two potential options to support Aurora PostgreSQL and I could provide the PR once the community come into agreement.

Option 1.

Add a new feature flag and using conditional statement to switch the replication lag query

SELECT
        highest_lsn_rcvd = current_read_lsn as synced,
        ROUND(replica_lag_in_msec / 1000)::int   as lag
    FROM aurora_replica_status()
    ORDER BY replica_lag_in_msec DESC NULLS LAST
    LIMIT 1;

Option 2.

Automatically handle fallback case using pg function. Create a sql file to create the function automatcially before checking the replication lag,

CREATE OR REPLACE FUNCTION get_replication_lag_sec()
    RETURNS TABLE (synced BOOLEAN,lag INTEGER) AS $$
DECLARE
BEGIN
    RETURN QUERY SELECT
        pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() as synced,
        EXTRACT(SECONDS FROM now() - pg_last_xact_replay_timestamp())::int AS lag;
EXCEPTION WHEN SQLSTATE '0A000' THEN
    -- SQLSTATE 0A000 feature_not_supported
    -- SQLERRM: Function pg_last_xlog_receive_location() is currently not supported for Aurora
    RETURN QUERY SELECT
        highest_lsn_rcvd = current_read_lsn as synced,
        ROUND(replica_lag_in_msec / 1000)::int   as lag
    FROM aurora_replica_status()
    ORDER BY replica_lag_in_msec DESC NULLS LAST
    LIMIT 1;
END;
$$ LANGUAGE plpgsql;

modify the codes https://github.com/matter-labs/zksync-era/blob/main/core/lib/dal/src/system_dal.rs#L28-L34 to

select * from get_replication_lag_sec();

馃 Rationale

Aurora PostgreSQL provides serverless infrastructure that able to near-instant autoscale vertically on compute, memory, and IOPS.

  • Simplify the adoption of zksync stack on AWS.
  • Prevent over-provisioning on the database and theoretically more environmental friendly
@EmilLuta
Copy link
Contributor

Hey @gaplo917. Thanks for the issue submission. At the moment, we mostly run in GCP (we're agnostic on most areas, but there are a few ties to GCP). That said, we want to move the codebase in a modular place where you could have your own implementation of of each component (in this case, you could write your own snapshotter from scratch and just inject it at compile time).

For the moment, it's unlikely that we'll support other infrastructure, but we will open the capabilities for anyone to build on top in the near future.

@hiteshdamke
Copy link

thankss

@gaplo917
Copy link
Author

gaplo917 commented May 6, 2024

@EmilLuta Thank you for the prompt reply. The modular structure sounds promising and I can imagine that it's a huge chunk of workload that will take time.

Would you mind if I leave this FR open, so if others encounter the same issue, they are able to find it in the issues list?

@EmilLuta
Copy link
Contributor

EmilLuta commented May 6, 2024

I believe it's discoverable even as closed, but let's keep it open as it's not been addressed. Thanks @gaplo917!

@trader2021
Copy link

good nice work

@tradeupo
Copy link

tradeupo commented May 9, 2024

lfg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants