Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect label saved when writing metrics with prometheus remote write after flush memtables #3881

Closed
0neSe7en opened this issue May 8, 2024 · 5 comments · Fixed by #3927

Comments

@0neSe7en
Copy link

0neSe7en commented May 8, 2024

What type of bug is this?

Incorrect result

What subsystems are affected?

Standalone mode, Storage Engine, Datanode

Minimal reproduce step

starts a clean greptime database
save these four metrics with prometheus remote write

{
  "records": {
    "schema": {
      "column_schemas": [
        {
          "name": "greptime_timestamp",
          "data_type": "TimestampMillisecond"
        },
        {
          "name": "greptime_value",
          "data_type": "Float64"
        },
        {
          "name": "test_label",
          "data_type": "String"
        }
      ]
    },
    "rows": [
      [
        1712717731000,
        0.0,
        "10min"
      ],
      [
        1714623931000,
        0.0,
        "10min"
      ],
      [
        1712070000000,
        0.0,
        "daily"
      ],
      [
        1712761200000,
        0.0,
        "daily"
      ]
    ]
  }
}

wait until the first memtable flush finished.
write a new metric with prometheus remote write

timestamp=1714643131
value=0.1
labels={"test_label": "10min"}

Then select * from test_metric order by greptime_timestamp desc

What did you expect to see?

greptime_timestamp greptime_value test_label
1714643131000 0.1 10min <----
1714623931000 0 10min
1712761200000 0 daily
1712717731000 0 10min
1712070000000 0 daily

What did you see instead?

greptime_timestamp greptime_value test_label
1714643131000 0.1 daily <----
1714623931000 0 10min
1712761200000 0 daily
1712717731000 0 10min
1712070000000 0 daily

Then it changed back to the expect value 10min after restart the greptime

What operating system did you use?

macOS 14.2.1 (23C71)

What version of GreptimeDB did you use?

commit: 9038e1b version: 0.7.2

Relevant log output and stack trace

2024-05-08T07:48:01.981797Z  INFO cmd: short_version: 0.7.2, full_version: greptimedb-HEAD-9038e1b
2024-05-08T07:48:01.982335Z  INFO cmd: command line arguments
2024-05-08T07:48:01.982339Z  INFO cmd: argument: ../greptime
2024-05-08T07:48:01.982342Z  INFO cmd: argument: standalone
2024-05-08T07:48:01.982344Z  INFO cmd: argument: start
2024-05-08T07:48:01.982346Z  INFO cmd: argument: -c
2024-05-08T07:48:01.982348Z  INFO cmd: argument: ./config.toml
2024-05-08T07:48:01.982926Z  INFO cmd::standalone: Standalone start command: StartCommand {
    http_addr: None,
    rpc_addr: None,
    mysql_addr: None,
    postgres_addr: None,
    opentsdb_addr: None,
    influxdb_enable: false,
    config_file: Some(
        "./config.toml",
    ),
    tls_mode: None,
    tls_cert_path: None,
    tls_key_path: None,
    user_provider: None,
    env_prefix: "GREPTIMEDB_STANDALONE",
    data_home: None,
}
2024-05-08T07:48:01.983465Z  INFO cmd::standalone: Building standalone instance with MixOptions {
    data_home: "./greptimedb/",
    procedure: ProcedureConfig {
        max_retry_times: 12,
        retry_delay: 500ms,
        max_metadata_value_size: None,
    },
    metadata_store: KvBackendConfig {
        file_size: 256.0MiB,
        purge_threshold: 4.0GiB,
    },
    frontend: FrontendOptions {
        mode: Standalone,
        node_id: None,
        default_timezone: None,
        heartbeat: HeartbeatOptions {
            interval: 18s,
            retry_interval: 3s,
        },
        http: HttpOptions {
            addr: "127.0.0.1:4000",
            timeout: 30s,
            disable_dashboard: false,
            body_limit: 64.0MiB,
        },
        grpc: GrpcOptions {
            addr: "127.0.0.1:4001",
            runtime_size: 8,
            max_recv_message_size: 512.0MiB,
            max_send_message_size: 512.0MiB,
        },
        mysql: MysqlOptions {
            enable: true,
            addr: "127.0.0.1:4002",
            runtime_size: 2,
            tls: TlsOption {
                mode: Disable,
                cert_path: "",
                key_path: "",
                watch: false,
            },
            reject_no_database: None,
        },
        postgres: PostgresOptions {
            enable: true,
            addr: "127.0.0.1:4003",
            runtime_size: 2,
            tls: TlsOption {
                mode: Disable,
                cert_path: "",
                key_path: "",
                watch: false,
            },
        },
        opentsdb: OpentsdbOptions {
            enable: true,
            addr: "127.0.0.1:4242",
            runtime_size: 2,
        },
        influxdb: InfluxdbOptions {
            enable: true,
        },
        prom_store: PromStoreOptions {
            enable: true,
            with_metric_engine: true,
        },
        otlp: OtlpOptions {
            enable: true,
        },
        meta_client: None,
        logging: LoggingOptions {
            dir: "/tmp/greptimedb/logs",
            level: None,
            enable_otlp_tracing: false,
            otlp_endpoint: None,
            tracing_sample_ratio: None,
            append_stdout: true,
        },
        datanode: DatanodeOptions {
            client: DatanodeClientOptions {
                timeout: 10s,
                connect_timeout: 1s,
                tcp_nodelay: true,
            },
        },
        user_provider: None,
        export_metrics: ExportMetricsOption {
            enable: false,
            write_interval: 30s,
            self_import: None,
            remote_write: None,
        },
    },
    datanode: DatanodeOptions {
        mode: Standalone,
        node_id: Some(
            0,
        ),
        require_lease_before_startup: false,
        init_regions_in_background: false,
        rpc_addr: "127.0.0.1:4001",
        rpc_hostname: None,
        rpc_runtime_size: 8,
        rpc_max_recv_message_size: 512.0MiB,
        rpc_max_send_message_size: 512.0MiB,
        heartbeat: HeartbeatOptions {
            interval: 3s,
            retry_interval: 3s,
        },
        http: HttpOptions {
            addr: "127.0.0.1:4000",
            timeout: 30s,
            disable_dashboard: false,
            body_limit: 64.0MiB,
        },
        meta_client: None,
        wal: RaftEngine(
            RaftEngineConfig {
                dir: None,
                file_size: 256.0MiB,
                purge_threshold: 4.0GiB,
                purge_interval: 60s,
                read_batch_size: 128,
                sync_write: false,
                enable_log_recycle: true,
                prefill_log_files: false,
                sync_period: None,
            },
        ),
        storage: StorageConfig {
            global_ttl: None,
            data_home: "./greptimedb/",
            store: File(
                FileConfig,
            ),
            providers: [],
        },
        region_engine: [
            Mito(
                MitoConfig {
                    num_workers: 8,
                    worker_channel_size: 128,
                    worker_request_batch_size: 64,
                    manifest_checkpoint_distance: 10,
                    compress_manifest: false,
                    max_background_jobs: 4,
                    auto_flush_interval: 180s,
                    global_write_buffer_size: 1.0GiB,
                    global_write_buffer_reject_size: 2.0GiB,
                    sst_meta_cache_size: 128.0MiB,
                    vector_cache_size: 512.0MiB,
                    page_cache_size: 512.0MiB,
                    enable_experimental_write_cache: false,
                    experimental_write_cache_path: "",
                    experimental_write_cache_size: 512.0MiB,
                    sst_write_buffer_size: 8.0MiB,
                    scan_parallelism: 0,
                    parallel_scan_channel_size: 32,
                    allow_stale_entries: false,
                    inverted_index: InvertedIndexConfig {
                        create_on_flush: Auto,
                        create_on_compaction: Auto,
                        apply_on_query: Auto,
                        write_buffer_size: 8.0MiB,
                        mem_threshold_on_create: Some(
                            64.0MiB,
                        ),
                        intermediate_path: "",
                    },
                    memtable: TimeSeries,
                },
            ),
        ],
        logging: LoggingOptions {
            dir: "/tmp/greptimedb/logs",
            level: None,
            enable_otlp_tracing: false,
            otlp_endpoint: None,
            tracing_sample_ratio: None,
            append_stdout: true,
        },
        enable_telemetry: true,
        export_metrics: ExportMetricsOption {
            enable: false,
            write_interval: 30s,
            self_import: None,
            remote_write: None,
        },
    },
    logging: LoggingOptions {
        dir: "/tmp/greptimedb/logs",
        level: None,
        enable_otlp_tracing: false,
        otlp_endpoint: None,
        tracing_sample_ratio: None,
        append_stdout: true,
    },
    wal_meta: RaftEngine,
}
2024-05-08T07:48:01.987162Z  INFO log: Recovering raft logs takes 1.356833ms
2024-05-08T07:48:01.987654Z  INFO common_procedure::store: The procedure state store path is: procedure/
2024-05-08T07:48:02.026542Z  INFO datanode::store::fs: The file storage home is: ./greptimedb/
2024-05-08T07:48:02.031361Z  INFO datanode::datanode: Creating raft-engine logstore with config: RaftEngineConfig { dir: None, file_size: 256.0MiB, purge_threshold: 4.0GiB, purge_interval: 60s, read_batch_size: 128, sync_write: false, enable_log_recycle: true, prefill_log_files: false, sync_period: None } and storage path: ./greptimedb/wal/
2024-05-08T07:48:02.031614Z  INFO log: Recovering raft logs takes 228.292µs
2024-05-08T07:48:02.031666Z  INFO common_runtime::global: Creating runtime with runtime_name: global-read, thread_name: read-worker, work_threads: 8.
2024-05-08T07:48:02.032038Z  INFO common_runtime::global: Creating runtime with runtime_name: global-write, thread_name: write-worker, work_threads: 8.
2024-05-08T07:48:02.032460Z  INFO common_runtime::global: Creating runtime with runtime_name: global-bg, thread_name: bg-worker, work_threads: 8.
2024-05-08T07:48:02.042700Z  INFO mito2::worker: Start region worker thread 0, init_check_delay: 7s
2024-05-08T07:48:02.042672Z  INFO mito2::worker: Start region worker thread 5, init_check_delay: 85s
2024-05-08T07:48:02.042674Z  INFO mito2::worker: Start region worker thread 6, init_check_delay: 49s
2024-05-08T07:48:02.042704Z  INFO mito2::worker: Start region worker thread 1, init_check_delay: 149s
2024-05-08T07:48:02.042742Z  INFO mito2::worker: Start region worker thread 2, init_check_delay: 104s
2024-05-08T07:48:02.042544Z  INFO datanode::region_server: Region Engine mito is registered
2024-05-08T07:48:02.042671Z  INFO mito2::worker: Start region worker thread 4, init_check_delay: 170s
2024-05-08T07:48:02.042720Z  INFO mito2::worker: Start region worker thread 7, init_check_delay: 177s
2024-05-08T07:48:02.042778Z  INFO mito2::worker: Start region worker thread 3, init_check_delay: 134s
2024-05-08T07:48:02.043248Z  INFO datanode::region_server: Region Engine metric is registered
2024-05-08T07:48:02.044935Z  INFO datanode::datanode: going to open 0 region(s)
2024-05-08T07:48:02.045159Z  INFO datanode::datanode: all regions are opened
2024-05-08T07:48:02.373499Z  INFO cmd: Starting app: greptime-standalone
2024-05-08T07:48:02.373695Z  INFO common_greptimedb_telemetry: Attention: GreptimeDB now collects anonymous usage data to help improve its roadmap and prioritize features.
2024-05-08T07:48:02.373704Z  INFO common_greptimedb_telemetry: To learn more about this anonymous program and how to deactivate it if you don't want to participate, please visit the following URL:
2024-05-08T07:48:02.373706Z  INFO common_greptimedb_telemetry: https://docs.greptime.com/reference/telemetry
2024-05-08T07:48:02.374571Z  INFO common_greptimedb_telemetry: reporting greptimedb version: StatisticData { os: "macos", version: "0.7.2", arch: "x86_64", mode: Standalone, git_commit: "9038e1b7694043e71eb13e8b6ebfdc2d21780b21", nodes: Some(1), uuid: "4ade4765-72db-43a3-96eb-90cfb903bd18" }
2024-05-08T07:48:02.374772Z  INFO common_procedure::local: LocalManager is start.
2024-05-08T07:48:02.374791Z  INFO common_procedure::local: LocalManager start to recover
2024-05-08T07:48:02.376027Z  INFO common_procedure::local: LocalManager finish recovery, cost: 1ms
2024-05-08T07:48:02.378007Z  INFO servers::server: MySQL server started at 127.0.0.1:4002
2024-05-08T07:48:02.378180Z  INFO servers::server: Service MYSQL_SERVER is started at 127.0.0.1:4002
2024-05-08T07:48:02.378679Z  INFO servers::server: Postgres server started at 127.0.0.1:4003
2024-05-08T07:48:02.378696Z  INFO servers::server: Service POSTGRES_SERVER is started at 127.0.0.1:4003
2024-05-08T07:48:02.378969Z  INFO servers::server: OpenTSDB server started at 127.0.0.1:4242
2024-05-08T07:48:02.379181Z  INFO servers::server: Service OPENTSDB_SERVER is started at 127.0.0.1:4242
2024-05-08T07:48:02.379635Z  INFO servers::grpc: gRPC server is bound to 127.0.0.1:4001
2024-05-08T07:48:02.383516Z  INFO servers::server: Service GRPC_SERVER is started at 127.0.0.1:4001
2024-05-08T07:48:02.385426Z  INFO servers::http: Enable dashboard service at '/dashboard'
2024-05-08T07:48:02.385616Z  INFO servers::http: HTTP server is bound to 127.0.0.1:4000
2024-05-08T07:48:02.385633Z  INFO servers::server: Service HTTP_SERVER is started at 127.0.0.1:4000
2024-05-08T07:48:17.864660Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateDatabase-3bbaee79-8a70-4695-a63f-7ab1d32ecedc starts
2024-05-08T07:48:17.867577Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateDatabase-3bbaee79-8a70-4695-a63f-7ab1d32ecedc done
2024-05-08T07:48:17.869403Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateDatabase-3bbaee79-8a70-4695-a63f-7ab1d32ecedc exits
2024-05-08T07:48:17.869547Z  INFO sql{protocol="http" request_type="sql"}:do_query:execute_stmt:create_database:DdlManager::submit_ddl_task: common_meta::ddl_manager: Database greptime.test_incorrect is created via procedure_id ProcedureId(3bbaee79-8a70-4695-a63f-7ab1d32ecedc)
2024-05-08T07:48:27.408040Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts: operator::insert: Physical metric table `greptime.test_incorrect.greptime_physical_table` does not exist, try creating table
2024-05-08T07:48:27.416822Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateTable-395a75bd-ab9b-4e69-a8a8-bd16d6bfa799 starts
2024-05-08T07:48:27.433125Z  INFO mito2::manifest::manager: Checkpoint not found in data/greptime/test_incorrect/1024/1024_0000000000/metadata/manifest/, build manifest from scratch
2024-05-08T07:48:27.436132Z  INFO mito2::manifest::manager: Creating region manifest in data/greptime/test_incorrect/1024/1024_0000000000/metadata/manifest/ with metadata RegionMetadata { column_metadatas: [[ts TimestampMillisecond not null default=Value(Timestamp(Timestamp { value: 0, unit: Millisecond })) Timestamp 0], [k String not null Tag 1], [v String null Field 2]], time_index: 0, primary_key: [1], region_id: 4398063288320(1024, 16777216), schema_version: 0 }
2024-05-08T07:48:27.444637Z  INFO mito2::worker::handle_create: A new region created, region: RegionMetadata { column_metadatas: [[ts TimestampMillisecond not null default=Value(Timestamp(Timestamp { value: 0, unit: Millisecond })) Timestamp 0], [k String not null Tag 1], [v String null Field 2]], time_index: 0, primary_key: [1], region_id: 4398063288320(1024, 16777216), schema_version: 0 }
2024-05-08T07:48:27.445931Z  INFO mito2::manifest::manager: Checkpoint not found in data/greptime/test_incorrect/1024/1024_0000000000/data/manifest/, build manifest from scratch
2024-05-08T07:48:27.445983Z  INFO mito2::manifest::manager: Creating region manifest in data/greptime/test_incorrect/1024/1024_0000000000/data/manifest/ with metadata RegionMetadata { column_metadatas: [[greptime_timestamp TimestampMillisecond not null Timestamp 0], [greptime_value Float64 null Field 1], [__table_id UInt32 not null Tag 2147483652], [__tsid UInt64 not null Tag 2147483651]], time_index: 0, primary_key: [2147483652, 2147483651], region_id: 4398046511104(1024, 0), schema_version: 0 }
2024-05-08T07:48:27.451466Z  INFO mito2::worker::handle_create: A new region created, region: RegionMetadata { column_metadatas: [[greptime_timestamp TimestampMillisecond not null Timestamp 0], [greptime_value Float64 null Field 1], [__table_id UInt32 not null Tag 2147483652], [__tsid UInt64 not null Tag 2147483651]], time_index: 0, primary_key: [2147483652, 2147483651], region_id: 4398046511104(1024, 0), schema_version: 0 }
2024-05-08T07:48:27.451575Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4398046511104(1024, 0)"}:handle_request{request_type="Create"}: metric_engine::engine::create: Created physical metric region 4398046511104(1024, 0)
2024-05-08T07:48:27.451990Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4398046511104(1024, 0)"}:handle_request{request_type="Create"}: datanode::region_server: Region 4398046511104(1024, 0) is registered to engine metric
2024-05-08T07:48:27.453265Z  INFO LocalManager::submit_root_procedure: common_meta::ddl::create_table: Created table metadata for table 1024
2024-05-08T07:48:27.453283Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateTable-395a75bd-ab9b-4e69-a8a8-bd16d6bfa799 done
2024-05-08T07:48:27.453345Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateTable-395a75bd-ab9b-4e69-a8a8-bd16d6bfa799 exits
2024-05-08T07:48:27.453392Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts:create_table_inner:DdlManager::submit_ddl_task: common_meta::ddl_manager: Table: 1024 is created via procedure_id ProcedureId(395a75bd-ab9b-4e69-a8a8-bd16d6bfa799)
2024-05-08T07:48:27.453408Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts:create_table_inner: operator::statement::ddl: Successfully created table 'greptime.test_incorrect.greptime_physical_table' with table id 1024
2024-05-08T07:48:27.453622Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts: operator::insert: Successfully created table greptime.test_incorrect.greptime_physical_table
2024-05-08T07:48:27.455206Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateLogicalTables-debaa241-f99f-483b-8685-a3df09a25d2f starts
2024-05-08T07:48:27.462557Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4402341478400(1025, 0)"}:handle_request{request_type="Create"}: metric_engine::engine::create: Found new columns [[test_label String null Tag 2]] to add to physical region 4398046511104(1024, 0)
2024-05-08T07:48:27.462805Z  INFO mito2::worker::handle_alter: Try to alter region: 4398046511104(1024, 0), request: RegionAlterRequest { schema_version: 0, kind: AddColumns { columns: [AddColumn { column_metadata: [test_label String null Tag 2], location: None }] } }
2024-05-08T07:48:27.469639Z  INFO mito2::worker::handle_alter: Schema of region 4398046511104(1024, 0) is altered from 0 to 1
2024-05-08T07:48:27.473607Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4402341478400(1025, 0)"}:handle_request{request_type="Create"}: metric_engine::engine::create: Create region 4402341478400(1025, 0) leads to adding columns [[test_label String null Tag 2]] to physical region 4398046511104(1024, 0)
2024-05-08T07:48:27.476734Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4402341478400(1025, 0)"}:handle_request{request_type="Create"}: metric_engine::engine::create: Created new logical region 4402341478400(1025, 0) on physical region 4398046511104(1024, 0)
2024-05-08T07:48:27.476791Z  INFO LocalManager::submit_root_procedure:RegionInvoker::handle_region_request:RegionServer::handle_region_request{region_id="4402341478400(1025, 0)"}:handle_request{request_type="Create"}: datanode::region_server: Region 4402341478400(1025, 0) is registered to engine metric
2024-05-08T07:48:27.479679Z  INFO LocalManager::submit_root_procedure: common_meta::ddl::create_logical_tables::update_metadata: Created 1 tables [1025] metadata for physical table 1024
2024-05-08T07:48:27.479704Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Procedure metasrv-procedure::CreateLogicalTables-debaa241-f99f-483b-8685-a3df09a25d2f done
2024-05-08T07:48:27.479760Z  INFO LocalManager::submit_root_procedure: common_procedure::local::runner: Runner metasrv-procedure::CreateLogicalTables-debaa241-f99f-483b-8685-a3df09a25d2f exits
2024-05-08T07:48:27.479933Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts:create_logical_tables:DdlManager::submit_ddl_task: common_meta::ddl_manager: 1 logical tables on physical table: 1024 is created via procedure_id ProcedureId(debaa241-f99f-483b-8685-a3df09a25d2f)
2024-05-08T07:48:27.480133Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts:create_logical_tables: operator::statement::ddl: Successfully created logical tables: [1025]
2024-05-08T07:48:27.480304Z  INFO remote_write{protocol="prometheus" request_type="remote_write"}:handle_metric_row_inserts: operator::insert: Successfully created logical tables
2024-05-08T07:49:02.040635Z  INFO log_store::raft_engine::log_store: Successfully purged logstore files, namespaces need compaction: []
2024-05-08T07:50:02.043500Z  INFO log_store::raft_engine::log_store: Successfully purged logstore files, namespaces need compaction: []

2024-05-08T07:51:02.045894Z  INFO log_store::raft_engine::log_store: Successfully purged logstore files, namespaces need compaction: []


2024-05-08T07:51:27.537349Z  INFO mito2::flush: Successfully flush memtables, region: 4398063288320(1024, 16777216), reason: Periodically, files: [FileId(02769029-3c9a-4bcf-be3c-c1009ecaae28)], cost: 0.044660541s
2024-05-08T07:51:27.537664Z  INFO mito2::region: Applying RegionEdit { files_to_add: [FileMeta { region_id: 4398063288320(1024, 16777216), file_id: FileId(02769029-3c9a-4bcf-be3c-c1009ecaae28), time_range: (Timestamp { value: 0, unit: Millisecond }, Timestamp { value: 0, unit: Millisecond }), level: 0, file_size: 4010, available_indexes: [InvertedIndex], index_file_size: 418 }], files_to_remove: [], compaction_time_window: None, flushed_entry_id: Some(4), flushed_sequence: Some(4) } to region 4398063288320(1024, 16777216)
2024-05-08T07:51:27.541253Z  INFO mito2::flush: Successfully flush memtables, region: 4398046511104(1024, 0), reason: Periodically, files: [FileId(4ba01c4d-b673-4f9a-9ce5-6d70a28222c2)], cost: 0.029436084s
2024-05-08T07:51:27.545809Z  INFO mito2::worker::handle_flush: Region 4398063288320(1024, 16777216) flush finished, tries to bump wal to 4
2024-05-08T07:51:27.546029Z  INFO log_store::raft_engine::log_store: Namespace 4398063288320 obsoleted 4 entries, compacted index: 4, span: (None, None)
2024-05-08T07:51:27.546301Z  INFO mito2::compaction::twcs: Compaction window for region 4398063288320(1024, 16777216) is not present, inferring from files: 3600
2024-05-08T07:51:27.546615Z  INFO mito2::region: Applying RegionEdit { files_to_add: [FileMeta { region_id: 4398046511104(1024, 0), file_id: FileId(4ba01c4d-b673-4f9a-9ce5-6d70a28222c2), time_range: (Timestamp { value: 1712070000000, unit: Millisecond }, Timestamp { value: 1714623931000, unit: Millisecond }), level: 0, file_size: 3427, available_indexes: [InvertedIndex], index_file_size: 360 }], files_to_remove: [], compaction_time_window: None, flushed_entry_id: Some(1), flushed_sequence: Some(4) } to region 4398046511104(1024, 0)
2024-05-08T07:51:27.551884Z  INFO mito2::worker::handle_flush: Region 4398046511104(1024, 0) flush finished, tries to bump wal to 1
2024-05-08T07:51:27.551919Z  INFO log_store::raft_engine::log_store: Namespace 4398046511104 obsoleted 1 entries, compacted index: 1, span: (None, None)
2024-05-08T07:51:27.551927Z  INFO mito2::compaction::twcs: Compaction window for region 4398046511104(1024, 0) is not present, inferring from files: 31536000
2024-05-08T07:52:02.047711Z  INFO log_store::raft_engine::log_store: Successfully purged logstore files, namespaces need compaction: []
@v0y4g3r
Copy link
Contributor

v0y4g3r commented May 8, 2024

Thanks for the feedback. I tried to reproduce this issue on the same build but failed.

  1. Start GreptimeDB
  2. Insert first 4 rows using
cat <<EOF | base64 --decode | curl -i -XPOST --data-binary @- -H "Content-Encoding: compress" localhost:4000/v1/prometheus/write
yAHICjAKEAoIX19uYW1lX18SBHRlc3QKEwoKdGVzdF9sYWJlbBIFMTBtaW4SBxC4qbqv7DEKqjIAEPjEs7zzkjIAMGRhaWx5EgcQgPvL+um2MgAQu5fE7DE=
EOF
  1. wait for flush
  2. Insert the last row using:
cat <<EOF | base64 --decode | curl -i -XPOST --data-binary @- -H "Content-Encoding: compress" localhost:4000/v1/prometheus/write
O+gKOQoQCghfX25hbWVfXxIEdGVzdAoTCgp0ZXN0X2xhYmVsEgUxMG1pbhIQCZqZmZmZmbk/EPi0x8XzMQ==
EOF
  1. Query the result
mysql> select *, cast(greptime_timestamp as bigint) as ts from test order by greptime_timestamp desc;
+---------------------+----------------+------------+---------------+
| greptime_timestamp  | greptime_value | test_label | ts            |
+---------------------+----------------+------------+---------------+
| 2024-05-02 09:45:31 |            0.1 | 10min      | 1714643131000 |
| 2024-05-02 04:25:31 |              0 | 10min      | 1714623931000 |
| 2024-04-10 15:00:00 |              0 | daily      | 1712761200000 |
| 2024-04-10 02:55:31 |              0 | 10min      | 1712717731000 |
| 2024-04-02 15:00:00 |              0 | daily      | 1712070000000 |
+---------------------+----------------+------------+---------------+
5 rows in set (0.01 sec)

@0neSe7en
Copy link
Author

0neSe7en commented May 8, 2024

oh ok, I sorted the timestamps for the first four metrics.

You can try with:

cat <<EOF | base64 --decode | curl -i -XPOST --data-binary @- -H "Content-Encoding: compress" localhost:4000/v1/prometheus/write
yAHQCjAKEAoIX19uYW1lX18SBHRlc3QKEwoKdGVzdF9sYWJlbBIFZGFpbHkSBxCA+8v66TEKMAqCMgAwMTBtaW4SBxC4qbqv7JIyAABkEWQIu5fEljIANDEwbWluEgcQ+MSzvPMx
EOF
cat <<EOF | base64 --decode | curl -i -XPOST --data-binary @- -H "Content-Encoding: compress" localhost:4000/v1/prometheus/write
O+gKOQoQCghfX25hbWVfXxIEdGVzdAoTCgp0ZXN0X2xhYmVsEgUxMG1pbhIQCZqZmZmZmbk/EPi0x8XzMQ==
EOF

got

greptime_timestamp greptime_value test_label
1714643131000 0.1 daily
1714623931000 0 10min
1712761200000 0 daily
1712717731000 0 10min
1712070000000 0 daily

the config file that I use

[wal]
file_size = "256MB"
purge_threshold = "4GB"
purge_interval = "1m"
read_batch_size = 128
sync_write = false

# Mito engine options
[[region_engine]]
[region_engine.mito]
num_workers = 8
worker_channel_size = 128
worker_request_batch_size = 64
manifest_checkpoint_distance = 10
compress_manifest = false
max_background_jobs = 4
auto_flush_interval = "2m"
global_write_buffer_size = "1GB"
global_write_buffer_reject_size = "2GB"
sst_meta_cache_size = "128MB"
vector_cache_size = "512MB"
page_cache_size = "512MB"
sst_write_buffer_size = "8MB"
scan_parallelism = 0
parallel_scan_channel_size = 32
allow_stale_entries = false

@v0y4g3r
Copy link
Contributor

v0y4g3r commented May 10, 2024

We've confirmed the issue and will fix it ASAP.

@v0y4g3r
Copy link
Contributor

v0y4g3r commented May 14, 2024

This bug is related to the internal encoding of GreptimeDB's memtable.

In PartitionTreeMemtable, we use two encodings to encode the primary keys. In the outer PartitionTree, we use sparse encoder to faciliate fast primary key lookup:

self.sparse_encoder
.encode_to_vec(kv.primary_keys(), pk_buffer)?;

and we store the mapping from sparse key to pk index:

inner.pk_to_pk_id.insert(sparse_key, pk_id);

While in the inner Partition, we use memory-comparable encoding to encode primary keys so that we can restore the primary key values from the encoded bytes when reading memtables:

row_codec.encode_to_vec(key_value.primary_keys(), primary_key)?;

and we also store the mapping from mem-comparable encoded primary keys in a dictionary:

let pk_index = self.dict_builder.insert_key(primary_key, metrics);

But when a PartitionMemtable gets freezed, for example, flushed onto disk, the inner dictionary may re-order the keys:

for (i, pk_index) in self.pk_to_index.values_mut().enumerate() {

For some particular primary key combanitions, the sparsely-encoded bytes and the mem-comparable encoded bytes may have two different pk indices (which means the byte order of sparse encoding and mcmp encoding are different), so the last inserted metric has an incorrect label.

#3927 fixes this issue by also carrying a sparsely encoded key in the inner dictionary so that when dictionary is reordered, it can updates the outer mapping:

https://github.com/GreptimeTeam/greptimedb/pull/3927/files#diff-c0a79e64c7242941bbf66373b6b9d7a407184c3826e88118d8b716b80b0972dcR141-R143

@0neSe7en Big thanks for your feedback!

@0neSe7en
Copy link
Author

awesome, thanks for your work and explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants