Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] load_relcache_init_file should clear yb_table_properties #22342

Closed
1 task done
myang2021 opened this issue May 10, 2024 · 0 comments
Closed
1 task done

[YSQL] load_relcache_init_file should clear yb_table_properties #22342

myang2021 opened this issue May 10, 2024 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@myang2021
Copy link
Contributor

myang2021 commented May 10, 2024

Jira Link: DB-11248

Description

typedef struct RelationData
{
    ...
    YbTableProperties yb_table_properties; /* NULL if not loaded */
    ...
} RelationData;

In write_relcache_init_file

        /* first write the relcache entry proper */
        write_item(rel, sizeof(RelationData), fp);

This is going to write the contents of RelationData as raw bytes to the rel cache init file fp.

In load_relcache_init_file, we load the raw bytes back into RelationData:

        /* first read the relation descriptor length */
        nread = fread(&len, 1, sizeof(len), fp);
        if (nread != sizeof(len))
        {
            if (nread == 0)
                break;          /* end of file */
            goto read_failed;
        }

        /* safety check for incompatible relcache layout */
        if (len != sizeof(RelationData))
            goto read_failed;

        /* allocate another relcache header */
        if (num_rels >= max_rels)
        {
            max_rels *= 2;
            rels = (Relation *) repalloc(rels, max_rels * sizeof(Relation));
        }

        rel = rels[num_rels++] = (Relation) palloc(len);

        /* then, read the Relation structure */
        if (fread(rel, 1, len, fp) != len)
            goto read_failed;


The bug here is that yb_table_properties is a pointer, it is a private memory address in the PG backend process that invoked write_relcache_init_file, and then it will be interpreted within the PG backend process that invoked load_relcache_init_file. It is not valid to do that because it can point to some random bytes meant for other things. In the worst case the same raw pointer may point to an address that is not even mmap'ed into the PG backend process and can lead to SEGV if accessed.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@myang2021 myang2021 added the area/ysql Yugabyte SQL (YSQL) label May 10, 2024
@myang2021 myang2021 self-assigned this May 10, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels May 10, 2024
myang2021 added a commit that referenced this issue May 10, 2024
Summary:
1. Related data structures:

```
typedef struct YbTablePropertiesData* YbTableProperties;

typedef struct RelationData
{
    ...
    YbTableProperties yb_table_properties; /* NULL if not loaded */
    ...
} RelationData;
```

Note that `yb_table_properties` is a pointer.

2. In `write_relcache_init_file`

```
        /* first write the relcache entry proper */
        write_item(rel, sizeof(RelationData), fp);
```
This is going to write the contents of RelationData as raw bytes to the rel cache init file fp.

3. In `load_relcache_init_file`, we load the raw bytes back into RelationData:

```
        /* first read the relation descriptor length */
        nread = fread(&len, 1, sizeof(len), fp);
        if (nread != sizeof(len))
        {
            if (nread == 0)
                break;          /* end of file */
            goto read_failed;
        }

        /* safety check for incompatible relcache layout */
        if (len != sizeof(RelationData))
            goto read_failed;

        /* allocate another relcache header */
        if (num_rels >= max_rels)
        {
            max_rels *= 2;
            rels = (Relation *) repalloc(rels, max_rels * sizeof(Relation));
        }

        rel = rels[num_rels++] = (Relation) palloc(len);

        /* then, read the Relation structure */
        if (fread(rel, 1, len, fp) != len)
            goto read_failed;
```

4. The bug here is that `yb_table_properties` is a pointer, it is a private memory
address in the PG backend process that invoked `write_relcache_init_file`, and
then it will be interpreted within the PG backend process that invoked
`load_relcache_init_file`. It is not valid to do that because it can point to some
random bytes meant for other things. In the worst case the same raw pointer may
point to an address that is not even mmap'ed into the PG backend process that
loads the data and can lead to SEGV if accessed.

This diff fixes the bug by adding to `load_relcache_init_file`:
```
        /* YB properties will be loaded lazily */
        rel->yb_table_properties = NULL;
```
We are already clearing yb_table_properties in `AllocateRelationDesc`.

Note:

This appears to be a recent change. In the past when `write_relcache_init_file` is invoked, we have not set `yb_table_properties` but after commit `07db0b57cee0aee372a6820638fd07cb20c499a9` we are because
in `ybcSetupScanPlan` we now call `YbGetTableProperties`:

```
static void
ybcSetupScanPlan(bool xs_want_itup, YbScanDesc ybScan, YbScanPlan scan_plan)
{
    Relation relation = ybScan->relation;
    Relation index = ybScan->index;
    YbTableProperties yb_table_prop_relation = YbGetTableProperties(relation);
```
Jira: DB-11248

Test Plan: jenkins

Reviewers: kfranz, fizaa

Reviewed By: fizaa

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D34931
jharveysmith pushed a commit that referenced this issue May 24, 2024
Summary:
1. Related data structures:

```
typedef struct YbTablePropertiesData* YbTableProperties;

typedef struct RelationData
{
    ...
    YbTableProperties yb_table_properties; /* NULL if not loaded */
    ...
} RelationData;
```

Note that `yb_table_properties` is a pointer.

2. In `write_relcache_init_file`

```
        /* first write the relcache entry proper */
        write_item(rel, sizeof(RelationData), fp);
```
This is going to write the contents of RelationData as raw bytes to the rel cache init file fp.

3. In `load_relcache_init_file`, we load the raw bytes back into RelationData:

```
        /* first read the relation descriptor length */
        nread = fread(&len, 1, sizeof(len), fp);
        if (nread != sizeof(len))
        {
            if (nread == 0)
                break;          /* end of file */
            goto read_failed;
        }

        /* safety check for incompatible relcache layout */
        if (len != sizeof(RelationData))
            goto read_failed;

        /* allocate another relcache header */
        if (num_rels >= max_rels)
        {
            max_rels *= 2;
            rels = (Relation *) repalloc(rels, max_rels * sizeof(Relation));
        }

        rel = rels[num_rels++] = (Relation) palloc(len);

        /* then, read the Relation structure */
        if (fread(rel, 1, len, fp) != len)
            goto read_failed;
```

4. The bug here is that `yb_table_properties` is a pointer, it is a private memory
address in the PG backend process that invoked `write_relcache_init_file`, and
then it will be interpreted within the PG backend process that invoked
`load_relcache_init_file`. It is not valid to do that because it can point to some
random bytes meant for other things. In the worst case the same raw pointer may
point to an address that is not even mmap'ed into the PG backend process that
loads the data and can lead to SEGV if accessed.

This diff fixes the bug by adding to `load_relcache_init_file`:
```
        /* YB properties will be loaded lazily */
        rel->yb_table_properties = NULL;
```
We are already clearing yb_table_properties in `AllocateRelationDesc`.

Note:

This appears to be a recent change. In the past when `write_relcache_init_file` is invoked, we have not set `yb_table_properties` but after commit `7e9e6eabe0f408d6c4a307eeaaa612b5910d1c60` we are because
in `ybcSetupScanPlan` we now call `YbGetTableProperties`:

```
static void
ybcSetupScanPlan(bool xs_want_itup, YbScanDesc ybScan, YbScanPlan scan_plan)
{
    Relation relation = ybScan->relation;
    Relation index = ybScan->index;
    YbTableProperties yb_table_prop_relation = YbGetTableProperties(relation);
```
Jira: DB-11248

Test Plan: jenkins

Reviewers: kfranz, fizaa

Reviewed By: fizaa

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D34931
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
1. Related data structures:

```
typedef struct YbTablePropertiesData* YbTableProperties;

typedef struct RelationData
{
    ...
    YbTableProperties yb_table_properties; /* NULL if not loaded */
    ...
} RelationData;
```

Note that `yb_table_properties` is a pointer.

2. In `write_relcache_init_file`

```
        /* first write the relcache entry proper */
        write_item(rel, sizeof(RelationData), fp);
```
This is going to write the contents of RelationData as raw bytes to the rel cache init file fp.

3. In `load_relcache_init_file`, we load the raw bytes back into RelationData:

```
        /* first read the relation descriptor length */
        nread = fread(&len, 1, sizeof(len), fp);
        if (nread != sizeof(len))
        {
            if (nread == 0)
                break;          /* end of file */
            goto read_failed;
        }

        /* safety check for incompatible relcache layout */
        if (len != sizeof(RelationData))
            goto read_failed;

        /* allocate another relcache header */
        if (num_rels >= max_rels)
        {
            max_rels *= 2;
            rels = (Relation *) repalloc(rels, max_rels * sizeof(Relation));
        }

        rel = rels[num_rels++] = (Relation) palloc(len);

        /* then, read the Relation structure */
        if (fread(rel, 1, len, fp) != len)
            goto read_failed;
```

4. The bug here is that `yb_table_properties` is a pointer, it is a private memory
address in the PG backend process that invoked `write_relcache_init_file`, and
then it will be interpreted within the PG backend process that invoked
`load_relcache_init_file`. It is not valid to do that because it can point to some
random bytes meant for other things. In the worst case the same raw pointer may
point to an address that is not even mmap'ed into the PG backend process that
loads the data and can lead to SEGV if accessed.

This diff fixes the bug by adding to `load_relcache_init_file`:
```
        /* YB properties will be loaded lazily */
        rel->yb_table_properties = NULL;
```
We are already clearing yb_table_properties in `AllocateRelationDesc`.

Note:

This appears to be a recent change. In the past when `write_relcache_init_file` is invoked, we have not set `yb_table_properties` but after commit `07db0b57cee0aee372a6820638fd07cb20c499a9` we are because
in `ybcSetupScanPlan` we now call `YbGetTableProperties`:

```
static void
ybcSetupScanPlan(bool xs_want_itup, YbScanDesc ybScan, YbScanPlan scan_plan)
{
    Relation relation = ybScan->relation;
    Relation index = ybScan->index;
    YbTableProperties yb_table_prop_relation = YbGetTableProperties(relation);
```
Jira: DB-11248

Test Plan: jenkins

Reviewers: kfranz, fizaa

Reviewed By: fizaa

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D34931
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
1. Related data structures:

```
typedef struct YbTablePropertiesData* YbTableProperties;

typedef struct RelationData
{
    ...
    YbTableProperties yb_table_properties; /* NULL if not loaded */
    ...
} RelationData;
```

Note that `yb_table_properties` is a pointer.

2. In `write_relcache_init_file`

```
        /* first write the relcache entry proper */
        write_item(rel, sizeof(RelationData), fp);
```
This is going to write the contents of RelationData as raw bytes to the rel cache init file fp.

3. In `load_relcache_init_file`, we load the raw bytes back into RelationData:

```
        /* first read the relation descriptor length */
        nread = fread(&len, 1, sizeof(len), fp);
        if (nread != sizeof(len))
        {
            if (nread == 0)
                break;          /* end of file */
            goto read_failed;
        }

        /* safety check for incompatible relcache layout */
        if (len != sizeof(RelationData))
            goto read_failed;

        /* allocate another relcache header */
        if (num_rels >= max_rels)
        {
            max_rels *= 2;
            rels = (Relation *) repalloc(rels, max_rels * sizeof(Relation));
        }

        rel = rels[num_rels++] = (Relation) palloc(len);

        /* then, read the Relation structure */
        if (fread(rel, 1, len, fp) != len)
            goto read_failed;
```

4. The bug here is that `yb_table_properties` is a pointer, it is a private memory
address in the PG backend process that invoked `write_relcache_init_file`, and
then it will be interpreted within the PG backend process that invoked
`load_relcache_init_file`. It is not valid to do that because it can point to some
random bytes meant for other things. In the worst case the same raw pointer may
point to an address that is not even mmap'ed into the PG backend process that
loads the data and can lead to SEGV if accessed.

This diff fixes the bug by adding to `load_relcache_init_file`:
```
        /* YB properties will be loaded lazily */
        rel->yb_table_properties = NULL;
```
We are already clearing yb_table_properties in `AllocateRelationDesc`.

Note:

This appears to be a recent change. In the past when `write_relcache_init_file` is invoked, we have not set `yb_table_properties` but after commit `07db0b57cee0aee372a6820638fd07cb20c499a9` we are because
in `ybcSetupScanPlan` we now call `YbGetTableProperties`:

```
static void
ybcSetupScanPlan(bool xs_want_itup, YbScanDesc ybScan, YbScanPlan scan_plan)
{
    Relation relation = ybScan->relation;
    Relation index = ybScan->index;
    YbTableProperties yb_table_prop_relation = YbGetTableProperties(relation);
```
Jira: DB-11248

Test Plan: jenkins

Reviewers: kfranz, fizaa

Reviewed By: fizaa

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D34931
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants