Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support random() in batch queries #16694

Open
fuyufjh opened this issue May 10, 2024 · 2 comments
Open

Support random() in batch queries #16694

fuyufjh opened this issue May 10, 2024 · 2 comments
Assignees
Labels
help wanted Issues that need help from contributors type/feature
Milestone

Comments

@fuyufjh
Copy link
Contributor

fuyufjh commented May 10, 2024

Is your feature request related to a problem? Please describe.

Handy for generating testing data.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@github-actions github-actions bot added this to the release-1.10 milestone May 10, 2024
@xiangjinwu
Copy link
Contributor

The corresponding setseed function seems non-trivial. Note the following query is actually deterministic:

test=# select case n when 1 then setseed(0.4) else null end, random() from generate_series(1, 10) as t(n);
 case |       random        
------+---------------------
      |  0.6722690410337004
      |  0.9936586191615264
      | 0.19947461195735405
      |    0.90340354600701
      |   0.516998156895498
      |  0.7913052710048887
      |   0.531550972556847
      | 0.48302401286537666
      | 0.14525565255427209
      | 0.19221106009513123
(10 rows)

So what random() really does in PostgreSQL is mutating a session-level variable:
https://www.postgresql.org/docs/16/sql-set.html#:~:text=TO%20value.-,SEED,-Sets%20the%20internal
https://github.com/postgres/postgres/blob/REL_16_1/src/backend/utils/misc/guc_tables.c#L3695-L3704
https://github.com/postgres/postgres/blob/REL_16_1/src/common/pg_prng.c#L232-L234

Of course we can provide a weaker random() function first without the ability to generate deterministic sequence given a seed.

cc @wangrunji0408 may be more familiar with RNGs given experience working on madsim.

@fuyufjh fuyufjh added the help wanted Issues that need help from contributors label May 13, 2024
@fuyufjh
Copy link
Contributor Author

fuyufjh commented May 13, 2024

So what random() really does in PostgreSQL is mutating a session-level variable:

Of course we can provide a weaker random() function first without the ability to generate deterministic sequence given a seed.

Either LGTM. If the session-level variable approach involves lots of work, I will prefer the 2nd

@wangrunji0408 wangrunji0408 self-assigned this May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need help from contributors type/feature
Projects
None yet
Development

No branches or pull requests

3 participants