Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [date 5.19]tke regression: mo reported error no such table sysbench_db.sbtest9 #16232

Closed
1 task done
heni02 opened this issue May 20, 2024 · 4 comments
Closed
1 task done
Assignees
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Milestone

Comments

@heni02
Copy link
Contributor

heni02 commented May 20, 2024

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

main

Commit ID

fae5807

Other Environment Information

- Hardware parameters:
- OS type:
- Others:

Actual Behavior

job:https://github.com/matrixorigin/mo-nightly-regression/actions/runs/9148415689/job/25154937743
企业微信截图_885d325f-465d-4131-8508-23568b737c25

mo log:
https://grafana.ci.matrixorigin.cn/explore?panes=%7B%22ya3%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-regression-20240519%5C%22%7D%20%7C%3D%20%60no%20such%20table%20sysbench_db.sbtest9%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221716154940507%22,%22to%22:%221716155264102%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

sysbench100w write_only 1000 test

Additional information

No response

@heni02 heni02 added kind/bug Something isn't working severity/s0 Extreme impact: Cause the application to break down and seriously affect the use labels May 20, 2024
@heni02 heni02 added this to the 1.2.0 milestone May 20, 2024
@heni02 heni02 modified the milestones: 1.2.0, 1.2.1 May 20, 2024
@jensenojs
Copy link
Contributor

jensenojs commented May 20, 2024

通过mo-log的链接, 搜索sbtest9, 扫出来的日志, 这里截取的是日志的两个片段,

第一个片段是在执行了大量的DELETE FROM sbtest9 xxx之后, 在21:44:09.640以及随后的日志中显示这个表被删除了. 随后有一系列的报错是table sbtest9 does not exist的, 应该算属于预期结果内的.
image

跟本issue相关的日志在下面, 在21:44:10.478重新创建了sbtest9之后, 21:44:11.166往这个表里面插入数据的时候仍然显示no such table sysbench_db.sbtest9
image

@jensenojs
Copy link
Contributor

jensenojs commented May 20, 2024

向韩枫哥请教了一下, 通过日志把问题定位得更准确了一些, 初步怀疑cn的时间戳有问题, 不知道是什么东西引起的(事务重试?), 下面是诊断的过程.

  1. 21:44:11.478可以看到新建的 sbtest9 的 id 是 273407 (pod : nightly-regression-dis-dn-0)

    • 于此同时在21:44:11.239中刷盘的是 sbtest9 是273393, 是之前drop掉的 ( pod : nightly-regression-dis-dn-0)
    • 21:44:10.478 将新建的sbtest9刷盘了, 21:44:11.166的insert报错tbl not found
      image
  2. 但是从后面的日志可以看出(下图第一行), cn是可以正常提交 273407 的 drop 请求的

    • 高亮第一行 : pod : nightly-regression-dis-dn-0
企业微信截图_73d1f424-61e4-462d-9486-26ed18dee805

因此新建的273407在tn是成功了, cn因为某种原因, 在1s中后没有看到, 4秒后却能正常的删除, 下图的第二行能反映是真的往 273407 里插入了 1392 行数据,删除了 230 行数据
image


补充, 21:44:10.478新建sbtest9的cn应该是21:44:10.470所反映的cn-4pvb4, 后面插入的报错是cn-hhwxg, create sbtest9 和 insert sbtest9 不在同一个CN上

create sbtest9 pod :
image

insert pod :
image

@jensenojs
Copy link
Contributor

在高并发事务执行期间, 没有sync commit. 不保证外部一致性.

@heni02
Copy link
Contributor Author

heni02 commented May 28, 2024

confirm,closed

@heni02 heni02 closed this as completed May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working phase/testing severity/s0 Extreme impact: Cause the application to break down and seriously affect the use
Projects
None yet
Development

No branches or pull requests

3 participants