Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Attribute isBitcast to PTR_TO_INT pseudoinstruction #388

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sayon
Copy link

@sayon sayon commented Feb 10, 2024

Code Review Checklist

Purpose

Correctly attribute pointer to int pseudoinstruction as a bitcast instruction.

Ticket Number

CP-1533

Requirements

  • Have the requirements been met?
  • Have stakeholder(s) approved the change?

Implementation

  • Does this code change accomplish what it is supposed to do?
  • Can this solution be simplified?
  • Does this change add unwanted compile-time or run-time dependencies?
  • Could an additional framework, API, library, or service improve the solution?
  • Could we reuse part of LLVM instead of implementing the patch or a part of it?
  • Is the code at the right abstraction level?
  • Is the code modular enough?
  • Can a better solution be found in terms of maintainability, readability, performance, or security?
  • Does similar functionality already exist in the codebase? If yes, why isn’t it reused?
  • Are there any best practices, design patterns or language-specific patterns that could substantially improve this code?

Logic Errors and Bugs

  • Can you think of any use case in which the
    code does not behave as intended?
  • Can you think of any inputs or external events
    that could break the code?

Error Handling and Logging

  • Is error handling done the correct way?
  • Should any logging or debugging information
    be added or removed?
  • Are error messages user-friendly?
  • Are there enough log events and are they
    written in a way that allows for easy
    debugging?

Maintainability

  • Is the code easy to read?
  • Is the code not repeated (DRY Principle)?
  • Is the code method/class not too long?

Dependencies

  • Were updates to documentation, configuration, or readme files made as required by this change?
  • Are there any potential impacts on other parts of the system or backward compatibility?

Security

  • Does the code introduce any security vulnerabilities?

Performance

  • Do you think this code change decreases
    system performance?
  • Do you see any potential to improve the
    performance of the code significantly?

Testing and Testability

  • Is the code testable?
  • Have automated tests been added, or have related ones been updated to cover the change?
    • For changes to mutable state
  • Do tests reasonably cover the code change (unit/integration/system tests)?
    • Line Coverage
    • Region Coverage
    • Branch Coverage
  • Are there some test cases, input or edge cases
    that should be tested in addition?

Readability

  • Is the code easy to understand?
  • Which parts were confusing to you and why?
  • Can the readability of the code be improved by
    smaller methods?
  • Can the readability of the code be improved by
    different function, method or variable names?
  • Is the code located in the right
    file/folder/package?
  • Do you think certain methods should be
    restructured to have a more intuitive control
    flow?
  • Is the data flow understandable?
  • Are there redundant or outdated comments?
  • Could some comments convey the message
    better?
  • Would more comments make the code more
    understandable?
  • Could some comments be removed by making the code itself more readable?
  • Is there any commented-out code?

Documentation

  • Is there sufficient documentation?
  • Is the ReadMe.md file up to date?

Best Practices

  • Follow Single Responsibility principle?
  • Are different errors handled correctly?
  • Are errors and warnings logged?
  • Magic values avoided?
  • No unnecessary comments?
  • Minimal nesting used?

Experts' Opinion

  • Do you think a specific expert, like a security
    expert or a usability expert, should look over
    the code before it can be accepted?
  • Will this code change impact different teams, and should they review the change as well?

@sayon sayon changed the title Attribute isBitcast to PTR_TO_INT pseudoinstruction [DO NOT MERGE] Attribute isBitcast to PTR_TO_INT pseudoinstruction Feb 10, 2024
@sayon
Copy link
Author

sayon commented Feb 10, 2024

When I run the tests locally with --load-system-contracts system-contracts-stable-build at least some of the failing tests pass. Seems that a problem occurs in system contracts.

In most or all of the failing tests exception: true, is present:

Y+M3B3 0.8.24     FAILED tests/solidity/simple/immutable/unaligned/mixed_data_origin.sol::main[main:1] (expected (
    return_data: [
        "0x0000000000000000000000000000000000000000000000000000000000000094",
    ],
    exception: false,
    events: [],
), found (
    return_data: [],
    exception: true,
    events: [],
), calldata ab3ae255000000000000000000000000000000000000000000000000000000000000002a)

@sayon sayon force-pushed the iz-cpr-1533-attributing-isbitcast branch from 4fdb83a to d824481 Compare April 17, 2024 17:07
Copy link

github-actions bot commented Apr 20, 2024

Benchmark results:

╔═╡ Size (-%) ╞════════════════╡ All M3B3 ╞═╗
║ Mean                                0.002 ║
║ Best                                3.077 ║
║ Worst                               0.000 ║
║ Total                               0.002 ║
╠═╡ Cycles (-%) ╞══════════════╡ All M3B3 ╞═╣
║ Mean                                0.004 ║
║ Best                                3.571 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════════════╡ All M3B3 ╞═╣
║ Mean                                0.001 ║
║ Best                                3.571 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════════════╡ All MzB3 ╞═╗
║ Mean                                0.002 ║
║ Best                                3.175 ║
║ Worst                              -2.985 ║
║ Total                               0.003 ║
╠═╡ Cycles (-%) ╞══════════════╡ All MzB3 ╞═╣
║ Mean                                0.005 ║
║ Best                                3.846 ║
║ Worst                              -0.546 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════════════╡ All MzB3 ╞═╣
║ Mean                                0.001 ║
║ Best                                2.317 ║
║ Worst                              -0.721 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles M3B3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞════════╡ Precompiles MzB3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞══════╡ Precompiles MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞════════╡ Precompiles MzB3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life M3B3 ╞═╗
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Cycles (-%) ╞════════╡ Real life M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life M3B3 ╞═╣
║ Mean                                0.000 ║
║ Best                                0.000 ║
║ Worst                               0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

╔═╡ Size (-%) ╞══════════╡ Real life MzB3 ╞═╗
║ Mean                                0.043 ║
║ Best                                0.593 ║
║ Worst                               0.000 ║
║ Total                               0.021 ║
╠═╡ Cycles (-%) ╞════════╡ Real life MzB3 ╞═╣
║ Mean                                0.229 ║
║ Best                                3.333 ║
║ Worst                              -0.076 ║
║ Total                               0.013 ║
╠═╡ Ergs (-%) ╞══════════╡ Real life MzB3 ╞═╣
║ Mean                                0.040 ║
║ Best                                1.382 ║
║ Worst                              -0.000 ║
║ Total                               0.000 ║
╚═══════════════════════════════════════════╝

@sayon
Copy link
Author

sayon commented Apr 22, 2024

With 'isBitcast = 1' on PTR_TO_INT, the offender is the register coalescer.

IR dump after register coalescer with isBitcast = 0

bb.0.entry:
  successors: %bb.4(0x40000000), %bb.1(0x40000000); %bb.4(50.00%), %bb.1(50.00%)
  liveins: $r1, $r2
  %15:gr256 = COPY $r   2
>>>
  %14:grptr = fatptr COPY $r1
  PTR_ADDrrs_s %14:grptr, $r0, i256 0, i256 0, @ptr_calldata, 0 :: (store (s256) into @ptr_calldata)
>>>

  %26:gr256 = ADDirr_s i256 128, $r0, 0
  ST1i i256 64, %26:gr256, i256 0 :: (store (s256) into `ptr addrspace(1) inttoptr (i256 64 to ptr addrspace(1))`, align 64, addrspace 1)
  %127:gr256 = COPY %14:grptr
  %28:gr256 = SHRxrr_s i256 96, %127:gr256, 0
  %0:gr256 = ANDcrr_s %const.0, 0, %28:gr256, 0
  dead %29:gr256 = ANDirr_v i256 1, %15:gr256, 0, implicit-def $flags
  JC %bb.4, i256 2, implicit $flags
  J %bb.1

IR dump after register coalescer with isBitcast = 1

bb.0.entry:
  successors: %bb.4(0x40000000), %bb.1(0x40000000); %bb.4(50.00%), %bb.1(50.00%)
  liveins: $r1, $r2
>>>
  %15:gr256 = COPY $r2
  %14:gr256 = fatptr COPY $r1
>>>
  PTR_ADDrrs_s %14:gr256, $r0, i256 0, i256 0, @ptr_calldata, 0 :: (store (s256) into @ptr_calldata)
  %26:gr256 = ADDirr_s i256 128, $r0, 0
  ST1i i256 64, %26:gr256, i256 0 :: (store (s256) into `ptr addrspace(1) inttoptr (i256 64 to ptr addrspace(1))`, align 64, addrspace 1)
  %28:gr256 = SHRxrr_s i256 96, %14:gr256, 0
  %0:gr256 = ANDcrr_s %const.0, 0, %28:gr256, 0
  dead %29:gr256 = ANDirr_v i256 1, %15:gr256, 0, implicit-def $flags
  JC %bb.4, i256 2, implicit $flags
  J %bb.1

Fragment of debug output for regalloc

96B	%127:gr256 = COPY %14:grptr
	Considering merging to GR256 with %127 in %14
		RHS = %127 [96r,112r:0) 0@96r  weight:0.000000e+00
		LHS = %14 [32r,192B:0)[432B,1008r:0)[1248B,1696r:0) 0@32r  weight:0.000000e+00
		merge %127:0@96r into %14:0@32r --> @32r
		erased:	96r	%127:gr256 = COPY %14:grptr
		updated: 112B	%28:gr256 = SHRxrr_s i256 96, %14:gr256, 0
	Success: %127 -> %14
	Result = %14 [32r,192B:0)[432B,1008r:0)[1248B,1696r:0) 0@32r  weight:0.000000e+00

The problem is therefore in coalescing a COPY which is not actually a copy by semantic. It is more like a copy from a subregister, but there is no hierarchy on grptr/gr256.

I think this may be important because any COPY instruction between regclasses (int->ptr specifically) is affected.

Fix

Currently, the fix is to disable cross-regclass joining of live intervals . It requires either patching LLVM source code (which is ugly but effective) or making use of their subregister machinery (which would have prevented this from happening, but requires some redesign of our regclasses and other parts of TD files).

The gains here do not seem to be substantial so maybe we should just keep it for later. Otherwise we have to patch ~90 broken lit tests. WDYT?

@akiramenai
Copy link
Collaborator

@sayon what is the asm we finally emit for the basic block you shown?

@sayon
Copy link
Author

sayon commented Apr 23, 2024

Without bitcast, or with bitcast but no cross-regclass coalescing

.func_begin0:
    ...
add	r0, r0, r1
ptr.add	r4, r0, stack-[7]
near_call	r0, @__sha3, @DEFAULT_UNWIND
ptr.add	stack-[7], r0, r2
ptr.add.s	36, r2, r2
ld	r2, r2

With bitcast, bugged

.func_begin0:
    ...
add	r0, r0, r1
add	r4, r0, stack-[7]               ; 32-byte Folded Spill
near_call	r0, @__sha3, @DEFAULT_UNWIND
add	stack-[7], r0, r2               ; 32-byte Folded Reload
ptr.add.s	36, r2, r2
ld	r2, r2

; %bb.10:                               ; %abi_decode_address.exit.i
	st.1	0, r1
	st.1	32, r0
	add	r0, r0, r1
	add	r4, r0, stack-[7]               ; 32-byte Folded Spill
	near_call	r0, @__sha3, @DEFAULT_UNWIND
	add	stack-[7], r0, r2               ; 32-byte Folded Reload
	ptr.add.s	36, r2, r2
    ld	r2, r2
    

bitcast-0.zasm.txt
bitcast-1-bugged.zasm.txt

@akiramenai
Copy link
Collaborator

@sayon thank you.
If I got it right cross regclass coalescing is what we desired to enable. I wonder if it's possible to not to touch original register classes when coalescing instead of prohibit doing it cross class.

@sayon
Copy link
Author

sayon commented Apr 23, 2024

Bitcast instructions are basically converted to COPY by peephole optimizer. If in a cross-class copy we keep the original register classes, then we will end up with instructions like SHR taking grptr as a register class. If this is ok, I can try it out.

@akiramenai
Copy link
Collaborator

@sayon if it's easy to try, please do.
Technically arithmetic instructions with fatptr are ok, they are just interpreted as numbers. LLVM's verifiers might complain, but we can check it instead of guessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants