Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC] Refine the definition of NOP instructions in TableGen files #483

Open
atrosinenko opened this issue Apr 16, 2024 · 6 comments
Open
Assignees

Comments

@atrosinenko
Copy link
Collaborator

atrosinenko commented Apr 16, 2024

The spec describes NOP encoding mostly similar to arithmetic instructions, except for no modifiers:

OpNoOp src dst ⇒ 1 + 4 × src + 1 × dst

that is, 6 variants of source operands, 4 variants of destination operands.

The question is how many instructions should be defined in TableGen files:

  • 4x2 variants similar to arithmetics (inputs: reg, imm, code, "any stack" x outputs: reg, "any stack")
  • 3 variants: nop, incsp, decsp
  • something in between?
@atrosinenko atrosinenko self-assigned this Apr 16, 2024
@asl
Copy link
Collaborator

asl commented Apr 16, 2024

@atrosinenko Some of this encoding space will be taken by stack push / pop instructions in #453

@sayon
Copy link

sayon commented Apr 16, 2024

I confirm that after redesign nop should have no operands. ATM three distinct instructions (nop, spadd, spdec) are having the same concrete (marked "legacy") mnemonic: see here.

The type asm_instruction describes distinct instructions:

 | OpNoOp
 | OpSpAdd (in1: in_reg) (ofs: imm_in) 
| OpSpSub (in1: in_reg) (ofs: imm_in) 

The matching between asm and mach variants transforms all three into NOP: see here and here.

@atrosinenko
Copy link
Collaborator Author

As far as I understand, the spec on binary encoding of the instructions mentions 24 variants of NOP (6 source operand kinds x 4 destination operand kinds). On the assembly side, the new syntax only defines three instructions: incsp (note that reg+imm, reg and imm operands of incsp are essentially the same stack+=[...] output operand kind), decsp and no-operand nop.

@sayon @hedgar2017 The three assembly mnemonics can be mapped to SrcReg+DstSpRelativePush, SrcSpRelativePop+DstReg and SrcReg+DstReg variants on encoding. What is expected on decoding: rejecting 21 other opcodes, silently converting all non-SP-modifying operand modes to SrcReg/DstReg at some point or something else?

@atrosinenko atrosinenko assigned sayon and hedgar2017 and unassigned atrosinenko May 22, 2024
@atrosinenko
Copy link
Collaborator Author

atrosinenko commented May 22, 2024

  • among these 24 variants of NOP there is the most strange one: nop stack-=[...], reg, stack+=[...]
  • updated the description: the question is how to model NOP opcodes in the backend

@asl
Copy link
Collaborator

asl commented May 22, 2024

@atrosinenko I doubt @sayon @hedgar2017 could answer to your question about "what should be in tablegen files". If the spec is unclear, maybe you can reformulate your question in terms of assembler behavior?

@atrosinenko
Copy link
Collaborator Author

From the user point of view

  • the spec on binary encoding assumes 24 variants of NOP instructions: OpNoOp src dst ⇒ 1 + 4 × src + 1 × dst
  • the spec on assembly syntax defines three different instructions: incsp and decsp (with register and/or immediate operands) and no-operand form of nop
  • each of nop, incsp and decsp can be mapped to one particular 11-bit opcode value, say, nop to 1, incsp to 2 and decsp to 5 (assuming unused operands being registers)
  • no "generic nop" assembly instruction is defined

The question is: do we want to understand non-canonical use of NOP? For example if someone used the spec to implement their own code emission and we want to disassemble it for debug purpose (via LLDB, objdump, etc.).

For example, nop (the precise syntax described by the spec) can be encoded as arith_nop r0, r0, r0 (explanatory syntax derived from arithmetic instructions), incsp reg+imm as arith_nop r0, r0, stack+=[reg+imm] and decsp reg+imm as arith_nop stack-=[reg+imm], r0, r0. Then, the disassembler gets arith_nop 0, r0, stack+=[reg+imm] (SrcImm input operand, meaning 11-bit opcode field is 18 instead of 2) or arith_nop r0, r1, stack+=[reg+imm] (meaning rs1 is non-canonical) - both can be naturally understood as incsp reg+imm (loosing the semantically insignificant details).

Another (easier to implement) option could be replacing the line

OpNoOp src dst ⇒ 1 + 4 × src + 1 × dst

with

OpNoOp ⇒ 1
OpIncSp ⇒ 2
OpDecSp ⇒ 5

assuming all operand fields must be zeroed for OpNoOp, only rd0 and Imm1 can be non-zero for OpIncSp and rs0 and Imm0 for OpDecSp, correspondingly (and there were some opcodes defined in between, but they do not exist in the 1.5.0 version of the ISA).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants