tact-asm
v0.0.6
Published
This repository contains an assembler and disassembler implementation for TVM bitcode.
Readme
Tact Assembly
This repository contains an assembler and disassembler implementation for TVM bitcode.
This implementation provides a complete cycle
Text -> Internal representation -> Cells -> BoC -> Cells -> Internal representation -> Text, this means that the same
text assembly can be obtained from a text assembly, going through all the compilation and decompilation steps.
The internal representation in this implementation is much more convenient for manipulations than the IR in https://github.com/tact-lang/ton-opcode. Thanks to this, we can, for example, build a bitcode optimizer on top of this assembler, which can work with raw bitcode, and not with some high-level language like Tact or FunC.
During compilation, the assembler collects additional mappings that can be used to convert the TVM log into a full trace
that will refer to specific instructions in the decompiled version of the contract.
The proof of concept can be found in the ./src/debugger folder, it collects instructions, gas and
stack at each step.
This implementation is also able to generate a coverage report for the contract by BoC and logs from the sandbox.
The proof of concept can be found in the ./src/coverage folder.
Validity
The assembler was tested on 106k contracts from the blockchain where it successfully decompiled and compiled all contracts into equivalent Cells.
Why?
Reasons not to use Fift:
- It is difficult to use on the web, we need a WASM compiler build, which greatly increases the package size
- Fift is not just a plain assembler, it handles
INLINECALLDICTin a special way, implicitly splits Cell into parts in cases where the reference or code size limit is reached and so on. - We do not control its code, it is written in itself, which makes its modification extremely difficult
- Fift implicitly combines several instructions into one, where a specific implementation is selected based on its arguments
- To write custom logic for how to lay out code, we need to write a lot of unreadable Fift code (see the selector hack)
In the future, this assembler will be used as a target for a new backend in the Tact compiler.
Example contract
Hex:
b5ee9c7201010101005e0000b8ff0020dd2082014c97ba94308168e8e0a4f260810200d71820d31f018168e8baf2e052d32f01f823bef2e04dd74c59f9010182f0871740fee9829718bd7dfddb66c215758869fdb682780dc6298283dd6134d67ff910f2a3f800ed55
SETCP 0
DUP
IFNOTRET
DUP
PUSHINT_LONG 85143
EQUAL
PUSHCONT_SHORT {
DROP
PUSHINT_16 26856
}
IFJMP
INC
THROWIF_SHORT 32
PUSHINT_16 512
LDSLICEX
DUP
LDU 32
SWAP
PUSHINT_16 26856
EQUAL
THROWIFNOT 82
LDU 48
SWAP
NOW
GEQ
THROWIFNOT 77
PLDREFIDX 0
ROTREV
HASHSU
SWAP
PUSHINT_LONG 61103320625414998982152303707971264142854662587410920750381746782697495516799
CHKSIGNU
THROWIFNOT_SHORT 35
ACCEPT
POPCTR c5