what is it?
The libdisasm library provides basic disassembly of Intel x86 instructions from a binary stream. The intent is to provide an easy to use disassembler which can be called from any application; the disassembly can be produced in AT&T syntax and Intel syntax, as well as in an intermediate format which includes detailed instruction and operand type information.
This disassembler is derived from libi386.so in the bastard project; as such it is x86 specific and will not be expanded to include other CPU architectures. Releases for libdisasm are generated automatically alongside releases of the bastard; it is not a standalone project, though it is a standalone library.
The recent spate of objdump output analyzers has proven that many of the people [not necessarily programmers] interested in writing disassemblers have little knowledge of, or interest in, C programming; as a result, these "disassemblers" have been written in Perl. In order to address this audience, a HOWTO has been provided which demonstrates how to use the libdisasm opcode tables to implement a true disassembler using Perl:
It is hoped that the state of the art of disassemblers in Linux/UNIX will improve beyond mere objdump backends in the near future.
how does it work?
The basic usage of the library is:
These routines have the following prototypes:
int x86_init( enum x86_options options, DISASM_REPORTER reporter, void *arg);
unsigned int x86_disasm( unsigned char *buf, unsigned int buf_len,
unsigned long buf_rva, unsigned int offset,
x86_insn_t * insn );
int x86_cleanup(void);
Instructions are disassembled to an intermediate format:
typedef struct {
enum x86_op_type type; /* operand type */
enum x86_op_datatype datatype; /* operand size */
enum x86_op_access access; /* operand access [RWX] */
enum x86_op_flags flags; /* misc flags */
union {
/* sizeof will have to work on these union members! */
/* immediate values */
char sbyte;
short sword;
long sdword;
qword sqword;
unsigned char byte;
unsigned short word;
unsigned long dword;
qword qword;
float sreal;
double dreal;
/* misc large/non-native types */
unsigned char extreal[10];
unsigned char bcd[10];
qword dqword[2];
unsigned char simd[16];
unsigned char fpuenv[28];
/* absolute address */
void * address;
/* offset from segment */
unsigned long offset;
/* ID of CPU register */
x86_reg_t reg;
/* offsets from current insn */
char relative_near;
long relative_far;
/* effective address [expression] */
x86_ea_t expression;
} data;
} x86_op_t;
typedef struct {
/* information about the instruction */
unsigned long addr; /* load address */
unsigned long offset; /* offset into file/buffer */
enum x86_insn_group group; /* meta-type, e.g. INS_EXEC */
enum x86_insn_type type; /* type, e.g. INS_BRANCH */
enum x86_insn_note note; /* note, e.g. RING0 */
unsigned char bytes[MAX_INSN_SIZE];
unsigned char size; /* size of insn in bytes */
/* 16/32-bit mode settings */
unsigned char addr_size; /* default address size : 2 or 4 */
unsigned char op_size; /* default operand size : 2 or 4 */
/* CPU/instruction set */
enum x86_insn_cpu cpu;
enum x86_insn_isa isa;
/* flags */
enum x86_flag_status flags_set; /* flags set or tested by insn */
enum x86_flag_status flags_tested;
/* stack */
unsigned char stack_mod; /* 0 or 1 : is the stack modified? */
long stack_mod_val; /* val stack is modified by if known */
/* the instruction proper */
enum x86_insn_prefix prefix; /* prefixes ORed together */
char prefix_string[MAX_PREFIX_STR]; /* prefixes [might be truncated] */
char mnemonic[MAX_MNEM_STR];
x86_oplist_t *operands; /* list of explicit/implicit operands */
size_t operand_count; /* total number of operands */
size_t explicit_count; /* number of explicit operands */
} x86_insn_t;
The x86_format_insn() routine can be used to generate a string representation:
int x86_format_insn(x86_insn_t *insn, char *buf, int len, enum x86_asm_format);
...so that a simple disassembler can be implemented in C with the following code:
#include <libdis.h>
char buf[BUF_SIZE]; /* buffer of bytes to disassemble */
char line[LINE_SIZE]; /* buffer of line to print */
int pos = 0; /* current position in buffer */
int size; /* size of instruction */
x86_insn_t insn; /* instruction */
x86_init(opt_none, NULL, NULL);
while ( pos > BUF_SIZE ) {
/* disassemble address */
size = x86_disasm(buf, BUF_SIZE, 0, pos, &insn);
if ( size ) {
/* print instruction */
x86_format_insn(&insn, line, LINE_SIZE, intel_syntax);
printf("%s\n", line);
pos += size;
} else {
printf("Invalid instruction\n");
pos++;
}
}
x86_cleanup();
why not use libopcodes?
Get out.
No, really. Leave.
where are the files?
The latest release can always be found here:
what about support?
Support can be obtained through the bastard sourcefroge project help system:
who's behind it?