HardCore Assembler v1.0

by Epi/Tristesse^O.S.

TABLE OF CONTENTS


INTRODUCTION

HardCore Assembler (hcasm) is a cross-assembler which generates code for 6502 and 65c816 processors. It has been designed to be fully compatible with xasm, but it has some new features. However some xasm features will be supported only in future releases.

Differences between xasm and hcasm

New features in hcasm:

Currently unsupported xasm features:


USAGE

System requirements

Source code requirements

Command line syntax

To run hcasm, write:
HCA source [options]
source is the name of source file. If no extension is given, the .HCA is implied.

You can place options before or behind the source file name and link them freely:

-bw -i source -c -m -l:listing.lst -q -t
is equivalent to:
source -bwicmqtl:listing.lst
With no options given, hcasm assembles source[.HCA] into a file with .HCO extension, which is default for object files.

Assembling options

Available options are:
-b[w]
Enable automatic selection of branches range.
By default 'Branch out of range' error is generated if destination address is out of range of standard relative branches (-128..127), and 'Branch would be sufficient' warning if it is possible to use standard branch (Bxx) in place of existing Jxx pseudo-command. With this option you needn't to care about using proper branch type.
-bw enables warnings on any branches changed by hcasm.

-c
Enable listing false conditionals.

-i
Disable listing included source files.

-l[:filename]
Enable generating listing.
If no filename given, source.HCL is assumed. This setting may be changed inside the source file by using opt l.

-m
Disable listing macro definitions inside every time they are used.

-o:filename
Set object file name. Default is source.HCO.

-q
Enable quiet mode.
No messages will be listed if assembly is correct.

-t[:filename]
Enable creating label table.
If no filename given, table is written to source.HCT.

Exit codes

Exit codes has the same meaning as the xasm's ones:
3 = bad parameters, assembling not started
2 = error occured
1 = warning(s) only
0 = no errors, no warnings

Listing structure

A line of listing contains:

Label table structure

A line of label table contains:


SYNTAX

Fields

Lines started with an asterisk (*) or a semicolon (;), as well as empty lines are ignored. It allows you to enter line comments in source code. You may optionally type a label name at the beginning to assign current value of the origin counter to it.

All other lines are divided into fields, which are separated with one or more blank characters. From left to right, field types are as follows:

Label field
It is optional and starts from the first column. Its purpose is to define a label or a macro.

Repeat count
Like xasm, hcasm allows single line to be assembled multiple times. Repeat count, which can be any valid expression, has to be preceded with a colon.
Examples:
:4	asl	@
:2	dta	a(*)
In the latter example each DTA has different operand value.
If repeat count equals zero, remaining part of line is not assembled. This allows compact single-line conditional assembly.

Instruction field
The only required field, which has to be preceded with at least one blank character if this is the first field in a line. In place of an instruction you can use:

CPU instructions and pseudo-commands can be linked. Instruction pairing feature, known from xasm, has been somewhat extended in hcasm and now you can link any number of instructions by separating them with a colon (:). All linked instructions has shared operand, as in the example:

	lda:cmp:req	$14
which is equivalent to:
	lda	$14
	cmp	$14
	req	$14
Note that REQ pseudo-command takes no operand - $14 is a comment for it.

Operand field
Number of operands needed is constant for all instructions, except macros.
6502 and 65c816 commands require operand with proper addressing mode.

Comment
Comment in a statement does not start with any special character like ; for example. Comment field is implied when appropriate number of operands was taken.

Expressions

Expressions are numbers combined with operators and square brackets. All numbers are 32-bit signed integer values and all expressions give results in the same range, since they are calculated using the 32-bit arithmetic.

Numbers

In place of number you can use:

Operators

Binary operators:

+ Addition
- Subtraction
* Multiplication
/ Division
% Remainder
& Bitwise and
| Bitwise or
^ Bitwise xor
<< Arithmetic shift left
>> Arithmetic shift right
= Equal
== Equal (same as =)
<> Not equal
!= Not equal (same as <>)
< Less than
> Greater than
<= Less or equal
>= Greater or equal
&& Logical and
|| Logical or

Unary operators:

+Plus (does nothing)
-Minus (changes sign)
~Bitwise not (complements all bits)
!Logical not (changes true to false and vice versa)
<Low (extracts low byte)
>High (extracts high byte)
\Extracts high word of double word

Operator precedence:

first[](brackets)
+ - ~ < > \(unary)
* / % & << >>(binary)
+ - | ^(binary)
= == <> != < > <= >=(binary)
!(unary)
&&(binary)
last ||(binary)

Note that although the operators are similar to these used in C, C++ and Java, their priorities are different than in these languages.

Compare and logical operators assume that zero is false and a non-zero is true. They return 1 for true.


List of directives

blk dta eif eli els emc end equ ert icl ift ini ins mac opt org run smb

Assembly control

Changing assembly options

Some options can be changed during the assembly process with the OPT directive.

Following options are available:

Default setting is c-h+l-o+. C+ enables commands and addressing modes specific for 65c816 CPU. O- tells assembler not to write data to object file. It doesn't modify DOS headers, so be careful with this option.

Examples:

	opt	o-	disable writing to object file
	opt	c+l+	enable 65c816 CPU, enable generating listing

Conditional assembly

You can construct conditionally assembled fragments by using following directives:
IFT - assemble if expression is true
ELI - else if
ELS - else
EIF - end if

Conditional constructions can be nested.
Example:
noscr	equ	1
widescr	equ	1
	ift	noscr
	lda	#0
	eli	widescr
	lda	#$23
	els
	lda	#$22
	eif
	sta	$22f
Above example can be rewritten using line repeating feature:
noscr	equ	1
widescr	equ	1
:noscr	lda	#0
:!noscr&&widescr	lda	#$23
:!noscr&&!widescr	lda	#$22
	sta	$22f

Breaking assembly

There are two ways to stop assembly process:
ERT - generate error if an expression is true
Examples:
	ert	*>$c000
	ert	len1>$ff||len2>$ff
END - end assembling file
Remaining part of the file is not assembled. If this statement does not occur, assembler stops assembling when encounters end of file.
Example:
	end

Including files

You can include another source file by using ICL. It specifies the file to be included in the assembly as if the contents of the reference file appeared in place of the ICL statement. The included file may contain other ICL statements. The .HCA extension is added if none given.

Examples:

	icl	'macros.hca'
	icl	'd:\atari\hca\fileio'

Labels

Defining labels

Labels represent 32-bit integers and they can be defined in three ways:
EQU - assign value of an expression to the label
Expression can't contain SDX symbols. Nested forward references are available, but keep in mind that assembler must take more passes to get forward referenced value.

Examples:

label	equ	value
value	equ	6*$10
here	equ	*
SMB - assign SDX symbol to the label
You can use symbols only in SDX-specific block and you have to put 'update symbols' block at the end of code to tell SDX loader, which symbols you have used and where.
Label defined with SMB has the value of 0. You can combine it with operators, but remember that only simple addition and subtraction are reasonable.

Examples:

mul	smb	"MUL_32"
pf	smb	"PRINTF"
Putting label name before an instruction other than EQU and SMB, or in an empty line, assigns current origin counter value to it.

Label names

There's no limit on label name length.
First character of label name can be a letter or an underscore (_). Other characters can be also numbers, dots (.) and question marks (?). The latter two characters has the special meaning:
When the first character of a label is a question mark, the label is a local label. Locals are defined only in the source code segment between two global labels. References to local labels cannot cross a global label definition.
If you want to use labels inside macro definition, you have to place a dot at the beginning of label name. Assembler adds specific information about source line number and nesting level before it, so it will have unique name every time macro is executed.

Macros

Defining macros

Macro definition can be performed using following directives:
MAC - start macro definition
It has no operands. Macro name should be entered in label field.
EMC - end macro definition.
Every single character in lines between these directives is stored in buffer, and each time you use the macro, it's inserted into source in place of the line with macro name. You can use other macros in currently defined one.
Macros should be defined before use, but it needn't to be at the beginning of source file.

Example:

pushall	mac
	pha
	txa:pha
	tya:pha
	emc

Parameters

Each macro can be executed with up to nine parameters. They are separated with at least one blank character. Inside macro definition they are marked with .1 to .9. Specified parameters are copied in place of these symbols before the line is assembled.
There is also special symbol .0 which gives total number of parameters.
If you want the number preceded with a dot not to be treated as a parameter, use double dot (..).

Example:

move.b	mac
	ert	.0<2
	lda	.1
	sta	.2
	emc
This macro does the same as MVA pseudo-command. When it is used as move.b source,y+ dest,x+, hcasm will assemble following code:
	ert	2<2
	lda	source,y+
	sta	dest,x+
Another example, for copying 32-bit values from memory to memory:
move.d	mac

	ert	.0<2
	ert	.?1&$0f3f
	ert	.?2&$0f3f

	mva	.<1.~1.!1.>1	.<2.~2.!2.>2
	mva	.<1.~11+.!1.>1	.<2.~21+.!2.>2
	mva	.<1.~12+.!1.>1	.<2.~22+.!2.>2
	mva	.<1.~13+.!1.>1	.<2.~23+.!2.>2
	
	emc
Used symbols relate to operand
addressing mode and have following meaning:

.<x - returns addressing mode prefix, like # or [, of parameter #x.
.~x - returns operand size prefix, like v: or l:.
.!x - returns only operand, without addressing mode prefix and suffix.
.>x - returns addressing mode suffix, like ,x) or ],0+.
.?x - returns addressing mode identifier.

Addressing mode identifier is a bit-oriented, word value describing used adressing mode and some attributes of expression used in parameter:
bit no.  FEDC BA987654 3210
meaning  swlv -+0SYX]) [(@#
where 1 on specific position:
smeans that expression contains symbol.
wmeans that expression value is greater than byte.
lmeans that expression value is greater than word.
vmeans that expression value should be updated with blk u addr.
- or +describe pseudo-addressing modes with decrementing/incrementing index registers.
0describes pseudo-addressing modes with resetting index register before instruction.
Sdescribes stack pointer addressing modes (65c816 only).
Y or Xdescribe indexed addressing modes.
[ and ]together describe long indirect addressing mode (65c816 only).
( and )together describe indirect addressing mode.
@describes accumulator addressing mode.
#describes immediate addressing mode.

Examples:
$4000absabsolute
$6440long,Xlong, X indexed, with post-incrementation
$0b94(ds,S),0+stack pointer indirect, with Y register zeroed before and incremented after

The assembler checks if addressing mode is correct, so it's impossible to get e.g. $6625, which could mean (#long,S+].

In the above example, .?x&$2F3F gives non-zero value for addressing modes other than absolute (also zero-page), optionally indexed, so ert will generate an error, if you use other addressing modes.
Note that .<1 and .<2 in this macro are not needed, because absolute addressing mode has no prefix, so they give empty strings.


Defining blocks

There are several directives for defining blocks of code:
ORG - set origin counter
You can also define block address to load at. It can be useful when you write code which has to be moved into another memory area, unavailable during loading.
Syntax is:
	org	origin[,load_address]
You can also set some options applied to the new header (if headers are enabled):
  • a: tells assembler to always make a header, even it is unnecessary, like in ORG *.
  • f: works same as a:, but additionally tells to generate a $FFFF prefix before the header. hcasm adds it at the beginning of the file by default, so use this option only if you want the $FF's somewhere inside.
Examples:
	org	$600
	org	f:$700
	org	$c000,*	doesn't create new block  
table	org	*+100
In the latter example table points to 100 bytes of uninitialized data (label is assigned to * before ORG directive is executed).

BLK - define block of code
Following block types are supported:
	blk	n[one] origin[,load_address]
Starts new block without generating DOS header. load_address is discarded, it's only for compatibility with definitions of other block types.
	blk	d[os] origin[,load_address]
Starts standard binary block. If it's the first block in object file, or previous block was of different type, a $FFFF prefix is generated before header.
	blk	s[parta] origin[,load_address]
Same as blk dos, but generates $FFFA prefix (always), making the object file specific for Sparta DOS X.
	blk	r[eloc] mem
Starts relocatable block for SDX, loaded into memory indexed by mem. Although values in the range of 0..3 are allowed, only 0 (main) and 2 (extended) are used by SDX.
	blk	e[mpty] size mem
Starts empty relocatable block for SDX, defined in memory indexed by mem. You can't put any data to block of this type, it's only purposed for defining storage areas in relocatable programs.
	blk	u[pdate] a[ddress]|s[ymbols]|n[ew]
Creates special block(s) for update addresses or symbols in previous reloc and sparta blocks.
With blk new you can define new global symbol for SDX environment. It requires symbol name and address, which has to be defined inside a resident relocatable block. Symbol, which name starts with @, can be executed directly from SDX's command processor.

Examples:

	blk	none $8000
	blk	dos $c000,$6000
	blk	sparta $3c00
	blk	reloc 0
	blk	empty $200 2
	blk	upd n start "FASTIO"
The latter example shows definition of symbol FASTIO in SDX environment with the value of the label start assigned to it.

RUN - generate run address
The Atari executable program should have a run address specified. A program may be loaded in many areas of memory and started from any address.
	run	addr
is equivalent to:
	org	$2e0
	dta	a(addr)
Examples:
	run	start
	run	main
INI - generate init address
The Atari executable program may have some routines which are executed during loading process. There may be many init blocks in one file.
Examples:
	ini	init
	ini	showpic

Defining data

Data can be defined in two ways:
DTA - define data
Currently the following data types are available:
  • integers
    • bytes: b(200) or simply 200
    • words: a($89AB) or v($89AB) - v type means that word should be updated with blk u addr
    • long words: e($F7F7F7)
    • double words: f($12345678) or g($12345678) - g uses big endian
    • low bytes: l(511) defines byte 255
    • high bytes: h(511) defines byte 1; note that h(value) is always the second byte of value, even if it's larger than 2 bytes.
  • text strings
    • ASCII: c'Text' or c"Text"
    • ANTIC: d'Text' or d"Text"
    Within a string, you can put single quotation mark using two successive quotation marks. An asterisk (*) after a string inverts all characters.

INS - insert contents of file
Copies every byte of specified file into the object file and updates the origin counter, as if these bytes were defined with DTA.
You may specify range of inserted file. Syntax is following:
	ins 'file'[,offset[,length]]
First byte in file has offset 0.
If offset is negative, it is counted from the end of file.
Examples:
	ins	'picture.raw'
	ins	'file',-256	insert last 256 bytes of file
	ins	'file',10,10	insert bytes 10..19 of file

Commands

For 6502 and 65c816 commands, standard mnemonics are used.
In 65c816 mode, both INA/DEA and INC/DEC @ forms are allowed for INC and DEC in accumulator addressing mode. This conception has been also applied to commands available for 6502: ASL, LSR, ROL, ROR, so you can also use SLA, SRA, RLA and RRA for these commands in accumulator addressing mode. This is for easier linking them with other instructions.
Both HLT and STP mean the same.
All PEA addressing modes have the same mnemonic.

You can also use undocumented 6502 commands by including file undoc.hca, which contains macro definitions and short descriptions for these commands. Remember that they won't work on 65c816, which gains ground quickly.

hcasm makes all xasm pseudo-commands (built-in macros) available. They are:

ADD - addition without carry
If you have ever programmed 6502, you must have noticed that you had to use a CLC before ADC for every simple addition.
hcasm can do it for you. ADD replaces two instructions: CLC and ADC.

SUB - subtraction
It is SEC and SBC.

RCC, RCS, REQ, RMI, RNE, RPL, RVC, RVS - conditional repeat
These are branches to the previous instruction. They take no operand, because the branch target is the address of previously assembled instruction.
Example:
	ldx	#0
	mva:rne	$500,x $600,x+
The example code copies memory $500-$5ff to $600-$6ff. Here is the same written with standard 6502 commands only:
	ldx	#0
loop	lda	$500,x
	sta	$600,x
	inx	
	bne	loop
SCC, SCS, SEQ, SMI, SNE, SPL, SVC, SVS - conditional skip
These are branches over the next instructions. No operand is required, because the target is the address of instruction following the next instruction.
Example:
	lda	#40
	add:sta	$80
	scc:inc	$81
In the above example word-size variable $80 is incremented by 40.
Nor conditional repeat nor skip pseudo-commands require operand, thus they can be linked with any other command.

JCC, JCS, JEQ, JMI, JNE, JPL, JVC, JVS - conditional jumps
These are a kind of 'long' branches. While standard branches (Bxx) have range of -128..+127, these jumps have range of all 64 kB. Unlike xasm, hcasm can also automatically select branch range.
Example:
	jne	dest
is equivalent to:
	seq:jmp	dest
INW - increment word
Increments a 16-bit word in the memory.
Example:
	inw	dest
is equivalent to:
	inc	dest
	sne:inc	dest+1
MVA, MVX, MVY - move byte using accumulator, X or Y
Each of these pseudo-commands requires two operands and substitutes two commands:
	mva source dest = lda source : sta dest
	mvx source dest = ldx source : stx dest
	mvy source dest = ldy source : sty dest
MWA, MWX, MWY - move word using accumulator, X or Y
These pseudo-commands require two operands and are combinations of two MV*'s: one to move low byte, and the other to move high byte.
You can't use indirect nor pseudo addressing mode with MW*. Destination must be absolute address (optionally indexed).
When source is also absolute, a mw* src dest will be:
	mv*	src	dest
	mv*	src+1	dest+1
When source is an immediate, a mw* #immed dest will be:
	mv*	<immed	dest
	mv*	>immed	dest+1
When <immed equals >immed and immed is not forward-referenced, hcasm uses optimization:
	mv*	<immed	dest
	st*	dest+1
If possible, MWX and MWY use increment/decrement commands. E.g. mwx #1 dest is assembled as:
	mvx	#1	dest
	dex
	stx	dest+1

Addressing modes

All addressing modes are entered in standard convention except the accumulator addressing mode, which should be marked with a @ character (as in Quick Assembler or xasm).

There are two extra immediate addressing modes: < and >, which use low/high byte of 16-bit word constant. They are for Quick Assembler compatibility. You can use traditional #< and #>. Note lda >$ff+5 loads 1 (>$104), while lda #>$ff+5 loads 5 (0+5) to accumulator, because unary operator > has higher priority than the binary plus.

hcasm examines the expression and uses the shortest addressing mode available for it. You may override it with z:, a: and l: (for long addressing modes) prefixes. Note that instructions like LDA have the same op-codes for both 8 and 16-bit immediate modes. Operand size is switched by X and M flags.
Since square brackets can be used to represent long indirect modes for 65c816 as well as to change priorities in expressions, use + character before left bracket to override absolute addressing modes.
All word values, except immediates, will be updated by blk u addr, if they point to relocatable blocks. If you want immediate word to be updated, use v: prefix.
Examples:

	nop
	asl	@
	lda	>$1234	assembles to lda #$12
	lda	$100,x
	lda	a:0	generates 16-bit address
	jmp	($0a)
	lda	($80),y
Following examples are valid only in 65c816 mode:
	lda	$12345
	jsr	(690,x)
	jmp	[l:$a000]	generates 24-bit address
	ldx	#v:$1532	includes value position to address update block
	lda	(2,s),y
	lda	z:$1234,x	generates 8-bit address $34 (will work only if D register contains $1200)
There are also pseudo addressing modes, which are similar to pseudo-commands. You may use them just like standard addressing modes in all commands and pseudo-commands, except for MWA, MWX and MWY:
	cmd a,x+      =  cmd a,x     : inx
	cmd a,x-      =  cmd a,x     : dex
	cmd a,y+      =  cmd a,y     : iny
	cmd a,y-      =  cmd a,y     : dey
	cmd (z),y+    =  cmd (z),y   : iny
	cmd (z),y-    =  cmd (z),y   : dey
	cmd (z,0)     =  ldx #0      : cmd (z,x)
	cmd (z),0     =  ldy #0      : cmd (z),y
	cmd (z),0+    =  ldy #0      : cmd (z),y   : iny
	cmd (z),0-    =  ldy #0      : cmd (z),y   : dey
In 65c816 mode, hcasm makes also these pseudo-addressing modes available:
	cmd [z],y+    =  cmd [z],y   : iny
	cmd [z],y-    =  cmd [z],y   : dey
	cmd (z,s),y+  =  cmd (z,s),y : iny
	cmd (z,s),y-  =  cmd (z,s),y : dey
	cmd [z],0     =  ldy #0      : cmd [z],y 
	cmd [z],0+    =  ldy #0      : cmd [z],y   : iny
	cmd [z],0-    =  ldy #0      : cmd [z],y   : dey
	cmd (z,s),0   =  ldy #0      : cmd (z,s),y
	cmd (z,s),0+  =  ldy #0      : cmd (z,s),y : iny
	cmd (z,s),0-  =  ldy #0      : cmd (z,s),y : dey
Note that 65c816 has also indirect modes without indexing which can replace (z),0 and [z],0 if you don't need indexing later.


CREDITS

Idea, design, development and testing:
Adrian Matoga (Epi/Tristesse^O.S.)
Send questions/comments/bug reports to:
epi /at/ atari /dot/ pl

Thanks to Fox, MMMG, J. Harris and TeBe for some good ideas.

HC Assembler is freeware. Use at own risk.
Copyright © 2005 Tristesse - All rights reserved.