X Sharp
Developer | COSMOS Development Team, part of the Cosmos operating system |
---|---|
First appeared | 2009 |
Typing discipline | weak (almost none) |
Platform | X86 |
Filename extensions | .xs |
X# is a low-level programming language developed for the x86 processor architecture as a part of Cosmos operating system to make operating system development easier. X# is designed to bring some of C-like language syntax to assembly language. At the beginning, X# was an aid for debugging services of Cosmos. The X# compiler is an open source console interface program with an atypical architecture. It parses lines of code into tokens and compares them with patterns. Finally, matched X# code patterns are translated to intel syntax x86 assembly, usually for the NASM compiler. In first versions, X# operation was mostly 1:1 with assembly code, but hasn't been the reason why the X# compiler was written.
Syntax
The syntax of X# is simple. Despite being similar to C, X# syntax differs and is stricter.
Comments
X# supports only one kind of comment, the C-style single line comment, started with a double forward slash - //
.
Constants
X# supports the definition of named constants which are declared outside of functions. Defining a numeric constant is similar to C++ - for instance,
const i = 0
. Referencing the constant elsewhere requires a #
before the name - "#i"
, for instance.
- To define a string constant, single quotes (
''
) are used. To use a single quote in a string constant, it must be escaped by placing a backslash before it, as'I\'m so happy'
. X# strings are null terminated. - Hexadecimal constants are prefixed with a dollar sign (
$
), followed by the constant. ($B8000
). - Decimal constants are not decorated but may not start with
0
. - Binary and octal constants aren't supported yet.
Labels
Labels in X# are mostly equivalent to labels in other assembly languages. The instruction to jump to a label uses the goto
mnemonic, as opposed to the conventional jump
or jmp
mnemonic.
CodeLabel1:
goto CodeLabel2:
Namespaces
X# program files must begin with a namespace directive. X# lacks a namespace hierarchy, so any directive will change the current namespace until it's changed again or the file ends. Variables or constants in different namespaces may have the same name as the namespace is prefixed to the member's name on assembly output. Namespaces cannot reference each other except through "cheats" using native-assembly-level operations.
namespace FIRST
// Everything variable or constant name will be prefixed with FIRST and an underscore. Hence the true full name of the below variable
// is FIRST_aVar.
var aVar
namespace SECOND
// It's not a problem to name another variable aVar. Its true name is SECOND_aVar.
var aVar
namespace FIRST
// This code is now back to the FIRST namespace until the file ends.
Functions
All X# executive code should be placed in functions defined by the 'function' keyword. Unlike C, X# does not support any formal parameter declaration in the header of the functions, so the conventional parentheses after the function name are omitted. Because line-fixed patterns are specified in syntax implemented in code parser, the opening curly bracket can't be placed on the next line, unlike in many other C-style languages.
function xSharpFunction {
// function code
}
Because X# is a low-level language, there are no stack frames inserted, so by default, there should be the return EIP address on the top of the stack. X# function calls do contain arguments enclosed in parentheses, unlike in function headers. Arguments passed to functions can be registers, addresses, or constants. These arguments are pushed onto the stack in reverse order. Note that the stack on x86 platforms cannot push or pop one-byte registers.
function xSharpFunction {
EAX = $10
anotherFunction(EAX);
return
}
function anotherFunction {
//function code
}
The return
keyword returns execution to the return EIP address saved in the stack.
Arithmetic and bitwise operations
X# can work with three low-level data structures: the registers, the stack and the memory, on different ports. The registers are the base of all normal operations for X#. A register can be copied to another by writing DST = SRC
as opposed to mov
or load/store instructions. Registers can be incremented or decremented just as easily. Arithmetic operations (add, subtract, multiply, divide) are written as dest op src
where src
is a constant, variable, or register, and dest
is both an operand and the location where the result is stored.
Examples of assignment and arithmetic operations are shown below.
ESI = 12345 // assign 12345 to ESI
EDX = #constantForEDX // assign #ConstantForEDX to EDX
EAX = EBX // move EBX to EAX => mov ebx, eax
EAX-- // decrement EAX => dec eax
EAX++ // increment EAX => inc eax
EAX + 2 // add 2 to eax => add eax, 2
EAX - $80 // subtract 0x80 from eax => sub eax, 0x80
BX * CX // multiply BX by CX => mul cx -- division, multiplication and modulo should preserve registers
CX / BX // divide CX by BX => div bx
CX mod BX // remainder of CX/BX to BX => div bx
Register shifting and rolling is similar to C.
DX << 10 // shift left by 10 bits
CX >> 8 // shift right by 8 bits
EAX <~ 6 // rotate left by 6 bits
EAX ~> 4 // rotate right by 4 bits
Other bitwise operations are similar to arithmetic operations.
DL & $08 // perform bitwise AND on DL with 0x08 and store the result in DL
CX | 1 // set the lowest bit of CX to 1 (make it odd)
EAX = ~ECX // perform bitwise NOT on ECX and store the result in EAX
EAX ^ EAX // erase EAX by XORing it with itself
Stack
Stack manipulation in X# is performed using +
and -
prefixes, where +
pushes a register, value, constant or all registers onto the stack and -
pops a value to some register. All constants are pushed on stack as double words, unless stated otherwise (pushing single bytes is not supported).
+ESI // push esi
-EDI // pop into edi
+All // save all registers => pushad
-All // load all registers => popad
+$1badboo2 // push 0x1badboo2 on the stack
+$cafe as word // \/
+$babe as word // push 0xcafebabe
+#VideoMemory // push value of constant VideoMemory
Variables
Variables are defined within namespaces (as there are no stack frames, local variables aren't supported) using the var
keyword. Arrays can be defined by adding the array's type and size on the end of the declaration. Variables and arrays are zeroed by default. To reference a variable's value, it must be prefixed with a dot. Prefixing that with an @
will reference the variable's address.
namespace XSharpVariables
var zeroVar // variable will be assigned zero
var myVar1 = $f000beef // variable will be assigned 0xf000beef
var someString = 'Hello XSharp!' // variable will be assigned 'Hello XSharp!\0',
var buffer byte[1024] // variable of size 1024 bytes will be assigned 1024 zero bytes
...
EAX = .myVar1 // moves value of myVar1 (0xf000beef) to EAX
ESI = @.someString // moves address of someString to ESI
CL = .someString // moves first character of someString ('H') to CL
.zeroVar = EAX // assigns zeroVar to value of EAX
X# can access an address with a specified offset using square brackets:
var someString = 'Hello XSharp!' //variable will be assigned to 'Hello XSharp!\0'
...
ESI = @.someString // load address of someString to ESI
CL = 'B' // set CL to 'B' (rewrite 'H' on the start)
CH = ESI[1] // move second character ('E') from string to CH
ESI[4] = $00 // end string
//Value of someString will be 'Bell' (or 'Bell\0 XSharp!\0')
Comparison
There are two ways of comparing values: pure comparison and if-comparison.
- Pure comparison leaves the result in FLAGS so it can be used in native assembly or using the
if
keyword without specifying comparison members. - If comparison compares two members directly after an
if
keyword.
Here are two ways of writing a (slow) X# string length (strlen
)function:
// Method 1: using pure comparison
function strlen {
ESI = ESP[4] // get pointer to string passed as first argument
ECX ^ ECX // clear ECX
Loop:
AL = ESI[ECX]// get next character
AL ?= 0 // is it 0? save to FLAGS
if = return // if ZF is set, return
ECX++ // else increment ECX
goto Loop // loop...
}
//Way 2: using if
function strlen {
ESI = ESP[4] // get pointer to string passed as first argument
ECX ^ ECX // clear ECX
Loop:
AL = ESI[ECX]
if AL = 0 return// AL = 0? return
ECX++
goto Loop // loop....
}
There are six available comparison operators: < > = <= >= !=
. These operators can be used in both comparisons and loops. Note that there's also a bitwise AND operator which tests bits:
AL ?& $80 // test AL MSB
if = return // if ZF is 0, test instruction resulted in 0 and MSB is not set.