oitofelix - QDot 8086

QDot 8086

QDot 8086 is a mid-level programming language targeting the original IBM-PC architecture written as a set of macros for NASM — the Netwide Assembler. The idea behind it is to make it easy to write small, fast, correct and maintainable code in a language almost as expressive as C but without giving up all control Assembly language grants to programmers. It features support to functions of an arbitrary number of parameters and multiple return values, global and function-local variables, loop and conditional flow-control constructs, evaluation of arbitrarily complex stack-based expressions, symbol importing and primitive debugging. In order to accomplish this, NASM’s powerful preprocessing and assembling capabilities are used to achieve a machinery that very closely resembles a compiler. QDot has also a companion standard library that is fully BIOS-based, thus OS-independent, which provides array processing, keyboard, video, disk and speaker I/O, timing, low-level debugging, math functions, user interface procedures and last but not least a versatile metamorphic boot-loader, that makes it simple to build a binary that is simultaneously a valid DOS executable and a bootable image — a property known as run-within-OS-or-bootstrap-itself-without-OS. There are already a couple of programs implemented in QDot as a proof of concept: Terminal Matrix 8086 and DeciMatrix 8086. QDot currently supports only the tiny memory model (.COM binaries — whose code, data and stack fit all within 64kb segment boundaries).

Download

QDot is free software under GPLv3+ and you can obtain its source code here.

Documentation

Table of contents

Use

In order to use QDot just include it in your main source code file, like this:

%include "qdot/qdot.qdt"

Make sure to put QDot’s lib directory in NASM’s include search path, for instance, by using the -i switch. When compiling your code, use the plain binary output format by means of the -fbin option. A typical command-line invocation for compilation of QDot sources looks like:

nasm -isrc -isrc/lib/ -fbin -oprog.com src/prog.qdt

Where all source code is inside the src directory, including QDot’s main lib directory, and presuming the main source code is called prog.qdt. Obviously, every source code file written in QDot language is also a valid NASM file, provided you include all of QDot’s macro definitions, but by convention, we use the .QDT extension for QDot source code.

QDot is dubbed this way because of the ubiquitous ? character used as the first component in the name of all NASM macros defined for QDot. That character, somewhat unusual for identifier names, makes it easy to distinguish between code using QDot’s facilities and standard, or third-party, NASM code.

Often, it’s useful to identify which QDot version was used in a particular build. The macro ?version is defined as QDot’s version string.

Numerical constants

Being QDot implemented on top of NASM, its syntax and semantics for numerical constants is the same. You can read more about it at NASM’s numerical constants manual section. Every numeric constant in a QDot 8086 context is a 16-bit number. Examples of numerical constants are: 61640, 0F0C8h and 1111000011001000b — all of them representing the same number in decimal, hexadecimal and binary, respectively. QDot equates the symbols ?TRUE and ?FALSE to numerical constants that represent the respective boolean values. Similarly, the ?NULL symbol equates to a numerical constant that represents the null pointer.

See the file lib/qdot/stack.qdt for the implementation details of predefined numerical constants.

Variables

All variables in QDot 8086 are 16-bit wide. To perform byte-wide operations one has to resort to the couple of special operators: @byte and @byte= for reading and writing, respectively. There are two scopes for variables global and function-local. The former are the usual NASM labels pointing to data space and the latter are context local labels defined at the function level. Therefore, local variable names always start with %$ at the top-most level inside a QDot function, and deeper references must add a $ character per nesting level, since unfortunately context fall-through lookup has been removed from NASM. QDot pushes a new context into NASM preprocessor’s context stack for each nested flow-control construct. Examples of global and function-local variable references are [video_rows] and %$count, respectively. The latter would be referred as %$$count, though, if one context deeper — and so on.

Operators

An operator is a symbol used in stack-based expressions to transform the stack and/or variable values. QDot has operators analogous to almost every C language operator and a few more. Suppose that a, b and c refer to the three top-most stack values, in this order from top to bottom, var refers to a global or local variable, and id to a function symbol. The operators work as follow:

Arithmetic
- +: pop a and b, then push their sum;
- -: pop a and b, then push the difference between a and b;
- *: pop a and b, then push their product;
- /: pop a and b, then push the quotient of a divided by b;
- %: pop a and b, then push the remainder of a divided by b;
- min: pop a and b, then push the lowest of both, or one of them if they are equal (unsigned comparison);
- min-: pop a and b, then push the lowest of both, or one of them if they are equal (signed comparison);
- max: pop a and b, then push the lowest of both, or one of them if they are equal (unsigned comparison);
- max-: pop a and b, then push the lowest of both, or one of them if they are equal (signed comparison);
- neg: pop a, then push its symmetrical additive;
- inc: pop a, then push its successor;
- dec: pop a, then push its predecessor;
- abs: pop a, then push its absolute value;
Bitwise
- ~: pop a, then push its bitwise negation;
- &: pop a and b, then push their bitwise conjunction;
- |: pop a and b, then push their bitwise disjunction;
- ^: pop a and b, then push their bitwise exclusive disjunction;
- <<: pop a and b, then push a bitwise-left-shifted by b positions;
- >>: pop a and b, then push a bitwise-right-shifted by b positions;
Boolean
- !: pop a, then push its boolean negation;
- &&: pop a and b, then push their boolean conjunction;
- ||: pop a and b, then push their boolean disjunction;
Comparison
- ==: pop a and b, then push ?TRUE if they are equal, ?FALSE otherwise;
- !=: pop a and b, then push ?TRUE if they are different, ?FALSE otherwise;
- <: pop a and b, then push ?TRUE if a is less than b, ?FALSE otherwise (unsigned comparison);
- <-: pop a and b, then push ?TRUE if a is less than b, ?FALSE otherwise (signed comparison);
- <=: pop a and b, then push ?TRUE if a is less than or equal to b, ?FALSE otherwise (unsigned comparison);
- <=-: pop a and b, then push ?TRUE if a is less than or equal to b, ?FALSE otherwise (signed comparison);
- >: pop a and b, then push ?TRUE if a is greater than b, ?FALSE otherwise (unsigned comparison);
- >-: pop a and b, then push ?TRUE if a is greater than b, ?FALSE otherwise (signed comparison);
- >=: pop a and b, then push ?TRUE if a is greater than or equal to b, ?FALSE otherwise (unsigned comparison);
- >=-: pop a and b, then push ?TRUE if a is greater than or equal to b, ?FALSE otherwise (signed comparison);
Assignment
- =, var: pop a, then assign it to var;
- =$, var: assign a to var;
- +=, var: pop a, then sum it to var;
- -=, var: pop a, then subtract it from var;
- *=, var: pop a, then store in var the product of both;
- /=, var: pop a, then store in var the quotient of the division of var by a;
- %=, var: pop a, then store in var the remainder of the division of var by a;
- &=, var: pop a, then store in var the bitwise conjunction of both;
- |=, var: pop a, then store in var the bitwise disjunction of both;
- ^=, var: pop a, then store in var the bitwise exclusive disjunction of both;
- <<=, var: pop a, then store in var itself bitwise-left-shifted by a positions;
- >>=, var: pop a, then store in var itself bitwise-right-shifted by a positions;
- ++, var: increment var by one;
- ++$, var: increment var by one, then push it;
- $++, var: push var, then increment it by one;
- --, var: decrement var by one;
- --$, var: decrement var by one, then push it;
- $--, var: push var, then decrement it by one;
Dereference
- @word: pop a, then push the word value it points to;
- @word=: pop a and b, then assign b to the word a points to;
- @byte: pop a, then push the word extension of the byte it points to;
- @byte=: pop a and b, then assign the low byte of b to the byte a points to;
Function call
- call, id: call function whose symbol is id;
- dcall: pop a, then call the function a points to;
Flow:
- ?:: pop a, b and c, then push b if a is ?TRUE, otherwise push c;
Stack:
- drop: pop a;
- dup: push a;

See the file lib/qdot/stack.qdt for the implementation details of operators.

Expressions

QDot’s core concept is that of stack-based expression. An expression is a comma-separated list of numerical constants, variables, and operators. They can be used by themselves in order to achieve side-effects, or as conditions in flow-control constructs or as calculations of return values. The elements of the list are read by the compiler from right to left and each of them change the stack, and potentially variables, respecting that order; therefore, it’s said that expressions in QDot have “Polish”, or “prefix”, notation. Numerical constants are always pushed to the stack. Variables have their values pushed onto the stack, unless they lie before an assignment operator, in which case the top-most stack’s value is popped, possibly operated on, and stored into the respective variable. Operators pop stack values, possibly pushing results onto there or into variables. An example of a stand-alone expression that is equivalent to the C expression x += (x - a++) * x + 3 is:

? +=, %$x, +, *, -, %$x, $++, %$a, %$x, 3

Although QDot syntax and semantics would allow for an expression to push values onto the stack in order to save them for later use, this is considered bad practice and can potentially break QDot’s implementation in unexpected ways. Therefore, it’s given as formally undefined the behavior of programs that make use of this kind of misleading and perverse hacks. The only exception being, of course, the return expressions of a function, that may evaluate to multiple values that are pushed onto stack prior returning, but that is done transparently by means of the ?return and ?endfunc clauses. The rule of thumb for this matter is: stack-based expressions must be designed to be clean and atomic in the sense that they must leave the stack in the state they got it initially — the stack is a medium of operation within an expression and NOT across expressions — unless the expression is a flow-control condition, where a boolean value is expected, in which case it must evaluate to that, and only that, value, otherwise it must evaluate to nothing. For instance, all stand-alone expressions (those with just a ? after them), must evaluate to nothing. This rule is extended to low-level push and pop instructions: pop everything you push.

The current implementation of QDot 8086 reserves all processor’s general-purpose registers ax, bx, cx and dx, to perform operations within expressions. Therefore, you should not presume any of them remains unchanged after the evaluation of an expression, be it stand-alone, flow-control or return based.

See the file lib/qdot/stack.qdt for the implementation details of stack-based expressions.

Flow-control

There is a nearly one-to-one match between QDot’s loop and conditional flow-control constructs and those of the C language. QDot supports two conditional constructs, ?if and ?switch, and three loop constructs, ?do, ?while and ?for.

The syntax for the ?if block is:

?if condition_0
  ...
?elseif condition_1
  .
  .
  .
?elseif condition_n
  ...
?else
  ...
?endif

Where condition_0…condition_n are mandatory expressions that must evaluate to a boolean value. Each of condition_0…condition_n are evaluated in sequence until one of them evaluates to ?TRUE, in which case its corresponding sub-block is executed. If no expression evaluates to ?TRUE, the ?else sub-block is executed, in case there is one, otherwise no code is executed. The ?elseif or ?else clauses may be omitted.

The syntax for the ?switch block is:

?switch [value]
  ?case condition_0
    .
    .
    .
  ?case condition_n
    ...
  ?default
    ...
?endswitch

Where [value] is an optional expression that must evaluate to nothing and condition_0…condition_n are mandatory expressions that must evaluate to a boolean value. The [value] expression is evaluated, if given, then each of condition_0…condition_n are evaluated in sequence until one of them evaluates to ?TRUE, in which case its corresponding sub-block is executed. If no expression evaluates to ?TRUE, the ?default sub-block is executed, in case there is one, otherwise no code is executed. There is no need for some form of break statement as in C because there is no fall-through. As you can see, this is essentially an ?if/?elseif/?else/?endif block with a prior standalone expression used to initialize a variable, for example. The ?default clause may be omitted.

There are two special clauses that can be used within loop blocks to set execution to the next iteration or to resume execution at the first line of code following the block: ?continue and ?break, respectively. As loop blocks can be nested, those clauses receive the loop’s name they apply to as a mandatory argument. The loop’s name is given by a label preceding its opening clause, and may be omitted in case there is no ?continue nor ?break clause referring to it.

The syntax for the ?do block is:

label: ?do
  ...
?while condition

Where condition is a mandatory expression that must evaluate to a boolean value. The code within the block is executed and then condition is evaluated. If it evaluates to ?TRUE, this process repeats, otherwise the execution resumes at the first line of code following the block.

The syntax for the ?while block is:

label: ?while condition
  ...
?endwhile

Where condition is a mandatory expression that must evaluate to a boolean value. The condition expression is evaluated. In case it evaluates to ?TRUE the code within the block is executed and this process repeat, otherwise the execution resumes at the very first line of code following the block.

The syntax for the ?for block is:

label: ?for [init]
  ?cond condition
  ...
?next [next]

Where [init] and [next] are optional expressions that must evaluate to nothing and condition is a mandatory expression that must evaluate to a boolean value. The [init] expression is evaluated exactly once as the very first step. Then, condition is evaluated. In case it evaluates to ?TRUE the code within the block is executed, and then [next] is evaluated and this process repeats, otherwise the execution resumes at the very first line of code following the block.

See the file lib/qdot/flow.qdt for the implementation details of flow-control constructs.

Functions

Functions are defined using the following syntax:

?func funcname, %$par_0,...,%$par_n
  ?local %$var_0,...,%$var_n
  ...
  ?return [retvals_0]
  ...
  ?return [retvals_1]
  ...
?endfunc [retvals_n]

Where,

funcname is a valid identifier to be used as the function global name;
%$par_0,…,%$par_n are the function parameter variables;
%$var_0,…,%$var_n are the function local variables;
[retvals_0],…,[retvals_n] are the optional expressions for return values;

For example, a non-recursive function to calculate the factorial of a number can be defined like this:

?func factorial, %$n
  ?local %$p, %$i
  ?for =, %$$p, 1, =, %$$i, 2
    ?cond <=, %$$i, %$$n
    ? *=, %$$p, %$$i
  ?next ++, %$$i
?endfunc %$p

For instance, that function can be called using the call operator and have its return value incremented by one and collected in the caller’s variable %$f, like this:

? =, %$f, inc, call, factorial, 8

Although correct, this example is just illustrative, because this function wouldn’t allow you to calculate factorials of numbers above 8, because QDot 8086’s variables are only 16-bit wide.

The order in which arguments are written in calling expressions is the natural order one familiar with C would expect: first argument first, last argument last. Since the compiler parses the expression from right to left, the last argument of a function call is deeper on the stack than the first argument, that lies on top. The same rule applies to return expressions: first return value on top, last return value deeper on the stack.

You can return an arbitrary number of values by pushing them onto the stack using the ?return and ?endfunc clauses. By the way, this is the only acceptable context where an expression should ultimately evaluate to multiple values. A single function can have a different number of return values depending upon arbitrary conditions, however the caller must be prepared to pop every single value returned, therefore every possible combination of the number of return values and conditions applicable must be known and paid attention to at compilation time.

The number of arguments in a call to a function must match its number of parameters. You can, however, refer to “additional arguments” pushed to stack prior the call by using the ?argument macro, like this: ?argument(i), where i is a number greater than the index of the last parameter. Thus, if a function has 2 parameters, the expression ?argument(2) points to the first “additional parameter”. As one would expect, those parameters aren’t popped out in the process of calling and returning from a function and must then be explicitly dealt with by the caller.

The current implementation of QDot 8086 uses the bp register to keep track of function arguments, local variables and the return address. Therefore, it’s imperative that this register remain untouched by user code. At a return point (?return or ?endfunc clauses), the sp register is used to keep track of return values, and therefore must be the same it were at the function’s beginning.

See the file lib/qdot/func.qdt for the implementation details of function definition, calling and returning.

Symbol importing

A “module” is defined as “a file that provides importable symbols”. With a little help from NASM’s single-line macro definitions and preprocessor includes, in QDot you can define importable symbols that relies on dependency trees of arbitrary complexity and depth. The following idiomatic syntax is used to define an importable symbol:

%ifdef IMPORT_symbol
%ifndef IMPORTED_symbol
%define IMPORTED_symbol

?import symboldep_0_0,...,symboldep_0_l
%include "file_0"
.
.
.
?import symboldep_n_0,...,symboldep_n_m
%include "file_n"

symbol_definition

%endif ; IMPORTED_symbol
%endif ; IMPORT_symbol

Where,

symbol is the symbol being defined;
symboldep_i_0,…,symboldep_i_j are the symbols, defined in the file file_i, on which symbol depends upon: namely the symbols that its definition makes reference to;
symbol_definition is the definition of symbol itself properly. It may be anything that defines symbol globally, ranging from NASM’s equates, or labels used with pseudo-instructions, like db, to function definitions;

To illustrate this, let’s re-define the factorial function to return 0 in case of overflow. First we declare an importable symbol defining the maximum allowable value for the factorial argument:

%ifdef IMPORT_FACTORIAL_ARGMAX
%ifndef IMPORTED_FACTORIAL_ARGMAX
%define IMPORTED_FACTORIAL_ARGMAX

FACTORIAL_ARGMAX equ 8

%endif ; IMPORTED_FACTORIAL_ARGMAX
%endif ; IMPORT_FACTORIAL_ARGMAX

Then we re-define the factorial function as an importable symbol depending upon the previously defined FACTORIAL_ARGMAX symbol:

%ifdef IMPORT_factorial
%ifndef IMPORTED_factorial
%define IMPORTED_factorial

?import FACTORIAL_ARGMAX
%include "math/factorial.qdt"

?func factorial, %$n
  ?local %$p, %$i
  ?if >, %$$n, FACTORIAL_ARGMAX
    ?return 0
  ?endif
  ?for =, %$$p, 1, =, %$$i, 2
    ?cond <=, %$$i, %$$n
    ? *=, %$$p, %$$i
  ?next ++, %$$i
?endfunc %$p

%endif ; IMPORTED_factorial
%endif ; IMPORT_factorial

And finally we could import and use it like this:

?import factorial
%include "math/factorial.qdt"
...
  ? =, %$f, inc, call, factorial, 9     ; now '%$f' equals 1
...

Notice that, one can in principle define more than one symbol inside a single importable symbol block. That feature should be used wisely, though, in order to maintain module’s sanity and cleanness. A legitimate use for it is to define multiple symbols that comprise a single whole, like a 32 bit variable, with one symbol for the high and other for the low word, like this:

;;;;;;;;;;;;;;
; random_seed
;;;;;;;;;;;;;;

%ifdef IMPORT_random_seed
%ifndef IMPORTED_random_seed
%define IMPORTED_random_seed

random_seed_0 dw 0000h
random_seed_1 dw 0000h

%endif ; IMPORTED_random_seed
%endif ; IMPORT_random_seed

Notice also, that a module can depend upon symbols defined by itself — and that’s actually pretty common. The recursive %include preprocessor directives should not pose a problem for two reasons: first, an importable symbol must never depend on itself, and second, no piece of code inside a module, be it an importable symbol block or otherwise, may be allowed to be included twice.

If the programmer wants the main source file of a program to be a module, so it can depend on symbols defined by itself — an elegant choice — he must make sure it obeys the two principles above. For that end, the execution entry point must be isolated from the rest of the file and the remaining code must be made into importable symbol blocks. The very first code inside the main source module must follow along the lines:

%push SRC_PROG_QDT ; push a preprocessor context for this module

%ifndef SRC_PROG_QDT ; ensure this block won't be included twice
%define SRC_PROG_QDT

CPU 8086
ORG 100h

%include "qdot/qdot.qdt" ; enable QDot language

mov sp, 0FFFEh ; setup stack for QDot expressions
mov bp, sp

call main  ; call the main procedure defined in this very module
           ; somewhere outside this block

?import main  ; import the main symbol used above.  This
              ; will expand to all the code it depends upon, and so on
              ; recursively, thus all the program's source code will
			  ; be compiled and assembled here
%include "src/prog.qdt"

SECTION .bss ; here is the binary end
.
.
.
%endif ; SRC_PROG_QDT
.
. ; here are all importable symbol blocks defined by the
. ; main module
.
%pop SRC_TM_QDT ; the module ends here

See the file lib/qdot/func.qdt for the implementation details of symbol importing.

Debugging

The .COM plain binaries don’t support any kind of meta-information since they are memory images in raw form. Therefore, debugging information is lacking from any executable built from source code written in QDot. In order to ease debugging, a practice of paramount importance in software development of any scale, QDot defines the ?debug clause. This construct aids in debugging by inserting an arbitrary string, and a jump over it, in the assembled binary which can be used to locate specific part of code at the place of declaration without changing the executable behavior. Thus, for instance, one can find where the code for a function starts by putting a ?debug clause in its beginning, then compiling and telling the debugger to search for the string passed to the ?debug clause at that point.

For example, if we would like to easily find, within a debug session, the entry point for the factorial function we defined, the following code put at the very line before the function definition does the job:

?debug 'factorial'

Then it suffices to search for the string 'factorial' in the debugger. When defining several points of reference for debugging make sure to use correlate but distinct strings, like: 'factorial_0',…,'factorial_n' and so forth.

See the file lib/qdot/func.qdt for the implementation details of debugging support.

Standard library

Although the language is, in a strict sense, feature-complete and should remain stable, the standard library is the result of the needs I’ve come across while developing particular projects. Therefore, one should expect the standard library to be ever evolving according to the use case scenarios I shall face in the future. At this stage it covers well some specific areas, not so well others and lacks a lot of features one may expect for projects sufficiently different from the ones I’ve developed with QDot to the present date.

The standard library is designed to be deployed at source-level for each project that depends upon it. It means that a copy of it should go along each project that makes use of it. The library is modular enough, though, so one may just include the relevant bits of it, omitting the rest, without breaking anything.

In order to use a particular symbol defined by the library you should import it as described in “Symbol importing”. For example, if you want to use the function that clears the screen, you must code

?import video_cls
%include "kernel/video.qdt"

within the importable symbol block you are defining for the function that uses it, like this:

;;;;;;;;;;
; cmd_cls
;;;;;;;;;;

%ifdef IMPORT_cmd_cls
%ifndef IMPORTED_cmd_cls
%define IMPORTED_cmd_cls

?import video_cls
%include "kernel/video.qdt"

?func cmd_cls, %$cmd_args
  ? call, video_cls
?endfunc

%endif ; IMPORTED_cmd_cls
%endif ; IMPORT_cmd_cls

Inside the lib directory, every directory that is not qdot comprises the standard library. Each of them is associated with a major area of application, and below them one can find the standard library’s modules, each corresponding to a specialization within its respective area. Currently the standard library comprehend the following directories:

kernel: I/O, memory handling, bootstrapping and debugging;
math: Mathematical functions;
os: OS-dependent routines;
ui: High-level user interface procedures;

In order to avoid clashes in the global name space, by convention, every function symbol has a prefix which corresponds to the module to which it pertains. For instance, the above video_cls function pertains to the module video.qdt, that in turn is contained inside the kernel directory.

Below functions are presented with the following syntax

%$r0, ..., %$rn, func, %$a0, ..., %$an: imperative text demonstrating behavior;

where:

func is the function symbol;
%$r0,…,%$rn are the function return values;
%$a0,…,%$an are the function arguments;

Return values and arguments may be omitted in case they are not expected by the particular function at hand. An imperative text goes along with the function demonstrating how it behaves. For the sake of simplicity, each return value is regarded as a variable to which the function can assign to, in order to return a value at its respective position.

Single-line macros are presented with the syntax:

macro(a0,...,an): imperative text demonstrating expansion;

And multi-line macros with:

macro a0, ..., an: imperative text demonstrating high-level meaning;

where a0…an are the arguments of the macros, that may be omitted in case the macro at hand doesn’t expect them. An imperative text of expansion or a high-level meaning goes along each macro description. Notice that single-line and multi-line macros accepting no argument can’t be distinguished by the syntax presentation, in which case that might be done in the text, if necessary at all.

kernel/memory.qdt

This module could have been called array.qdt, but for the sake of consistency with its hardware-oriented kernel fellows, it has been dubbed memory.qdt. This module defines several functions useful in processing logical memory structures called “arrays”, like characters (dimension 0), strings (dimension 1) and higher-dimensional arrays.

The standard library defines string as a sequence of zero or more characters terminated by END. There is a macro specifically employed to define strings:

string: define an END terminated sequence of characters;

It’s used like this:

label:
  string 'This is a string!'

The label symbol is used to make reference to the string’s beginning. Besides END, there are two more characters defined by the standard library as having special meaning in strings: LF and COLOR_ESC. The former terminates a row and the latter introduces an embedded text attribute code. Both are handled as expected by memory and video routines. Normally, one don’t need to use them directly because there are higher-level macros for the help (for COLOR_ESC see the cor and bcor macros of kernel/video.qdt):

row: define an LF terminated sequence of characters;

It’s used like this:

row 'This is a row!'

However, that is not a complete string because it lacks the terminator character. In order to define a multi-row and complete string one needs to use the array definition macros:

array: open an array definition block;
endarray: close an array definition block;

Use it like this:

label:
  array
    row "This is the first row."
    row "This is the second row."
    row "This is the third row."
  endarray

That is an 1-dimensional array, that’s to say, a string. Moreover, it’s possible to define an array of strings, namely, a 2-dimensional array:

label:
  array
    string "This string is at index 0"
    string "This string is at index 1"
    string "This string is at index 2"
  endarray

Or even an array of 2-dimensional arrays — a 3-dimensional array:

label:
  array

	array
	  string "This is an 1-dimensional element at 0,0"
	  string "This is an 1-dimensional element at 0,1"
	  string "This is an 1-dimensional element at 0,2"
	endarray

	array
	  string "This is an 1-dimensional element at 1,0"
	  string "This is an 1-dimensional element at 1,1"
	  string "This is an 1-dimensional element at 1,2"
	endarray

	array
	  string "This is an 1-dimensional element at 2,0"
	  string "This is an 1-dimensional element at 2,1"
	  string "This is an 1-dimensional element at 2,2"
	endarray

  endarray

And so on, recursively, to an arbitrarily higher dimensionality. The general definition given by the standard library for an array of dimension n is “a sequence of characters terminated by n consecutive END characters”. Notice that, as in the string case, the label symbol is used as a reference to the array itself. With those definitions in mind we can begin to explore the functions available for array processing.

The two general array functions that can work with arrays of arbitrary dimensions are:

%$len, memory_array_len, %$array, %$dim: supposing %$array has %$dim dimensions, assign the number of elements of dimension %$dim minus one contained in it to %$len;
%$ptr, memory_array_elem, %$array, %$dim, %$i: supposing %$array has %$dim dimensions, assign the address of the element at index %$i to %$ptr;

Notice that, because of its very broad definition, there is no way for a general array function to unambiguously identify the dimensionality of an array only by inspecting its contents, since there is no hard end to it. That meta-information has to be stored externally and provided in each function call. Below are the restrict functions for 1-dimensional arrays (strings) and 0-dimensional arrays (characters).

A string has a property called “length”, that is defined as “how far the END character is placed from its beginning”. Intuitively, that’s how many characters there are in a string excepting the terminator character.

%$len, memory_strlen %$str: assign the length of the string %$str to %$len;

Another string property is called “width”, that is roughly defined as “the maximum number of characters between two adjacent LF characters within a string”. Intuitively, that’s how much horizontal space is required to draw that string on screen.

%$width, memory_strwidth, %$str: assign the width of the string %$str to %$width;

Similarly, there is yet another string property called “height”, that is roughly defined as “the number of LF characters within a string”. Intuitively, that’s how much vertical space is required to draw that string on screen.

%$height, memory_strheight, %$str: assign the height of the string %$str to %$height;

To ensure a character is uppercase you can use the function:

%$uchar, memory_uppercase_char, %$char: assign the uppercase correspondent of %$char, in case it’s in the range a–z, otherwise %$char unchanged, to %$uchar;

To apply that procedure in-place to each character of a given string one may use:

memory_uppercase_str, %$str: convert all lowercase characters of the %$str string to uppercase;

Another string transformation function exists to obfuscate string contents in a reversibly symmetric way.

memory_rot47, %$str: rotate all printable ASCII characters (but space) from string %$str by 47 positions. It’s its own inverse, that is, applying it twice one obtains the original string;

One can also compare two strings to see if they are equal — that is, whether they have the same length and character sequence;

%$eq, memory_streq, %$str0, %$st1: assign ?TRUE to %$eq if string %$str0 is equal to string %$str1, ?FALSE otherwise;

It’s also possible to copy one string to another memory location:

memory_copy_str, %$dest, %$orig: copy the string at %$orig to the %$dest address;

Sometimes its useful to obtain indexes for specific characters within strings. That can be accomplished with the following function:

%$i, memory_str_char_index, %$str, %$char: assign the index of the first occurrence of character %$char within string %$str, or -1 in case there is none, to %$i;

On the other hand, one can find the first character that doesn’t match a given character for arbitrary memory locations.

%$nptr, memory_skip_char, %$ptr, %$char: assign the address of the first character not equal to %$char starting at %$ptr to %$nptr. Notice that this function doesn’t care about string boundaries;

Related to this:

%$count, memory_charseq_len, %$ptr, %$char: assign the number of consecutive occurrences of %$char at %$ptr to %$count;

See the file lib/kernel/memory.qdt for the implementation details of kernel memory routines.

kernel/video.qdt

This module is used to query and set some video properties, handle screen cursor positioning and draw to the screen. Before any video operation can be done, the video subsystem has to be probed and initialized by the following routine:

video_init: probe and initialize the video subsystem. This routine supports the holy trinity of IBM-PC graphics adapters: CGA, EGA and VGA, which it sets to the maximum resolution available: 25, 43 and 50 rows, respectively.

There are a dozen global variables used to get and set general video properties that will regulate how video functions handle the display. After calling video_init the following two couples of read-only global variables are available for getting video resolution information:

video_rows: number of rows the screen currently shows;
video_cols: number of columns the screen currently shows;
video_maxrow: its value is [video_rows] minus one;
video_maxcol: its value is [video_cols] minus one;

The following read/write global variables are initialized as well, but to default values based upon the above variables. However, they can be changed at will in order to drive any applicable video function.

video_page: video page in which operations take place. It defaults to 0;
video_win_rows: number of rows of the current window. It defaults to [video_rows];
video_win_cols: number of columns of the current window. It defaults to [video_cols];
video_win_minrow: number of the top-most row of the current window. It defaults to 0;
video_win_mincol: number of the left-most column of the current window. It defaults to 0;
video_win_maxrow: number of the bottom-most row of the current window. It defaults to [video_maxrow];
video_win_maxcol: number of the right-most column of the current window. I t defaults to [video_maxcol];
video_win_color: default text attribute for the current window. It defaults to color(LIGHT_GRAY,BLACK);

Often it’s necessary to specify text attributes for output procedures. The following equates define colors that can be used for specifying foreground and background colors: BLACK, BLUE, GREEN, CYAN, RED, MAGENTA, BROWN, LIGHT_GRAY, DARK_GRAY, LIGHT_BLUE, LIGHT_GREEN, LIGHT_CYAN, LIGHT_RED, LIGHT_MAGENTA, YELLOW, WHITE.

A well-defined text attribute is specified by three parameters: the foreground color, the background color, and the blinking text status. The foreground and background colors are specified by the above equates. The following macros assist in making a complete text attribute specification.

color(fore,back): expand to the text attribute that has fore as its foreground color, back as its background color, and non-blinking text;
bcolor(fore,back): expand to the text attribute that has fore as its foreground color, back as its background color, and blinking text;

Text attributes may also be embedded in strings so as to make it easy to draw colored text messages or ASCII art by means of the video_draw_str function. The following macros are used in string definitions.

cor(fore,back): expand to the string color escape code and text attribute that has fore as its foreground color, back as its background color, and non-blinking text;
bcor(fore,back): expand to the string color escape code and text attribute that has fore as its foreground color, back as its background color, and blinking text;

To enable blinking text a trade-off has to be made. One has to give up the following background colors: DARK_GRAY, LIGHT_BLUE, LIGHT_GREEN, LIGHT_CYAN, LIGHT_RED, LIGHT_MAGENTA, YELLOW, WHITE, because in that case the intensity bit becomes the blinking bit. When blinking is enabled, DARK_GRAY maps to BLACK, YELLOW maps to BROWN, WHITE maps to LIGHT_GRAY, and every light color maps to its non-light version. In order to enable or disable blinking text use the function video_blink_mode that gets a boolean, indicating whether to enable blinking mode, as its solely parameter and returns nothing.

Sometimes it’s useful to draw to a video page that’s not the current one, for instance, to make an off-screen drawing, so the user don’t notice the flickering caused by direct drawing to a visible spot, or to show transitory information without changing the current page’s content. Every video function will work on the video page given by the video_page global variable, that user code may modify. Of course, that’s not necessarily the current video page but one may make it so by calling the following function:

video_select_page, %$p: set %$p as the current video page and make it the operational page by storing it in [video_page];

When drawing to screen, one may want to disable the screen cursor and only enable it when reading input from keyboard. For that end, there are a couple of functions:

video_disable_cursor: disable video cursor;
video_enable_cursor: enable video cursor;

The video module has a rich set of cursor position related routines. Those are divided in two classes: getting and setting. In the getting class one find functions that return cursor positions:

%$r, video_row: assign the current cursor position row to %$r;
%$c, video_col: assign the current cursor position column to %$c;
%$r, %$c, video_pos: assign the current cursor position row to %$r and column to %$c;

Given a string it’s also possible to calculate the starting row or column at which it has to be drawn in order to appear centered on the screen.

%$r, video_cent_str_row, %$str: assign the starting row at which the string %$str has to be drawn in order to appear vertically centered on the screen to %$r;
%$c, video_cent_str_col, %$str: assign the starting column at which the string %$str has to be drawn in order to appear horizontally centered on the screen to %$c;

In the setting class one can find a larger variety of procedures:

video_setpos, %$r, %$c: set the cursor position to the row %$r and column %$c;
video_setrow, %$r: set the cursor position to the row %$r and the current column;
video_setcol, %$c: set the cursor position to the current row and the column %$c;
video_setpos_rel, %$dr, %$dc: set the cursor position to the current row plus %$dr and the current column plus %$dc. Both parameter may be negative integers;
video_setrow_cent: set the cursor position to the row at the middle of the current window, and the current column;
video_setcol_cent: set the cursor position to the current row, and the column at the middle of the current window;

Furthermore, there are two couples of routines that aid in terminal-like output:

video_next_col: if not at the current window’s last column, advance cursor by one column. Otherwise, if not at the current window’s last row, advance cursor by one row. Otherwise, scroll up the current window by one row and rewind the cursor to the current window’s first column;
video_prev_col: if not at the current window’s first column, rewind cursor by one column. Otherwise, if not at the current’s window first row, rewind cursor by one row and advance cursor to the current window’s last column. Otherwise do nothing;
video_next_row: if not at the current window’s last row, advance cursor by one row. Otherwise scroll up the current window by one row;
video_prev_row: if not at the current window’s first row, rewind cursor by one row. Otherwise do nothing;

As you may have notice in these functions, the current window may be scrolled. That’s the job of the following procedures:

video_scroll_up_win, %$rows: scroll up the current window by %$rows;
video_scroll_down_win, %$rows: scroll down the current window by %$rows;

Finally, it’s possible to draw to the screen by using the following routines:

video_cls: clear the current window and put cursor at the minimum allowable position;
video_draw_char, %$char, %$color, %$count: draw the character %$char, with text attribute %$color, %$count times starting at the current cursor position.
video_draw_char_hfull, %$char, %$color, %$dy0, %$dy1: draw the character %$char, with text attribute %$color, filling from the first to the last column of the current row plus %$dy0 to the current row plus %$dy1. The parameters %$dy0 and %$dy1 may be negative;
video_draw_str, %$str, %$color: draw the string %$str, with text attribute %$color, starting at the current cursor position. The %$color parameter may be overridden by color escape sequences (see cor and bcor macros) embedded within the string;
video_draw_str_hcent, %$str, %$color: same as the video_draw_str, but centralize the string horizontally;
video_draw_str_vhcent, %$str, %$color: same as the video_draw_str, but centralize the string horizontally and vertically;

Reciprocally, it’s also possible to read characters that have been drawn to the screen.

%$char, %$color, video_read_char: assign the character at the current cursor position to %$char and its text attribute to %$color;

See the file lib/kernel/video.qdt for the implementation details of kernel video routines.

kernel/keyboard.qdt

This module handles keyboard input. Prior to reading any input one might want to setup the keyboard delay and rate of repetition.

keyboard_mindelay_maxrate: set keyboard delay and rate of repetition to the minimum and maximum values allowed, respectively;

For reading from keyboard a character at a time, one can use these routines:

%$char, keyboard_read_char: in case the keyboard buffer is non-empty, remove its next character and assign that to %$char, otherwise wait for it to become non-empty and then repeat this process;
%$char, keyboard_check_char: in case the keyboard buffer is non-empty, assign its next character to %$char, otherwise assign -1 to %$char;

There are some equates, that the standard library defines, corresponding to relevant keyboard keys: RETURN and BACKSPACE. In order to read an entire string at once there is a specialized procedure:

keyboard_read_str, %$buffer, %$max, %$color, %$outchar: read at most %$max printable ASCII characters, discarding the others, from the keyboard buffer, waiting if it is or becomes empty, and put them in %$buffer, until RETURN is read. If BACKSPACE is read, discard the last character put into %$buffer, in case there is one, otherwise do nothing. If %$outchar is PRINT, output each read character to screen with text attribute %$color, else if it is NOPRINT suppress output, otherwise output the character in %$outchar with text attribute %$color. After returning %$buffer is a string (END terminated character sequence), and there is no RETURN character in it;

Notice that a value of, let’s say, '*' in %$outchar is useful to implement hidden password input. Similarly, NOPRINT in %$outchar can be used when acknowledgement of password length is a concern.

As you may have noticed, the above keyboard input routines don’t read characters directly from the keyboard but rather from its buffer. More often than not one wants to ensure its buffer is empty prior to invoking these procedures, as to synchronize reading and inputting, giving the impression input is being read directly from the keyboard. For that end, the remaining routines deal with flushing the keyboard buffer.

keyboard_flush_buffer: empty the keyboard buffer;
keyboard_flush_buffer_from_char, %$char: keep discarding %$char from buffer until a different character is read;
%$b, keyboard_flush_buffer_from_char_with_resistence, %$char, %$resist: if %$char is read %$resist times in a row, assign ?TRUE to %$b, otherwise assign ?FALSE. Discard all occurrences of %$char from the buffer;

See the file lib/kernel/keyboard.qdt for the implementation details of kernel keyboard routines.

kernel/timer.qdt

This module has facilities that enable programs to wait synchronously or asynchronously for a given amount of time. The synchronous wait functions are:

timer_sleep, %$t: wait %$t clock ticks before returning. A second has 18.2 clock ticks;
timer_wait, %$m: wait %$m milliseconds before returning.

The asynchronous wait function is:

%$b, timer_alarm, %$t: if %$t is non-zero, set the alarm to %$t clock ticks in the future and return nothing, else assign ?TRUE to %$b if the alarm went off, otherwise assign ?FALSE to it.

Notice that for precision sake, this function has to be called frequently enough in order to make the time between calls insignificant in comparison to the wait time.

The time can be specified in seconds to the timer_sleep and timer_alarm functions by means of an intermediate macro:

timer_seconds2ticks(s): expand to the number of clock ticks s seconds are comprised of.

See the file lib/kernel/timer.qdt for the implementation details of kernel timer routines.

kernel/speaker.qdt

This module handles the internal IBM-PC speaker. Currently it’s comprised solely of the following function:

speaker_beep, %$f, %$t: produce a tone of frequency %$f for the duration of %$t clock ticks.

See the file lib/kernel/speaker.qdt for the implementation details of kernel speaker routines.

kernel/disk.qdt

This module provides low-level disk access and is used by the kernel/boot.qdt module when booting from floppies, hard disks, or something that emulates them.

%$spt, %$tpc, disk_get_parameters, %$disk: assign the number of sectors per track and tracks per cylinder of disk %$disk to %$spt and %$tpc, respectively;
%$c, %$h, %$s, disk_sector_to_chs, %$disk, %$sector: convert the logical sector %$sector of disk %$disk into its CHS address, given by %$c, %$h and %$s;
disk_read_sectors, %$buffer_segment, %$buffer_offset, %$disk, %$start_sector, %$count: read %$count sectors starting at the logical sector %$start_sector from disk %$disk to the memory buffer segment and offset given by %$buffer_segment and %$buffer_offset, respectively;

See the file lib/kernel/disk.qdt for the implementation details of kernel disk routines.

kernel/boot.qdt

This module is intended to provide the run-within-OS-or-bootstrap-itself-without-OS capability for programs using the standard library. It’s special and must be used in a different way than other common modules. To use it one have to make two changes: its symbol boot_sector must be imported at the entry point block of the main module and the assembler must be instructed to fill the final binary with zeroes so its size become sector-aligned (divisible by 512 bytes).

In respect to the first requirement the following code must be put right after the inclusion of the QDot language definition (file qdot/qdot.qdt):

?import boot_sector
%include "kernel/boot.qdt"

As for the second requirement, the following code must be placed immediately above the beginning of the .bss section.

times 512 - ($ - $$) % 512 db 0
PROGRAM_END:

Borrowing the example from the Symbol importing section, one should finally have:

%push SRC_PROG_QDT ; push a preprocessor context for this module

%ifndef SRC_PROG_QDT ; ensure this block won't be included twice
%define SRC_PROG_QDT

CPU 8086
ORG 100h

%include "qdot/qdot.qdt" ; enable QDot language

?import boot_sector         ; place the standard library's boot sector at
%include "kernel/boot.qdt"  ; the binary's first 512 bytes

mov sp, 0FFFEh ; setup stack for QDot expressions
mov bp, sp

call main  ; call the main procedure defined in this very module
           ; somewhere outside this block

?import main  ; import the main symbol used above.  This
              ; will expand to all the code it depends upon, and so on
              ; recursively, thus all the program's source code will
			  ; be compiled and assembled here
%include "src/prog.qdt"

; Make this program's size a multiple of the sector size (512 bytes)
; so the boot loader can exactly load it.

times 512 - ($ - $$) % 512 db 0

PROGRAM_END: ; This global symbol is used by the boot loader to
             ; calculate the binary's size

SECTION .bss ; here is the binary end
.
.
.
%endif ; SRC_PROG_QDT
.
. ; here are all importable symbol blocks defined by the
. ; main module
.
%pop SRC_TM_QDT ; the module ends here

After these changes your program becomes simultaneously a bootable image and a valid DOS executable. It can be booted from a floppy disk, a hard drive, an optical disk or even an USB mass storage device. In the case of optical disks, you can generate an ISO image using the resulting executable as a valid El Torito no-emulation boot image. For all other cases it should suffice to write the executable to the very first sector of the drive.

See the file lib/kernel/boot.qdt for the implementation details of kernel boot routines.

kernel/debug.qdt

This module provides primitive debugging support for the boot process, since a proper debugger is not usually available at that stage.

When inspecting the root of a problem, it’s important to inspect variable values and print information to the screen.

debug_print_char, %$char: print character at lower byte of %$char in teletype mode;
debug_print_nibble, %$n: print the low nibble of %$n in hexadecimal;
debug_print_byte, %$n: print the low byte of %$n in hexadecimal;
debug_print_word, %$n: print the word value %$n in hexadecimal;

See the file lib/kernel/debug.qdt for the implementation details of kernel debug routines.

math/random.qdt

This module provides a 32-bit Galois LFSR pseudo-random number generator. In order to use it one has to first initialize it.

random_init: seed the random number generator with the current time;

Then, one can generate random numbers with:

%$rnd, random_number, %$a, %$b: assign to %$rnd a random number between %$a and %$b inclusive;

See the file lib/math/random.qdt for the implementation details of math random functions.

os/dos.qdt

This module handles functions specific to DOS.

%$b, dos_check_if_running: assign ?RETURN to %$b if DOS is running, ?FALSE otherwise;
dos_exit: quit program and return to DOS;

See the file lib/os/dos.qdt for the implementation details of DOS functions.

ui/prompt.qdt

This module is used to present prompts to the user and read back his input.

prompt_read_str, %str, %$buffer, %$max, %$color, %$outchar: print the string %$str, which may have text attribute codes, and call keyboard_read_str with the remaining arguments;
prompt_read_str_vhcent, %$dy, %$dx, %$str, %$buffer, %$max, %$color, %$outchar: center prompt horizontally and vertically on the screen, call video_setpos_rel with arguments %$dy and %$dx, and call prompt_read_str with the remaining arguments;
%$b, prompt_yes_or_no, %$str, %$char_yes, %$char_no, %$char_default, %$color: print the string %$str and the character %$char_default, with default text attribute %$color — setting %$char_default as the current selection — and wait for keyboard input that matches %$char_yes, %$char_no, or RETURN. For the two former cases change the current selection to the respective character — replacing the previous in screen with it, and in the latter case assign ?TRUE (?FALSE resp.) to %$b in case the current selection is %$char_yes (%$char_no resp.). In case a different character is inputted ring the bell and keep waiting.

See the file lib/ui/prompt.qdt for the implementation details of user interface prompt functions.

ui/command.qdt

This module provides basic command-line processing capabilities. To understand it, the following definitions are fundamental.

Token is any sequence of characters within a string that doesn’t contain the space character.
Tokenization is the process of identifying tokens in a string and copying them to an array, whose elements will be each token in the order they are found in the original string.

Considering that, to tokenize a string one can use:

command_extract_args, %$cmd_args, %$cmd_buffer: tokenize the string cmd_buffer into the array %$cmd_args. The memory region pointed by %$cmd_args must have the same size as cmd_buffer.

The standard library is capable of invoking commands automatically, given a command-line string. In order to do that, however, it’s necessary to define a command table using the following macros:

cmdtable: open a command table definition;
cmd name, func: map the command whose name is given by the string name to the function func;
endcmdtable: close a command table definition;

For example, consider the command table, where prog_cmdtable is the symbol used to refer to it:

prog_cmdtable:
  cmdtable
    cmd 'VER', cmd_ver
    cmd 'HELP', cmd_help
    cmd 'EXIT', cmd_exit
  endcmdtable

Then, the following routine is used to automatically invoke any of these possible commands:

%$b, command_run, %$cmdargs, %$cmdstr, %$cmdtable: tokenize command arguments from the command string %$cmdstr to the %$cmdargs array, which must have the same size as the former, then call the function associated with the command, whose name is given by the first element in %$cmdargs according to the command table %$cmdtable, and assign ?TRUE to %$b — in case it exists, do nothing and assign ?FALSE to %$b otherwise. A command function is called with %$cmdargs as its solely argument, so the function can have access to its command-line arguments;

Therefore, faced with the string "HELP VER" in %$cmdstr, command_run would call cmd_help like this:

? call, cmd_help, %$cmdargs

and return ?TRUE, whilst for the string "QUIT" it would just return ?FALSE right away.

See the file lib/ui/command.qdt for the implementation details of user interface command-line functions.

ui/progbar.qdt

This module assists in drawing progress bars used to represent an ongoing process.

progbar_draw, %$size, %$ticks, %$callback, %$callback_data, %$numcalls: draw a progress bar of size %$size that takes %$ticks clock ticks to fill (ignoring the callback processing time). Evenly distribute throughout this time %$numcalls calls to the function %$callback, passing to it %$callback_data as its only argument;
progbar_draw_hfull, %$ticks, %$callback, %$callback_data, %$numcalls: call progbar_draw with a full-width size and the remaining arguments;
progbar_draw_r, %$size, %$ticks, %$callback, %$callback_data, %$numcalls: right align cursor and call progbar_draw with all arguments;

See the file lib/ui/progbar.qdt for the implementation details of user interface progress bar functions.

ui/scrnsvr.qdt

This module has animation procedures intended to be used as screen-savers by applications.

scrnsvr_dcmatrix, %$forever: start DeciMatrix screen-saver. If %$forever is ?FALSE return at any key press, otherwise never return;

See the file lib/ui/scrnsvr.qdt for the implementation details of user interface screen-saver functions.