Ms Dos - Looking For An Open Source DOS .Com Program Written in Assembly - Retrocomputing Stack Exchange
Ms Dos - Looking For An Open Source DOS .Com Program Written in Assembly - Retrocomputing Stack Exchange
Sign up to join this community The best answers are voted up and
rise to the top
Retrocomputing
I'm writing a NASM-compatible assembler targeting the Intel 8086, and I'm looking for an existing open
source program written in assembly, with which I can showcase the capabilities (and understand the
15 shortcomings) of my assembler. My plan with the source code is to convert it to NASM syntax manually,
and then compile it with both NASM and my assembler, look at the difference, and see which subset of
NASM is still missing or incorrect in my assembler.
It should be an existing program for DOS 8086 or DOS 286, whose assembly source is available,
and doesn't contain newer-than-286 instructions, protected mode instructions or floating point
instructions. (Programs requiring a 386 or 8087 are not suitable.)
Preferably, the executable program should be a DOS .com file. My assembler doesn't support DOS
.exe directly, but an .exe shorter than 64 KiB can be arranged by emitting the .exe header with dw
instructions.
Preferably, the executable program size should be between 10 and 50 KiB, ideally near 30 KiB.
There are some boot sector games by Oscar Toledo G., e.g. F-Bird and others by the same GitHub
user, but they are only 512 bytes or less.
The source code shouldn't make heavy use of assembler macros. (Conditional assembly and
symbolic constants (e.g. NASM %define answer 42 ) are fine though.) Thus GW-BASIC dated 1983-
02-10 is not a good candidate, because it uses lots of MASM macros.
The full source code (accounting for every single byte of the program file) should be available. It's
OK if a few bytes are missing, I can add them from the disassembly.
Preferably the program shouldn't be a development tool (e.g. assembler, compiler, linker or
interpreter), an operating system (e.g. msdos.sys, command.com, himem.sys) or a system
maintenance tool (e.g. format.com, chkdsk.com, antivirus), but it should be a game or some other
interactive productivity tool (e.g. text editor, picture editor, e-learning tool). This way it can be
showcased in a DOS emulator as something directly useful on its own.
I was considering:
The game Arcade Volleyball: it's a timeless enjoyable game, it matches the desired file size), but its
source code is not available, and the DOS port was written in Turbo C, not in assembly.
The game Paranoid: it's a timeless enjoyable game, the program file is a 26 KiB .com file, and it
doesn't use 386 instructions. Unfortunately it's not open source.
Volkov Commander 4.05: it's very well-known, its built-in text editor can handle files larger than 64
KiB, the program file is a 47 KiB .com file, and it doesn't use 386 instructions. Unfortunately it's not
open source.
Mostly I write single-line unit tests checking whether a single assembly instruction compiles and
what machine code it generates.
I also write multi-line unit tests with labels and various distances. Such a test checks that a short
jump (with 1-byte offset) is generated for a label which is not too far away (about at most +-127
bytes). These optimizations depend on the -O... flag specified in the command-line, so I check the
machine code output against the expected bytes in multiple optimization levels.
My goal is that my assembler to be compatible with NASM, and if different versions of NASM
behave differently, then my assembler should match what NASM 0.98.39 compiled for a Linux i386
host does. Thus I also run the unit tests above with various versions of NASM and compare the
machine code output. I don't compare the warning or error messages though.
Accepted behaviors of my assembler in general: (1) both my assembler and NASM 0.98.39
succeed for a particular input + command-line pair, and produce identical machine code in the
output file; (2) either NASM 0.98.39 or my assembler fails (with non-zero exit code) for a particular
input + command-line pair. An example for (2): my assembler fails if it encounters %macro . Another
example for (2): NASM fails for db $*3 .
I'd like to have as few failures (and as many matches) as possible, within my design limitations. An
example design limitation of my assembler: it supports only bits 16 , not bits 32 or bits 64 .
Sometimes I relax a small design limitation, but the one above is unlikely to change.
ms-dos assembly
Share Improve this question Follow edited Dec 15, 2022 at 11:48 asked Dec 9, 2022 at 12:13
Toby Speight pts
1,611 14 31 1,855 9 19
Comments are not for extended discussion; this conversation has been moved to chat. – Chenmunka ♦ Dec 12,
2022 at 8:45
The small / tiny text editors section on Free software for DOS includes a few text editors provided with
source code, such as SuperTed. They are all smaller than 10KiB, so they don’t quite fit your
15 requirements.
Any large DOS freeware / shareware compilation will have a few programs with assembly language
source code; I don’t have any specific examples to highlight but a quick grep through SimtelNet
archives reveals a few candidates.
A number of FreeDOS programs are written in x86 assembly language, including games such as Floppy
Bird and Invaders.
A few demoscene productions for the PC include source code, usually in assembly language, but all
those I’m aware of require a 386.
Share Improve this answer Follow edited Dec 9, 2022 at 13:07 answered Dec 9, 2022 at 12:47
Stephen Kitt
122k 17 505 463
1 Thank you for the links! Floppy Bird program size is 8704 bytes, Invaders program size is 9194 bytes. I'm still
looking for something bigger for the demo. The listings in your answer can keep me busy. – pts Dec 9, 2022 at
19:14
3 The Styx remastered source code might be worth a look, but building the executable involves linking.
– Stephen Kitt Dec 9, 2022 at 19:50
I've also looked at ibiblio.org/pub/micro/pc-stuff/freedos/files/games , and besides Invaders, there was a 2K (2048-
byte .com program) Tetris which I can use, but that's still too small. – pts Dec 10, 2022 at 19:22
2 FreeDOS is a good option, there are quite a few asm programs but you have to ensure you install the source
packages (or everything). – paxdiablo Dec 14, 2022 at 2:51
1 @StephenKitt: Thank you for recommending Styx Remastered! Linking wasn't a big issue, it was easy hardcode
(with dw instructions) the EXE header to the NASM source. Translating from A86 to NASM syntax was also
straightforward, but it needed a few hours of work. In the end, it has uncovered an optimization bug in my
assembler. Translated source: github.com/pts/mininasm/tree/master/demo/styx – pts Jan 3, 2023 at 4:29
I wasn't going to post it as an answer, because I felt it exceeded the specified criteria, but I wrote a
comment that the OP may enjoy testing their assembler on this actively-developed 8086 open-source
12 assembly code: https://github.com/Joshua-Riek/x86-kernel. It's a 16-bit real mode kernel.
Given that the OP found it helpful, I'm posting the recommendation as an answer in the event that the
comments get deleted at a future date, and that this may help others.
Another project's that has code that may be useful is an open-source assembly bootloader targeted for
8086 processors: https://github.com/Joshua-Riek/x86-bootloader. The project is designed to be
assembled with NASM.
Share Improve this answer Follow answered Dec 11, 2022 at 8:31
End Anti-Semitic Hate
227 2 9
2 Thank you for both project links! After fixing some bugs in my assembler and rewriting the use of the features my
assembler doesn't support, I've compiled both projects with NASM 0.98.39 and my assembler, and the resulting
executable programs are identical. – pts Dec 11, 2022 at 23:17
3 You're welcome. That's very impressive, and a significant accomplishment. Well done! Do you think you may
choose to release your assembler some day? – End Anti-Semitic Hate Dec 12, 2022 at 7:51
github.com/pts/mininasm with source code and binary releases. – pts Dec 13, 2022 at 4:06
I've taken a look at all .com files of at least 6000 bytes on the FreeDOS 1.2 CD. Here is the list of
programs written in assembly:
7
32935 NASM devel/insight.zip.ex/progs/insight/insight.com
27566 JWasm base/debug.zip.ex/bin/debugx.com
22476 JWasm base/debug.zip.ex/bin/debug.com
23994 TASM net/fdnet.zip.ex/network/pcntpk.com
16256 TASM util/doslfn.zip.ex/bin/doslfn.com
15373 TASM32 util/wde.zip.ex/bin/wde.com
6002 NASM util/fdshield.zip.ex/bin/fdshield.com
Source code (JWasm, MASM can also compile it, it uses macros heavily) of debug.com an
debugx.com: https://github.com/Baron-von-Riedesel/DOS-debug
WDE, doslfn.com, debug.com and debugx.com use 32-bit registers (e.g. eax) a lot, so they won't
run on a 8086, and it would be pointless to convert their source code to my assembler.
insight.com (within FreeDOS 1.2) uses 32-bit instructions very sparingly. Maybe it's possible to
patch get rid of them. It uses multi-line macros sparingly. I'm getting about 400 errors from my
assembler. Maybe I can convert it in a couple of hours, and then this modified Insight will be the
largest real-world project my assembler is able to compile so far.
Please note that some .exe programs may also be written in assembly, I'll take a look at them later.
I've taken a closer look at Insight (insight.com is 32935 bytes, most of it is generated from NASM
source), because that was the largest open-source DOS .com program I was able to find. Only a
surprisingly few changes were necessary to avoid NASM features not supported by my assembler. I've
fixed some bugs, most of them related to local labels. I've even added a few small features (such as
labels starting with ..@ , directive absolute $ and data instruction resw ... ). Finally I discovered a
serious design flaw in the jump size optimizer, at least my understanding how it converges to a fixed
point was wrong. I disabled some checks, and it generated the correct insight.com file (bitwise identical
to the one distributed and to the one NASM generates). This is great news: no machine code generation
bugs found! However, I still had to fix the optimizer in the general case. Final case study with full source
code here: https://github.com/pts/insight-mininasm
Oddly enough, games/flpybird/flpybird.com was missing from the FreeDOS 1.2 CD, even though this
packages is available in FreeDOS 1.2. It's a 8704-byte DOS .com program written in NASM with only a
few simple macros (which I was able to change to those supported by my asembler), and with a few 186
and 386 instructions. Since I've added 186 instruction support to my assembler recently, I only had to
code the 386 instructions as db or dw when porting flpybird to my assembler. I've also discovered a
parsing bug and a code generation bug in my assembler while making the port, so it was worth my time.
Short case study here: https://github.com/pts/mininasm/tree/master/demo/flpybird
The game games/psrinvad/invaders.com was relatively straightforward to port, I just had to search-and-
replace single-line macros (several dozen, so I've automated it) with their simple definitions in the NASM
source. The final .com program file is 9194 bytes long. It's interesting that the source code of psrinvad
contains autogenerated assembly source for many assemblers targeting DOS (not only NASM).
The debugger ldebug is written in NASM, it makes heavy use of macros, and it has a multistage build
process, creating a compressed .exe in the end.
The game Styx Remastered was compiled using A86 and linked with TLINK. The styx.exe (without
PKLITE compression) contains about 25600 bytes of 8086 machine code, the rest of the .exe file is
headers, data and padding. Translating the linking was easy: I've hardcoded the DOS .exe header as
dw instructions in the NASM source. Translating from A86 to NASM syntax was also straightforward, but
it needed a few hours of work. In the end, it has uncovered an optimization convergence bug in my
assembler (which I still have to understand and fix). Translated source:
https://github.com/pts/mininasm/tree/master/demo/styx
The largest program I was able to find written in 8086 assembly is Double Dragon II. The executable
program file dragon.exe released on 1989-11-07 is 82056 bytes long. It's probably written in MASM
5.10a or earlier, the earliest .asm source file is dated 1989-08-28. Almost all of the source files were
found in a deleted (but recoverable) file on a floppy disk. The files are archived in multiple places, e.g.
on Internet Archive. Some source files are missing. It is unclear how much source code is missing.
Share Improve this answer Follow edited Jan 10, 2023 at 0:33 answered Dec 11, 2022 at 2:10
pts
1,855 9 19
1 Note that Debug/X, much like my fork lDebug, both optionally use 386+ features. They will run fine on 8086s (or
eg NEC V20, as lCDebug is tested by me). The DPMI-enabled DebugX and lDebugX can also use 286+ features
but only do so if Protected Mode is entered. Anyway, lDebug heavily uses the NASM preprocessor, creates an MZ
.EXE output file (but as is all with two NASM assembly steps, no external linker), is generally larger than 64 KiB,
and both the FreeDOS Debug/X and lDebug are "development tools". – ecm Dec 11, 2022 at 12:34
1 doslfn however does make mandatory use of 386+ registers and operations. I'm unsure about Insight, it's been a
while since I have touched my Insight repo. – ecm Dec 11, 2022 at 12:43
1 Insight is documented as running on a 8086 as well. Much like lDebug it uses a patch table to change some
instructions by writing an o32 prefix. (lDebug instead writes nop bytes and overwrites various o32 , a32 , or
other instructions.) There is also two spots in its run functions where it explicitly dispatches based on a variable, eg
hg.pushbx.org/ecm/insight/file/73e1b07abd73/src/trace.inc#l172 for saving/restoring fs / gs – ecm Dec 11,
2022 at 12:54
1 @ecm: I'm trying to compile your Insight repo with NASM, but the file lmacros3.mac seems to be missing. Also in
tools/mkhelp.c, the hh (char) size specifiers are missing from "; HELP: %hhd %hhd %hhd %hhd %hhd . – pts
Dec 11, 2022 at 23:56
1 The macro collection is available at hg.pushbx.org/ecm/lmacros and I also added clone/build instructions to the
readme. Specifically the lmacros have to be in a location found by NASM in this command:
hg.pushbx.org/ecm/insight/file/73e1b07abd73/src/makefile#l35 That is, either the current directory or
../../lmacros/ or ../../../lmacros/ . I also made sure that mkhelp builds without warnings at -Wall ,
thanks for telling me about it! – ecm Dec 12, 2022 at 8:29
2 I made a bunch of test files for my assembler and checked their output code with NASM, too. For instance
addressing modes or segment override, maybe it could help. – vitsoft Dec 13, 2022 at 19:23
@ecm: I wasn't able to find IDebug, IDebugX or ICDebug using Google. Could you please provide links for the
source code of each? – pts Jan 2, 2023 at 14:56
1 @ecm: OK, I've found IDebug and IDebugX: hg.pushbx.org/ecm/ldebug/file/tip/source Or is it? The name of the
repo starts with a small l rather than a capital I. – pts Jan 2, 2023 at 15:10
Yes, it is "L"Debug with a small letter L. lDDebug(X) can be built using maked. No such shortcut exists for
lCDebug(X) yet, you need to manually run build_name=cdebug ./mak.sh -D_DEBUG -D_DEBUG_COND to build
it. (Plus x and -D_PM to make lCDebugX instead of lCDebug.) But as I mentioned lDebug does use the
preprocessor a lot, you can easily view debug.mac and asmtabs.asm + lmacros. – ecm Jan 2, 2023 at 16:03
Might not meet this criteria, but it could be a good test anyway:
I haven’t read through the source code, so I don’t know what kind of macros are in use, but DOS 2.0
DEBUG.COM is 11.5KB, and generated from DEBUG.ASM, which Microsoft open-sourced.
GW-BASIC is also open-sourced, but contains a dozen macros or more, so it might not be suitable for
your test.
Share Improve this answer Follow edited Dec 18, 2022 at 1:24 answered Dec 17, 2022 at 21:10
Jacob Krall
2,299 2 17 31
1 I also worked on Microsoft's free software Debug version a bit to add some features (recreated) from more recent
(non-free) versions as well as a few extensions: hg.pushbx.org/ecm/msdebug I'm not satisfied with this as it still
requires DOS to handle int 20h differently from int 21h function 4Ch, which eg FreeDOS does not
implement. Also it's still in old MASM dialect. – ecm Dec 17, 2022 at 21:42