Text Editor Loading/Saving & Basic Tokenizer - Amateur OS Dev (x86 asm)

Channel:
Subscribers:
5,840
Published on ● Video Link: https://www.youtube.com/watch?v=M0tcMsKF0Io



Duration: 1:19:46
525 views
10


Adding loading & saving files to the text editor, and after a hair-growth hiatus, adding a basic lexer/tokenizer to the kernel for running commands/programs going forward
Notes:
The leading spaces/newlines when loading files in the text editor are due to incrementing cursor_x under .increment: before moving the cursor. So the cursor is always offset by an additional space. This should be fixed in the future, when I remember it's broken :)

I'm also slowly reading through a few other books in my spare time, including one on x86 assembly. Learned about the movsx/movzx instructions recently, which can sign extend or zero extend a smaller size register into a larger size register. For example, instead of

xor ch, ch
mov cl, {some byte of something}

You can do

movzx cx, {8 bit register or [8 bit register]}

and ch will be zeroed out, which is pretty neat.

As seen in the loading text file code, the cmov{condition} instructions move a source register or [register] into a destination register, if the condition is true. For example,

cmp al, 'T'
cmove ax, [bx]

Will move the data at BX into AX if al = 'T', satisfying the 'equal' condition or zero flag being set. So cmovne, cmovg, cmovnc, etc. should all work similarly.

Shown in the tokenizer.asm code is the lea {register}, {[memory]} instruction, for "load effective address". Instead of moving the data at the memory location to the destination register, the numbers are computed and an address is moved instead. For example,

mov bx, 10h
org 2000h
address_a: times 20 db 0

... Some other code here ...

lea di, [address_a+bx]

moves the memory location at address_a + 16 (10h) bytes, into di. address_a here would be at location 2000h, so di should point to 2000h + 10h = 2010h. di will NOT be the data at location 2010h, but will be the number 2010h itself. [di] would be the data at location 2010h.

It's hard to remember all of the instructions all the time to gauge the best use case when recording these videos, but there's always future refactoring and yak shaving available :)
I'll try to remember to review code on occasion and look for potential clean-ups and improvements.

Video outline:
0:00 - Add loading files to text editor
17:21 - Add saving files to text editor
37:17 - Basic lexer/tokenizer example in C
40:55 - Tokenizer example in Asm
48:29 - Add tokenizer code to kernel & debug
----------------------------------------------------------------------------------------------------------------------------------------
Playlist for this series:
https://www.youtube.com/playlist?list=PLT7NbkyNWaqajsw8Xh7SP9KJwjfpP8TNX

Git repos:
https://git.sr.ht/~queso_fuego/quesos
https://github.com/queso-fuego/amateuros

Software used:
VMware Workstation Player - https://www.vmware.com/products/workstation-player/workstation-player-evaluation.html
openBSD - https://www.openbsd.org/
qemu - https://www.qemu.org/
vim - https://www.vim.org/ (neovim is probably better :p)
fasm - https://flatassembler.net/
fasm docs - https://flatassembler.net/docs.php?article=manual

Suggest content you would like to see live, be it programming, gaming, or otherwise, through youtube comments, twitter, or by email.

Contact:
email - fuegoqueso@gmail.com
twitter - @Queso_Fuego

Thoughts/Notes/Suggestions/Other - Drop a message in the video comments, by twitter, or by email

Credits:
Music from https://incompetech.com:
"Your Call" by Kevin MacLeod (https://incompetech.com)
Licence: CC BY (http://creativecommons.org/licenses/by/4.0/)

The blue title tags:
#osdev #programming #x86







Tags:
osdev
os
dev
operating system
development
queso fuego
assembler
assembly
16 bit
real mode
x86
fasm
flat assembler
qemu
vim
tokenizer