Koppany's Blog: 2016

Sunday, September 11, 2016

Super Simple Interpreter

I was wondering how to write a simple interpreter in assembly for a 6510 processor (#c64) and I dawned upon probably the simplest interpreter ever (aside from a brainfuck interpreter).

Here is the basic pseudo code of the program:

define array of lines
define dictionary of variables
define left variable
define right variable
define counter and set to 1

loop
    write '> '
    get line
    append line to array of lines if line not empty
    break loop if line is 'run'
end loop

loop for each line
    split line by spaces

    if first element is 'print' or 'write' then
        if second element is single quote do
            get second to last element
            join with spaces
            write string excluding first and last character
        end
        else do
            find second element from dictionary of variables
            get value associated with element
            turn value to string
            write value string
        end
        if first element is 'print' do
            write '\n'
        end
    end

    else if second element is '=' do
        if third element is number do
            set left to number
        end
        else do
            get variable name
            find value from variable dictionary
            set left to value
        end
        if line is > 3 elements do
            if fifth element is number do
                set right to number
            end
            else do
                get variable name
                find value from variable dictionary
                set right to value
            end
            if fourth element is '+' do
                left = left + right
            end
            if fourth element is '-' do
                left = left - right
            end
            if fourth element is '*' do
                left = left * right
            end
            if fourth element is '/' do
                left = left / right
            end
        end
        set dictionary position of first element to value of left
    end

    else do
        write 'Error on line ' + counter
        break loop
    end

    counter = counter + 1
end loop

The syntax of this interpreted language is this:
Each operation is its own line.
The mathematical operations are limited to +, -, *, /
All numbers are integers and operations return integers.
Strings are only allowed in print/write statements.
Strings are enveloped in single quotes.
Variable names must be single letters.
All individual parts of a line must be separated by 1 space.
Write writes to screen without newline.
Print is like write but with a newline.
No direct string concatenation, use write statements for that.

Example program:

a = 5
write 'A is: '
print a
b = a + 5
b = b / 7
write 'B is: '
print b

Example output:

A is: 5
B is: 1

To test its simple nature, I wrote the original version of the interpreter in Python 2.7 and was able to make it in 26 lines of code. Here is the source:

import sys
varis = {}
lines = []
write = sys.stdout.write
while 1: #get input
    inp = raw_input("> ")
    if inp == "run": break
    lines.append(inp)
for y in range(len(lines)): #parse program
    x = lines[y].split()
    if x[0] == "print" or x[0] == "write":
        if x[1][0] == "'": write(" ".join(x[1:])[1:-1])
        else: write(str(varis[x[1]]))
        if x[0] == "print": write("\n")
    elif len(x) > 2 and x[1] == "=":
        if ord(x[2][0]) > 96 and ord(x[2][0]) < 123: left = varis[x[2]]
        else: left = int(x[2])
        if len(x) > 3:
            if ord(x[4][0]) > 96 and ord(x[4][0]) < 123: right = varis[x[4]]
            else: right = int(x[4])
            if x[3] == "+": left += right
            elif x[3] == "-": left -= right
            elif x[3] == "*": left *= right
            elif x[3] == "/": left /= right
        varis[x[0]] = left
    else: print "Error: line "+str(y+1)

I then wrote a modified version of it in Axon (the main programming language I use at work) and I was able to make it in 31 lines of code. Here is the source:

(x) => do
  vars: {}
  x = x.replace("\n",";").split(";")
  t: []
  cou: 1
  left: ""
  right: ""
  x.each y=> do
    t = y.split(" ")
    if(t[0] == "print") do
      if(t[1][0..0] == "'") echo(t[1..-1].concat(" ")[1..-2])
      else echo(vars.trap(t[1]))
    end
    else if(t[1] == "=") do
      if(t[2][0] > 47 and t[2][0] < 58) left = parseInt(t[2])
      else left = vars.trap(t[2])
      if(t.size > 3) do
        if(t[4][0] > 47 and t[4][0] < 58) right = parseInt(t[4])
        else right = vars.trap(t[4])
        if(t[3] == "+") left = left + right
        else if(t[3] == "-") left = left - right
        else if(t[3] == "*") left = left * right
        else if(t[3] == "/") left = floor(left / right)
      end
      vars = vars.set(t[0], left)
    end
    else echo("Error: line "+cou)
    cou = cou + 1
  end
end

The differences in each of the implementations is that the Python one is more like a command line in which you type in your lines and then type in "run" when you want to execute the program, and it supports both "write" and "print". The Axon implementation only has "print" and is a function that takes in the whole program as a string parameter, so I made that to also work with semicolons between the statements.

Friday, April 8, 2016

Easily Install Oracle's Java On Ubuntu Linux

Hello world, today I have a quick "life hack" type thing for Ubuntu people.

Where I work, I am kind of the Linux guy, installing our software for clients who wish to run it on Ubuntu machines.
The problem is that our software runs on Java, and some of the people either don't know how to install Java or have it installed in a weird way.

I was asked what was the easiest way to install Java, and here you go:


sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

(agree to the two prompts)

(wait for download to finish)

java -version

The first 2 commands installs the webupd8team Java repository and updates apt-get so you can install from there.
The next command actually installs it, with a license agreement and a download.
The last command just confirms that you have Java installed.

Sunday, March 20, 2016

Making C64 Program Tap Files

Before I go ahead and tell you guys how to make a tap file that is loadable in a C64 emulator, I'm going to give you the link to the source of all my info: http://c64tapes.org/dokuwiki/doku.php?id=loaders:rom_loader and http://c64tapes.org/dokuwiki/doku.php?id=analyzing_loaders

This is where I learned how to make tap files (although it took me two weeks to find that page).

Enough chat, here is the structure of a C64 tap file.
There are 3 different kinds of pulses:
-    a long pulse lasting about 672 µS represented as ASCII char 86 or hex 56
-    a medium pulse lasting about 512 µS represented as ASCII char 66 or hex 42
-    a short pulse lasting about 352 µS represented as ASCII char 48 or hex 30
Physically, these pulses are square wave pulses running at 50% duty with a period of the specified µS.
I will refer to long as L, medium as M, and small as S. Data encoded using a combination of these pulses will be referred to as LMS format.

Now these pulses alone don't make up binary data (as I learned the hard way).
Bits are represented as a pair of pulses, a 0 bit being SM, a 1 bit being MS, a start bit being LM, and a data end bit being a LS.
A LMS byte is represented as a certain sequence of pulses which looks like this:

LM     xx    xx    xx    xx    xx    xx    xx    xx    nn

start  bit0  bit1  bit2  bit3  bit4  bit5  bit6  bit7  check

There is a start bit in front of every byte, but each byte doesn't have its own end bit. The check bit at the end is calculated with the bitwise formula:
1 xor bit0 xor bit1 xor bit2 xor bit3 xor bit4 xor bit5 xor bit6 xor bit7
Or more conveniently, if the byte has an even number of 1's, the check bit is 1, if the byte has an odd number of 1's, the check bit is 0.

Now that single bytes are covered, it's time to learn how whole data is encoded in a tap file.

Every C64 tap file starts off with 20 bytes of information, kind of like meta data. All information should be coded in hex, as this is raw data.
Bytes 0 to 11 are C64-TAPE-RAW, byte 12 is the number 1, bytes 13 to 15 are just 0, and bytes 16 to 19 is the length of the actual tap data in least significant byte first (LSBF) format. The tap data length is usually added last, since we don't know how long our tap data is yet.

The rest of the bytes are the data.
We start with 27136 small (S) pulses, this is the leader block used to calibrate the datasette speed.
Then we start the header with 9 sync bytes, from 137 down to 129 (LMS).
Then we write the header file type, in this case just the number 1 (LMS).
Next is the starting address for the data, two bytes in LSBF format (LMS).
Next is the ending address for the data plus 1, also two bytes in LSBF format (LMS).
Then the bytes in PETSCII code for the file name, no more than 16 bytes (LMS).
Then 16-lengthOfFileName bytes of the number 32 for title padding (LMS).
Then write the number 32 for 171 times, this is the header body, not sure what it's for (LMS).
Then a checksum byte for the header, this is all the bytes from the header file type (byte1) to the end of the header body (last byte) (LMS), use this bitwise formula to calculate:
0 xor byte1 xor byte2 xor ... xor last byte
Then a data end bit of LS to end the header.
Then we write a small pulse 79 times as an interblock gap.

Now we write the header again, but with some changes.
Write the 9 sync bytes, but this time from 9 to 1 (LMS).
Then write the exact same header data from the header file type to the data end bit (LMS).
Next write 78 small pulses as the header end trailer.
Then write 5376 small pulses as the interrecord gap.

Now we can start writing our actual program data.
Before we do that, you'll need to know how to make program data (useful link: https://www.c64-wiki.com/index.php/BASIC_token) or get it from RAM. (Or use my tapwriter.py program on GitHub)
Start the program with 9 sync bytes, from 137 down to 129 (LMS).
Next write the bytes of your program exactly how it would appear in ram from the starting address to the ending address (LMS).
Then write your checksum byte, calculated just like on the header blocks, only from after the sync bytes all the way to before this checksum byte (LMS).
Then write your data end bit of LS to end the header.
Then write a small pulse 79 times as the interblock gap.

Now we write the program data again, just like with the header.
Write the 9 sync bytes, but from 9 to 1 (LMS).
Copy the program block from after the sync bytes all the way to the data end bit.
Then write a small pulse 78 times for the data end trailer.

That's all there is to it, just remember to update the tap data length in bytes 16 to 19, this is only the length from byte 20 to the end.

This isn't the most efficient way to store data, since it takes about 10 seconds just to get to the header after the leader block, but it works and it's pretty cool to have a program load off a tap file that you made (forged with your own hands in the heart of a dying star).

If you understand how to create a tap file, and can actually get it to work (took me 3 days), then it wouldn't be hard to reverse the protocol to read the data off of the cassette or even directly from the C64, just as I will be doing for my C64 Arduino Internet project.

GitHub program files:
https://github.com/koppanyh/C64Net/blob/master/tapwriter.py