Koppany's Blog

Saturday, August 19, 2023

Advertising your Minecraft server with pings: How to return custom ICMP Echo Reply messages

So I was at DEF CON last week (world's largest and most fun hacker cons, I highly recommend going), and I saw many wonderful and fun things there. One of those was what is called the S.O.D.A Machine, which is basically a soda machine that outputs credentials to a virtual machine rather than an actual soda. Naturally, I had to try it.

I heard about it from a tweet shortly before the con and thought it would be fun to try. It was the 2nd thing I was most excited for (the first being getting an LED installed in my hand) and made sure my friends knew that we were going to be doing this.

So we went to the chillout area on Thursday only to find out the machine was out of order. The next day we went and again found it out of order. However it started working later that day, so I sacrificed a dollar to receive the legendary VM credentials. Score.

Top of the receipt, it was kind of long, it said TL;DR: DFIU after all the legal parts.

At this point I had no idea what to do with it. I didn't think that far ahead. So I fired up my little disposable 2010s netbook running Manjaro on a spinning rust drive and SSHed in just to see that it worked, and it did. A minimal Debian environment running as root.

The next day on Saturday, one of my other friends was curious about DEF CON and ~~stole~~ borrowed the badge of the one that was with me the prior day (it's ok, she slept the first half of the day and did not miss the badge). We wanted to do some actual hacking, so we got the idea to run a Minecraft server on it.

That's all well and good, but not if no one knows about it. So we thought about how to advertise the server without being so obvious, and somehow we arrived at the possibility of returning a custom message in the ICMP reply when someone pings it so that they could have instructions for how to connect. Then we could just spread little strips of paper (ripped out of my tiny yellow legal pad) that said something like "Ping me: 172.31.65.46" and log each ping to see if it's actually working. Bonus points for logging IPs on the Minecraft server to see how many of those pings lead to our advertisement scheme working. Minecraft for real hackers 😎

With the plan laid out, my friend took the netbook and started to try to set up the Minecraft server while I tried to do some badge hacking on the DEFCON Furs badge (that was infinitely cooler than the plastic square that was the standard DEF CON badge). We spent maybe around an hour on the con floor doing this before it was time to go do something else. The MC server installation ran into too many dependency errors and didn't get finished. I decided to stay down on the con floor while my friends went to go shopping (I think).

I would spend the rest of the con thinking about this project and I even took some Wireshark traces and did a bit of research on the ICMP protocol. I also tried to see how one would even go about returning custom messages (and hoping there was a simple program I could just run to do this, but I couldn't find one (not that I really looked)). I found out that you could disable the ping response which was part of the OS itself. You could then basically do a packet capture to see when you received an ICMP echo request packet and craft your own ICMP echo reply packet as a response. I didn't get as far as actually implementing it, but at least I had a vague high-level idea of what to do.

This is the magic sauce to turn off the OS's ping replies

Not to be deterred, I still really wanted to do it, since it would definitely be a learning experience. So I spent a few hours for the next few days after the con to chisel away at this project. At this point I didn't have my trusty S.O.D.A. virtual machine, so I just used one of my cloud servers as the dev environment for this, SSHing in and writing all the code through Vim with tmux, as I would've on the S.O.D.A.

I found out that you could write a packet sniffer in python using only the standard libraries (that's how I roll), so I wrote a test program to just output the ICMP requests so that I could match them up with the Wireshark traces and try to decode them. I wrote an IP packet header decoder class (as per the RFC) to detect when there was an ICMP packet (which apparently is all you get with socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP)). Then I wrote an ICMP decoder class (also as per the RFC) specifically for detecting when that ICMP packet was an echo request.

With that, I then modified the classes to work in reverse, so that I could just manually change whatever fields I wanted and then call .toBytes() to get them to turn into a byte list to be sent out. This way the ICMP byte list could be inserted into the IP header's data field, then the header turned into a byte list that is a whole response. I even had to learn how to calculate the checksums used by these packets, which turns out to be pretty simple.

The basic approach to generating the reply for an ICMP echo request is to swap the source and destination IP addresses in the IP header, set the ICMP's type to 0 (8 indicates a request, 0 is a reply), and recalculate the checksums on both ICMP and IP header. In this case, also updating the data field to be something other than whatever the reply had in it.

So that's what I did, and... it didn't work. After a few hours of combing over everything and comparing real ICMP echo replies to the ones I'm generating and checking that I was generating the checksums correctly, I finally figured it out. Turns out Python's socket .sendto() already encapsulates whatever you give it with an IP header before sending it out. So my full IP packet was being wrapped in an IP packet. I had to use tcpdump on the server to capture these malformed packets for analysis in Wireshark on my laptop, and then it was immediately obvious that this was the problem.

So I just removed the part where I put my ICMP packet into an IP header and sent the raw ICMP packet back to the sender and it worked. Yay for doing things the hard way.

Once that was done, I was finally getting those sweet sweet ICMP echo replies coming back to me. Since the con was over, I just encoded the message "if you can read this then you're a haxxor B)" as a test payload. Some ping applications didn't like my custom reply (looking at you macOS). Turns out this is because the spec requires that the data payload in the reply is exactly the same as the request. Oh well, the spec can go fly a kite, we have secret data to return!

Only took a few days to get to this point

Now that I've got this working, maybe my friend and I will revisit the idea at DEF CON 32 next year. Who knows, maybe it'll even work.

For those interested, here's the full code in all its full ~~jank~~ glory (it's exactly as it's running on my cloud server, I didn't expect to write this blog post so I didn't make it pretty):

# https://www.uv.mx/personal/angelperez/files/2018/10/sniffers_texto.pdf
# https://unix.stackexchange.com/questions/412446/how-to-disable-ping-response-icmp-echo-in-linux-all-the-time

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_ICMP)

def hexify(v):
	h = hex(v)[2:]
	if len(h) == 1:
		return "0" + h
	return h

def calculateChecksum(data):
	index = 0
	checksum = 0
	while index < len(data):
		high = data[index] << 8
		index += 1
		low = 0 if index >= len(data) else data[index]
		index += 1
		checksum += high | low
	while checksum > 0xFFFF:
		carry = checksum >> 16
		checksum &= 0xFFFF
		checksum += carry
	return checksum ^ 0xFFFF

class Stream:
	def __init__(self, data):
		self.data = data
		self.index = 0
	def readU8(self):
		val = self.data[self.index]
		self.index += 1
		return val
	def readU16(self):
		val = self.readU8() << 8
		return val | self.readU8()
	def readU32(self):
		val = self.readU16() << 16
		return val | self.readU16()

class IPHeader:
	def __init__(self, stream):
		self.version = stream.readU8()
		self.header_len = (self.version & 0x0F) * 4
		self.version >>= 4
		self.tos = stream.readU8()
		self.total_len = stream.readU16()
		self.identification = stream.readU16()
		self.frag_flags = stream.readU16()
		self.fragment = self.frag_flags & 0x1FFF
		self.frag_flags >>= 13
		self.ttl = stream.readU8()
		self.protocol = stream.readU8()
		self.checksum = stream.readU16()
		self.source_addr = stream.readU32()
		self.dest_addr = stream.readU32()
		self.options = []
		for _ in range(self.header_len - 20):
			self.options.append(stream.readU8())
		self.data = None
	def isICMP(self):
		return self.protocol == 1
	def toBytes(self):
		total_len = 20 + len(self.options) + len(self.data)
		data = [
			self.version << 4 | self.header_len >> 2,
			self.tos,
			total_len >> 8, total_len & 0xFF, #self.total_len >> 8, self.total_len & 0xFF,
			self.identification >> 8, self.identification & 0xFF,
			self.frag_flags << 5 | self.fragment >> 8, self.fragment & 0xFF,
			64, #self.ttl,
			self.protocol,
			0, 0, #self.checksum >> 8, self.checksum & 0xFF,
			self.source_addr >> 24, (self.source_addr & 0xFF0000) >> 16, (self.source_addr & 0xFF00) >> 8, self.source_addr & 0xFF,
			self.dest_addr >> 24, (self.dest_addr & 0xFF0000) >> 16, (self.dest_addr & 0xFF00) >> 8, self.dest_addr & 0xFF,
		]
		data.extend(self.options)
		data.extend(self.data)
		self.checksum = calculateChecksum(data)
		data[10] = self.checksum >> 8
		data[11] = self.checksum & 0xFF
		return data
	def __repr__(self):
		return f'IPHeader(ver:{self.version}, hln:{self.header_len}, tos:{self.tos}, tln:{self.total_len}, idn:{self.identification}, fgs:{self.frag_flags}, frg:{self.fragment}, ttl:{self.ttl}, ptc:{self.protocol}, chk:{self.checksum}, src:{self.source_addr}, dst:{self.dest_addr}, opt:{self.options}, dat:{self.data})'

class ICMPEcho:
	def __init__(self, stream):
		self.type = stream.readU8()
		self.code = stream.readU8()
		self.checksum = stream.readU16()
		self.identifier = None
		self.seq_num = None
		self.data = None
	def isEchoReq(self):
		return self.type == 8
	def finalize(self, stream, header):
		self.identifier = stream.readU16()
		self.seq_num = stream.readU16()
		self.data = []
		for _ in range(header.total_len - header.header_len - 8):
			self.data.append(stream.readU8())
	def toBytes(self):
		data = [
			self.type,
			self.code,
			0, 0, #self.checksum >> 8, self.checksum & 0xFF,
			self.identifier >> 8, self.identifier & 0xFF,
			self.seq_num >> 8, self.seq_num & 0xFF
		]
		data.extend(self.data)
		self.checksum = calculateChecksum(data)
		data[2] = self.checksum >> 8
		data[3] = self.checksum & 0xFF
		return data
	def __repr__(self):
		return f'ICMPHeader(typ:{self.type}, cod:{self.code}, chk:{self.checksum}, idn:{self.identifier}, seq:{self.seq_num}, dat:{self.data})'

message = list(b'If you can read this, then you\'re a hacker B)')
message = list(b'1F Y0U C4N r34D 7H15 7H3N Y0Ur3 4 H4XX0r B)')

while True:
	p, i = s.recvfrom(65565)
	print()
	print(' '.join([hexify(x) for x in p]))
	stream = Stream(p)
	header = IPHeader(stream)
	print(header)
	if not header.isICMP():
		continue
	icmp = ICMPEcho(stream)
	if not icmp.isEchoReq():
		continue
	icmp.finalize(stream, header)
	print(icmp)
	icmp.type = 0
	icmp.data = message
	p = bytes(icmp.toBytes())
	#source_addr = header.source_addr
	#header.source_addr = header.dest_addr
	#header.dest_addr = source_addr
	#header.data = icmp.toBytes()
	##break
	#p = bytes(header.toBytes())
	#print(' '.join([hexify(x) for x in p]))
	#stream = Stream(p)
	#header = IPHeader(stream)
	#print(header)
	#icmp = ICMPEcho(stream)
	#icmp.finalize(stream, header)
	#print(icmp)
	print(s.sendto(p, i))

Wednesday, November 16, 2022

Tertis, A Supercon Production

Two weekends ago (Nov 4-6, 2022), I attended the Hackaday Supercon for the first time. It was probably the most amazing con I've ever been to. It helped that the badge was something I really really wanted to play with, so naturally, I spent 2 weeks (actually the 4 days prior, because procrastination) reading almost all of the documentation so that I could be a good enough expert of the system. The badge was intended to be programmed by hand, so I decided to rise up to the challenge of writing Tetris for it by hand... on a yellow paper legal pad... with a pen (not a pencil)... and then entering it all in manually... like a madman... without any use of a computer for the entire weekend... because why not...

I had written Tetris for my first time ever in C++ 3 weeks before the con to get an idea of how it would work. In order to simplify the tetromino rotations, I picked one block to be the center and would rotate all the other blocks around that one. That made things interesting when rotating the square or the line. I decided to call the project "Tertis" (not Tetris) because of how jank it was seemingly going to be to play. But it actually wasn't bad. I spent half an hour playing it at 1 am on a workday.

Oops, all ANSI escapes

(P.S. Because of the whole Twitter fiasco and the possibility of losing tweets or whatever, I'll be posting pictures of my tweets along with links to get to them instead of embedding the tweets directly. I don't think anything will happen, but just in case.)

Here's how it all went down:

Friday

I got to the con super early to be one of the first 100 people to receive the poster. Naturally, I got there half an hour early... only to wait at the wrong building and trick a small crowd of people to line up behind me. A nice Supercon volunteer told us we were at the wrong place and where to go. We had to walk to the other building, but we still got our posters, so all was good. Whoops.

Since I hadn't had access to the badge before this, Friday was dedicated to figuring out all the little peculiarities of the architecture and testing my understanding of the manual. My first program just read from the random register and displayed it on the display in a loop, so I could test programming. For my second program, I one-upped a friend by whipping up a little routine on paper with 14 instructions that would just fill the screen with randomness for maximum blinkenlight effect, because his program only did half the screen. I'd end up passing this routine out throughout the con to people who just wanted the badge to do something; since it was open source, they'd just take a picture of my page of assembly. I'd later end up submitting this program as the first punch card for another team building a punch card reader.

It ain't much, but it's honest work

No idea if I filled this in correctly

With all the introductory exploration out of the way, I then started the design for Tertis. Because I'm doing it all on paper, and inserting newlines is generally not possible, I came up with a system in which I'd specify a staring address for each subroutine, write the subroutine, test it, then specify the end address as being a full empty 16 instruction page after the page containing the last instruction of the subroutine. This way if I had to change anything, then I had at least 16 instructions of buffer to insert things without breaking the defined addresses of any subroutines down the line. Took me a little bit to get this system down. This ended up having the happy result that if I started from the most primitive subroutines and worked up, then I could test each one individually as I wrote them, and the subroutines they depend on would've been tested already and would probably not need to change and mess up the addressing. I later added a listing with the registers used, so subroutines calling subroutines would know what was safe to use and what registers would be modified by the call. Also the size of the subroutines so I could know how many instructions would need to be entered in.

My first subroutine would start on 0x010, leaving 16 instructions at the beginning for initialization and jumping to the main program loop. This was to be a subroutine that draws a single pixel at a specified x-y coordinate on registers R0 and R1, with a "color" value (on/off) at R2. I quickly ran into the limitations of the processor. Turns out there's no way to programmatically set/clear a bit. You can set/clear a literal bit, but you can't set/clear a bit who's position is specified in a register. This made things messy and I ended up setting the least significant bit and just shifting it over in a loop as many times as needed before ORing (set) or ANDing (clearing) the bit on the specified display page. This was tested, and I could indeed set/clear a single pixel by the x-y coordinates.

Assembly intensifies

I then wrote a subroutine to read a 4 element array of x-y coordinates from memory and draw it to the screen with the specified color in R3. The tetrominoes would be represented by the coordinates of their individual blocks, so moving/rotating would just require calculating the new coordinates for each block. This subroutine was tested by drawing and clearing the L tetromino over and over to test the max framerate and see if there'd be any flashing during movement. There was... with an update rate of less than 1 Hz. Not gonna lie, I was actually kind of pissed that it was midnight and that's all I had to show for all my hard work at the end of the day (remember that it's all written and keyed by hand). There was no way I was going to be able to do Tetris at that speed (none of the other necessary subroutines had even been written yet, those would only add more processing delay).

On the drive home, I was talking with my friend (the same one who's random blinkenlight routine I rewrote) and he suggested using using a jump table like how the Hamlet program did for storing the relevant bits that created letters on the display. I figured it might actually be faster to calculate the table address to retrieve the bit I need set rather than do the shifting method. I'd do it tomorrow because this was a spicy subroutine and it was past midnight.

I got home and decided not to go to sleep. Instead, I'd continue badge badge hacking for a bit more. It was clear that too many loops were evil where pixel rendering was concerned, so I'd unroll the loop on the subroutine that drew the 4 x-y coordinate pairs. The subroutine ended up being twice as long, but would run twice as fast without the loop overhead. I went to sleep around 2 am happy knowing that there was at least a way forward and my whole Tertis project wasn't destroyed (probably).

I. Am. Speed.

Saturday

I woke up after a nice little 4 hours of sleep, picked up my friend, and went to the con. After claiming a spot at the tables, I got my food and sat down to rewrite the pixel drawing routine. I had it rewritten, tested, and debugged, along with the x-y coordinate array renderer before lunch. A quick test and high speed video of the drawing and clearing of the L tetromino showed that now I was getting between 60 and 100 Hz framerates (you could see the 60 Hz flashing of the overhead LED lights in that video). That's more like it! More than fast enough to support running Tertis at reasonable speeds!

At this point, I was slowly getting famous. I was told that I was known as "the Tetris guy", which definitely added no pressure to the already insane goal I had. And people would even come to me either to see the crazy guy who's doing all this by hand without a computer, or to ask for help because I read the manuals and knew some tricks. I even had a few regulars who were extra impressed by me being constantly hunched over the yellow paper notepad and would come by every few hours to see what progress I've made.

I managed to squeeze in 2 talks, the PEV talk by Bradley Gawthorp and Samy Kamkar's talk about making gas discharge tubes by hand. Unfortunately, I didn't see most things I wanted to because I was deep in the badge hacking. There was an expectation (at least in my mind) that I get Tertis working in time for the badge hacking competition, so no time for talks.

All that colour coding, for naught...

I spent a good part of the rest of the day chatting with people about various nerd things. The rest of the time after that was spent trying to figure out how to load the tetrominoes into the x-y coordinate array in the most efficient way. I had planned to just encode all the variations and their rotations in a jump table so I could just load them in any rotation I needed. Unfortunately, each tetromino would need 8 instructions, and with 7 tetrominoes with 4 possible rotations, that would've taken 224 instructions... just for the jump table that holds the coordinates. Not even counting the logic to actually calculate the jump address and then read and save the values to the x-y array. I thought about just jumping to sections that load the whole tetromino in one go, but that would require about 16 instructions per tetromino, not counting overhead logic.

I eventually settled on doing the direct load approach on only the base 7 tetrominoes and not their rotations. So that would come out to about 119 instructions with overhead included. (In retrospect, it would've been better to do the jump table solution with the 7 base tetrominoes, but it was late and I was tired and wasn't thinking straight.)

I stubbed out the subroutine and encoded the L tetromino only. I calculated how large the subroutine would be with the other 6 so that I could calculate the program space needed to store it all so I could work on the other subroutines that come after it. Then with this single tetromino, I'd have enough to work on and test the translation and rotation and collision subroutines that I'd need for full functionality.

I ended the night by writing a demo routine that loaded the L tetromino and just moved it down. The translation subroutine didn't exist yet, so I entered all the instructions to add 1 to each y coordinate manually.

Was really hoping to be farther along than this

I went home, stayed up until 2-ish am again, and worked on writing out the next 2 tetrominoes into the loader subroutine.

Sunday

This time I woke up with almost 5 hours of sleep. Score! Again, I picked up the friendo and went to the con.

I was kind of in the flow of things by now, able to quickly crank out the other subroutines for rotating, translating, and wrapping tetrominoes around the screen. I had also entered in so many instructions by hand that I was able to key some in without looking up their opcode number or parameters. As time went on, I became able to key in more and more instructions and read them just by looking at their binary representation on the badge. I'm still not sure if that's a good thing or a bad thing...

It took me quite a bit of time to test these, but nothing too bad. However, I started writing the collision subroutine with the intent to have a stacking tetromino demo in time for the badge hacking competition, and this is where things started to break down. It took me a bit of time to write it, but even longer to test. It was broken for some reason and I couldn't figure out where. I poured over the dissassembly and didn't see anything, so I had to write test after test to try and figure out what part of the subroutine was failing. I eventually got that working (ended up being a single off-by-one-bit entry error), but I was really starting to feel the pressure of the badge hacking deadline.

With less than 1.5 hours left, I decided to make the push and try to get the stacking tetromino demo working. Things went from bad to worse. Because I was in a rush and mildly panicking, my slow and methodic and right way of writing routines by hand wasn't cutting it. I kept making large changes to the demo subroutine, scratching out large parts of code and injecting chunks here and there, which kept requiring me to recalculate relative jumps generally made a mess. The more I tried to fix it, the less it worked.

People were nice and saw that I was working, so they generally came up to see if I thought I'd get it finished, but they let me work. Unfortunately, I didn't make it. It got to the point where I just had a line on the bottom of the screen and I wanted to get a single T tetromino to fall on the line and stay. It did, kinda, but part of the tetromino would disappear on contact. I have no idea why.

Time was up, I entered into the contest under the math category thinking I could show my broken Tertis and talk about the math of the rotation subroutine and how you rotate coordinates around an arbitrary rotation point with 4 bit math. But I was extremely nervous and probably very incoherent on stage as I rambled on about how it was supposed to be Tetris but I ran out of time. I even forgot to mention a name for the project so people would know what to vote for, so it was dubbed "Failed Tetris".

I was barely beat out (probably by a lot) by Kenneth's 128 bit counter. I was there to witness the beginning of his "no documentation, no spoilers" badge hacking campaign on Friday morning, being one of the first to confirm his suspicions of how things worked without giving hints. I have nothing but extreme respect to this madman for getting as far as he did with no spoilers and no documentation, only pressing random buttons and figuring out how things worked.

The con was officially over, I had failed in my task. I regret not just showing a single tetromino rotating on screen and talking about the rotation subroutine and mentioning that all of it was written by hand and keyed in manually, with no use of a computer all weekend long and no documentation after Friday. I probably would have won (or maybe not, idk).

We all went to a bar afterward to continue the party. It was a grand time with food and drink and most importantly, a tab that was being paid for by the con (thanks, Hackaday!). I sat with the friends I had made and we talked about all sorts of nerd things and had a blast. Then that bar closed up and so we moved over to the next closest bar that was still open.

I sat at a table alone and tried to finish that demo, determined not to leave the con empty handed (and with less street cred than I could've had). I just rewrote the demo subroutine from scratch, not even trying to fix the mess that I had created in the last one. Some people from the other table were jokingly reminding me that the con had ended and it was safe to relax now 😂

I was in the process of entering it in when someone sat down and we conversed for a few hours, slowly picking up 2 more people at the table to talk with. It was a great time. Turned out the first conversant was good friends with Simone Giertz. What a small place our world of hackers/makers/engineers is. I hope to get more entangled in it.

I left for home early (shortly after midnight) because I had to work the next day (more like later that day). I didn't get the demo done, but whatever, the con was over and I was gone. I got home and decided to quickly type in the new demo subroutine just to see if it would've worked. It did. The first time. Dang...

Why couldn't I type this in really quick at the bar to restore my honour?

Monday

I only had 5.5 hours of sleep, so needless to say, after 3 days of torturing myself, I was very ineffective at work. I went through my email inbox, did some mandatory trainings, and basically stared at code without actually changing anything because I had nothing but Tertis on the mind.

After work, I took an hour to write a bot to let me know when the Pinecil soldering iron was in stock. I want one now because I saw them at the con and they seem so convenient (also because the screen on my TS100 is dead).

Then I spent a bit of time to write a delay subroutine that would also handle user input to read the buttons and shift and rotate the tetrominoes. The test program just made it fall (without collision, so it would wrap around to the start) and let the user shift and rotate it as it fell. There was another off-by-one-bit error where it read from the wrong address when checking if the "shift left" key was pressed, and that took me forever to find.

Even this would've been a better demo...

I then tried to write the subroutine to clear the full rows from the screen. Yeah... that one didn't go so well. I started late and was so tired that I ended up with some weird spaghetti code that didn't work. It would kind of shift down the bottom 2 rows and then stop for some reason. I wasn't going to figure it out that late at night.

Tuesday

I woke up with a little over 5 hours of sleep, and I was dying at this point. I had to actually go into the office and it was raining, so I left extra early to make sure I could get to the bus on time. Surprisingly, traffic somehow improved with the rain. I guess it was already so bad that it just underflowed.

I started writing this blog post on the bus heading to work and managed to crank out the first 1500-ish words in 1.5 hours.

I also brought my badge to work to show my coworkers what I got so far. One of them was excited to see Tetris running, but alas, I didn't get that far. They were still very impressed by the 2 demos I showed them (stacking blocks, and user input). They were scared to play with the badge because they thought it was very fragile, but I let them know that none of the badges would've survived the weekend if that was true 🤣

During a presentation at work about code reviews, I figured out how to add a score counter while making collision handling easier. If I didn't use the bottom row, then I'd know if a collision happened when the tetromino reached into the bottom row. Then the standard "undo previous move" algorithm would work for resetting its position and signalling that it reached the ground, as opposed to having separate handling logic for if the ground was detected. Then this empty row could be used to display the score in binary.

Other than that, I didn't do much. I was busy all day and was so tired by the end of it.

Wednesday

I woke up with 6.5 hours of sleep. Score! Finally.

I got to the bus and wrote the score counter subroutine on my little yellow pad. Then I worked on this writeup some more.

During work, I came up with a different way to write the subroutine to clear full rows from the screen. This would use a few more registers, but would simplify the algorithm a bit since we'd be storing precomputed values rather than calculating everything on the fly as I had tried previously.

On the bus ride home, there was no WiFi, so I used the time to write out the score counter subroutine (it just incremented the binary number on screen), and then I wrote out the new row clearing subroutine. Then I started filling out the rest of the spawn subroutine that I started writing on Saturday, but I didn't finish because the bus arrived at my stop.

I went home and keyed in the subroutines for the score counter and row cleaner. After the usual off-by-one-bit debugging and a round of verification, that was done. The cleanup subroutine turned out to be missing a jump statement. This must've been the first logic bug I've had outside of a demo/main routine. That was quickly fixed.

As usual, I ended up staying up super late to try and get Tertis done. I committed the entry point routine to paper (all it did was set the clock speed, display page, and timer interval before jumping to the game loop). Then I started writing the main loop, which would tie together all the subroutines that had been written. It was late and I was tired, so there were so many bugs and things I had to move around. It was kind of a mess. Then I forgot the cleanup subroutine, which I had to add in.

The handwritten main loop algorithm

Eventually I got it working and finished writing and entering the tetromino spawn subroutine. I played for a bit to test it and there turned out to be a weird bug where one of the tetrominoes wasn't being spawned correctly. It took me forever to debug because I wasn't sure which one it was. I'd seen each one of them spawned correctly at some point, so what was happening? It turned out to be an off-by-one-bit error on the part for spawning a square where one single x coordinate wasn't being written, so it used whatever value was there previously, which is why I'd seen it be correct sometimes.

By this point it was very late, but I had a fully working Tertis. I took a video and posted it to Twitter and the Hackaday Discord server for proof that I had done it.

The proof is in the ~~pudding~~ badge!

I crashed and burned by 1:30 am knowing I had done it.

Thursday

I had only just over 4.5 hours of sleep. Surprisingly, I wasn't feeling too bad. Only my eyes didn't like it and were hurting a bit, but otherwise I was fine.

I worked on the writeup during the ride to work. And then I just worked. I was apparently too tired to be distracted by multiple things, so I was able to mostly focus on my actual job.

Around lunch time, I started getting spammed with notifications from Twitter. Apparently my video of functioning Tertis on the badge had become a bit viral. No idea how it started, but I started getting likes and follows left and right along with a few retweets and comments. It was a bit annoying that it was happening during work hours, but I also liked the attention 🤣

After lunch, I showed my coworkers the working version of Tertis and let them play with it. They were my beta testers, and I could see that the way I ended the game (by underflowing the stack pointer) didn't work because they kept pressing buttons which brought them to different programming modes instead of restarting the game or whatever they were thinking would happen.

When it was time to go home, I decided use my bus time to implement some kind of proper "game over" indicator that wouldn't cause the problems seen during testing. I figured nothing fancy, it would just cause your score to blink at the bottom of the screen in a loop so pressing buttons wouldn't do anything (unless you hit the break button). Elliot Williams was right when he said that using a stack crash to indicate game over was spooky 😂

On the bus, I also tried to write some mods for Tertis that would allow you to move your tetromino so it wraps around the screen, and then a mod to make the score counter count the number of rows that are cleared rather than the number of tetrominoes spawned. Unfortunately it became too dark to write by that point.

I got home and wrote those mods, then keyed everything in and tested them. My tests involved activating and deactivating the mods individually to see that everything works as expected.

I also noticed that I got followed by the official Hackaday Twitter account at one point. Cool. I guess I'm a real maker-hacker now.

I probably gained more followers on this one day than the past 3 years combined.

Friday

My watch says I got 7 hours of sleep. I'm pretty sure that's a lie...

I took a bit of time to take pictures of every active page of my paper assembly repository, even if the code in that page was unrelated or didn't make it into the production version of Tertis because I wanted to show the total journey and how beat up the pad had become.

I used those pictures to create a gif of flipping through the pages. Then I made a grid of all the pages so people could inspect the pages.

AnImAtIoN!1!

Saturday

I was busy this day, so I did nothing for the Tertis project. (Easiest entry, heck yeah!)

Sunday

I did some more writing of the writeup, along with moving my writings from a temporary Notepad++ instance to an actual Blogger draft.

Then it was a matter of doing all the formatting and inserting links and pictures.

I was too tired from the past week of not sleeping, so I didn't get too far in this even though I spent a few hours on it.

Monday

I came home from work and worked on the writeup a bit more. I was bored of that so I quickly moved over to transcribing all 439 lines of handwritten assembly into a digital file that could be built with the official assembler program. I also included comments and stuff to make it less confusing to other people (or they can at least know what each subroutine is and what it does, even if they can't tell how it works).

Octavian (who won the supersized badge) was cool and tested out my transcribed assembly for me on his Nibbler badge emulator while I went to go eat dinner. He even found and fixed a bug or two in my transcription (no idea how, he's on a different level).

Then I created a nice little utility to download programs from the badge onto a computer since the repo only had a utility to go the other way. I used that to pull the original Tertis program from the badge to compare it against the assembled version of the program. They came out basically identical, so I could confirm that the transcription was correct.

I then forked the repo, made a "tertis" branch, threw my transcribed assembly in, and made a pull request. This was the first time I'd ever done that in GitHub, it was easier than I expected. I also had to make a little fix to the win_flash program since the flush statement wasn't doing anything for me (must be the weird generic drivers I have), so I created as separate pull request for both the serial utility I made and the one I fixed.

Tuesday

I was busy with a "friendsgiving" thing after work, so I didn't do much this day.

I did check to see that my pull request was approved, and it was. Cool! You can find the Tertis source code here if you want to try it out.

I then started making the magazine cover for the fake magazine that I'd use for the type-in listing of the code (more on this in the Extra section below). I went to sleep really late after that.

Would you believe me if I told you this wasn't a real magazine cover?

Wednesday

I didn't keep track of how much I slept in the previous four days, it couldn't have been a lot, but today I slept for only 5 hours and I really felt it all day (to the point where even my job was impacted).

After work, I just sat and finished the magazine. It actually took less time than I expected and the results were great. I even managed to trick a friend into thinking it was real! 🤣

I submitted a small PR to my Tertis source code for a bug that I found in one of the comments for how to do the score counter mod. Hopefully that shows up tomorrow.

I also just powered through to finish this writeup and submit it to Hackaday. My goal was to get it all done before 10:00 pm, but that didn't happen. Anyways, I didn't have to stay up until 1:00 am, I got it done at around 10:30 pm.

Conclusion

Overall, I had fun with this little project. I'll definitely be filing it under "Never Again", but I can't say it wasn't cool to write and debug code as my ancestors once did.

The interesting thing is that almost none of my subroutines had any logic errors in them. That was strange because I'm not used to writing assembly, and correcting mistakes is very hard on paper (maybe that's why, got it right the first time out of necessity). Some had to be rewritten for efficiency, but they all pretty much worked the first time. Any bugs I did encounter were from messing up during program entry (usually being a single bit off or messing up with turning a negative number into a 2's complement binary value in my head). The only logic errors I had were with the subroutine to stack blocks for my demo on stage, and one other supporting subroutine later down the line.

I also ended up using no memory aside from the pages for the screen, and the coordinates for the active tetromino. Everything else was stored in registers (which take up memory, but that doesn't count, just like the stack). And to make things more interesting, I chose not to refer to the manual after the first day of the con (which is a lie because I peeked once on Sunday to get the table for the timer delays). So all that studying paid off.

My little yellow paper legal pad also came out of this ordeal with battle scars. Below is a picture of all the pages that were written on. Most of them went into the production version of Tertis. They're all battered and wrinkled and have grease stains (because I ate while working a lot) and folded/ripped pages like a true engineer's notebook.

I dare you to type it in using just this

There is definitely room for improvement and optimizations. Most of the subroutines are not optimal because they were written in a hurry and/or late at night and you can't move things around on paper as you think of improvements, so I had to commit to whatever course I chose when I started writing each subroutine.

The Hackaday Supercon is an amazing convention, and if you ever have the chance to go and have even the smallest inkling of interest, I highly recommend it! I plan to be there next year, and the year after that, and the year after that (it helps that I'm kind of local).

So what's next? I don't know. I've been thinking I might play around with the badge for a bit and write some stuff (Snake, Pong, Flappy Bird, Lunar Lander, a Brainfuck interpreter, the ~~crappiest~~ smallest Forth interpreter, or maybe even Doom). But this time with the assembler on a computer. Writing Tertis by hand on paper and entering it in manually with buttons was an experience and all that, but not one I'm in a hurry to repeat. Or, I could just work on other hardware projects not involving the badge. Who knows, only time will tell.

Extra

Here is the transcribed source code to Tertis on Hackaday's repo.

Kuba Tyszko, who gave a talk on cracking encrypted software, gave me the suggestion that I put my code into a magazine like the type-in programs of old. So I did exactly that. But not only that, I had to create the magazine that it goes inside (which apparently fooled some people, so I guess I did it right)! Here you go, have fun, happy hacking:

https://drive.google.com/file/d/18-dJTzrF_ELPg95-Ffr47-QUv_PNf2bI/view

If you actually type it in by hand (or even if you load it over serial), please send images/videos on Twitter and tag @koppanyh with #TetrisGuy and #Tertis (not Tetris) so I can see it. I'll even be giving out free retweets to the first infinity people who do this!

Monday, December 20, 2021

The CountdownSort Algorithm (or the 2nd worst sorting algorithm)

I was thinking about the SleepSort algorithm and how you might make it less unreliable. The problem with SleepSort is that if you try to make it faster by scaling down the sleep delays to milli/microsecond levels, then you run into some rather undeterministic behavior by the OS's scheduler in that it can't do exact millisecond-scale timing of when the threads should run. This is a normal aspect of non-realtime operating systems, but it makes SleepSort unreliable along with just plain slow. The solution is to operate on the second-scale (process schedulers do have the resolution for this), but then sorting a list with huge numbers could literally take more time than humans have had on Earth.

What if we could speed things up and have them be reliable and deterministic? We can; enter CountdownSort.

The idea is that instead of making processes with delays, you make nodes in a linked list with a counter. That counter is initialized with the same value as the delay in SleepSort, but we iterate through the linked list and decrement the counter for every node in each iteration. Once a node reaches 0, then it is pulled off the linked list and added to the end of a sorted linked list. The process repeats until there are no more nodes in the unsorted linked list.

Here I've written the basic algorithm in C++ (and man it's been a loooong looong time since I've written something like this in C++).

#include <iostream>
using namespace std;

class Node {
    void print2() {
        cout << value << " ";
        if(next != NULL) next->print2();
    }
    public:
    Node* next = NULL;
    int counter = 0;
    int value;
    Node(int val) {
        value = val;
        counter = val; //counter value defined here
    }
    ~Node() { if(next != NULL) delete next; }
    Node* append(int val) { return next = new Node(val); }
    void print() {
        cout << value << " ";
        if(next != NULL) next->print2();
        cout << endl;
    }
};

Node* countdownSort(Node* root) {
    Node* sortedRoot = NULL;
    Node* sortedCur = NULL;
    while(root != NULL) { //repeat until nothing left to sort
        Node* prev = NULL;
        Node* cur = root;
        while(cur != NULL) { //iterate through the list
            if(--cur->counter <= 0) { //add to sorted list
                //pop from old
                if(prev == NULL) root = cur->next;
                else prev->next = cur->next;
                //add to new
                if(sortedCur == NULL) sortedRoot = sortedCur = cur;
                else { sortedCur->next = cur; sortedCur = cur; }
                cur = cur->next;                sortedCur->next = NULL;
            } else {
                prev = cur;
                cur = cur->next;
            }
        }
    }
    return sortedRoot;
}

int main(){
    Node* root = new Node(3); // initialize our list
    Node* cur = root;
    cur = cur->append(7);
    cur = cur->append(5);
    cur = cur->append(2);
    root->print(); //print: 3 7 5 2
    root = countdownSort(root); //sort
    root->print(); //print: 2 3 5 7
}

You can see that in the main function we add the numbers in this order: 3, 7, 5, 2. Then we print it out, giving us: 3 7 5 2. Then we use CountdownSort to sort it and end up with: 2 3 5 7. Success!

The space complexity is O(n) since we only use the nodes that exist, so no further memory is used. But even though it works, it's still a terrible algorithm! There are 2 iterations needed to get to the point where we move 2 to the sorted list in the beginning. Then after 3 there are an additional 2 iterations to move the next value, 5, to the sorted list. Same for the value 7. So we eventually get to the point where each value is added to the sorted list, but we skip any steps in the middle! So if our list of values looked like [100(100), 2(2)] (where the value within the parenthesis is the counter for the real value outside of the parenthesis), then there would be 98 useless iterations in the process of sorting just 2 numbers, or almost the same cost as sorting 100 values (minus the part where we iterate through multiple values). As such, the worst-case time complexity is O(n*m) where n is the number of nodes and m is the largest counter value. This example takes 17 iterations total.

Welp, we made the 2nd worst sorting algorithm (it runs way faster than SleepSort and only wastes a bunch of CPU power in the process), so our job here is done. Or maybe not, it turns out you can actually optimize this into a half decent sorting algorithm.

We said that you need 2 iterations in order for the counter to reach the first value, but what if you could offset all the counters in the entire list by the right amount to make it so the smallest item would be added to the sorted list on the first iteration without ruining the relative offsets of each item. Instead of the counters starting at [100(100), 2(2)] (as per our previous example), we start them at [100(99), 2(1)] so the 2nd item (our smallest) would be added to the sorted list on the first iteration.

Sounds good, but if there's a gap then the next smallest item, 100(99), would still need more than one more iteration to be added to the sorted list. So what if on each iteration we just keep track of the current smallest value in the list? We can do that easily since we have to iterate through the remaining items anyways. So each iteration will always result in the addition of the next smallest item, and no wasted iterations would exist. Take our new list, [98(98), 100(100), 2(2)], find the smallest, and offset the whole list so it's [98(97), 100(99), 2(1)]. We pop item 3 since that counter reaches 0 and are left with [98(97), 100(99)], our last iteration tells us the smallest counter is 97 so we offset the list so it's now [98(1), 100(3)] and pop the first item when the counter reaches 0 and send that off to the sorted list. Our list is now [100(3)] and since we know 3 is the smallest number, we offset the list to become [100(1)] and then send that to the sorted list since the counter reaches 0.

If we add that modification to the function, then we end up with the following:

#include <iostream>
#include <climits>
using namespace std;

class Node {
    void print2() {
        cout << value << " ";
        if(next != NULL) next->print2();
    }
    public:
    Node* next = NULL;
    int counter = 0;
    int value;
    Node(int val) {
        value = val;
        counter = val; //counter value defined here
    }
    ~Node() { if(next != NULL) delete next; }
    Node* append(int val) { return next = new Node(val); }
    void print() {
        cout << value << " ";
        if(next != NULL) next->print2();
        cout << endl;
    }
};

//find the smallest counter in the list
int smallestCounter(Node* cur) {
    int smallest = cur->counter;
    while(cur != NULL) {
        if(cur->counter < smallest) smallest = cur->counter;
        cur = cur->next;
    }
    return smallest;
}

Node* countdownSort(Node* root) {
    Node* sortedRoot = NULL;
    Node* sortedCur = NULL;
    int offset = smallestCounter(root);
    while(root != NULL) { //repeat until nothing left to sort
        Node* prev = NULL;
        Node* cur = root;
        int newSmallest = INT_MAX;
        while(cur != NULL) { //iterate through the list
            cur->counter -= offset;
            if(cur->counter <= 0) { //add to sorted list
                //pop from old
                if(prev == NULL) root = cur->next;
                else prev->next = cur->next;
                //add to new
                if(sortedCur == NULL) sortedRoot = sortedCur = cur;
                else { sortedCur->next = cur; sortedCur = cur; }
                cur = cur->next;
                sortedCur->next = NULL;
            } else {
                if(cur->counter < newSmallest)
                    newSmallest = cur->counter;
                prev = cur;
                cur = cur->next;
            }
        }
        offset = newSmallest;
    }
    return sortedRoot;
}

int main(){
    Node* root = new Node(3); // initialize our list
    Node* cur = root;
    cur = cur->append(7);
    cur = cur->append(5);
    cur = cur->append(2);
    root->print(); //print: 3 7 5 2
    root = countdownSort(root); //sort
    root->print(); //print: 2 3 5 7
}

The only thing that was changed was adding the function to find the smallest counter before the first iteration, keeping track of the new smallest counter (but only if it's not being removed from the list), and then counting down by the entire offset instead of just by 1. That last one is actually better than trying to offset everything to make the smallest counter 1 and then using the decrement to reach 0 since you already know how much you have to subtract to reach 0 anyways. This slight modification to the algorithm results in the same sorting taking only 10 iterations total, with a new worst-case (and average case if all numbers are unique) time complexity of O(n^2) (actually more like O(n^2-n), which given an input of 4 actually returns 10, but we don't count the little n).

And now we've got an almost good sorting algorithm!

At this point it turns out to be basically SelectionSort (which is a real-world sorting algorithm) but using linked lists and wasting memory on a counter and finding the minimum value by subtraction rather than comparisons. Since we're wasting memory, it would've been worth making it a doubly linked list so we could find the previous node easier and just write an extra method that would automatically bridge the previous and next nodes together and release this node from it's linking responsibilities, but I wrote the example code as I wrote this post, so that didn't happen (and it's late at night, so still not gonna happen).

The limitations of this algorithm is that the counter requires positive numbers, so if you want to use 0 or negative numbers then you have to somehow make sure the counter is always positive and not 0. One way to do this is to make the counter an unsigned int and add INT_MAX to it so the lowest negative numbers start at 0 (probably an off-by-one error waiting to happen here) and the lowest positive numbers start at around 2 billion (halfway up UINT_MAX). Also this probably wouldn't work on strings or complex objects since you'd have to find a way to make counter values that maintain relative ordering with each other, and that probably requires sorting to generate new values, which defeats the entire purpose.

This post started as a joke, but ended up being an actually serious analysis of an algorithm. The original Discord chat that prompted this post and explored the time complexities and optimizations is much more entertaining. Oh well, at least we had fun 😂

Sunday, May 10, 2020

Using a 3D printer as a CNC plastic welding machine

Intro:
A friend and I were discussing fusing plastic sheets and creating complex patterns and specific curves with the seams. This reminded me of a thought I had a while back to use a 3D printer as a CNC plastic fusing machine since the position is computer controlled and the print head can be set to specific temperatures for melting the plastic. There is a technique to use a soldering iron to fuse plastic sheets by hand, and this would be the robotic equivalent.

If this works with such thin plastic and on a smallish scale, then it could be a good method of producing inflatable pneumatic actuators for soft robots (which is something I've been trying to figure out how to do cheaply and easily for a little while now).

The Plan:
Before being able to use my 3D printer (Monoprice Maker Select Plus) as a CNC plastic fusing machine, I'd need to find out the correct temperature and speed to use on my material of choice: disposable Subway sandwich bags, the thin ones for a single sub.
According to Google, a thin plastic shopping bag is about 0.5mil, which comes out to about 12.7 microns, which is thin, and according to my friend might be harder to work with due to how thin the plastic is (I can confirm having tried to fuse this plastic manually with an unregulated soldering iron before), so this test is absolutely necessary.

The idea I came up with for the test looked something like this in order to test a variety of speeds and temperatures:

The rows would keep on increasing in temperature as they went up, and the speed would keep on getting faster.

I decided this wouldn't be able to fit enough rows on my tiny printer, so I made it more like a grid of temperatures with 12 lines each starting at a slow speed and ending with a fast speed (although this was later reversed to start fast and end slow).

If I was going to melt unknown (probably LDPE) plastic with my print head, I didn't want to contaminate it or risk it in any way, so I added 2 layers of aluminum foil to the print head and secured it with a wire loop.

Then I cut out a cardboard holder for the plastic and clamped it onto the build plate with binder clips. It's a sandwich with a cardboard square underneath the plastic sheets, the sheets themselves, then the cardboard frame above that.

The cardboard on the bottom also works to keep the print head from scratching the build plate if something went wrong, the worst that would happen is the plastic would be ripped and the cardboard would have indentations in it.

The Software:
I wrote the control software for this in Python 2.7 (I know, I know, Python 2.7 is dead, but until the python command stops bringing up version 2.7 in the Ubuntu terminal, then I'll probably keep on using it, and no I don't want to change the link to point to Python 3 😛). The repo is at https://github.com/koppanyh/Plastic-Fuser
All it has to do is generate the G-code to be sent to the printer and make sure the lines are within the cardboard frame, and at the proper height and temperatures and position and everything.
I spent an entire day writing it, which is sad because I thought this little experiment would last about half a day and be done, not take me 1.5 days to do in which more than a day was spent just on the software.

I tried to make a nice interface, so the output is color coded (error are even red and everything) and the calibration section was designed so you could only type in what you wanted and skip what was already good.

Since in my testing the calibration data didn't change often (only if you messed around with the cardboard frame) it made sense to be able to save and restore it from a file so you wouldn't have to go through the calibration multiple times.

The calibration step would ask you to input the coords for a specific corner, like "upper left", and would present you a prompt to type in X's value followed by a default value which was the last X value you typed in. If you type in a value, then it sets X at that value, if you hit enter it will use the previous value and if you type # then it moves onto the next corner. After X is typed in, it asks for Y and follows the same input convention. The Z axis stays at 10mm the whole time so it's hovering over the plastic.

Once all the corners are specified, then it calculates the center by averaging all the X coordinates for the center X and averaging all Y coordinates for the center Y. It looks like this (green point) and is better than the red point because it's the position that is more or less the farthest from all the edges at once (the red can get really close to the edges if you squish the square in just the right way).

Then the software visits all 5 points and for each one asks the Z height. The prompt for each Z is the same as for X or Y in that you can enter new, reuse previous, or move onto the next action.

The software then calculates and instructs the printer to draw lines at the positions and temperatures and speeds detailed in the grid image above.

To make matters complicated, if the build platform quad is not made of right angles, then the software will calculate the paths so they are squished in the same manner as the work surface. So if the top is bigger than the bottom (upside down trapezoid) then the lines will be angled in a way to diverge following this expansion.

To make things even more complicated, the cardboard piece at the bottom was not flat, and so was warped. I added code to attempt "bed leveling" in software to correct the Z height for this. It worked ok, but I realized a major flaw. The software calculates the Z for the starting point and the ending point, then draws a 3D line directly between the 2. This probably does not follow the surface of the cardboard when modeled as a pyramid with the 5 points. My poorly illustrated drawing attempts to show this. On the left you have the pyramid with the line going from one side to the other. In an ideal world it would climb up the starting face, then across the middle face, then down the ending face (depicted on the right by the line that goes over the pyramid cross-section). In reality it goes straight through the pyramid (the line that goes through the cross-section). Fortunately the difference in heights is around a millimeter or less, so this inaccuracy isn't a big deal since the cardboard is a bit flexible.

The software was given a PRODUCTION flag that when set to true will output over serial port to the printer, or when false will output to a debug.gcode file. This G-code file was viewed in https://ncviewer.com to make sure the paths were correct and to see if the warp compensation was working good enough. I had to fix a few little bugs dealing with the coordinate system, but for the most part it seemed to work more or less the first time. Good thing I could preview the movements before actually using the printer.

The First Test:
With the software done it was finally time to run the first actual test. There was a pretest where I ran the software but gave the Z axis a +10 offset so it wouldn't touch the plastic and turned off thermals just so I could see the path once in software. It looked good so I was able to undo those little changes to have it run the real deal.

This test would be as planned, 9 thermal regions with 12 lines each. The temperature would start at 115°C and increase 15°C on each following region to end up with 280°C at the last region. The lines in each region would have the first one drawn at 11994mm/min and be decremented by 954mm/min until the 12th line would be drawn at 1500mm/min.

Here is a video of the first 115C region being drawn:
https://twitter.com/koppanyh/status/1259620567235170305?s=20

Issues:
There were some major issues with this test. At the start I could see that the plastic needed to be stretched tighter because the extruder seemed to be dragging the plastic a bit. Then I saw the hotend's blower fan nozzle hitting the binder clip at the bottom middle, I'm sure this contributed to a bit of misalignment. Then I noticed the temperatures were changing to the next temperature almost halfway through drawing each thermal zone. This was because apparently the M109 command doesn't wait for movements to be done before it tries to change the temperature, so it would be changing the temperature as soon as the buffer would have space for the command to be sent. Some of the faster lines would go really slow for some reason and then would finally run at the expected speed once the speed was at or below around 8000mm/min. And when the temperature was high enough the hotend would poke holes in the plastic because the plastic would stick to it and when it pulled away it would pull a hole in the plastic.

The Second Test:
With those issues, I killed the test less than halfway through, made minor code modifications, made tiny hardware adjustments, and started the test from scratch with fresh plastic. This was the progress of the first test:

I solved the loose plastic, blower nozzle collision, and hole poking problems in one go by manually taking care to keep the plastic tight and using 3 mini binder clips at the bottom.

The issue of temperatures changing before it was time was solved by adding the M400 command right before the M109 commands, this way it would wait for the current movements to be done before it changed the temperature and queued up the next set of movements.

I didn't know how to solve the issue where some fast lines run way slower than expected, but I didn't run into that issue this second time around. My theory is that it was some kind of weird behavior related to changing the Z height for warp compensation (which is theoretically a buggy feature anyways) and the new calibration after moving the cardboard somehow didn't trigger this behavior.

I changed the thermal zones to start at 145°C and increase 5°C on each following zone so it would end at 185°C. But I kept the 9 zones with 12 lines and speeds the same. This is what it looked like:

More Issues:
Of course, second time's not the charm, but this test did go much better than the first one. At thermal increments of 5°C, my printer did not have the accuracy to maintain this, so as it drew out one thermal zone I would notice the temperature wandering between ±5°C. This means that the results in one thermal zone would be similar to its neighbors' results. The Z height calibrations were too high on the lower right side, so you can see that some of those lines are only partially drawn. I forgot an M400 command at the end to delay turning off the heater, because of this the final thermal zone started at 185°C dropped to about 170°C by the end. There was still the issue of the plastic bunching up as it was dragged a bit by the hotend, this would create a skipping patter on some lines when the hotend would just run over those bunches. And at higher temperatures I noticed color bleeding from the plastic onto the cardboard, which would have stained my precious heated bed if the cardboard wasn't there to protect it.

Results:
Once this second test was done, the results were good enough to analyze the seams.

Immediately the bottom row (145°C, 150°C, 155°C) just peeled apart with pretty much no effort. It wasn't just the partially drawn lines because they had sections that were contacted by the hotend. These temperatures were discarded.

I labeled and cut the rest of the thermal zones out to analyze them one by one (picture was before I cut them to individual sections).

I peeled each one apart, starting from the left where the lines were drawn fast and ending where the lines were drawn slow. The common observation was that the first half (the highest speeds) always came apart without a fight.

At 160°C only the slowest speed (1500mm/min) didn't practically fall apart by itself, but it was still peeled apart too easily.

At 165°C it acted like at 160°C, but the last one required just a tiny bit more force to peel apart, still too easy though.

At 170°C the last 2 (1500mm/min and 2454mm/min) hung on nicely near the end, but the higher heat and bunching issue caused it to form holes along the seam as a pulled it before the seams finally gave up. The force to do this was starting to get acceptable though.

At 175°C the fastest 8 seams broke pretty easily. The next 2 seams looked nice and didn't break without a fight. The last 2 seams had holes from the heat, they seemed pretty strong though.

At 180°C it pretty much acted like at 175°C but the last 4 seams all looked nice and were pretty strong, but there was significant bunching on these. The nicest results were on 4362mm/min, just like at 175°C.

At 185°C the first 6 seams broke apart too easily, and this was the one where a missing M400 command caused it to start cooling off before all the lines were drawn. The remaining seams were pretty strong, but there was heavy bunching on all the lines and some small holes on the slowest 2 speeds.

Conclusion:
None of the seams produced in this experiment were particularly acceptable for pneumatic applications. Part of the reason could have been that variables such as contact surface or contact pressure were not regulated in any way, but could make big differences in how nicely the plastic was fused.

Based on my limited testing and experience trying to do this process manually, it seems that Subway bags are not a good material to use for manufacturing pneumatic bladders using thermal plastic fusing. It is possible that it could work, but it would most likely need either a more precise setup, or a different approach altogether.

It will be necessary to continue exploring ways to cheaply, quickly, and easily make soft robotic pneumatic actuators.

Regardless of the failure of this project, I learned a lot about G-code, and interfacing with a printer over serial connection. I also learned that the Subway bags I've hoarded are not as useful as I was hoping they'd be 😛.

Future Improvements/Research:
If anyone wants to replicate or continue this experiment, here are some suggestions for ways to improve or explore different options.

Use a better surface than warped cardboard. Something that is level by default and preferably still able to take a bit of a beating if your extruder goes down too far without damaging the extruder itself.

Maybe create a surface that is suspended by springs, so the hotend can go down a bit further and the springs will provide almost constant pressure at any point of the plastic instead of it being at the whim of the warpage and warp compensation/calibration.

Pay better attention to making the plastic as tight as possible. This is harder than I thought and I probably should have spent more time on this. Don't make the same mistake as me!

Having a rounded heating tip would probably be more effective than a point covered in aluminum foil. Maybe something like a brass screw with the same thread as the extruder with one side rounded into a hemisphere. This way it can screw into the hotend and provide a much nicer contact point that might help with the bunching problem.

Miscellaneous:
While analyzing the thermal zones, I found some dried Meatball Marinara sauce on the bag. It had been there for quite some time, but it was in a small enough amount that I didn't notice it before until I started looking closely to inspect the seams. Good thing there was cardboard and aluminum foil separating the printer from the plastic because I don't want that stuff defiling my printer 🤣. This most likely won't be a problem for normal people because normal people don't eat as many sandwiches as I do (at least I hope).

I also realized that I kind of wrote this like a detailed lab writeup essay from college, which is funny because I absolutely hated writing anything and there's no way I would have been able to write all of this in less than a day in one sitting the way I did here. It helps that the language and format are much less formal than what's expected in college, and the topic is much more interesting than the kind of stuff I had to do in college 😛.