Hiding kilobytes C=Hacking Issue 7.txt

From ReplayResources
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Taken from The Fridge C=Hacking Section C=Hacking Issue #7

This issue of C=Hacking #7 was stripped down to just the Hiding kilobytes article by Marko Mäkela which contains the interesting sections "_Freezer cartridges_" and "_Building an unbeatable freezer circuit_".



Hiding kilobytes
by Marko M"akel"a <Marko.Makela@Helsinki.FI>

Most Commodore 64 programs do not utilize even nearly all of the 64 kB
random access memory space. By default, there are only 44 kilobytes of
accessible RAM. This article describes how you can take the hiding 20
kilobytes to use.


_Memory management_

  The Commodore 64 has access to more memory than its processor can
directly handle. This is possible by banking the memory. There are
five user configurable inputs that affect the banking. Three of them
can be controlled by program, and the rest two serve as control lines
on the memory expansion port.

  The 6510 MPU has an integrated I/O port with six I/O lines. This
port is accessed through the memory locations 0 and 1. The location 0
is the Data Direction Register for the Peripheral data Register, which
is mapped to the other location. When a bit in the DDR is set, the
corresponding PR bit controls the state of a corresponding Peripheral
line as an output. When it is clear, the state of the Peripheral line
is reflected by the Peripheral register. The Peripheral lines are
numbered from 0 to 5, and they are mapped to the DDR and PR bits 0 - 5,
respectively. The 8502 processor, which is used in the Commodore 128,
has seven Peripheral lines in its I/O port. The seventh line is connected
to the Caps lock or ASC/CC key.

  The I/O lines have the following functions:

     Direction  Line  Function
     ---------  ----  --------
        out      P5   Cassette motor control. (0 = motor spins)
        in       P4   Cassette sense. (0 = PLAY button depressed)
        out      P3   Cassette write data.
        out      P2   CHAREN
        out      P1   HIRAM
        out      P0   LORAM

  The default value of the DDR register is $2F, so all lines except
Cassette sense are outputs. The default PR value is $37 (Datassette
motor stopped, and all three memory management lines high).

  Like most chips in the Commodore 64, the 6510 MPU uses the NMOS
(N-channel metal oxide semiconductor) technology. The NMOS switches
produce strong logical '0' levels, but weak '1' levels. The opposite
is the PMOS (P-channel metal oxide semiconductor) technology, which
cannot pull strong signals low, but is able to drive them high. The CMOS
technology (complementary metal oxide semiconductor), which combines
these two technologies, is able to drive both logical levels.

  Because most integrated circuits in the C64 use the NMOS technology,
all hardware lines that are not outputs are driven to +5 volts with a
weak current. This is usually accomplished by pull-up resistors, large
resistances between the hardware lines and the +5 volt power supply
line. The resistors can be inside a chip or on the printed circuit
board. This allows any NMOS or CMOS chip to drive the line to the
desired state (low or high voltage level).

  The difference between an input and an output is that an output uses
more current to drive the signal to the desired level. An input and an
output outputting logical '1' are equivalent for any other inputting
chip. But if a chip is trying to drive a signal to ground level, it
needs more current to sink an output than an input. You can even use
outputs as inputs, i.e. read them in your program.

        You can use this feature to distinquish between the left
      shift and the shift lock keys, although they are connected
      to same hardware lines. The shift lock key has smaller
      resistance than the left shift. If you make both CIA 1
      ports to outputs (write $FF to $DC03 and $DC01) prior
      reading the left shift key, only shift lock can change the
      values you read from CIA 1 port B ($DC01).)

  So, if you turn any memory management line to input, the external
pull-up resistors will make it look like it is outputting logical
'1'. This is actually why the computer always switches the ROMs in
upon startup: Pulling the -RESET line low resets all Peripheral lines
to inputs, thus driving all three processor-driven memory management
lines high.

  The two remaining memory management lines are -EXROM and -GAME on
the cartridge port. Each line has a pull-up resistor, so the lines are
'1' by default. (In the Commodore 128, you can set the state of these
two lines prior to selecting the C64 mode, provided that you write the
mode switch routine yourself.)

  Even though the memory banking has been implemented with a 82S100
Programmable _Logic_ Array, there is only one control line that seems
to behave logically at first sight, the -CHAREN line. It is mostly
used to choose between I/O address space and the character generator
ROM. The following memory map introduces the oddities of -CHAREN and
the other memory management lines. It is based on the memory maps in
the Commodore 64 Programmer's Reference Guide, pp. 263 - 267, and some
errors and inaccuracies have been corrected.

  The leftmost column of the table contains addresses in hexadecimal
notation. The columns aside it introduce all possible memory
configurations. The default mode is on the left, and the absolutely
most rarely used Ultimax game console configuration is on the right.
(There have been at least two Ultimax cartridges on the market.) Each
memory configuration column has one or more four-digit binary numbers
as a title. The bits, from left to right, represent the state of the
-LORAM, -HIRAM, -GAME and -EXROM lines, respectively. The bits whose
state does not matter are marked with "x". For instance, when the
Ultimax video game configuration is active (the -GAME line is shorted
to ground, -EXROM kept high), the -LORAM and -HIRAM lines have no effect.


      default                                                         Ultimax
        1111    101x    1000    011x    001x    1110    0100    1100    xx01
                                00x0
10000
-----------------------------------------------------------------------------
 F000
        Kernal  RAM     RAM     Kernal  RAM     Kernal  Kernal  Kernal  ROMH(*
 E000
-----------------------------------------------------------------------------
 D000   IO/C    IO/C    IO/RAM  IO/C    RAM     IO/C    IO/C    IO/C    I/O
-----------------------------------------------------------------------------
 C000   RAM     RAM     RAM     RAM     RAM     RAM     RAM     RAM      -
-----------------------------------------------------------------------------
 B000
        BASIC   RAM     RAM     RAM     RAM     BASIC   ROMH    ROMH     -
 A000
-----------------------------------------------------------------------------
 9000
        RAM     RAM     RAM     RAM     RAM     ROML    RAM     ROML    ROML(*
 8000
-----------------------------------------------------------------------------
 7000

 6000
        RAM     RAM     RAM     RAM     RAM     RAM     RAM     RAM      -
 5000

 4000
-----------------------------------------------------------------------------
 3000    

 2000   RAM     RAM     RAM     RAM     RAM     RAM     RAM     RAM      -

 1000
-----------------------------------------------------------------------------
 0000   RAM     RAM     RAM     RAM     RAM     RAM     RAM     RAM     RAM
-----------------------------------------------------------------------------

    *) Internal memory does not respond to write accesses to these areas.


    Legend: Kernal      E000-FFFF       Kernal ROM.

            IO/C        D000-DFFF       I/O address space or Character
                                        generator ROM, selected by -CHAREN.
                                        If the CHAREN bit is clear,
                                        the character generator ROM is
                                        chosen. If it is set, the
                                        I/O chips are accessible.

            IO/RAM      D000-DFFF       I/O address space or RAM,
                                        selected by -CHAREN.
                                        If the CHAREN bit is clear,
                                        the character generator ROM is
                                        chosen. If it is set, the
                                        internal RAM is accessible.

            I/O         D000-DFFF       I/O address space.
                                        The -CHAREN line has no effect.

            BASIC       A000-BFFF       BASIC ROM.

            ROMH        A000-BFFF or    External ROM with the -ROMH line
                        E000-FFFF       connected to its -CS line.

            ROML        8000-9FFF       External ROM with the -ROML line
                                        connected to its -CS line.

            RAM         various ranges  Commodore 64's internal RAM.

            -           1000-7FFF and   Open address space. 
                        A000-CFFF       The Commodore 64's memory chips
                                        do not detect any memory accesses
                                        to this area except the VIC-II's
                                        DMA and memory refreshes.

    NOTE:   Whenever the processor tries to write to any ROM area
            (Kernal, BASIC, CHAROM, ROML, ROMH), the data will get
            "through the ROM" to the C64's internal RAM.

            For this reason, you can easily copy data from ROM to RAM,
            without any bank switching. But implementing external
            memory expansions without DMA is very hard, as you have to
            use the Ultimax memory configuration, or the data will be
            written both to internal and external RAM.

            However, this is not true for the Ultimax game
            configuration. In that mode, the internal RAM ignores all
            memory accesses outside the area $0000-$0FFF, unless they are
            performed by the VIC, and you can write to external memory
            at $1000-$CFFF and $E000-$FFFF, if any, without changing
            the contents of the internal RAM.


_A note concerning the I/O area_

  The I/O area is divided as follows:

     Address range  Owner
     -------------  -----
       D000-D3FF    MOS 6567/6569 VIC-II Video Interface Controller
       D400-D7FF    MOS 6581 SID Sound Interface Device
       D800-DBFF    Color RAM (only lower nybbles are connected)
       DC00-DCFF    MOS 6526 CIA Complex Interface Adapter #1
       DD00-DDFF    MOS 6526 CIA Complex Interface Adapter #2
       DE00-DEFF    User expansion #1 (-I/O1 on Expansion Port)
       DF00-DFFF    User expansion #2 (-I/O2 on Expansion Port)

  As you can see, the address ranges for the chips are much larger
than required. Because of this, you can access the chips through
multiple memory areas. The VIC-II appears in its window every $40
addresses. For instance, the addresses $D040 and $D080 are both mapped
to the Sprite 0 X co-ordinate register. The SID has one register
selection line less, thus it appears at every $20 bytes. The CIA
chips have only 16 registers, so there are 16 copies of each in their
memory area.

  However, you should not use other addresses than those specified by
Commodore. For instance, the Commodore 128 mapped its additional I/O
chips to this same memory area, and the SID responds only to the
addresses D400-D4FF, also when in C64 mode. And the Commodore 65,
which unfortunately did not make its way to the market, could narrow
the memory window reserved for the MOS 6569/6567 VIC-II (or CSG 4567
VIC-III in that machine).


_The video chip_

  The MOS 6567/6569 VIC-II Video Interface Controller has access to
only 16 kilobytes at a time. To enable the VIC-II to access the whole
64 kB memory space, the main memory is divided to four banks of 16 kB
each. The lines PA0 and PA1 of the second CIA are the inverse of the
virtual VIC-II address lines VA14 and VA15, respectively. To select a
VIC-II bank other than the default, you must program the CIA lines to
output the desired bit pair. For instance, the following code selects
the memory area $4000-$7FFF (bank 1) for the video controller:

    LDA $DD02 ; Data Direction Register A
    ORA #$03  ; Set pins PA0 and PA1 to outputs
    STA $DD02
    LDA $DD00
    AND #$FC  ; Mask the lowmost bit pair off
    ORA #$02  ; Select VIC-II bank 1 (the inverse of binary 01 is 10)
    STA $DD00

  Why should you set the pins to outputs? Hardware RESET resets all
I/O lines to inputs, and thanks to the CIA's internal pull-up
resistors, the inputs actually output logical high voltage level. So,
upon -RESET, the video bank 0 is selected automatically, and older
Kernals could leave it uninitialized.

  Note that the VIC-II always fetches its information from the
internal RAM, totally ignoring the memory configuration lines. There
is only one exception to this rule: The character generator ROM.
Unless the Ultimax mode is selected, VIC-II "sees" character generator
ROM in the memory areas 1000-1FFF and 9000-9FFF. If the Ultimax
configuration is active, the VIC-II will fetch all data from the
internal RAM.


_An application: Making an operating system extension_

  If you are making a memory resident program and want to make it as
invisible to the system as possible, probably the best method is
keeping most of your code under the I/O area (in the RAM at
$D000-$DFFF). This area is very safe, since programs utilizing it are
rare, since they are very difficult to implement and to debug. You
need only a short routine in the normally visible RAM that pushes the
current value of the processor's I/O register $01 on stack, switches
RAM on to $D000-$DFFF and jumps to this area. Returning from the
$D000-$DFFF area is possible even without any routine in the normally
visible RAM area. Just write an RTS or an RTI to an I/O register and
return through it.

  But what if your program needs to use I/O? And how can you write the
return instruction to an I/O register while the I/O area is switched
off? You need a swap area for your program in normally visible memory.
The first thing your routine at $D000-$DFFF does is copying the I/O
routines (or the whole program) to normally visible memory, swapping
the bytes. For instance, if your I/O routines are initially being
stored at $D200-$D3FF, exchange the bytes in $D200-$D3FF with the
contents of $C000-$C1FF. Now you can call the I/O routines from your
routine at $D000-$DFFF, and the I/O routines can switch the I/O area
temporarily on to access the I/O chips. And right before exiting your
program at $D000-$DFFF swaps the old contents of that I/O routine area
in, e.g. exchanges the memory areas $D200-$D3FF and $C000-$C1FF
again.

  What I/O registers can you use for the return instruction? There are
basically two alternatives: 8-bit VIC sprite registers or either CIA's
serial port register. The VIC registers are easiest to use, as they
act precisely like memory places: you can easily write the desired
value to a register. But the CIA register is usually better, as
changing the VIC registers might change the screen layout.

  However, also the SP register has some drawbacks: If the machine's
CNT1 and CNT2 lines are connected to a frequency source, you must stop
either CIA's Timer A to use the SP register method. Normally the 1st
CIA's Timer A is the main hardware interrupt source. And if you use
the Kernal's RS232, you cannot stop the 2nd CIA's Timer A either. Also,
if you don't want to lose any CIA interrupts, you might want to know
that executing the RTS or RTI at SP register has the side effect of
reading the Interrupt Control Register, thus acknowledging an interrupt
that might have been waiting.

  If you can't use either method, you can use either CIA's ToD seconds
or minutes or ToD alarm time for storing an RTI. Or, if you don't want
to alter any registers, use the VIC-II's light pen register. Before
exiting, wait for appropriate raster line and trig the light pen latch
with CIA1's PB4 bit. However, this method assumes that the control
port 1's button/light pen line remains up for that frame. After
trigging the light pen, causing the light pen Y co-ordinate register
($D014) to be $40 or $60, you have more than half a frame time to
restore the state of the I/O chips and return through the register.

  You can also use the SID to store an RTI or RTS command. How is this
possible, you might ask. After all, the chip consists of read only or
write only registers. However, there are two registers that can be
controlled by program, the envelope generator and oscillator outputs
of the third voice. This method requires you to change the frequency
of voice 3 and to select a waveform for it. This will affect on the
voice output by turning the voice 3 off, but who would keep the voice
3 producing a tone while calling an operating system routine?

  Also keep in mind that the user could press RESTORE while the Kernal
ROM and I/O areas are disabled. You could write your own non-maskable
interrupt (NMI) handler (using the NMI vector at $FFFA), but a fast
loader that uses very tight timing would still stop working if the
user pressed RESTORE in the middle of a data block transfer. So, to
make a robust program, you have to disable NMI interrupts. But how is
this possible? They are Non-Maskable after all. The NMI interrupt is
edge-sensitive, the processor jumps to NMI handler only when the -NMI
line drops from +5V to ground. To disable the interrupt, simply cause
an NMI with CIA2's timer, but don't read the Interrupt Control
register. If you need to read $DD0D in your program, you must add a
NMI handler just in case the user presses RESTORE. And don't forget to
raise the -NMI line upon exiting the program.  Otherwise the RESTORE
key does not work until the user issues a -RESET or reads the ICR
register explicitly. (The Kernal does not read $DD0D, unless it is
handling an interrupt.) This can be done automatically by the two
following SP register examples due to one of the 6510's undocumented
features (refer to the descriptions of RTS and RTI below).
 
        ; Returning via VIC sprite 7 X co-ordinate register

        Initialization:   ; This is executed when I/O is switched on
                LDA #$60
                STA $D015 ; Write RTS to VIC register $15.

        Exiting:          ; NOTE: This procedure must start at VIC register
                          ; $12. You have multiple alternatives, as the VIC
                          ; appears in memory at $D000+$40*n, where $0<=n<=$F.

                PLA       ; Pull the saved 6510 I/O register state from stack
                STA $01   ; Restore original memory bank configuration
                          ; Now the processor fetches the RTS command from the
                          ; VIC register $15.


        ; Returning via CIA 2's ToD or ToD alarm seconds register

        Initialization:   ; This is executed when I/O is switched on
                LDA #$40  
                STA $DD08 ; Set ToD tenths of seconds
                          ; (clear it so that the seconds register
                          ; would not overflow)
                          ; If ToD alarm register is selected, this
                          ; instruction will be unnecessary.
                STA $DD09 ; Set ToD seconds
                LDA $DD0B ; Read ToD hours (freeze ToD display)

        Exiting:          ; NOTE: This procedure must start at CIA 2 register
                          ; $6. As the CIA 2 appears in memory at $DD00+$10*n,
                          ; where 0<=n<=$F, you have sixteen alternatives.
                PLA
                STA $01   ; Restore original memory bank configuration
                          ; Now the processor fetches the RTS command from
                          ; the CIA 2 register $9.

 
        ; Returning via CIA 2's SP register (assuming that CNT2 is stable)

        Initialization:   ; This is executed when I/O is switched on
                LDA $DD0E ; CIA 2's Control Register A
                AND #$BF  ; Set Serial Port to input
                STA $DD0E ; (make the SP register to act as a memory place)
                LDA #$60
                STA $DD0C ; Write RTS to CIA 2 register $C.

        Exiting:          ; NOTE: This procedure must start at CIA 2 register
                          ; $9. As the CIA 2 appears in memory at $DD00+$10*n,
                          ; where 0<=n<=$F, you have sixteen alternatives.
                PLA
                STA $01   ; Restore original memory bank configuration
                          ; Now the processor fetches the RTS command from
                          ; the CIA 2 register $C.


        ; Returning via CIA 2's SP register, stopping the Timer A
        ; and forcing SP2 and CNT2 to output

        Initialization:   ; This is executed when I/O is switched on
                LDA $DD0E ; CIA 2's Control Register A
                AND #$FE  ; Stop Timer A
                ORA #$40  ; Set Serial Port to output
                STA $DD0E ; (make the SP register to act as a memory place)
                LDA #$60
                STA $DD0C ; Write RTS to CIA register $C.

        Exiting:          ; NOTE: This procedure must start at CIA 2 register
                          ; $9. As the CIA 2 appears in memory at $DD00+$10*n,
                          ; where 0<=n<=$F, you have sixteen alternatives.
                PLA
                STA $01   ; Restore original memory bank configuration
                          ; Now the processor fetches the RTS command from
                          ; the CIA 2 register $C.

        ; Returning via SID oscillator 3 output register

        Initialization:   ; This is executed when I/O is switched on
                LDA #$20  ; Select sawtooth waveform
                STA $D412 ; but do not enable the sound
                LDY #$00  ; Select frequency
                STY $D40E ; (system clock)/$FF00,
                LDA #$FF  ; causing the OSC3 output to increment by one
                STY $D40F ; every $10000/$FF00 cycles.

                LDA #$0E
                LDX #$60

                BIT $D41B ; Wait for the oscillator 3 output
                BMI *-3   ; to be in the range
                BVS *-5   ; $00-$3F.
                BIT $D41B ; Wait for the oscillator 3 output
                BVC *-3   ; to be at least $40.

                STA $D40F ; Slow down the frequency to (system clock)/$0E00.
                CPX $D41B ; Wait for the oscillator 3
                BNE *-3   ; output to reach $60 (RTS)

                STY $D40F ; Reset the frequency of voice 3
                          ; (stop the OSC3 register from increasing)

        Exiting:          ; NOTE: This procedure must start at SID register
                          ; $18. As the SID appears in memory at $D400+$20*n,
                          ; where 0<=n<=$20, you have thirty-two alternatives.
                          ; However, in C128 there are only eight alternatives,
                          ; as the SID is only at $D400-$D4FF.

                PLA
                STA $01   ; Restore original memory bank configuration
                          ; Now the processor fetches the RTS command from
                          ; the SID register $1B.


  For instance, if you want to make a highly compatible fast loader,
make the ILOAD vector ($0330) point to the beginning of the stack
area. Remember that the BASIC interpreter uses the first bytes of
stack while converting numbers to text. A good address is $0120.
Robust programs practically never use so much stack that it could
corrupt this routine. Usually only crunched programs (demos and alike)
use all stack in the decompression phase. They also make use of the
$D000-$DFFF area.

  This stack routine will jump to your routine at $D000-$DFFF, as
described above. For performance's sake, copy the whole byte transfer
loop to the swap area, e.g. $C000-$C1FF, and call that subroutine
after doing the preliminary work. But what about files that load over
$C000-$C1FF? Wouldn't that destroy the transfer loop and jam the
machine? Not necessarily. If you copy those bytes to your swap area at
$D000-$DFFF, they will be loaded properly, as your program restores
the original $C000-$C1FF area.

  If you want to make your program user-friendly, put a vector
initialization routine to the stack area as well, so that the user can
restore the fast loader by issuing a SYS command, rather than loading
it each time he has pressed RESET.


_An example: A "hello world" program_

  To help you in getting started, I have written a small example
program that echoes the famous message "hello, world!" to standard
output (normally screen) using the Kernal's CHROUT subroutine. After
the initialization routine has been run, the program can be started by
commanding SYS 300. I used the Commodore 128's machine language
monitor to put it up, but it was still pretty difficult to debug the
program. Here it is in uuencoded format:

begin 644 hello
M`0@+",D'GC(P-C$```!XI0%(*?B%`:(,O3`(G2P!RA#WHHN]/`B=8]W*T/=H
MA0%88*4!JBGX"01XA0%,I-WF`:*!C@W=H@".!=WHC@3=HMV.#MVB0(X,W<8!
M8*4!2`D#A0&@#+DSP"#2_X@0]VB%`6`A1$Q23U<@+$],3$5(BDBM^O](K?O_
M2*D6C?K_J<"-^_\@W-T@`,!HC?O_:(WZ_R`=P"#<W6BHJ0!(NOX"`=`#_@,!
5A`&@/[X`P+EDW9D`P(J99-V($/!@
`
end

  In order to fully understand the operation of this program, you need
to know how the instructions RTI, RTS and PHA work. There is some work
going on to reverse engineer the NMOS 6502 microprocessor to large
extent, and it is now known for most instructions what memory places
they access during their execution and for what purpose. The internal
procedures haven't been described in detail yet, but these
descriptions should be easier to read anyway.

  For curiosity, I quote here the description of all instructions that
use the stack. The descriptions of internal operations are yet
inaccurate, but the memory accesses have been verified with an
oscilloscope. I will mail copies the whole document upon request. When
finished, the document will be put on an FTP site.

     JSR

        #  address R/W description
       --- ------- --- -------------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  fetch address's low byte to latch, increment PC
        3  $0100,S  R
        4  $0100,S  W  push PCH on stack, decrement S
        5  $0100,S  W  push PCL on stack, decrement S
        6    PC     R  copy latch to PCL, fetch address's high byte to
                       latch, copy latch to PCH


     RTS

        #  address R/W description
       --- ------- --- -----------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  read next instruction byte (and throw it away),
                       increment PC
        3  $0100,S  R  increment S
        4  $0100,S  R  pull PCL from stack, increment S
        5  $0100,S  R  pull PCH from stack
        6    PC     R  increment PC


     BRK

        #  address R/W description
       --- ------- --- -----------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  read next instruction byte (and throw it away),
                       increment PC
        3  $0100,S  W  push PCH on stack, decrement S
        4  $0100,S  W  push PCL on stack, decrement S
        5  $0100,S  W  push P on stack (with B flag set), decrement S,
                       set I flag
        6   $FFFE   R  fetch PCL
        7   $FFFF   R  fetch PCH


     RTI

        #  address R/W description
       --- ------- --- -----------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  read next instruction byte (and throw it away),
                       increment PC
        3  $0100,S  R  increment S
        4  $0100,S  R  pull P from stack, increment S
        5  $0100,S  R  pull PCL from stack, increment S
        6  $0100,S  R  pull PCH from stack


     PHA, PHP

        #  address R/W description
       --- ------- --- -----------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  read next instruction byte (and throw it away),
                       increment PC
        3  $0100,S  W  push register on stack, decrement S


     PLA, PLP

        #  address R/W description
       --- ------- --- -----------------------------------------------
        1    PC     R  fetch opcode, increment PC
        2    PC     R  read next instruction byte (and throw it away),
                       increment PC
        3  $0100,S  R  increment S
        4  $0100,S  R  pull register from stack


  The example program consists of three parts. The first part
transfers the other parts to appropriate memory areas. The second part
is located in stack area (300-312), and it invokes the third part, the
main module.

  The loader part ($0801-$08C7) is as follows:

        1993 SYS2061

        080D SEI          Disable interrupts.
        080E LDA $01
        0810 PHA          Store the state of the processor's I/O lines.
        0811 AND #$F8
        0813 STA $01      Select 64 kB RAM memory configuration.

        0815 LDX #$0C     Copy the invoking part to 300-312.
        0817 LDA $0830,X
        081A STA $012C,X
        081D DEX
        081E BPL $0817

        0820 LDX #$8B     Copy the main part to $DD64-$DDEE.
        0822 LDA $083C,X
        0825 STA $DD63,X
        0828 DEX
        0829 BNE $0822

        082B PLA          Restore original memory configuration.
        082C STA $01
        082E CLI          Enable interrupts.
        082F RTS          Return.

  The user invokes the following part by issuing SYS 300. This part
changes the memory configuration and jumps to the main part.

        012C LDA $01
        012E TAX          Store original memory configuration to X register.
        012F AND #$F8
        0131 ORA #$04
        0133 SEI          Disable interrupts.
        0134 STA $01      Select 64 kB RAM memory configuration.
        0136 JMP $DDA4    Jump to the main part.

  The main part actually consists of two parts. It may be a bit
complicated, and it might teach new tricks to you.

        DDA4 TXA
        DDA5 PHA          Push original memory configuration on stack.
        DDA6 LDA $FFFA
        DDA9 PHA
        DDAA LDA $FFFB
        DDAD PHA          Store the original values of $FFFA and $FFFB.
        DDAE LDA #$16
        DDB0 STA $FFFA    Set ($FFFA) to point to RTI.
        DDB3 LDA #$C0
        DDB5 STA $FFFB
        DDB8 JSR $DDDC    Swap the auxiliary routines in.
        DDBB JSR $C000    Disable NMI's and initialize CIA2.
        DDBE PLA
        DDBF STA $FFFB    Restore original values to $FFFA and $FFFB.
        DDC2 PLA
        DDC3 STA $FFFA
        DDC6 JSR $C01D    Print the message.
        DDC9 JSR $DDDC    Swap the auxiliary routines out.
        DDCC PLA
        DDCD TAY          Load original memory configuration to Y register.
        DDCE LDA #$00     Push desired stack register value on stack
        DDD0 PHA          (clear all flags, especially the I flag).
        DDD1 TSX
        DDD2 INC $0102,X  Increment the return address.
        DDD5 BNE $DDDA    (RTS preincrements it, but RTI does not.)
        DDD7 INC $0103,X
        DDDA STY $01      Restore original memory configuration.

        (The 6510 fetches the next instruction from $DDDC, which is now
        connected to the CIA2's register $C, the Serial Port register.
        The initialization routine wrote an RTI to it. The processor also
        reads from $DDDD as a side effect of the instruction fetch,
        thus re-enabling NMI's.)

        DDDC LDY #$3F     Subroutine: Swap the memory areas $C000-$C03F
        DDDE LDX $C000,Y              and $DD64-$DDA3 with each other.
        DDE1 LDA $DD64,Y
        DDE4 STA $C000,Y
        DDE7 TXA
        DDE8 STA $DD64,Y
        DDEB DEY
        DDEC BPL $DDDE 
        DDEE RTS

        C000 INC $01      Enable the I/O area.
        C002 LDX #$81
        C004 STX $DD0D    Enable Timer A interrupts of CIA2.
        C007 LDX #$00
        C009 STX $DD05
        C00C INX
        C00D STX $DD04    Prepare Timer A to count from 1 to 0.
        C010 LDX #$DD
        C012 STX $DD0E    Cause an interrupt.

        (The instruction sets SP to output, makes Timer A to count
        system clock pulses, forces the CIA to load the initial value
        to the counter, selects one-shot counting and starts the timer.)

        C015 LDX #$40

        (The processor now jumps to the NMI handler ($C016), and
        the SP register starts to act as a memory place.)

        C017 STX $DD0C    Write an RTI to Serial Port register.
        C01A DEC $01      Disable the I/O area.
        C01C RTS          Return.

        C01D LDA $01
        C01F PHA
        C020 ORA #$03     Enable I/O and ROMs.
        C022 STA $01
        C024 LDY #$0C     Print the message.
        C026 LDA $C033,Y
        C029 JSR $FFD2
        C02C DEY
        C02D BPL $C026
        C02F PLA
        C030 STA $01      Restore the 64 kB memory configuration.
        C032 RTS

        C033 "!DLROW ,OLLEH"

        (The string is backwards in memory, since I don't want to
        waste cycles in explicit comparisons. This method results in
        more readable code than doing a forward loop with an index
        value $100-(number of characters).)

  This program is not excellent. It has the following bugs:

   o The 6510's memory management lines P0 and P1 (LORAM and HIRAM,
     respectively) are assumed to be outputs. If you issued the
     command POKE0,PEEK(0)AND252, this program would not work.
     This could be easily corrected by setting the P0 and P1 lines
     to output in the beginning of the interfacing routine (300 - 312):

        LDA $00
        ORA #$02
        STA $00

   o The program does not restore the original state of the CIA2
     Control Register A or Interrupt Control Register. It might be
     impossible to start using the Kernal's RS-232 routines after
     running this.

   o If the user redirected output to cassette or RS-232, interrupts
     would be required. However, they are completely disabled.

   o If a non-maskable interrupt occurs while the loader part is being
     executed, the program will screw up. This will happen also in the
     main part, if an NMI is issued after disabling ROMs and I/O in
     $0134 but before exchanging the contents of the memory places
     $C016 and $DD7A.


_Freezer cartridges_

There are many cartridges that let you to stop almost any program for
"back-up" purposes. One of the most popular of these freezer
cartridges is the Action Replay VI made by Datel Electronics back in
1989. The cartridge has 8 kilobytes RAM and 32 kilobytes ROM on board,
and it has a custom chip for fiddling with the C64 cartridge port
lines -EXROM, -GAME, -IRQ, -NMI and BA.

If the -NMI line is not asserted (the NMI interrupts are enabled), all
freezer cartridges should be able to halt any program. When the user
presses the "freeze" button, the cartridges halt the processor by
dropping the BA line low. Then they switch some of their own ROM to
the $E000 - $FFFF block by selecting the UltiMax configuration with
the -EXROM and -GAME lines. After this, they assert the -NMI line and
release the BA line. After completing the current instruction, the
processor will take the NMI interrupt and load the program counter
from the vector at $FFFA, provided that the NMI line was not asserted
already.

This approach is prone to many flaws. Firstly, if the processor is
executing a write instruction when the program is being halted, and if
the write occurred outside the area $0000 - $0FFF, the data would get
lost, if the UltiMax configuration was asserted too early. This can be
corrected to some extent by waiting at least two cycles after
asserting the BA line, as the processor will not stop during write
cycles. However, this is of no help if the processor has not gotten to
the write stage yet.

Secondly, if the instruction being executed is outside the area
$0000 - $0FFF, or if it accesses any data outside that area, the
processor will fetch either wrong parameters or incorrect data, or
both. If the instruction does not write anything, will only corrupt
one processor register.

Thirdly, if the NMI interrupts are disabled, pressing the "freeze"
button does not have any other immediate effect than leaving the
UltiMax mode asserted, which makes any system RAM outside the area
$0000 - $0FFF unavailable. It also forces the I/O area ($D000 - $DFFF)
on. If the program has any instructions or data outside the lowmost
four kilobytes, it will eventually jam, as that data will be something
else than the program expects.

One might except that reading from open address space should return
random bytes. But, in at least two C64's, the bytes read are mostly
$BD, which is the opcode for LDA absolute,X. So, if the processor has
a "good luck", it will happily execute only LDA $BDBD,X commands, and
it might survive to the cartridge ROM area without jamming. Or it
could eventually fetch a BRK and jump to the cartridge ROM via the
IRQ/BRK vector at $FFFE. The Action Replay VI has the familiar
autostart data in the beginning of both the ROML and ROMH blocks by
default, and that data could be interpreted as sensible commands. The
Action Replay VI was indeed able to freeze my test program, even
though I had covered its -RESET, -IRQ and -NMI lines with a piece of
tape, until I relocated the program to the first 4 kilobyte block.


_Building an unbeatable freezer circuit_

As you can see, it is totally impossible to design a freezer cartridge
that freezes any program. If the program to be freezed has disabled
the NMI interrupts, and if its code runs mostly at $0000 - $0FFF or
$D000 - $DFFF, the computer will more probably hang than succeed in
freezing the program.

However, it is possible to make some internal modifications to a C64,
so that it can freeze literally any program. You need to expand your
machine to 256 kilobytes following the documents on ftp.funet.fi in
the /pub/cbm/hardware/256kB directory. It will let you to reset the
computer so that all of the 64 kilobytes the previous program used,
will remain intact. If you add a switch to one of the memory expansion
controller's chip selection lines, the program being examined will
have no way to screw the machine up, as the additional memory management
registers will not be available.

A few enhancements to this circuit are required so that you can freeze
the programs without losing the state of the I/O chips. You will also
need to replace the Kernal ROM chip with your own code, if you do not
want to lose the state of the A, X, P and S registers. Unfortunately
this circuit will not preserve the state of the processor's Peripheral
lines (its built-in I/O port mapped to the memory addresses 0 and 1),
nor does it record the program counter (PC). I have a partial solution
to the PC problem, though.

If you are interested in this project, contact me. I will design the
additional hardware, and I will program the startup routines, but I
certainly do not have the time to program all of the freezer software.
Most of the freezer software could be in RAM, so it would be very easy
to develop it, and you could even use existing tools by patching them
slightly.

=============================================================================