Release: Turbo mode, MIDI support
This new release adds major improvements, that I describe in this article.
Turbo mode
zeST now has a turbo mode, allowing the 68000 CPU to run at 50 MHz, instead of the original 8 MHz.
Of course the goal of turbo mode is no longer cycle exactness, but rather achieve the maximum possible available performance allowed by your FPGA hardware. Switch turbo mode off, and you’re back to cycle exact 8 MHz again.
A short history of memory management in zeST
So, how is it possible? On a previous post about bus and clock management, I explained how we could achieve cycle-exact behaviour of an Atari ST on top of a generic FPGA SoC prototyping board that only features standard components such as DDR memory shared between the SoC’s CPU and the FPGA logic. In summary, I explained that the zeST architecture on the FPGA actually runs using a 100 MHz system clock, and that I produced clock enable signals to reproduce some sort of a virtual 8 MHz clock the ST runs on. The interesting aspect of this was that it allows “pausing” the ST clock when data is being fetched from memory and “resuming” it as soon as the data is available on the bus. This way, memory accesses by the ST always appear to happen in one clock cycle, just as on the original machine. The article then showed that the result was a very distorted clock signal, with a succession of “long cycles” for memory accesses, then “short cycles” to catch up with delays introduced by long cycles.
It became clear that my approach to handling memory accesses at the time was highly inefficient. For every memory access made by the 68000 CPU, I was performing a single 16-bit memory access request to the DDR. This resulted in additional access time penalties and even led to memory starvation in extreme cases.
This led to the introduction of a cache memory in my memory management module. This not only reduced strain on the main DDR and optimised accesses to already-previously accessed data but also enabled prefetching of data by using memory accesses in burst mode. As a result, a significant portion of memory accesses (roughly, more than 90%) are now being done from the cache rather than main memory. Since those cache accesses are done in a few clock cycles, I noticed that the memory management module was idle most of the time. Memory was no longer the bottleneck in zeST performance.
Hence the idea to implement a turbo mode.
Enhanced architecture
To implement turbo mode, I had to make some modifications to the ST architecture. The global architecture remains at the standard 8 MHz, but the CPU runs at 50 MHz when turbo mode is enabled. This required implementing an additional memory bus directly between the CPU and the memory manager to bypass the rest of the architecture that still operates at 8 MHz.
When turbo mode is disabled, the additional memory bus is deactivated, and all accesses revert to the standard ST architecture, maintaining cycle-exactness.
Memory accesses in turbo mode no longer pause the CPU clock when waiting for data from memory. The architecture profits from the fact that the 68000 CPU allows wait states so it waits until data becomes available. In the current state of zeST, two wait cycles are necessary when the data is in the memory manager’s cache, and the memory accesses from the DDR take as many wait cycles as necessary.
When turbo is on, to access the hardware registers on parts still running at 8 MHz, the CPU speed is momentarily reduced to 8 MHz to enable the access, then is immediately reset to turbo when the access is finished.
Performance
I made some experiments with GEMBench, and the results are promising.
The results show a performance increase of over 400% in average, while integer division achieves the full 50 MHz potential with 627% acceleration.
These results can be attributed to the fact that in general, the CPU continuously performs memory accesses for fetching instructions, copying data, etc. With 2 wait cycles being added, each memory access now takes 6 cycles instead of the usual 4. This effectively reduces the CPU’s performance to something similar to a CPU running at 33.3 MHz without any wait cycles.
The only exceptions are when performing complicated integer computations, such as multiplications and divisions. In these cases, most of the CPU activity is focused on computation rather than memory accesses.
There may be a possibility to reduce the number of wait cycles. This will be investigated for a future release.
Note: There currently is a restriction on memory size when running in turbo mode. Turbo mode only works with memory sizes 2M, 4M and above. If you enable turbo mode and memory size does not correspond, memory size will be adjusted to the closest upper setting, and a reset will be forced. As soon as memory size fits the correct setting, you may switch the memory size on and off without needing to reset.
MIDI support
MIDI is now supported in zeST.
On zeST, MIDI works with USB MIDI devices, including USB MIDI cable adapters and devices with direct USB connectivity.
In the Settings
menu you can directly select the connected USB MIDI devices as MIDI in, out or both.
More information is available in the documentation.
MIDI is basically handled by a Linux process running on the board’s ARM processor. Whenever a data byte becomes available from the chosen input MIDI device, it is immediately transferred to a dedicated ACIA-looking device on the FPGA side. On a similar way, whenever a byte is emitted by the ST’s 68000 processor to the ACIA device, an interrupt informs the ARM process that a byte is to be transferred to the MIDI output device.
It is to be noticed that because MIDI is not managed as a generic serial device on Linux, unlike the way it works on the ST, some ST-specific, non standard MIDI communication protocols do not work. I am especially thinking about the Midi Maze II game that implements a 16-player token ring-like local network on top of MIDI. Too bad it does not work on zeST, because it is a very funny game! The only way I see it working is having its author rework the communication protocol so all messages are embedded in standard MIDI-compatible packets.
Additional improvements
- Shifter fixes so some demos or utilities such as Spectrum 512 or MPP that change colours while racing the beam no longer show incorrect colour artifacts.
- MFP fixes for better compatibility.
- System menu reorganisation so the most useful settings come first, and the most exotic come last.
- EmuTOS update the supplied default EmuTOS system ROM has been updated to the latest 1.4 release.
That’s all folks!
Thank you for reading this release report.
As usual, download and installation instructions are available on the getting started page.
Comments
With an account on the Fediverse or Mastodon, you can respond to this post. Since Mastodon is decentralized, you can use your existing account hosted by another Mastodon server or compatible platform if you don't have an account on this one. Known non-private replies are displayed below.