A Machine for Testing of Static RAM
by Ralph Klimek 2007, Based on work from 1987 |
Abstract. A Digital machine for the rapid go no-go testing of some
commodity static ram chips. The design principals are based on
the Zero Bit Computer/Sequencer. Introduction.
Way back in 1987 I was servicing a large mini computer, the Pyramid 90X
which as based on 74Sxxx series logic. It was based on a 32Mhz
clock divided into 4 phases, and a risc architecture that gave it about
8 MIPS. Built of discrete SSI and MSI devices, the only VLSI devices
were the console processor which used a Motorola 68000 and
assorted high speed bipolar static RAMS. We did repair these
machines to chip level, back then , just swapping boards until it
worked was not an option. The CPU was heavily microcoded and came
with extensive , very effective diagnostics that ran at both the "cpu"
level and at the microcode level. The machine had many caches,
instruction buffers, microcode caches, writable control stores
implemented with high speed bipolar static rams. In those days, the
static ram failure rate was fairly high, Monash University had 6 of
these machines at one stage. Diagnosing static ram failures was
rendered easy by the microcode diagnostics. They basically told you
which chip to change. All well and good, but what if the bad ram was in
the writable control store ? The diagnostic program could not then run at all.
This machine was born of a necessity to bulk checking of machine ram
and our small stock of spare ram. Somehow bad ram had found its way
into the spares and maybe a third of our spares were suspect. One fine
day a Pyramid died with a control store parity so not even diagnostics
would run. These particular static rams ran very hot in
normal use and even though they were packaged in ceramic DIPs
they failed often. With some foresight the designers of the
Pyramid had placed these chips in good quality sockets so that it was
practical to just remove and test all of them.
So over a weekend I struggled to complete this testing machine. This Pyramid mini "mattered" and had been down for a while and the powers-that-be were not impressed. I had on hand an intel type 2149 that we knew was bad and one 2149 that we thought was good. So with that I wire wrapped away my weekend and by Sunday night it could tell apart good from bad. Why not a microprocessor design ? That would have meant for me a Z80, eprom, static ram, debugger, display and maybe a week of futzing around to get a sign of life. The hard wired design would "just work", as it did. The dead Mini was repaired in 10 minutes with the aid of this machine. It became quite popular in our group.
The core of the design is based on what I call a "Zero Bit Computer".
It is a micro sequencer based on a small handful of MSI 74xxx logic.
The unique design attribute of the zero bit computer design paradigm is
that there exists a one to one isomorphism between a plain natural
language description of a sequencing problem and the wiring ( that
would be mirco-programming) of the zero bit computer. I have discussed this
idea in another article in this site. There are two zero bit
computers in this design, one does the creation of a chip enable and
write enable signals to the static ram under test and it ensures
correct setup and hold times of address,data and the WE signal. The
other does "gross" sequencing, address generation, write cycle, read
back and compare cycle, next data pattern cycle. The machine will
write/read/compare ones, zeros , and selection of address bits and halt
on error and alarm. It could test 1 bit, 4 bit and 8 bit wide
rams with up to 16 address lines.
I built it from discarded mainframe components at zero net cost, this being a university department was a good thing as we had no budget for this kind of playing around. |
The above drawing shows the two micro sequencer, address generator,
test data multiplexer, and random pattern data generator for 1 bit wide
static rams. The "phase" sequencer performs the "inner loop" of
the microprogram, which is the generation of the static rams DUT
WE/ signal and not much else. The phase sequencer also
defined the read data validation window. Why do this rather than just
use a one-shot ? Because the sequencer is deterministic, it just
works the way it was told to. A one-shot is an analog device which a
rich baggage train of pathology ! The main sequencer orchestrates that microprogram flow and directly implements the "high level" program description. The chips labelled data mux are responsible for the generation of the test data. The muxes can select ONE, ZERO, address bits or the parity of the address bits. The parity was used for testing single bit width devices. I would have had to play around with shift registers to write the "address" into a single bit wide ram, it was just easier to generate a deterministic random strong function of the address value with just one parity generator. Even though this could not possibly detect all possible errors/per bit it could with very high probability detect any errors per ram chip . Despite the completeness of the testing algorithm the vast majority of ram chip failures were due to bad I/O line drivers. I only recall detecting one chip that could not decode its addresses properly. |
A table of static rams that my machine could test, these were all the
ones that we used in the Pyramid, VAX and Burroughs mainframes here at Monash in 1987. |
This is the data read back and compare circuit. Xor gates are used to perform bit wise comparison. Tristate RAMs and discrete Din/Dout chips had to be cateref for and well as differant chip data widths 1,4 and 8 bits |
The machine was wired on these prototype boards that were not prototype boards at all. They were salvaged from redundant Burroughs type BX350 disk pack drive controllers which were mini computers in their own right. Most of the chips are recycled Burroughs mainframe 74xxx seris logic. The reset button would also TRISTATE the D.U.T buffers so they could be quickly inserted without power cycling the machine. The dip switches formed the data input into the main sequencer. This permitted the machine to be debugged state by state. You need this when one's wirewrap error rate is about 1 in 50. |
Burroughs used the best quality wirewrap boards in their very expensive products |