nCUBE was founded in 1983 by a group of Intel employees frustrated by Intel's reluctance to enter the parallel computing market. This was somewhat premature, as Intel entered the market in 1989. In December 1985 the first generation of nCUBE's hypercube machines were released. Like Intel's these were based on work done on the Cosmic Cube. The second generation was launched in June 1989. The third generation was planned for 1995, but never released.
The first nCUBE machines to be released were the nCUBE 10 of late 1985. These were based on a set of custom chips, including a 32-bit ALU and a 64-bit IEEE 754 FPU with 128kB of RAM combined onto a board known as a module. Each module delivered 2 MIPS, 500 kFLOPS (32-bit single precision), or 300 kFLOPS (64-bit double precision), and ran the Vertex OS.
The name referred to the machines ability to build an order-ten hypercube, supporting 1024 CPU's in a single machine. Some of the modules would be used strictly for input/output, which included the nChannel storage-control card, frame buffers, and the InterSystem card that allowed nCUBEs to be attached to each other. At least one host board needed to be installed, acting as the terminal driver. It also could partition the machine into sub-cubes and allocate them separately to different users.
For the second series the naming was changed, and they created the single-chip nCUBE 2 processor. This was otherwise similar to the nCUBE 10's CPU, but run faster at 25MHz to provide about 7 MIPS and 3.5 MFLOPS. This was later improved to 30MHz in the 2S model. RAM was increased as well, with 4 to 16MB of RAM on a "single wide" 1" x 3.5" module, double that on the "double wide" module, and quadruple that on a double wide, double side module. The I/O cards generally had less RAM, with different backend interfaces to support SCSI, HIPPI, etc.
Each nCUBE-2 CPU also included thirteen I/O channels running at 20Mbps. One of these was dedicated to I/O duties, while the other twelve were used as the interconnect system between CPUs. Each channel used wormhole routing to forward messages along. The machines themselves were wired up as order-twelve hypercubes, allowing for up to 4096 CPU's in a single machine.
Each module ran a 200kB microkernel called nCX, but the system now used a Sun Microsystems workstation as the front end and no longer needed the Host Controler. nCX included a parallel filesystem that could do 96-way striping for high performance. C and C++ languages are available, as is NQS, Linda, and Parasoft's Express. These were supported by an in-house compiler team.
The largest nCUBE-2 system installed was at Sandia, a 1024-CPU system that reached 1.91 GFLOPS in testing.
The planned nCUBE-3 CPU included several improvements, and moved to a 64-bit ALU. Among the other improvements was a process-shrink to 0.5u, allowing the speed to be increased to 50Mhz (with plans for 66 and 100MHz). The CPU was also superscalar and included 16kB instruction and data caches, and an MMU for virtual memory support.
Additional I/O links were added, with two dedicated to I/O and sixteen for interconnects, allowing for up to 65,536 CPUs in the hypercube. The channels operated at 100mbps, due to use of 2-bit parallel instead of the serial lines previously The nCUBE3 also added fault-tolerant adaptive routing support, in addition to fixed routing, although in retrospect it's not entirely clear why.
A fully loaded nCUBE-3 machine could use up to 65k processors, for 3TIPS, and 6.5 TFLOPS. The maximum memory will be 65Tb, with an network I/O capability of 24TB/second. Thus, the processor is biased in terms of I/O, which is usually the limitation. The nChannel board provides 16 I/O channels, where each channel can support transfers at 20 Mbytes/second.
See also:
History
Description