Desktop supercomputer NF5588

Brief Introduction

Advanced dual-way quad-core CPU+GPU heterogeneous architecture,

the highest computing power up to 4 Tflops, is a high-performance supercomputer, but also a high-end workstation product, which is a new development of Inspur with excellent performance, excellent reliability, high scalability, suit for life science, finance, securities, animation, telecommunications, and medium-sized enterprises, energy and other industries which have high demand in performance and reliability, capable of kinds of critical applications.

Functional Features

Collaborative computing acceleration architecture

Creatively introduce GPU computing unit, which outperforms traditional single computing unit. It adopts the collaboration of the latest E3/Westmere processing core and NVIDIA Tesla accelerating computation. CPU and GPU fulfill their responsibilities respectively. CPU is competent for logical choice, judge to jump and I/O communications, while GPU is competent for intensive, parallel computation. This makes the computing resource reasonably assigned and the computing power adequately released. Thus the calculated performance improves sharply. Specially, collaborative computing accelerate architecture is not against to traditional technical route, but a complement and support, show the advantages of GPU and ultimately help the user in certain applications by solving problems in much shorter time or increasing solution scale of problem.

Large-scale parallel computing processing core

Compared with multi-core CPU providing less threads working together, GPU provides thousands of threads working concurrently. This enables the system to process more information flow. For example, in game and animation rendering, GPU computing is a special kind of computing tasks, which calculate each pixel does not require (or use method to make it not require) to consider the order, so if you have a million threads, you can render all the pixels at the same time, so that all pixels can be calculated within a pixel time. At present, Tesla processing unit is capable of supporting 448 processing cores, with a peak processing speed of 1 Tflops. By collaborative expandable architecture, the number of GPU can be expanded depending on customers’ requirement. Generally, the acceleration ratio of application is between 1 to 2 magnitudes.

High-speed I/O switch between processing units

Each device has the dedicated connection of its own, and GPU does not need to request system bus for bandwidth. Further, the data transmission efficiency can be increased to a high value. Compared with traditional PCI bus which can only achieve a one-way transmission in a certain time circle, PCI-E with dual-simplex connection can provide higher transmission rates and quality, the differences between them is similar to the differences between half-duplex and full duplex. Inspur Yitian supercomputer adopts PCI-E2.0 x16 with a bandwidth of 16 Gb/s.

Adopt the latest Intel QPI technology

To better the collaborative effect of CPU and GPU, Inspur Yitian supercomputer adopts Intel QPI (Quick Path Interconnect) technology, which provides a transmission rate of 6.4GT/s and makes the communication faster. QPI bus to achieve a direct interconnection within multi-core processors, during multi-processor operation, each processor can send data to each other, does not need to go through chipset, which significantly increases overall system performance. Nehalem architecture processors with integrated memory controller, PCI-E2.0 graphics interface as well as the emergence of the core graphics, will make a further play of QPI architecture capabilities.

Excellent programming environment

CUDA (Compute Unified Device Architecture) is capable of solving complicated computation. It consists of CUDA instruction set architecture (ISA) and parallel computation engine inside GPU. Developers may write C programs for CUDA architecture. Since C is one of the most popular advanced programming languages, the C-coded programs can operate effectively in processors compatible with CUDA. In the future, it will support other languages including FORTRAN and C++.
Standard C language for GPU parallel application development.
Fast Fourier Transform (FFT) and basic linear algebra subroutines (BLAS) standard digital library.
Dedicated CUDA driver, for fast data transfer between GPU and CPU computing.
CUDA driver with OpenGL and DirectX graphics drivers can interoperate.
Support Linux 32/64 bit, Windows XP 32/64 bit and Mac operating systems.
Thousands of software developers currently use the free CUDA software tools to solve a variety of professional and home applications. These applications from video and audio processing, physics simulations to the oil and gas exploration, product design, medical imaging and scientific research, covering various fields.
Inspur has strong GPU application development and migration team
Innovation and formation of CPU + GPU-based hybrid applications team.
Solve management and scheduling problem of mixed architecture cluster.
Integrate CPU and GPU computing power.
Extract the methodology of Applications migrate to the hybrid structure.

Applications Cases

Life science: molecular dynamics; gene permutation; protein folding; calculational chemistry.
Engineering science: CAD/CAM/CAE; astronomical physics; CFD; Mathematics and LabView.
Government and national defense: weapons; image processing; battleground simulation.
Medical treatment: MRI; CT; image-assistant treatment.
Petroleum and petrification: earthquake data processing; reservoir simulation.
Finance: risk analysis; derived finance modeling; trading algorithms.
Visualization: games; cartoons.
Electronic design automation EDA: SPICE; Verilog and 3D EM.

Technical Specifications



Intel Xeon 5600 series processor

Computing unit

Up to 4 pieces of Nvidia Tesla C2050/C2070 and next generation

System bus

Quick Path Interconnect, up to 6.4GT/s


12 DIMMs, DDR3 1066/1333MHz memory , up to 96GB

HDD controller

Integrated SATA controller, support SAS controller


HostRaid1, 0, 10, support external SAS RAID Controller with cache


Up to 8 pieces of 3.5’’SATA/SAS hot-swap HDDs

Expansion slots

4 PCI-E2.0 x16

2 PCI-E2.0 x16(x4 link)

1 PCI-E1.0 x8(x4 link)

2 PCI 33MHz

Network controller

Integrated 2 Gigabit network cards

Power supply

1400W high efficiency redundant power supply

Graphics card

Integrated Matrox G200eW, optional FX370LP /FX1800/ FX3800/ FX5800


Optional external DVD-ROM

Operating Temperature


AC input

100-240V 50/60Hz


178mm(W)    746mm(D)    452mm(H)




About Lotus

Lotus Information Systems provides world-class turn-key system integration and IT services and solutions


Layout Type

Presets Color

Background Image