Data-Level Parallelism in Vector, SIMD, and GPU Architectures

The fourth chapter of Computer Architecture: A Quantitative Approach gives comprehensive understanding of some alternatives to sequential processor architectures: vector processors, SIMD multimedia extensions, and GPU (Graphics Processing Units). I am not repeating or summarizing the contents of the book, but I would like to mention my personal impressions about how to adopt those approaches in the context of cloud computing. Fortunately, I have had enough experience of development and research on those topics.

Vector Processors

It is not very likely that vector processors will soon be popular in data center. Most developers are not familiar with vector programming, and the magic that automatically converts serial code into vector instructions has not been invented yet. The situation could be better with architecture-specific hints for compilers or more general ways (e.g., OpenCL). I guess vector processors could be promising for some workloads if they are used as coprocessors, but it could also hurts the cost-effectiveness through the generality of applications.

SIMD Multimedia Extensions

Thanks to its incremental approach, multimedia extension instructions have been possible from the beginning of virtualization. However each x86 vendor (Intel and AMD) has slightly different instructions sets, which implies potential portability issues.

GPU

The major problem with GPUs is that it does not support any virtualized environments, on which cloud computing heavily depends. While Amazon EC2 has begun to support GPU recently, it seems that the GPU instances run on bare machines, not on multi-tenant, virtualized machines. Even on a non-virtualized machine, sharing the same GPU is very problematic. It is based on time sharing, without any notion of priority scheduling. The shared GPU often shows suboptimal performance, and it is not fault-tolerant yet.

Power consumption is another major problem. Even a single high-end GPU could consume as much as power of a whole desktop machine. Considering the fact that energy consumption is very critical in cloud computing, one should draw significant speedup from GPUs to rationalize such power consumption.

The price of GPUs is another problem. Although a desktop GPU is available for a couple of hundred dollars, server-class GPUs (having more reliability, ECC DRAM supports, and high-performance double-precision arithmetic) cost up to a couple of thousand dollars for a single card. Even though cloud service providers could get significant discounts, its price is still unacceptable to be widely used.

Supposed to be anonymized - Cloud Computing

Sunday, September 11, 2011

Data-Level Parallelism in Vector, SIMD, and GPU Architectures

No comments:

Post a Comment