, , , , , e.a.

Data Parallel C++

Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL

Paperback Engels 2020 9781484255735
Verwachte levertijd ongeveer 9 werkdagen

Samenvatting

Learn how to accelerate C++ programs using data parallelism. This open access book enables C++ programmers to be at the forefront of this exciting and important new development that is helping to push computing to new levels. It is full of practical advice, detailed explanations, and code examples to illustrate key topics. 

Data parallelism in C++ enables access to parallel resources in a modern heterogeneous system, freeing you from being locked into any particular computing device. Now a single C++ application can use any combination of devices—including GPUs, CPUs, FPGAs and AI ASICs—that are suitable to the problems at hand.This book begins by introducing data parallelism and foundational topics for effective use of the SYCL standard from the Khronos Group and Data Parallel C++ (DPC++), the open source compiler used in this book.  Later chapters cover advanced topics including error handling, hardware-specific programming, communication and synchronization, and memory model considerations.
Data Parallel C++ provides you with everything needed to use SYCL for programming heterogeneous systems.

What You'll Learn

Accelerate C++ programs using data-parallel programmingTarget multiple device types (e.g. CPU, GPU, FPGA)Use SYCL and SYCL compilers Connect with computing’s heterogeneous future via Intel’s oneAPI initiative

Who This Book Is For

Those new data-parallel programming and computer programmers interested in data-parallel programming using C++.

Specificaties

ISBN13:9781484255735
Taal:Engels
Bindwijze:paperback
Uitgever:Apress

Lezersrecensies

Wees de eerste die een lezersrecensie schrijft!

Inhoudsopgave

<div><div>Chapter 1: Introduction</div><div>Sets expectation that book describes SYCL 1.2.1 with Intel extensions, and that most extensions are proof points of features that should end up in a future version of SYCL. Overview notion of different accelerator architectures doing well on different workloads, and introduce accelerator archs (but don’t overdo the topic). Overview/level setting on parallelism and relevant terminology, language landscape, SYCL history.</div><div>• SYCL key feature overview (single source, C++, multi-accelerator) - intended to draw people in and show simple code</div><div>• Language versions and extensions covered by this book</div><div>• Mixed-architecture compute and modern architectures</div><div>• Classes of parallelism</div><div>• Accelerator programming landscape (OpenMP, CUDA, TBB, OpenACC, AMD HCC, Kokkos, RAJA)</div><div>• Evolution of SYCL</div><div><br></div><div>Chapter 2: Where code executes</div><div>Describes which parts of code run natively on CPU versus on "devices". Differentiate between accelerator devices and the "host device". Show more code to increase reader familiarity with program structure.</div><div>• Single source programming model</div><div>• Built-in device selectors</div><div>• Writing a custom device selector</div><div><br></div><div>Chapter 3: Data management and ordering the uses of data</div><div>Overview the primary ways that data is accessible by both host and device(s): USM and buffers. Introduce command groups as futures for execution, and concept of dependencies between nodes forming a DAG.</div><div>• Intro</div><div>• Unified Shared Memory</div><div>• Buffers</div><div>• DAG mechanism</div><div><br></div><div>Chapter 4: Expressing parallelism</div><div>The multiple alternative constructs for expressing parallelism are hard to comprehend from the spec, and for anyone without major parallel programming experience. This chapter must position the parallelism mechanisms relative to each other, and leave the reader with a conceptual understanding of each, plus an understand of how to use the most common forms.</div><div>• Parallelism within kernels</div><div>• Overview of language features for expressions of parallelism</div><div>• Basic data parallel kernels</div><div>• Explicit ND-Range kernels</div><div>• Hierarchical parallelism kernels</div><div>• Choosing a parallelism/coding style</div><div><br></div><div>Chapter 5: Error handling</div><div>SYCL uses C++-style error handling. This is different/more modern than people using OpenCL and CUDA are used to. This chapter must frame the differences, and provide samples from which readers can manage exceptions easily in their code.</div><div>• Exception-based</div><div>• Synchronous and asynchronous exceptions</div><div>• Strategies for error management</div><div>• Fallback queue mechanism</div><div><br></div><div>Chapter 6: USM in detail</div><div>USM is a key usability feature when porting code, from C++ for example. When mixed with differing hardware capabilities, the USM landscape isn’t trivial to understand. This key chapter must leave the reader with an understanding of USM on different hardware capabilities, what is guaranteed at each level, and how to write code with USM features.</div><div>• Usability</div><div>• Device capability levels</div><div>• Allocating memory</div><div>• Use of data in kernels</div><div>• Sharing of data between host and devices</div><div>• Data ownership and migration</div><div>• USM as a usability feature</div><div>• USM as a performance feature</div><div>• Relation to OpenCL SVM</div><div><br></div><div>Chapter 7: Buffers in detail</div><div>Buffers will be available on all hardware, and are an important feature for people writing code that doesn’t have pointer-based data structures, particularly when implicit dependence management is desired. This chapter must cover the more complex aspects of buffers in an accessible waym, including when data movement is triggered, sub-buffer dependencies, and advanced host/buffer synchronization (mutexes).</div><div>• Buffer construction</div><div>• Access modes (e.g. discard_write) and set_final_data</div><div>• Device accessors</div><div>• Host accessors</div><div>• Sub-buffers for finer grained DAG dependencies</div><div>• Explicit data motion</div><div>• Advanced buffer data sharing between device and host</div><div><br></div><div>Chapter 8: DAG scheduling in detail</div><div>Must describe the DAG mechanism from a high level, which the spec does not do. Must describe the in-order simplifications, and common gotchas that people hit with the DAG (e.g. read data before buffer destruction and therefore kernel execution).</div><div>• Queues</div><div>• Common gotchas with DAGs</div><div>• Synchronizing with the host program</div><div>• Manual dependency management</div><div><br></div><div>Chapter 9: Local memory and work-group barriers</div><div>• "Local" memory</div><div>• Managing "local" memory</div><div>• Work-group barriers</div><div><br></div><div>Chapter 10: Defining kernels</div><div>• Lambdas</div><div>• Functors</div><div>• OpenCL interop objects</div><div><br></div><div>Chapter 11: Vectors</div><div>• Vector data types</div><div>• Swizzles</div><div>• Mapping to hardware</div><div><br></div><div>Chapter 12: Device-specific extension mechanism</div><div>• TBD</div><div><br></div><div>Chapter 13: Programming for GPUs</div><div>• Use of sub-groups</div><div>• Device partitioning</div><div>• Data movement</div><div>• Images and samplers</div><div>• TBD</div><div><br></div><div>Chapter 14: Programming for CPUs</div><div>• Loop vectorization</div><div>• Use of sub-groups</div><div>• TBD</div><div><br></div><div>Chapter 15: Programming for FPGAs</div><div>• Pipes</div><div>• Memory controls</div><div>• Loop controls</div><div><br></div><div>Chapter 16: Address spaces and multi_ptr</div><div>• Address spaces</div><div>• The multi_ptr class</div><div>• Intefacing with external code</div><div><br></div><div>Chapter 17: Using libraries</div><div>• Linking to external code</div><div>• Exchanging data with libraries</div><div><br></div><div>Chapter 18: Working with OpenCL</div><div>• Interoperability</div><div>• Program objects</div><div>• Build options</div><div>• Using SPIR-V kernels</div><div><br></div><div>Chapter 19: Memory model and atomics</div><div>• The memory model</div><div>• Fences</div><div>• Buffer atomics</div><div>• USM atomics</div></div>

Managementboek Top 100

Rubrieken

    Personen

      Trefwoorden

        Data Parallel C++