Previous | Next --- Slide 111 of 129
Back to Lecture Thumbnails

Review questions:

  • Describe the idea of hardware multi-threading.
  • How is prefetching is an alternative solution to the problem that hardware multi-threading tries to solve.
  • Describe a situation where prefetching may not be possible. (Hint: let's think about properties of the memory access.)
  • What is the "cost" of implementing hardware multi-threading? (what additional resources are required?)
  • Hardware multi-threading essentially involves swapping in a different instruction stream from a another thread for execution, to hide long memory access latencies. This keeps the idle compute units performing other useful work.
  • Prefetching involves hiding memory latencies by preempting the memory access before the data is actually required.
  • Prefetching data for irregular memory accesses (e.g., pointer chasing) is difficult.
  • Additional hardware to keep the execution state of the additional threads (registers) and scheduling logic to select the thread to execute.

@nandita. Great answers!

I'll only call attention to one thing. You say "irregular data accesses" make prefetching difficult. Irregular accesses may certainly thwart the design of practical hardware prefetchers (for everyone in the class, a hardware prefetcher is a hardware unit in a processor that predicts what memory addresses an instruction stream will access in the future and prefetches the contexts of predicted addresses into the cache), but really it's the unpredictability of memory accesses, not their irregularity that makes prefetching fundamentally difficult. For example, imagine I have an array of addresses I am going to access in the future:

1 5 0 0 6 7 2 10 42 0 4 0 0 10 30 2 ...

This is certainly an irregular pattern, but would certainly be prefetchable if I had this list ahead of time. (A gather DMA unit in a stream processor, or a vertex fetch unit in a modern GPU, might do exactly this.)

However, the example you mention, linked list traversal, is a great example of unpredictable data access, since the address the processor needs next is not known until the prior node is visited. I'd claim it's this unpredictability, not the irregularity of the access pattern, that is the fundamental reason why data in a linked list traversal would be tough to prefetch.