the c++20 synchronization library...bryce adelstein lelbach cuda c++ core libraries lead iso c++...
TRANSCRIPT
![Page 1: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/1.jpg)
unique_future<std::uint64_t>fibonacci(execution_policy auto&& s, std::uint64_t n) {if (n < 2) co_return n;
auto n1 = async(s, fibonacci<decltype(s)>, s, n - 1);auto n2 = fibonacci(s, n - 2);
co_return co_await n1 + co_await n2;}
Bryce Adelstein Lelbach, Meeting C++ 2019
Copyright (C) 2019 Bryce Adelstein Lelbach
The C++20 Synchronization Library
![Page 2: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/2.jpg)
Bryce Adelstein Lelbach
CUDA C++ Core Libraries Lead
ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair
THE C++20 SYNCHRONIZATION LIBRARY@blelbach
![Page 3: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/3.jpg)
includecpp.org
![Page 4: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/4.jpg)
Bryce Adelstein Lelbach
CUDA C++ Core Libraries Lead
ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair
THE C++20 SYNCHRONIZATION LIBRARY@blelbach
![Page 5: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/5.jpg)
namespace stdr = std::ranges;namespace stdv = std::views;
void f(std::invocable auto&&);// ^ Constrained template.
5Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 6: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/6.jpg)
Recipe For a Tasking Runtime
Worker threads.
Multi-consumer, multi-producer concurrent queue.
Termination detection mechanism.
Parallel algorithms.
6Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 7: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/7.jpg)
Recipe For a Tasking Runtime
Worker threads.
Multi-consumer, multi-producer concurrent queue.
Termination detection mechanism.
Parallel algorithms.
7Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 8: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/8.jpg)
struct thread_group {private:std::vector<std::thread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
8Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 9: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/9.jpg)
struct thread_group {private:std::vector<std::thread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
int main() {std::atomic<std::uint64_t> count(0);
{ thread_group tg(6, [&] { ++count; });
}
std::cout << count << "\n";}
9Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 10: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/10.jpg)
struct thread_group {private:std::vector<std::thread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
int main() {std::atomic<std::uint64_t> count(0);
{ thread_group tg(6, [&] { ++count; });
}
std::cout << count << "\n";}
10Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 11: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/11.jpg)
struct thread_group {private:std::vector<std::thread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}
~thread_group() {stdr::for_each(members, [] (std::thread& t) { t.join(); });
}};
11Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 12: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/12.jpg)
struct thread_group {private:std::vector<std::jthread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
12Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 13: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/13.jpg)
std::jthread
Just like std::thread, except:
When destroyed, if the thread is joinable, it joins instead of calling terminate.
Joining Thread
13Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 14: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/14.jpg)
struct thread_group {private:std::vector<std::jthread> members;
public:thread_group(std::uint64_t n, std::invocable auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
14Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 15: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/15.jpg)
struct thread_group {private:std::vector<std::jthread> members;
public:thread_group(std::uint64_t n, std::invocable<std::stop_token> auto&& f) {for (auto i : stdv::iota(0, n)) members.emplace_back(f);
}};
int main() {std::atomic<std::uint64_t> count(0);
{thread_group tg(6,[&] (std::stop_token s) { while (!s.stop_requested()) ++count; }
);}
std::cout << count << "\n";}
15Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 16: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/16.jpg)
std::jthread
Just like std::thread, except:
When destroyed, if the thread is joinable, it joins instead of calling terminate.
It supports interruption.
std::jthread invocables will be passed a std::stop_token parameter if they support it.
Interruption API:
[[nodiscard]] stop_source std::jthread::get_stop_source() noexcept;
[[nodiscard]] stop_token std::jthread::get_stop_token() const noexcept;
bool std::jthread::request_stop() noexcept;
Joining Thread
16Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 17: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/17.jpg)
std::stop_*
std::stop_source (analogous to a promise)
Producer of stop requests.
Owns the shared state (if any).
std::stop_token (analogous to future)
Handle to a std::stop_source.
Consumer only; can query for stop requests, but can’t make them.
std::stop_callback (analogous to future::then)
Mechanism for registering invocables to be run upon receiving a stop request.
Interruption Facilities
17Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 18: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/18.jpg)
CV Interruption Support
struct condition_variable_any {template <typename Lock, typename Predicate>bool wait(Lock& lock, stop_token stoken, Predicate pred);
template <typename Lock, class Clock, typename Duration, typename Predicate>bool wait_until(Lock& lock, stop_token stoken,
chrono::time_point<Clock, Duration> const& abs, Predicate pred);template <typename Lock, typename Rep, typename Period, typename Predicate>bool wait_for(Lock& lock, stop_token stoken,
const chrono::duration<Rep, Period>& rel, Predicate pred);};
18Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 19: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/19.jpg)
Recipe For a Tasking Runtime
Worker threads.
Multi-consumer, multi-producer concurrent queue.
Termination detection mechanism.
Parallel algorithms.
19Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 20: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/20.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
20Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 21: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/21.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
21Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 22: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/22.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
22Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 23: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/23.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
23Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 24: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/24.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
24Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 25: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/25.jpg)
std::counting_semaphore
template <ptrdiff_t least_max_value = implementation-defined>struct counting_semaphore {static constexpr ptrdiff_t max() noexcept;
constexpr explicit counting_semaphore(ptrdiff_t desired);
void release(ptrdiff_t update = 1);
void acquire();bool try_acquire() noexcept;template <typename Rep, typename Period>bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template <typename Clock, typename Duration>bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
};25Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 26: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/26.jpg)
std::counting_semaphore
template <ptrdiff_t least_max_value = implementation-defined>struct counting_semaphore {static constexpr ptrdiff_t max() noexcept;
constexpr explicit counting_semaphore(ptrdiff_t desired);
void release(ptrdiff_t update = 1);
void acquire();bool try_acquire() noexcept;template <typename Rep, typename Period>bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template <typename Clock, typename Duration>bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
};26Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 27: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/27.jpg)
std::counting_semaphore
template <ptrdiff_t least_max_value = implementation-defined>struct counting_semaphore {static constexpr ptrdiff_t max() noexcept;
constexpr explicit counting_semaphore(ptrdiff_t desired);
void release(ptrdiff_t update = 1);
void acquire();bool try_acquire() noexcept;template <typename Rep, typename Period>bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template <typename Clock, typename Duration>bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
};27Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 28: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/28.jpg)
std::counting_semaphore
template <ptrdiff_t least_max_value = implementation-defined>struct counting_semaphore {static constexpr ptrdiff_t max() noexcept;
constexpr explicit counting_semaphore(ptrdiff_t desired);
void release(ptrdiff_t update = 1);
void acquire();bool try_acquire() noexcept;template <typename Rep, typename Period>bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template <typename Clock, typename Duration>bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
};28Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 29: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/29.jpg)
std::counting_semaphore
template <ptrdiff_t least_max_value = implementation-defined>struct counting_semaphore {static constexpr ptrdiff_t max() noexcept;
constexpr explicit counting_semaphore(ptrdiff_t desired);
void release(ptrdiff_t update = 1);
void acquire();bool try_acquire() noexcept;template <typename Rep, typename Period>bool try_acquire_for(const chrono::duration<Rep, Period>& rel_time);
template <typename Clock, typename Duration>bool try_acquire_until(const chrono::time_point<Clock, Duration>& abs_time);
};29Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 30: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/30.jpg)
std::counting_semaphore
using binary_semaphore = counting_semaphore<1>;
30Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 31: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/31.jpg)
std::mutex vs std::counting_semaphore<N>
31Copyright (C) 2019 Bryce Adelstein Lelbach
std::mutex
Ensures a resource is only accessed by one thread at a time.
Each thread which needs the resource blocks until it receives it.
Thread identity:
Only the locking thread may unlock.
A locked mutex is unlocked once.
std::counting_semaphore<N>
Does not limit how many threads access resources concurrently.
Each thread which needs a resource blocks until it receives one.
No thread identity:
Any thread may release.
A thread may release up to N count.
![Page 32: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/32.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
32Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 33: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/33.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
33Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 34: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/34.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u){std::scoped_lock l(items_mtx);items.emplace(std::forward<decltype(u)>(u));
}};
34Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 35: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/35.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u){std::scoped_lock l(items_mtx);items.emplace(std::forward<decltype(u)>(u));
}};
35Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 36: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/36.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u){std::scoped_lock l(items_mtx);items.emplace(std::forward<decltype(u)>(u));
}};
36Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 37: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/37.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:void enqueue(std::convertible_to<T> auto&& u) {remaining_space.acquire();push(std::forward<decltype(u)>(u));items_produced.release();
}};
37Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 38: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/38.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:void enqueue(std::convertible_to<T> auto&& u) {remaining_space.acquire();push(std::forward<decltype(u)>(u));items_produced.release();
}};
38Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 39: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/39.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:void enqueue(std::convertible_to<T> auto&& u) {remaining_space.acquire();push(std::forward<decltype(u)>(u));items_produced.release();
}};
39Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 40: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/40.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:void enqueue(std::convertible_to<T> auto&& u) {remaining_space.acquire();push(std::forward<decltype(u)>(u));items_produced.release();
}};
40Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 41: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/41.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
41Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 42: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/42.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
T pop() {std::optional<T> tmp;std::scoped_lock l(items_mtx);tmp = std::move(items.front());items.pop();return std::move(*tmp);
}};
42Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 43: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/43.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
T pop() {std::optional<T> tmp;std::scoped_lock l(items_mtx);tmp = std::move(items.front());items.pop();return std::move(*tmp);
}};
43Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 44: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/44.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
T pop() {std::optional<T> tmp;std::scoped_lock l(items_mtx);tmp = std::move(items.front());items.pop();return std::move(*tmp);
}};
44Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 45: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/45.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:T dequeue() {items_produced.acquire();T tmp = pop();remaining_space.release();return std::move(tmp);
}};
45Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 46: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/46.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:T dequeue() {items_produced.acquire();T tmp = pop();remaining_space.release();return std::move(tmp);
}};
46Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 47: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/47.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:T dequeue() {items_produced.acquire();T tmp = pop();remaining_space.release();return std::move(tmp);
}};
47Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 48: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/48.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:T dequeue() {items_produced.acquire();T tmp = pop();remaining_space.release();return std::move(tmp);
}};
48Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 49: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/49.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:std::optional<T> try_dequeue() {if (!items_produced.try_acquire()) return {};T tmp = pop();remaining_space.release();return std::move(tmp);
}};
49Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 50: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/50.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
public:std::optional<T> try_dequeue() {if (!items_produced.try_acquire()) return {};T tmp = pop();remaining_space.release();return std::move(tmp);
}};
50Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 51: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/51.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
51Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 52: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/52.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; std::mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
52Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 53: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/53.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {while (flag.test_and_set(std::memory_order_acquire));
}
void unlock() {flag.clear(std::memory_order_release);
}};
53Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 54: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/54.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {for (std::uint64_t k = 0; flag.test_and_set(std::memory_order_acquire); ++k) {if (k < 16) __asm__ __volatile__( "rep; nop" : : : "memory" );else if (k < 64) sched_yield();else {timespec rqtp = { 0, 0 };rqtp.tv_sec = 0; rqtp.tv_nsec = 1000;nanosleep(&rqtp, nullptr);
}}
}
void unlock() {flag.clear(std::memory_order_release);
}};
54Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 55: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/55.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {for (std::uint64_t k = 0; flag.test_and_set(std::memory_order_acquire); ++k) {if (k < 16) __asm__ __volatile__( "rep; nop" : : : "memory" );else if (k < 64) sched_yield();else {timespec rqtp = { 0, 0 };rqtp.tv_sec = 0; rqtp.tv_nsec = 1000;nanosleep(&rqtp, nullptr);
}}
}
void unlock() {flag.clear(std::memory_order_release);
}};
55Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 56: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/56.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {for (std::uint64_t k = 0; flag.test_and_set(std::memory_order_acquire); ++k) {if (k < 16) __asm__ __volatile__( "rep; nop" : : : "memory" );else if (k < 64) sched_yield();else {timespec rqtp = { 0, 0 };rqtp.tv_sec = 0; rqtp.tv_nsec = 1000;nanosleep(&rqtp, nullptr);
}}
}
void unlock() {flag.clear(std::memory_order_release);
}};
56Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 57: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/57.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {for (std::uint64_t k = 0; flag.test_and_set(std::memory_order_acquire); ++k) {if (k < 16) __asm__ __volatile__( "rep; nop" : : : "memory" );else if (k < 64) sched_yield();else {timespec rqtp = { 0, 0 };rqtp.tv_sec = 0; rqtp.tv_nsec = 1000;nanosleep(&rqtp, nullptr);
}}
}
void unlock() {flag.clear(std::memory_order_release);
}};
57Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 58: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/58.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {for (std::uint64_t k = 0; flag.test_and_set(std::memory_order_acquire); ++k) {if (k < 16) __asm__ __volatile__( "rep; nop" : : : "memory" );else if (k < 64) sched_yield();else {timespec rqtp = { 0, 0 };rqtp.tv_sec = 0; rqtp.tv_nsec = 1000;nanosleep(&rqtp, nullptr);
}}
}
void unlock() {flag.clear(std::memory_order_release);
}};
58Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 59: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/59.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {while (flag.test_and_set(std::memory_order_acquire))flag.wait(true, std::memory_order_relaxed);
}
void unlock() {flag.clear(std::memory_order_release);flag.notify_one();
}};
59Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 60: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/60.jpg)
struct spin_mutex {private:std::atomic<bool> flag = ATOMIC_VAR_INIT(false);
public:void lock() {while (flag.exchange(true, std::memory_order_acquire))flag.wait(true, std::memory_order_relaxed);
}
void unlock() {flag.store(false, std::memory_order_release);flag.notify_one();
}};
60Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 61: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/61.jpg)
std::atomic{_flag}::wait/notify
template <typename T>struct atomic {void wait(T old, memory_order = memory_order::seq_cst) const volatile noexcept;void wait(T old, memory_order = memory_order::seq_cst) const noexcept;void notify_one() volatile noexcept;void notify_one() noexcept;void notify_all() volatile noexcept;void notify_all() noexcept;
};
61Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 62: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/62.jpg)
std::atomic{_flag}::wait/notify
template <typename T>struct atomic {void wait(T old, memory_order = memory_order::seq_cst) const volatile noexcept;void wait(T old, memory_order = memory_order::seq_cst) const noexcept;void notify_one() volatile noexcept;void notify_one() noexcept;void notify_all() volatile noexcept;void notify_all() noexcept;
};
62Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 63: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/63.jpg)
std::atomic{_flag}::wait/notify
template <typename T>struct atomic {void wait(T old, memory_order = memory_order::seq_cst) const volatile noexcept;void wait(T old, memory_order = memory_order::seq_cst) const noexcept;void notify_one() volatile noexcept;void notify_one() noexcept;void notify_all() volatile noexcept;void notify_all() noexcept;
};
63Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 64: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/64.jpg)
std::atomic{_flag}::wait/notify
struct atomic_flag {void wait(bool old, memory_order = memory_order::seq_cst) const volatile noexcept;void wait(bool old, memory_order = memory_order::seq_cst) const noexcept;void notify_one() volatile noexcept;void notify_one() noexcept;void notify_all() volatile noexcept;void notify_all() noexcept;
};
64Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 65: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/65.jpg)
std::atomic_flag::test
struct atomic_flag {void wait(bool old, memory_order = memory_order::seq_cst) const volatile noexcept;void wait(bool old, memory_order = memory_order::seq_cst) const noexcept;void notify_one() volatile noexcept;void notify_one() noexcept;void notify_all() volatile noexcept;void notify_all() noexcept;
bool test(memory_order = memory_order::seq_cst) const volatile noexcept;bool test(memory_order = memory_order::seq_cst) const noexcept;
};
65Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 66: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/66.jpg)
std::atomic{_flag} wait and notify
Futex. Supported for certain size objects on Linux and Windows.
Condition Variables. Supported for certain size objects on Linux and Mac.
Contention Table. Used to optimize futex notify or to hold CVs.
Timed back-off. Supported on everything.
Spinlock. Supported on everything. Note: performance is terrible.
Some possible implementation strategies
66Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 67: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/67.jpg)
struct spin_mutex {private:std::atomic_flag flag = ATOMIC_FLAG_INIT;
public:void lock() {while (flag.test_and_set(std::memory_order_acquire))flag.wait(true, std::memory_order_relaxed);
}
void unlock() {flag.clear(std::memory_order_release);flag.notify_one();
}};
67Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 68: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/68.jpg)
68Copyright (C) 2019 Bryce Adelstein Lelbach
spin_mutex
unlock
Thread A
lock
unlock
Thread B
lock
![Page 69: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/69.jpg)
69Copyright (C) 2019 Bryce Adelstein Lelbach
spin_mutex
lock
unlock
Thread A
lock
unlock
Thread B
UNFAIR
![Page 70: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/70.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 70Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 71: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/71.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 71Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 72: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/72.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 72Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 73: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/73.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 73Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 74: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/74.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 74Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 75: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/75.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 75Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 76: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/76.jpg)
76Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 77: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/77.jpg)
struct ticket_mutex {private:std::atomic<int> in = ATOMIC_VAR_INIT(0);std::atomic<int> out = ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 77Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 78: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/78.jpg)
struct ticket_mutex {private:alignas(std::hardware_destructive_interference_size) std::atomic<int> in= ATOMIC_VAR_INIT(0);
alignas(std::hardware_destructive_interference_size) std::atomic<int> out= ATOMIC_VAR_INIT(0);
public:void lock() {auto const my = in.fetch_add(1, std::memory_order_acquire);while (true) {auto const now = out.load(std::memory_order_acquire);if (now == my) return;out.wait(now, std::memory_order_relaxed);
}}
void unlock() {out.fetch_add(1, std::memory_order_release);out.notify_all();
}}; 78Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 79: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/79.jpg)
template <typename T, std::uint64_t QueueDepth>struct concurrent_bounded_queue {private:std::queue<T> items; ticket_mutex items_mtx;std::counting_semaphore<QueueDepth> items_produced{0};std::counting_semaphore<QueueDepth> remaining_space{QueueDepth};
void push(std::convertible_to<T> auto&& u);T pop();
public:constexpr concurrent_bounded_queue() = default;
void enqueue(std::convertible_to<T> auto&& u);
T dequeue();std::optional<T> try_dequeue();
};
79Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 80: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/80.jpg)
Recipe For a Tasking Runtime
Worker threads.
Multi-consumer, multi-producer concurrent queue.
Termination detection mechanism.
Parallel algorithms.
80Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 81: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/81.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s);
public:bounded_depth_task_manager(std::uint64_t n);
void submit(std::invocable auto&& f);};
81Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 82: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/82.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s);
public:bounded_depth_task_manager(std::uint64_t n);
void submit(std::invocable auto&& f);};
82Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 83: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/83.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
public:void submit(std::invocable auto&& f) {tasks.enqueue(std::forward<decltype(f)>(f));
}};
83Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 84: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/84.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s);
public:bounded_depth_task_manager(std::uint64_t n);
void submit(std::invocable auto&& f);};
84Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 85: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/85.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
}
public:bounded_depth_task_manager(std::uint64_t n): threads(n, [&] (std::stop_token s) { process_tasks(s); })
{}};
85Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 86: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/86.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
}
public:bounded_depth_task_manager(std::uint64_t n): threads(n, [&] (std::stop_token s) { process_tasks(s); })
{}};
86Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 87: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/87.jpg)
int main() {std::atomic<std::uint64_t> count(0);
{bounded_depth_task_manager<64> tm(6);
for (auto i : stdv::iota(0, 256))tm.submit([&] { ++count; });
}
std::cout << count << "\n";}
87Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 88: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/88.jpg)
int main() {std::atomic<std::uint64_t> count(0);
{bounded_depth_task_manager<64> tm(6);
for (auto i : stdv::iota(0, 256))tm.submit([&] { ++count; });
}
std::cout << count << "\n";}
88Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 89: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/89.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
}
public:bounded_depth_task_manager(std::uint64_t n): threads(n, [&] (std::stop_token s) { process_tasks(s); })
{}};
89Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 90: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/90.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:concurrent_bounded_queue<std::any_invocable<void()>, QueueDepth> tasks;thread_group threads;
void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:bounded_depth_task_manager(std::uint64_t n): threads(n, [&] (std::stop_token s) { process_tasks(s); })
{}};
90Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 91: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/91.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
91Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 92: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/92.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
92Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 93: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/93.jpg)
std::latch
struct latch {static constexpr ptrdiff_t max() noexcept;
constexpr explicit latch(ptrdiff_t expected);
latch(const latch&) = delete;latch& operator=(const latch&) = delete;
void count_down(ptrdiff_t update = 1);bool try_wait() const noexcept;void wait() const;void arrive_and_wait(ptrdiff_t update = 1);
};
93Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 94: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/94.jpg)
std::latch
struct latch {static constexpr ptrdiff_t max() noexcept;
constexpr explicit latch(ptrdiff_t expected);
latch(const latch&) = delete;latch& operator=(const latch&) = delete;
void count_down(ptrdiff_t update = 1);bool try_wait() const noexcept;void wait() const;void arrive_and_wait(ptrdiff_t update = 1);
};
94Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 95: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/95.jpg)
std::latch
struct latch {static constexpr ptrdiff_t max() noexcept;
constexpr explicit latch(ptrdiff_t expected);
latch(const latch&) = delete;latch& operator=(const latch&) = delete;
void count_down(ptrdiff_t update = 1);bool try_wait() const noexcept;void wait() const;void arrive_and_wait(ptrdiff_t update = 1);
};
95Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 96: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/96.jpg)
std::latch
struct latch {static constexpr ptrdiff_t max() noexcept;
constexpr explicit latch(ptrdiff_t expected);
latch(const latch&) = delete;latch& operator=(const latch&) = delete;
void count_down(ptrdiff_t update = 1);bool try_wait() const noexcept;void wait() const;void arrive_and_wait(ptrdiff_t update = 1);
};
96Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 97: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/97.jpg)
std::latch
struct latch {static constexpr ptrdiff_t max() noexcept;
constexpr explicit latch(ptrdiff_t expected);
latch(const latch&) = delete;latch& operator=(const latch&) = delete;
void count_down(ptrdiff_t update = 1);bool try_wait() const noexcept;void wait() const;void arrive_and_wait(ptrdiff_t update = 1);
};
97Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 98: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/98.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
98Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 99: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/99.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
99Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 100: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/100.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
100Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 101: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/101.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
101Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 102: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/102.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
102Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 103: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/103.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
while (true) {if (auto f = tasks.try_dequeue()) std::move(*f)();else break;
}}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
103Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 104: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/104.jpg)
template <std::uint64_t QueueDepth>struct bounded_depth_task_manager {private:void process_tasks(std::stop_token s) {while (!s.stop_requested())tasks.dequeue()();
}
public:~bounded_depth_task_manager() {
std::latch l(threads.size() + 1);for (auto i : stdv::iota(0, threads.size()))submit([&] { l.arrive_and_wait(); });
threads.request_stop();l.count_down();
}};
104Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 105: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/105.jpg)
Recipe For a Tasking Runtime
Worker threads.
Multi-consumer, multi-producer concurrent queue.
Termination detection mechanism.
Parallel algorithms.
105Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 106: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/106.jpg)
template <stdr::range I, std::random_access_iterator O,typename T, std::invocable</* ... */> BO>
requires /* ... */void histogram(I&& input, O output, T inc, OP op) {stdr::for_each(input, [&] (auto&& t) { output[op(t)] += inc; });
}
106Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 107: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/107.jpg)
template <stdr::range I, std::random_access_iterator O,typename T, std::invocable</* ... */> BO>
requires /* ... */void histogram(I&& input, O output, T inc, OP op) {stdr::for_each(input, [&] (auto&& t) { output[op(t)] += inc; });
}
107Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 108: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/108.jpg)
Histogram
108Copyright (C) 2019 Bryce Adelstein Lelbach
H e l l o C + +
![Page 109: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/109.jpg)
Histogram
109Copyright (C) 2019 Bryce Adelstein Lelbach
e l o C H +
1 2 1 1 1 2 1
H e l l o C + +
![Page 110: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/110.jpg)
template <execution_policy EP,stdr::random_access_range I, std::random_access_iterator O,typename T, std::invocable</* ... */> BO>
requires /* ... */void histogram(EP&& exec, I&& input, O output, T inc, OP op);
110Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 111: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/111.jpg)
template <execution_policy EP,stdr::random_access_range I, std::random_access_iterator O,typename T, std::invocable</* ... */> BO>
requires /* ... */void histogram(EP&& exec, I&& input, O output, T inc, OP op);
111Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 112: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/112.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
// ...}
112Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 113: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/113.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
// ...}
113Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 114: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/114.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
// ...}
114Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 115: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/115.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
std::latch l(chunks);
// ...}
115Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 116: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/116.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit(// ...
);}
116Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 117: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/117.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit(// ...
);}
117Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 118: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/118.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
// ...}
);}
118Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 119: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/119.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
// ...}
);}
119Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 120: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/120.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
stdr::for_each(stdr::begin(input) + my_begin,stdr::begin(input) + my_end,[&] (auto&& t) {output[op(t)] += inc;
});
// ...}
);}
120Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 121: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/121.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
stdr::for_each(stdr::begin(input) + my_begin,stdr::begin(input) + my_end,[&] (auto&& t) {output[op(t)] += inc;
});
// ...}
);}
121Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 122: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/122.jpg)
e l o C H +
1 2 1 1 1 2 1
Histogram
122Copyright (C) 2019 Bryce Adelstein Lelbach
H e l l o C + +
H e l l o C + +
![Page 123: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/123.jpg)
e l o C H +
1 2 1 1 1 2 1
Histogram
123Copyright (C) 2019 Bryce Adelstein Lelbach
H e l l o C + +
H e l l o C + +
![Page 124: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/124.jpg)
Histogram
124Copyright (C) 2019 Bryce Adelstein Lelbach
e l o C H +
1 2 1 1 1 2 1
H e l l o C + +
H e l l o C + +
![Page 125: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/125.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
stdr::for_each(stdr::begin(input) + my_begin,stdr::begin(input) + my_end,[&] (auto&& t) {std::atomic_ref r(output[op(t)]);r.fetch_add(inc, std::memory_order_relaxed);
});
// ...}
);}
125Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 126: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/126.jpg)
std::atomic_ref<T>
126Copyright (C) 2019 Bryce Adelstein Lelbach
std::atomic<T> holds a T.
template <struct T>struct atomic {private:T data; // exposition only
public:// ...
};
std::atomic_ref<T> does not hold a T.
template <struct T>struct atomic_ref {private:T* ptr; // exposition only
public:explicit atomic_ref(T&);
// Otherwise, same API as std::atomic.};
![Page 127: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/127.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
stdr::for_each(stdr::begin(input) + my_begin,stdr::begin(input) + my_end,[&] (auto&& t) {std::atomic_ref r(output[op(t)]);r.fetch_add(inc, std::memory_order_relaxed);
});
// ...}
);}
127Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 128: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/128.jpg)
std::atomic<floating-point>
template<> struct atomic<floating-point> {floating-point fetch_add(floating-point,
memory_order = memory_order_seq_cst) volatile noexcept;floating-point fetch_add(floating-point,
memory_order = memory_order_seq_cst) noexcept;floating-point fetch_sub(floating-point,
memory_order = memory_order_seq_cst) volatile noexcept;floating-point fetch_sub(floating-point,
memory_order = memory_order_seq_cst) noexcept;};
128Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 129: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/129.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &input, &l] {auto const my_begin = chunk * chunk_size;auto const my_end = std::min(elements, (chunk + 1) * chunk_size);
stdr::for_each(stdr::begin(input) + my_begin,stdr::begin(input) + my_end,[&] (auto&& t) {std::atomic_ref r(output[op(t)]);r.fetch_add(inc, std::memory_order_relaxed);
});
l.count_down();}
);}
129Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 130: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/130.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
std::latch l(chunks);
for (std::uint64_t chunk = 0; chunk < chunks; ++chunk)exec.submit(// ...
);
l.wait();}
130Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 131: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/131.jpg)
void histogram(EP&& exec, I&& input, O output, T inc, OP op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents() * 4;std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
std::latch l(chunks);
for (std::uint64_t chunk = 0; chunk < chunks; ++chunk)exec.submit(// ...
);
l.wait();}
131Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 132: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/132.jpg)
template <execution_policy EP,stdr::random_access_range I, std::random_access_iterator O,std::invocable</* ... */> BO>
requires /* ... */void inclusive_scan(EP&& exec, I&& input, O output, OP op);
132Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 133: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/133.jpg)
Inclusive Scan
133Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc abcd abcde abcdef abcdefg abcdefgh abcdefghi
![Page 134: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/134.jpg)
Inclusive Scan
134Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a b c d e f g h i
![Page 135: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/135.jpg)
Inclusive Scan
135Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc d de def g gh ghi
std::inclusive_scan std::inclusive_scan std::inclusive_scan
![Page 136: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/136.jpg)
Inclusive Scan
136Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc d de def g gh ghi
![Page 137: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/137.jpg)
Inclusive Scan
137Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc d de def g gh ghi
![Page 138: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/138.jpg)
std::inclusive_scan std::inclusive_scan std::inclusive_scan
Inclusive Scan
138Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
abc def ghiaggregates =
a ab abc d de def g gh ghi
![Page 139: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/139.jpg)
Inclusive Scan
139Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
abc def ghiaggregates =
std::inclusive_scan
a ab abc d de def g gh ghi
std::inclusive_scan std::inclusive_scan std::inclusive_scan
![Page 140: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/140.jpg)
Inclusive Scan
140Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc d de def g gh ghi
std::inclusive_scan std::inclusive_scan std::inclusive_scan
abc abcdef abcdefghiaggregates =
std::inclusive_scan
![Page 141: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/141.jpg)
std::inclusive_scan
Inclusive Scan
141Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc abcd abcde abcdef abcdefg abcdefgh abcdefghi
Increment by aggregates[0] Increment by aggregates[1]
a ab abc d de def g gh ghi
std::inclusive_scan std::inclusive_scan std::inclusive_scan
abc abcdef abcdefghiaggregates =
![Page 142: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/142.jpg)
Inclusive Scan
142Copyright (C) 2019 Bryce Adelstein Lelbach
a b c d e f g h i
a ab abc abcd abcde abcdef abcdefg abcdefgh abcdefghi
Increment by aggregates[0] Increment by aggregates[1]
abc def ghiaggregates =
std::inclusive_scan
a ab abc d de def g gh ghi
std::inclusive_scan std::inclusive_scan std::inclusive_scanUpsweep
Downsweep
![Page 143: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/143.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier</* ... */> upsweep_barrier(chunks, /* ... */);
std::latch downsweep_latch(chunks);
// ... }
143Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 144: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/144.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier</* ... */> upsweep_barrier(chunks, /* ... */);
std::latch downsweep_latch(chunks);
// ... }
144Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 145: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/145.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier</* ... */> upsweep_barrier(chunks, /* ... */);
std::latch downsweep_latch(chunks);
// ... }
145Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 146: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/146.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier</* ... */> upsweep_barrier(chunks, /* ... */);
std::latch downsweep_latch(chunks);
// ... }
146Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 147: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/147.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier</* ... */> upsweep_barrier(chunks, /* ... */);
std::latch downsweep_latch(chunks);
// ... }
147Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 148: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/148.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op);
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
148Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 149: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/149.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op);
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
149Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 150: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/150.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op);
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
150Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 151: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/151.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op);
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
151Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 152: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/152.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op);
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
152Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 153: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/153.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier<std::function<void()>> upsweep_barrier(chunks,[&] { stdr::inclusive_scan(aggregates, aggregates.begin(), op); });
std::latch downsweep_latch(chunks);
// ... }
153Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 154: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/154.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};154Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 155: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/155.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};155Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 156: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/156.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};156Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 157: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/157.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};157Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 158: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/158.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};158Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 159: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/159.jpg)
std::barrier
template <typename CompletionFunction = see below>struct barrier {using arrival_token = see below;
static constexpr ptrdiff_t max() noexcept;
constexpr explicit barrier(ptrdiff_t expected,CompletionFunction f = CompletionFunction());
[[nodiscard]] arrival_token arrive(ptrdiff_t update = 1);void wait(arrival_token&& arrival) const;
void arrive_and_wait();void arrive_and_drop();
};159Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 160: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/160.jpg)
std::latch vs std::barrier
160Copyright (C) 2019 Bryce Adelstein Lelbach
std::latch
Supports asynchronous arrival.
Single phase.
No thread identity:
Threads may arrive multiple times.
Any thread may wait on a latch.
No completion function.
std::barrier
Supports asynchronous arrival.
Multi phase.
Thread identity:
A thread may arrive only once per phase.
Only a thread who has arrived may wait.
Supports completion functions.
![Page 161: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/161.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
using T = stdr::ranges_value_t<I>;std::vector<T> aggregates(chunks);
std::barrier<std::function<void()>> upsweep_barrier(chunks,[&] { stdr::inclusive_scan(aggregates, aggregates.begin(), op); });
std::latch downsweep_latch(chunks);
// ... }
161Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 162: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/162.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op),
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
162Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 163: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/163.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
auto const this_begin = chunk * chunk_size;auto const this_end = std::min(elements, (chunk + 1) * chunk_size);aggregates[chunk] = *--stdr::inclusive_scan(stdr::begin(input) + this_begin,
stdr::begin(input) + this_end,output + this_begin,op),
upsweep_barrier.arrive_and_wait();
// ...}));
// ... }
163Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 164: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/164.jpg)
std::barrierSynchronization
164Copyright (C) 2019 Bryce Adelstein Lelbach
N x arrives
1 x completion
N x waits complete
happens-before
happens-before
![Page 165: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/165.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
// ...
upsweep_barrier.arrive_and_wait();
if (0 != chunk)stdr::for_each(output + this_begin, output + this_end,
[&, chunk] (auto& t) { t = op(std::move(t), aggregates[chunk - 1]); });
downsweep_latch.count_down();}));
// ... }
165Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 166: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/166.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
// ...
upsweep_barrier.arrive_and_wait();
if (0 != chunk)stdr::for_each(output + this_begin, output + this_end,
[&, chunk] (auto& t) { t = op(std::move(t), aggregates[chunk - 1]); });
downsweep_latch.count_down();}));
// ... }
166Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 167: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/167.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
// ...
upsweep_barrier.arrive_and_wait();
if (0 != chunk)stdr::for_each(output + this_begin, output + this_end,
[&, chunk] (auto& t) { t = op(std::move(t), aggregates[chunk - 1]); });
downsweep_latch.count_down();}));
// ... }
167Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 168: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/168.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {// ...
for (auto chunk : stdv::iota(0, chunks))exec.submit([=, &aggregates, &upsweep_barrier, &downsweep_latch] {
// ...
upsweep_barrier.arrive_and_wait();
if (0 != chunk)stdr::for_each(output + this_begin, output + this_end,
[&, chunk] (auto& t) { t = op(std::move(t), aggregates[chunk - 1]); });
downsweep_latch.count_down();}));
// ... }
168Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 169: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/169.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
std::vector<T> aggregates(chunks);
std::barrier<std::function<void()>> upsweep_barrier(chunks,[&] { stdr::inclusive_scan(aggregates, aggregates.begin(), op); });
std::latch downsweep_latch(chunks);
for (auto chunk : stdv::iota(0, chunks))exec.submit(
// ...);
downsweep_latch.wait();}
169Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 170: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/170.jpg)
void inclusive_scan(EP&& exec, I&& input, O&& output, BO&& op) {std::uint64_t const elements = stdr::distance(input);std::uint64_t const chunks = exec.concurrent_agents();std::uint64_t const chunk_size = (elements + chunks - 1) / chunks;
std::vector<T> aggregates(chunks);
std::barrier<std::function<void()>> upsweep_barrier(chunks,[&] { stdr::inclusive_scan(aggregates, aggregates.begin(), op); });
std::latch downsweep_latch(chunks);
for (auto chunk : stdv::iota(0, chunks))exec.submit(
// ...);
downsweep_latch.wait();}
170Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 171: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/171.jpg)
C++20 Synchronization Library
std::atomic<T> et al
wait/notify interface
std::atomic_ref<T>
test interface for std::atomic_flag
Floating-point specializations
171Copyright (C) 2019 Bryce Adelstein Lelbach
std::latch & std::barrier
std::counting_semaphore
std::jthread
Joining destructor
std::stop_* interruption mechanism
![Page 172: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/172.jpg)
libcu++
Opt-in, heterogeneous, incremental C++ standard library for CUDA.
Port of LLVM’s libc++; contributed C++20 sync library upstream.
Version 1 (next week): <atomic> (Pascal+), <type_traits>.
Version 2 (1H 2020): atomic<T>::wait/notify (Volta+), <barrier>(Volta+), <latch> (Volta+), <counting_semaphore> (Volta+), <chrono>, <ratio>, <functional> minus function.
Future priorities: <complex>, <tuple>, <array>, <utility>, <cmath>, string processing, …
The CUDA C++ Standard Library
172Copyright (C) 2019 Bryce Adelstein Lelbach
![Page 173: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/173.jpg)
173Copyright (C) 2019 Bryce Adelstein Lelbach
#include <atomic>std::atomic<int> x;
#include <cuda/std/atomic>cuda::std::atomic<int> x;
#include <cuda/atomic>cuda::atomic<int, cuda::thread_scope_block> x;
std:: ISO C++, __host__ only.
cuda::std:: CUDA C++, __host__ __device__.
Strictly conforming to ISO C++.
cuda:: CUDA C++, __host__ __device__.
Conforming extensions to ISO C++.
![Page 174: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/174.jpg)
174Copyright (C) 2019 Bryce Adelstein Lelbach
#include <atomic>std::atomic<int> x;
#include <cuda/std/atomic>cuda::std::atomic<int> x;
#include <cuda/atomic>cuda::atomic<int, cuda::thread_scope_block> x;
std:: ISO C++, __host__ only.
cuda::std:: CUDA C++, __host__ __device__.
Strictly conforming to ISO C++.
cuda:: CUDA C++, __host__ __device__.
Conforming extensions to ISO C++.
CUDA is the only GPU platform that implements C++ parallel forward progress and the C++ memory model; not possible
with OpenCL or SYCL.
![Page 175: The C++20 Synchronization Library...Bryce Adelstein Lelbach CUDA C++ Core Libraries Lead ISO C++ Library Evolution Incubator Chair, ISO C++ Tooling Study Group Chair THE C++20 SYNCHRONIZATION](https://reader036.vdocuments.site/reader036/viewer/2022070114/6082b32108a7de75d8074634/html5/thumbnails/175.jpg)
175Copyright (C) 2019 Bryce Adelstein Lelbach
@blelbach