Pulse · NVIDIA/cccl

November 18, 2024 – November 25, 2024

138 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Start reworking our math functions
#2749 commented on Nov 25, 2024 • 20 new comments
backport `unreachable`
#2852 commented on Nov 22, 2024 • 13 new comments
[STF] reduce access mode
#2830 commented on Nov 25, 2024 • 3 new comments
Add environment to encapsulate information needed for `cudax::vector`
#2775 commented on Nov 25, 2024 • 1 new comment
Higher-level, `cuda::vector<T, Properties...>` (like `thrust::host_vector/device_vector`)
#2057 commented on Nov 20, 2024 • 0 new comments
Provide Run-Length Decode API
#599 commented on Nov 20, 2024 • 0 new comments
[FEA]: Port libcxx <ranges> views and backport to C++17
#93 commented on Nov 20, 2024 • 0 new comments
[DOC]: Write dev docs for how to use Windows images for local development
#94 commented on Nov 20, 2024 • 0 new comments
[FEA]: Extend devcontainer/launch.sh to build custom containers on the fly
#126 commented on Nov 20, 2024 • 0 new comments
Thrust large input support
#49 commented on Nov 20, 2024 • 0 new comments
[FEA]: Use P2322R6 to determine accumulator type in Thrust
#153 commented on Nov 20, 2024 • 0 new comments
[FEA]: Intrusive Decoupled Look-Back
#220 commented on Nov 20, 2024 • 0 new comments
[FEA]: Relax shuffle-based warp exchange requirements
#271 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Thrust/CUB shouldn't invoke user-defined operators on out-of-bounds data
#459 commented on Nov 20, 2024 • 0 new comments
Specialize relevant `cuda::(std::)` types for `__half/bfloat16/fp8`
#525 commented on Nov 20, 2024 • 0 new comments
[FEA]: Transition to new Thrust benchmarks
#557 commented on Nov 20, 2024 • 0 new comments
Use parallel implementations for CPU execution policies
#831 commented on Nov 20, 2024 • 0 new comments
[FEA]: Remove CDP (RDC) architecture filtering logic from Thrust/CUB tests
#1137 commented on Nov 20, 2024 • 0 new comments
Replace mdspan with upstream libc++ implementation
#1185 commented on Nov 20, 2024 • 0 new comments
Proof-of-concept parallel ranges algorithms
#1213 commented on Nov 20, 2024 • 0 new comments
[DOC]: Add CUB examples utilizing multi-dimensional thread blocks
#1659 commented on Nov 20, 2024 • 0 new comments
expose device properties, meminfo etc
#2081 commented on Nov 20, 2024 • 0 new comments
[FEA]: Improve performance of `thrust::all_of`.
#2113 commented on Nov 20, 2024 • 0 new comments
[FEA]: Synchronous data structures that use a cuda::mr::resource to allocate their memory
#2129 commented on Nov 20, 2024 • 0 new comments
[FEA]: Aynchronous data structures that use a cuda::mr::asyncresource to allocate their memory
#2131 commented on Nov 20, 2024 • 0 new comments
[FEA]: Non-owning view types with properties to ensure type safe access to allocated memory
#2132 commented on Nov 20, 2024 • 0 new comments
[FEA]: Prototype a design to ensure asynchronous operations on different streams work nicely with `cuda::launch`
#2143 commented on Nov 20, 2024 • 0 new comments
Publish `cuda-cooperative` Python package
#2148 commented on Nov 20, 2024 • 0 new comments
Roll out address stability optimization
#2403 commented on Nov 20, 2024 • 0 new comments
[FEA]: Multi-dimesional TMA exposure
#39 commented on Nov 20, 2024 • 0 new comments
[EPIC] std::simd support in libcu++
#30 commented on Nov 20, 2024 • 0 new comments
[FEA]: Refactor user-facing Thrust types to use libcu++
#34 commented on Nov 20, 2024 • 0 new comments
[FEA]: Evaluate and retune radix sort and merge sort
#55 commented on Nov 20, 2024 • 0 new comments
[FEA]: Implement 1D TMA support in memcpy_async
#58 commented on Nov 20, 2024 • 0 new comments
[FEA]: Explore potential improvements for DeviceMemcpy::Batched
#59 commented on Nov 20, 2024 • 0 new comments
[FEA]: Add `thread_scope_cluster` and implement `atomic(_ref)<thread_scope_cluster>`
#73 commented on Nov 20, 2024 • 0 new comments
[FEA]: Add support for remote shared memory to barrier<thread_scope_block>
#75 commented on Nov 20, 2024 • 0 new comments
[FEA]: Provide `std::dims`
#2810 commented on Nov 20, 2024 • 0 new comments
[FEA]: Provide `submdspan` for padded layouts
#2809 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Add Hopper features to `cuda::ptx`
#1340 commented on Nov 21, 2024 • 0 new comments
Benchmark and plot before/after performance of BabelStream based on `thrust::transform`
#2717 commented on Nov 21, 2024 • 0 new comments
[Do Not Merge] Implement `<ranges>`
#198 commented on Nov 19, 2024 • 0 new comments
Add compute-sanitizer testing to CI.
#1879 commented on Nov 20, 2024 • 0 new comments
Add simple kernel for deterministic reduction
#2234 commented on Nov 21, 2024 • 0 new comments
`basic_any`: a utility for defining type-erasing wrappers in terms of an interface description
#2633 commented on Nov 22, 2024 • 0 new comments
Add CI with STF MathLib builds
#2651 commented on Nov 19, 2024 • 0 new comments
Configure `workflow-run-job-linux` to use sccache-dist build cluster
#2672 commented on Nov 25, 2024 • 0 new comments
Add missing template parameter to BlockRadixRank example.
#2736 commented on Nov 19, 2024 • 0 new comments
[WIP] Support fancy iterators in cuda.parallel
#2788 commented on Nov 23, 2024 • 0 new comments
backport std integer comparison functions to C++11
#2805 commented on Nov 22, 2024 • 0 new comments
Try to avoid UB in `thrust::reference`
#2813 commented on Nov 25, 2024 • 0 new comments
new type-erased memory resources
#2824 commented on Nov 22, 2024 • 0 new comments
Test proclaim_copyable_arguments for lambdas
#2833 commented on Nov 23, 2024 • 0 new comments
[BUG]: MSVC < 2022 doesn't properly handle thrust's member function detector.
#1731 commented on Nov 20, 2024 • 0 new comments
[EPIC] RAPIDS Should not need to patch CCCL
#1939 commented on Nov 20, 2024 • 0 new comments
[FEA]: Extend cub::DeviceSegmentedSort to support custom comparators
#1577 commented on Nov 20, 2024 • 0 new comments
Benchmark regression testing
#2011 commented on Nov 20, 2024 • 0 new comments
Standard abstraction for specifying thread grid hierarchy and dimensions
#2037 commented on Nov 20, 2024 • 0 new comments
Provide cmath for floating_point<M,E>
#2161 commented on Nov 20, 2024 • 0 new comments
Make bfloat alias for `floating_point<8,7>` that dispatches to accelerated instructions where possible
#2182 commented on Nov 20, 2024 • 0 new comments
[FEA]: Consider compiling CCCL tests with enabled warnings for local memory and register spilling
#2253 commented on Nov 20, 2024 • 0 new comments
[DOC]: Clarity check of CUB device-scope docs
#2323 commented on Nov 20, 2024 • 0 new comments
Return algorithm diagrams to CUB docs
#2319 commented on Nov 20, 2024 • 0 new comments
Make half_t alias for `floating_point<5,10>` that dispatches to accelerated instructions where possible
#2181 commented on Nov 20, 2024 • 0 new comments
[EPIC] CUDA Next asynchronous programming model
#2041 commented on Nov 20, 2024 • 0 new comments
[DOC]: Clarity check of CUB block-scope docs
#2324 commented on Nov 20, 2024 • 0 new comments
[DOC]: Clarity check of CUB warp-scope docs
#2325 commented on Nov 20, 2024 • 0 new comments
[THEME] CUDA Runtime Modernization
#1646 commented on Nov 20, 2024 • 0 new comments
[FEA]: Abstract build step in cccl.c.parallel
#2525 commented on Nov 20, 2024 • 0 new comments
[EXTERNAL] Contribute back Numba documentation updates
#2722 commented on Nov 18, 2024 • 0 new comments
[DOC]: `copy_n` docs formatting bug
#2748 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Improve documentation content by adopting diataxis
#2841 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Fix sphinx
#2807 commented on Nov 20, 2024 • 0 new comments
[EPIC] make the Thrust parallel algorithms available in `cuda::std`
#2818 commented on Nov 20, 2024 • 0 new comments
[FEA]: Implement cuda.parallel.device_for
#2537 commented on Nov 20, 2024 • 0 new comments
[DOC]: Design approach to CCCL documentation authoring
#2327 commented on Nov 20, 2024 • 0 new comments
[FEA]: Add more benchmarks for `thrust::transform`
#2814 commented on Nov 20, 2024 • 0 new comments
[THEME] Add support for 128b `atomic_ref` in device code.
#2048 commented on Nov 20, 2024 • 0 new comments
[EPIC]: `mdspan`-based algorithms
#2471 commented on Nov 20, 2024 • 0 new comments
[EPIC] Improve separation and elminate redundancy between Thrust and CUB
#24 commented on Nov 20, 2024 • 0 new comments
[EPIC] Consolidate kernels between Thrust and CUB
#26 commented on Nov 20, 2024 • 0 new comments
[EPIC] CUB Performance Tuning
#27 commented on Nov 20, 2024 • 0 new comments
[EPIC] CUB Test Catch2 Migration
#28 commented on Nov 20, 2024 • 0 new comments
[EPIC] Port CUB device-scope algorithm tests to use Catch2
#29 commented on Nov 20, 2024 • 0 new comments
[EPIC] Extended Floating-Point Support
#31 commented on Nov 20, 2024 • 0 new comments
[EPIC] Replace & refactor Thrust/CUB types w/ libcu++
#33 commented on Nov 20, 2024 • 0 new comments
[THEME] Hopper CUDA C++ Feature Exposure
#35 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Unify testing infrastructure
#2806 commented on Nov 19, 2024 • 0 new comments
[FEA]: Make cuda.parallel available on PyPI
#2555 commented on Nov 19, 2024 • 0 new comments
[FEA]: Match cuda.parallel support matrix with CuPy
#2554 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel version of radix sort
#2552 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel version of partition
#2551 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel version of select
#2550 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel.segmented_reduce algorithm
#2549 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel.merge_sort algorithm
#2546 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel version of segmented radix sort
#2553 commented on Nov 19, 2024 • 0 new comments
[FEA]: Provide `mdarray`
#2474 commented on Nov 19, 2024 • 0 new comments
[DOC]: Format function signatures
#2339 commented on Nov 19, 2024 • 0 new comments
[FEA]: Research stateful operators for cuda.parallel
#2538 commented on Nov 19, 2024 • 0 new comments
[FEA]: Implement cuda.parallel version of `cub::DoubleBuffer`
#2548 commented on Nov 19, 2024 • 0 new comments
[FEA]: Support fancy iterators in cuda.parallel
#2479 commented on Nov 20, 2024 • 0 new comments
[BUG]: Proclaiming copyable arguments for lambdas fails to compile
#2834 commented on Nov 20, 2024 • 0 new comments
[DOC]: `thrust::partition_copy` docs are wrong
#2792 commented on Nov 20, 2024 • 0 new comments
[DOC]: `thrust::stable_sort` docs are wrong
#2747 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Improve usability of architecture specific features in libcudacxx
#1083 commented on Nov 20, 2024 • 0 new comments
Redesign libcudacxx architecture specific testing
#1084 commented on Nov 20, 2024 • 0 new comments
[FEA]: Add SM90 to architecture list for CI
#1092 commented on Nov 20, 2024 • 0 new comments
[EPIC] Expand CUB APIs for user-specified algorithmic properties
#1186 commented on Nov 20, 2024 • 0 new comments
[EPIC] Clarify support for CCCL headers with host-only translation units
#1374 commented on Nov 20, 2024 • 0 new comments
[FEA]: Move rapids repositories away from thrust types in API interfaces
#1382 commented on Nov 20, 2024 • 0 new comments
[EPIC] Roadmap for cuda/memory_resource
#1502 commented on Nov 20, 2024 • 0 new comments
[EPIC] Third-party testing in CI
#1507 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Reproducible floating-point reductions
#1558 commented on Nov 20, 2024 • 0 new comments
[FEA]: Run tests through `compute-sanitizer` in CI
#1618 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Setup nightly/weekly CI
#1619 commented on Nov 20, 2024 • 0 new comments
Provide generic implementation of `floating_point<M, E>` limited to arithmetic operations
#1666 commented on Nov 20, 2024 • 0 new comments
[EPIC] Optimize `thrust::transform` for newer architectures
#1947 commented on Nov 20, 2024 • 0 new comments
Create generic floating-point wrapping type for <NumExponentBits, NumMantissaBits>
#1665 commented on Nov 20, 2024 • 0 new comments
Modern versions of cudaMemset and cudaMemcpy
#2000 commented on Nov 20, 2024 • 0 new comments
`cuda::launch` kernel-launch API
#2038 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Atomics Improvements
#2047 commented on Nov 20, 2024 • 0 new comments
[EPIC]: TMA Exposure
#36 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Hopper Cluster support
#37 commented on Nov 20, 2024 • 0 new comments
[EPIC]: 1D TMA/BULK exposure via `memcpy_async`
#38 commented on Nov 20, 2024 • 0 new comments
[EPIC] Fork/Join Parallel Ranges
#44 commented on Nov 20, 2024 • 0 new comments
[EPIC] Heterogeneous, sequential ranges support
#45 commented on Nov 20, 2024 • 0 new comments
[THEME] Asynchronous Parallel Algorithms
#46 commented on Nov 20, 2024 • 0 new comments
[EPIC] Universal 64-bit index type support in Thrust/CUB algorithms
#47 commented on Nov 20, 2024 • 0 new comments
[FEA]: CUB large input support
#50 commented on Nov 20, 2024 • 0 new comments
[EPIC] Docs Overhaul
#51 commented on Nov 20, 2024 • 0 new comments
[EPIC]: Implement `<cuda/std/ranges>` and associated headers
#61 commented on Nov 20, 2024 • 0 new comments
[EPIC] GitHub Actions CI
#68 commented on Nov 20, 2024 • 0 new comments
[EPIC] Track future breaking changes
#101 commented on Nov 20, 2024 • 0 new comments
[EPIC] Setup Windows container and script infrastructure
#248 commented on Nov 20, 2024 • 0 new comments
[FEA]: Vectorize memory operations in CUB algorithms
#307 commented on Nov 20, 2024 • 0 new comments
[FEA]: Automate release process
#653 commented on Nov 20, 2024 • 0 new comments
`thrust::all_of` is slower than a naive reduction
#720 commented on Nov 20, 2024 • 0 new comments
Add additional protections against undefined uses of CUDA extended lambdas
#1004 commented on Nov 20, 2024 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

November 18, 2024 – November 25, 2024

Overview

Could not load contribution data

38 Pull requests merged by 12 people

16 Pull requests opened by 9 people

16 Issues closed by 9 people

50 Issues opened by 15 people

138 Unresolved conversations

Insights: NVIDIA/cccl

November 18, 2024 – November 25, 2024

Overview

Could not load contribution data

38 Pull requests merged by 12 people

16 Pull requests opened by 9 people

16 Issues closed by 9 people

50 Issues opened by 15 people

138 Unresolved conversations