rocm

Star

Here are 123 public repositories matching this topic...

vllm-project / vllm

Star

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated May 23, 2024
Python

ROCm / hipBLASLt

Star

hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library

machine-learning amd assembly matrix-multiplication blas hip gpu-computing gemm rocm radeon-open-compute

Updated May 23, 2024
Assembly

apache / tvm

Star

Open deep learning compiler stack for cpu, gpu and specialized accelerators

javascript machine-learning performance deep-learning metal compiler gpu vulkan opencl tensor spirv rocm tvm

Updated May 23, 2024
Python

ROCm / aomp

Star

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.

amd llvm openmp clang rocm

Updated May 23, 2024
Fortran

eliranwong / MultiAMDGPU_AIDev_Ubuntu

Star

Multi AMD GPU Setup for AI Development on Ubuntu with ROCM

ai ubuntu amd gpu amdgpu rocm amd-gpu freegenius

Updated May 22, 2024

ROCm / rocRAND

Star

RAND library for HIP programming language

gpu random cuda rng hip rocm

Updated May 22, 2024
C++

ROCm / rocFFT

Star

Next generation FFT implementation for ROCm

fast amd gpu fourier transform fft hip rocm

Updated May 23, 2024
C++

patientx / ComfyUI-Zluda

Star

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface. Now ZLUDA enhanced for better AMD GPU performance.

windows amd cuda rocm stable-diffusion comfyui zluda

Updated May 22, 2024
Python

PennyLaneAI / pennylane-lightning

Star

The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane

hpc gpu parallel openmp mpi distributed-computing cuda quantum-computing rocm quantum-machine-learning

Updated May 23, 2024
C++

DejvBayer / afft

Star

C++17 wrapper library for fft-related computations on CPUs and GPUs

cuda fft hip dct mkl dst cufft rocm dtt fftw3 pocketfft vkfft

Updated May 22, 2024
C++

deepmodeling / deepmd-kit

Star

A deep learning package for many-body potential energy representation and molecular dynamics

nodejs python c deep-learning cpp tensorflow cuda molecular-dynamics pytorch computational-chemistry lammps materials-science ipi rocm ase potential-energy deepmd

Updated May 23, 2024
C++

cupy / cupy

Sponsor

Star

NumPy & SciPy for GPU

python gpu numpy cuda cublas scipy tensor cudnn rocm cupy cusolver nccl curand cusparse nvrtc cutensor nvtx cusparselt

Updated May 22, 2024
Python

shivaraj-bh / ollama-flake

Star

Run ollama natively - powered by Nix

services nix cuda rocm flakes ollama open-webui

Updated May 22, 2024
Nix

eth-cscs / DLA-Future

Star

DLA-Future

linear-algebra mpi cuda scalapack task-based rocm cholesky-decomposition eigensolver generalized-eigensolver stdexec p2300 distributed-linear-algebra

Updated May 22, 2024
C++

pika-org / pika

Star

pika builds on C++ std::execution with fiber, CUDA, HIP, and MPI support.

cplusplus cpp gpu concurrency mpi cuda parallelism hip rocm stdexec p2300

Updated May 22, 2024
C++

ROCm / hipBLAS

Star

ROCm BLAS marshalling library

cuda blas hip rocm

Updated May 22, 2024
C++

ROCm / rocBLAS

Star

Next generation BLAS implementation for ROCm platform

blas hip rocm

Updated May 22, 2024
C++

quokka-astro / quokka

Star

Two-moment AMR radiation hydrodynamics (with self-gravity, particles, and chemistry) on CPUs/GPUs for astrophysics

gpu cuda particles astrophysics hip hydrodynamics astrochemistry rocm adaptive-mesh-refinement self-gravity

Updated May 22, 2024
C++

ROCm / rocPRIM

Star

ROCm Parallel Primitives

amd gpu parallel cuda primitive hip rocm

Updated May 22, 2024
C++

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated May 23, 2024
C++

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rocm

Here are 123 public repositories matching this topic...

vllm-project / vllm

ROCm / hipBLASLt

apache / tvm

ROCm / aomp

eliranwong / MultiAMDGPU_AIDev_Ubuntu

ROCm / rocRAND

ROCm / rocFFT

patientx / ComfyUI-Zluda

PennyLaneAI / pennylane-lightning

DejvBayer / afft

deepmodeling / deepmd-kit

cupy / cupy

shivaraj-bh / ollama-flake

eth-cscs / DLA-Future

pika-org / pika

ROCm / hipBLAS

ROCm / rocBLAS

quokka-astro / quokka

ROCm / rocPRIM

ROCm / MIVisionX

Improve this page

Add this topic to your repo