1 unstable release
0.0.0 | Nov 14, 2024 |
---|
#5 in #relu
225KB
6K
SLoC
Cetana
An advanced machine learning library empowering developers to build intelligent applications with ease, written in Rust.
Cetana (चेतन) is a Sanskrit word meaning "consciousness" or "intelligence," reflecting the library's goal of bringing machine intelligence to your applications.
Overview
Cetana is a Rust-based machine learning library designed to provide efficient and flexible machine learning operations across multiple compute platforms. It focuses on providing a clean, safe API while maintaining high performance.
Features
- Type-safe Tensor Operations
- Neural Network Building Blocks
- Automatic Differentiation
- Model Serialization
- Multiple Activation Functions (ReLU, Sigmoid, Tanh)
- Basic Optimizers and Loss Functions
- CPU Backend (with planned GPU support)
Table of Contents
Example Usage
Compute Backends
- CPU (in progress)
- CUDA
- Metal Performance Shaders (MPS)
- Vulkan
Roadmap
Phase 1: Core Implementation (CPU)
- Basic tensor operations
- Addition, subtraction
- Matrix multiplication
- Element-wise operations
- Broadcasting support
- Neural Network Modules
- Linear layers
- Activation functions (ReLU, Sigmoid, Tanh)
- Convolutional layers
- Pooling layers
- Automatic Differentiation
- Backpropagation
- Gradient computation
- Auto Grad
- Loss Functions
- MSE (Mean Squared Error)
- Cross Entropy
- Binary Cross Entropy
- Training Utilities
- Basic training loops
- Advanced batch processing
- Mini-batch handling
- Batch normalization
- Dropout layers
- Data loaders
- Dataset abstraction
- Data augmentation
- Custom dataset support
- Model Serialization
- Save/Load models
- Export/Import weights
Phase 2: GPU Acceleration
- CUDA Backend
- Basic initialization
- Memory management
- Basic operations
- Advanced operations
- cuBLAS integration
- MPS Backend (Apple Silicon)
- Basic operations
- Performance optimizations
- Vulkan Backend
- Device initialization
- Basic compute pipeline
- Memory management
- Advanced operations
- Performance optimizations
Phase 3: Advanced Features
- Distributed Training
- Multi-GPU support
- Data parallelism
- Model parallelism
- Automatic Mixed Precision
- Model Quantization
- Performance Profiling
- Operation timing
- Memory usage tracking
- Bottleneck analysis
- Advanced Optimizations
- Kernel fusion
- Memory pooling
- Operation scheduling
- Graph optimization
Phase 4: High-Level APIs
- Model Zoo
- Pre-trained Models
- Easy-to-use Training APIs
- Integration Examples
- Comprehensive Documentation
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Dependencies
~0–9MB
~82K SLoC