50 releases
new 0.4.2 | Jan 12, 2025 |
---|---|
0.3.6 | Dec 11, 2024 |
0.3.5 | Nov 14, 2024 |
0.2.0 | Jul 29, 2024 |
#53 in Images
420 downloads per month
2MB
36K
SLoC
Image scaling library in Rust
Rust image scale in different color spaces using SIMD and multithreading.
Supported NEON, SSE, AVX-2, AVX-512, AVX-VNNI, WASM.
Colorspace
This library provides for you some conveniences to scale in different color spaces.
Prebuilt options for CIE L*a*b, CIE L*u*v, CIE L*c*h, Linear, Sigmoidal, Oklab, Jzazbz available.
Those transformations also very efficients.
Prefer downscale in linear colorspace or XYZ.
Up scaling might be done in LAB/LUB and simoidized components and also efficient in sRGB.
Have good f16
(the “binary16” type defined in IEEE 754-2008) support.
Example integration with image
crate
let img = ImageReader::open("./assets/asset.png")
.unwrap()
.decode()
.unwrap();
let dimensions = img.dimensions();
let mut bytes = Vec::from(img.as_bytes());
let mut scaler = LinearScaler::new(ResamplingFunction::Lanczos3);
scaler.set_threading_policy(ThreadingPolicy::Adaptive);
// ImageStore::<u8, 4> - (u8, 4) represents RGBA, (u8, 3) - RGB etc
let store =
ImageStore::<u8, 4>::from_slice(&mut bytes, dimensions.0 as usize, dimensions.1 as usize).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(dimensions.0 as usize / 2, dimensions.1 as usize / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
let resized_image = resized.as_bytes();
Fastest paths using SIMD
Despite all implementation are fast, not all the paths are implemented using SIMD, so some paths are slower
~
- Partially implemented
NEON | SSE | AVX2 | AVX-512 | WASM | |
---|---|---|---|---|---|
RGBA (8 bit) | x | x | x | x(avxvnni) | ~ |
RGB (8 bit) | x | x | x | x(avxvnni) | ~ |
Plane (8 bit) | x | x | ~ | ~ | ~ |
RGBA (8+ bit) | x | x | ~ | x(avxvnni) | - |
RGB (8+ bit) | x | x | ~ | ~ | - |
Plane (8+ bit) | ~ | ~ | ~ | ~ | - |
RGBA (f32) | x | x | x | - | - |
RGB (f32) | x | x | ~ | - | - |
Plane (f32) | x | x | ~ | - | - |
RGBA (f16) | x | x | x | - | - |
RGB (f16) | x | ~ | ~ | - | - |
Plane (f16) | ~ | ~ | ~ | - | - |
AR30/RA30 | x | - | - | - | - |
Features
Features:
- To enable support of
f16
the featurehalf
should be activated. nightly_avx512
activates AVX-512 feature set and requiresnightly
compiler channel
Target features with runtime dispatch
For x86 and aarch64 NEON runtime dispatch is used.
neon
optional target features are available, enable it when compiling on supported platform to get full features.
avx2
, fma
, sse4.1
, f16c
will be detected automatically if available, no additional actions need, and called the best path.
avx512
requires feature nightly_avx512
and requires nightly
compiler channel, runtime detection if it is available then will be used.
avxvnni
requires feature nightly_avx512
and requires nightly
compiler channel, runtime detection if it is available then will be used.
AVX-VNNI is helpful extension on modern Intel and AMD CPU's, consider turn it on to ger maximum performance.
fullfp16
NEON target detection performed in runtime, when available best the best paths for f16 images are available on ARM.
WASM simd128
target feature activating is mandatory in build flags.
About f16
To enable full support of f16 half
feature should be used, and f16c
enabled when targeting x86 platforms.
For NEON f16
feature use runtime detection, if CPU supports this feature then the very fast path is available
Even when half
feature activated but platform do not support or features not enabled for f16
speed will be slow
Performance
NEON test made on Apple M3 Pro. AVX2 test made on Premium Intel CPU Optimized 4 vCPU Digital Ocean instance.
Example comparison with fast-image-resize
time for downscale RGB 4928x3279 image in 4 times.
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 10.47 | 6.97 |
fir | 15.62 | 21.74 |
Example comparison time for downscale RGBA 4928x3279 image in 4 times with pre-multiplying alpha.
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 43.72 | 13.56 |
fir | 62.31 | 33.32 |
Example comparison time for downscale RGBA 4928x3279 image in 4 times without pre-multiplying alpha.
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 11.13 | 7.76 |
fir | 20.17 | 25.92 |
Apple Accelerate | - | 9.98 |
Example comparison time for downscale RGBA 4928x3279 10 bit image in 4 times with pre-multiplying alpha.
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 85.34 | 22.68 |
fir | 146.23 | 53.95 |
RGBA 4928x3279 10 bit downscale 4 two times without pre-multiplying alpha
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 19.15 | 8.91 |
fir | 58.57 | 38.46 |
Apple Accelerate | - | 27.63 |
Example comparison time for downscale RGB 4000x6000 10 bit image in 4 times using NEON.
Lanczos3 | AVX | NEON |
---|---|---|
pic-scale | 25.68 | 15.56 |
fir | 68.84 | 39.40 |
Example in sRGB
In common, you should not downsize an image in sRGB colorspace, however if speed is more preferable than more proper scale you may omit linearizing
let mut scaler = Scaler::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in linear
At the moment only sRGB transfer function is supported. This is also good optimized path so it is reasonably fast.
let mut scaler = LinearScaler::new(ResamplingFunction::Lanczos3);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in CIE L*a*b
let mut scaler = LabScaler::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in CIE L*u*v
let mut scaler = LuvScaler::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in CIE XYZ colorspace
let mut scaler = XYZScale::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in LCh colorspace
let mut scaler = LChScaler::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Example in Oklab colorspace
let mut scaler = OklabScaler::new(ResamplingFunction::Hermite);
scaler.set_threading_policy(ThreadingPolicy::Single);
let store = ImageStore::<u8, 4>::from_slice(&bytes, width, height).unwrap();
let mut dst_store = ImageStoreMut::<u8, 4>::alloc(width / 2, height / 2);
let resized = scaler.resize_rgba(
&store,
&mut dst_store,
true
);
Build C bindings
See picscale/include/picscale.h
for more info
cd picscale && RUSTFLAGS="-C strip=symbols" cargo +nightly build -Z build-std=std,panic_abort --release
Resampling filters
Over 30 resampling filters is supported.
Bilinear
Nearest
Cubic
MitchellNetravalli
CatmullRom
Hermite
BSpline
Hann
Bicubic
Hamming
Hanning
Blackman
And others
This project is licensed under either of
at your option.
Dependencies
~1.2–1.8MB
~37K SLoC