Hacker News new | past | comments | ask | show | jobs | submit | from login
Enhancing DeepSeek Models with MLA and FP8 Optimizations in VLLM (neuralmagic.com)
2 points by hochmartinez 53 days ago | past
Multimodal Model Quantization Support Through LLM Compressor by Neural Magic (neuralmagic.com)
1 point by BUFU 60 days ago | past
What happens if we remove 50 percent of Llama? (neuralmagic.com)
231 points by BUFU 4 months ago | past | 132 comments
We Ran Over Half a Million Evaluations on Quantized LLMs (neuralmagic.com)
12 points by eldar_ciki 6 months ago | past | 2 comments
Pushing the Boundaries of Mixed-Precision LLM Inference with Marlin (neuralmagic.com)
2 points by mwitiderrick 10 months ago | past
Fast Llama 2 on CPUs with Sparse Fine-Tuning and DeepSparse (neuralmagic.com)
238 points by mwitiderrick on Nov 23, 2023 | past | 26 comments
Build Scalable NLP and Computer Vision Pipelines with DeepSparse (neuralmagic.com)
1 point by mwitiderrick on June 8, 2023 | past
Achieving 1,000X CPU Performance Boost with Sparse Models in MLPerf (neuralmagic.com)
1 point by NM_Ricky on April 5, 2023 | past | 1 comment
SparseGPT: Remove 100B Parameters for Free (neuralmagic.com)
3 points by homarp on March 24, 2023 | past | 1 comment
SparseGPT: Remove 100B Parameters for Free (neuralmagic.com)
2 points by todsacerdoti on March 24, 2023 | past
Sparsify Image Classification Models Faster with SparseML and Deep Lake (neuralmagic.com)
1 point by mwitiderrick on March 16, 2023 | past
YOLOv8 Detection 10x Faster with DeepSparse (neuralmagic.com)
1 point by mwitiderrick on Jan 19, 2023 | past
Image Segmentation: Your Ultimate Guide to Easy Deployment and Fast Inferencing (neuralmagic.com)
2 points by mwitiderrick on Jan 5, 2023 | past | 2 comments
Search Documents Quickly with Extractive Question Answering (neuralmagic.com)
1 point by mwitiderrick on Dec 15, 2022 | past | 1 comment
Accelerate Customer Review Classification with Sparse Transformers (neuralmagic.com)
1 point by mwitiderrick on Nov 22, 2022 | past | 1 comment
Neural Network inference on commodity CPUs using sparsity (neuralmagic.com)
2 points by atylerrice on Sept 21, 2022 | past | 3 comments
Using compound sparsification for faster BERT on CPUs with better accuracy (neuralmagic.com)
4 points by szpcela on Sept 24, 2021 | past
YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance (neuralmagic.com)
121 points by T-A on Sept 10, 2021 | past | 53 comments
Show HN: YOLOv3 – Pruning and Quantizing to Improve Object Detection Performance (neuralmagic.com)
4 points by markurtz on June 23, 2021 | past
A Software Architecture for the Future of ML (neuralmagic.com)
2 points by beefman on May 29, 2021 | past

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: