A PyTorch tutorial explains how to visualize and understand GPU memory usage during training, including estimating memory requirements and optimizing GPU memory usage.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 87 | 2024-12-24 10:49:39 AM |
NVIDIA's LogitsProcessorZoo allows for more control over text generation by language models. It enables direct modification of the probability distribution used to select the next token, going beyond traditional sampling methods like beam search and nucleus sampling.
Source: https://huggingface.co/blog
AINVIDIA | Rating: 81 | 2024-12-23 09:29:16 AM |
Artificial Analysis releases Big Bench Audio, a new evaluation dataset for assessing the reasoning capabilities of audio language models, adapting questions from Big Bench Hard into the audio domain.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 84 | 2024-12-20 02:20:42 PM |
A new family of encoder-only models called ModernBERT has been introduced, offering improvements over BERT with better performance and faster processing. It is available as a replacement for BERT-like models and will be included in the transformers library.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 86 | 2024-12-19 04:30:43 PM |
IBM, Princeton, CMU, and UIUC have developed Bamba-9B, an inference-efficient Hybrid Mamba2 model. It provides 2.5x throughput improvement and 2x latency speedup compared to standard transformers in vLLM. The model is available on transformers, vLLM, TRL, and llama.cpp. Accompanying recipes for tuning, training, and extended pretraining are also released.
Source: https://huggingface.co/blog
AIIBM | Rating: 81 | 2024-12-18 07:30:42 PM |
Technology Innovation Institute (TII) released Falcon3, a family of five open-source decoder-only large language models under 10 billion parameters. These models focus on improving performance in science, math, and code.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 87 | 2024-12-17 09:30:36 AM |
A benchmark study compares the performance of language model tasks on two Google Cloud Compute Engine Xeon-based CPU instances, N2 and C4. The C4 instance shows 10x to 24x higher throughput in text embedding and 2.3x to 3.6x higher throughput in text generation compared to the N2 instance. Even when considering C4's higher hourly price, it maintains a 7x to 19x lower total cost of ownership (TCO) advantage in text embedding and a 1.7x to 2.9x lower TCO in text generation. This suggests that light-weight agentic AI solutions can be effectively deployed on CPUs.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 65 | 2024-12-17 06:00:42 AM |
A new tool, the Synthetic Data Generator, has been introduced that creates custom datasets using Large Language Models. This no-code application transforms data descriptions into synthetic data, allowing users to build datasets and models quickly without any coding knowledge.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 86 | 2024-12-16 02:01:30 PM |
LeMaterial, an open-source initiative, aims to simplify and accelerate materials research by providing a unified dataset and harmonized data format.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 56 | 2024-12-10 06:00:34 PM |
Amazon Bedrock now supports 83 open models from Hugging Face, allowing customers to build Generative AI applications with ease.
Source: https://huggingface.co/blog
AIHuggingFace | Rating: 87 | 2024-12-09 04:30:58 PM |