0/5 (0 Reviews)

Llama3.2-1B-Instruct

Generate text responses to prompts, enabling natural language conversations and content creation

Model Properties

The pipeline consists of a prefill step and a token-by-token step

License name: Llama 3.2 Community License
Number of parameters: 1B
Model Size: 1.79 GB
Select device..

Technical Details

Operations: 29.4 GOPs per input token
Context Length: 2048
Numerical Scheme: A8W4, symmetric, group-wise
Inference API: C++, Python, Hailo-Ollama
Compiled Model:

Performance Metrics

Load Time In Sec 3.839
Time To First Token In Sec 0.49
TPS 8.48

Explore Related Models

Qwen2.5-Coder-1.5B-Instruct
GenAI Models
Qwen2.5-Coder-1.5B-Instruct
Generate text responses to prompts, enabling natural language understanding, multilingual support, and code generation
LLM
GenAI Models
Qwen2.5-1.5B-Instruct
Generate text responses to prompts, enabling natural language understanding, multilingual support, and content creation
0/5 (0 Reviews)
0/5 (0 Reviews)