Qwen2.5-1.5B-Instruct

Generate text responses to prompts, enabling natural language understanding,
multilingual support, and content creation.

Model Properties

The pipeline consists of a prefill and tbt models

License name: Apache License 2.0
Number of parameters: 1.5B
Model Size: 1.82 GB
Select device..

Technical Details

Operations: 29.4 GOPs per input token
Context Length: 2048
Numerical Scheme: A8W4, symmetric, channel-wise
Inference Api: CPP, Hailo-Ollama

Performance Metrics

First Load Time In Sec 10.7077
Time To First Token In Sec 0.325097
TPS 7.99287
Accuracy
Test Evaluation Metric Full Precision Accuracy Post Quantization Accuracy
MMLU accuracy 59 51

Explore More Models

GenAI Models
Qwen2.5-Coder-1.5B
Generate text responses to prompts, enabling natural language understanding, multilingual support, and code generation
GenAI Models
DeepSeek-R1-Distill-Qwen-1.5B
Generate text responses to prompts, enabling natural language understanding, multilingual support, content creation and advanced reasoning