Qwen2.5-Coder-1.5B

Generate text responses to prompts, enabling natural language understanding, multilingual support, and code generation

Model Properties

The pipeline consists of a prefill and TBT models, optimized for coding tasks.

License name: Apache License 2.0
Number of parameters: 1.5B
Model Size: 1.64 GB
Select device..

Technical Details

Operations: 29.4 GOPs per input token
Context Length: 2048
Numerical Scheme: A8W4, symmetric, channel-wise
Inference Api: CPP, Hailo-Ollama
Compiled Model:

Performance Metrics

First Load Time In Sec 8.7
Time To First Token In Sec 0.31
Tps 8.19
Accuracy
Test Evaluation Metric Full Precision Accuracy Post Quantization Accuracy
MMLU accuracy 48 43

Explore More Models

DeepSeek-R1-Distill-Qwen-1.5B
GenAI Models
DeepSeek-R1-Distill-Qwen-1.5B
Generate text responses to prompts, enabling natural language understanding, multilingual support, content creation and advanced reasoning
Instruct graphic illustrating on-device LLM inference with Hailo Gen AI
GenAI Models
Qwen2 1.5B Instruct
Generate text responses to prompts, enabling natural language conversations and content creation