Qwen2-1.5B-Instruct

Generate text responses to prompts, enabling natural language conversations
and content creation.

Model Properties

The pipeline consists of a prefill and tbt models

License name: Apache License 2.0
Number of parameters: 1.5B
Model Size: 1.56 GB
Select device..

Technical Details

Operations: 29.4 GOPs per input token
Context Length: 2048
Numerical Scheme: A8W4, symmetric, channel-wise
Inference Api: CPP, Hailo-Ollama
Compiled Model:
Pre Compiled Model:

Performance Metrics

First Load Time In Sec 8.34639
Time To First Token In Sec 0.322963
TPS 8.12567