Qwen3-1.7B-Instruct

Generate text responses to prompts, enabling natural language conversations and content creation

Model Properties

The pipeline consists of a prefill step and a token-by-token step

License name: Apache License 2.0
Number of parameters: 1.7B
Model Size: 1.79 GB
Select device..

Technical Details

Context Length: 2048
Numerical Scheme: A8W4, symmetric, group-wise
Inference Api: C++, Python, Hailo-Ollama
Compiled Model:

Performance Metrics

First Load Time In Sec 6.75
Time To First Token In Sec 0.62
TPS 4.78