Qwen3-1.7B-Instruct

Generate text responses to prompts, enabling natural language conversations and content creation

Model Properties

The pipeline consists of a prefill step and a token-by-token step

License name: Apache License 2.0
Number of parameters: 1.7B
Model Size: 1.79 GB

Select device..

Hailo-10H

Context Length: 2048

Numerical Scheme: A8W4, symmetric, group-wise

Inference Api: C++, Python, Hailo-Ollama

Compiled Model:

First Load Time In Sec 6.75

Time To First Token In Sec 0.62

TPS 4.78