Qwen2-VL-2B-Instruct

Name: Qwen2-VL-2B-Instruct
Brand: Hailo AI

Generate multimodal responses by interpreting both text and images,
enabling vision-language understanding and content creation.

Model Properties

The pipeline processes image and text inputs using a vision
encoder and language model to generate contextualized outputs.

License name: Apache License 2.0
Number of parameters: 2B
Model Size: 2.18 GB

Select device..

Hailo-10H

Image Input Size: [336, 336, 3]

Numerical Scheme: A8W4, symmetric, channel-wise

Vision Tokens Per Frame: 144

Context Length: 2048

First Load Time In Sec 17.1006

Text Time To First Token In Sec 0.322963

Image Time To First Token In Sec 0.93134

TPS 8.15536

GenAI Models

Generate high-quality images from textual descriptions by leveraging advanced deep learning techniques

0/5 (0 Reviews)