Microsoft Phi 4 Multimodal

Visit Resource

Phi-4 Multimodal Instruct is a 5.6B parameter foundation model enabling multimodal reasoning and instruction-following across text and visual inputs, optimized for low-latency inference on edge and mobile devices. It supports multiple languages and is designed for developers and enterprises building sophisticated AI applications in scientific, mathematical, and document analysis domains.

Provider: MicrosoftProprietaryNo API
Context: 131.1K
Multimodal

LLM Specifications

Context Length:131.1K

Pricing

Input Cost:$0.05 / 1M tokens
Output Cost:$0.10 / 1M tokens

Supported Formats

TextImage