A subset of models are designed as premium, they offer better performance, but for a higher per token cost. Premium models are GPT 4, Claude 2.1, Claude 3 Opus & Sonnet, and Mistral Large. All paid plans have access to premium models, at varying capacity. See our plans page for more details.
Understanding the capacity and limitations in terms of context size offered for each AI model within the Vello suite are crucial for optimizing the performance of your tasks. These are the current context sizes in tokens provided for each model:
GPT-4 & Variants: 3000 tokens
Pro Plan: 4000 tokens
Claude 3 Series:
Claude 3 Opus: 4000 tokens
Pro Plan: 6000 tokens
Claude 3 Sonnet: 100,000 tokens
Pro Plan: 150,000 tokens
Claude 3 Haiku: 200,000 tokens
GPT-3.5 Turbo & Variants: Up to 16,384 tokens
Gemini Pro: 32,000 tokens
Claude 2: 100,000 tokens
Mistral & Davinci Codes: Varies up to 8000 tokens
These token limits are currently applicable across all premium plans. We plan to increase the context size for all models over time as we better understand our costs and API costs decrease.If you need full access to the context size of any model, the Flex Plan offers a pay-as-you-go pricing and full contexts for all models at cost.
Flex is an option for users who need more volume or context size than the Plus or Pro plans can provide. It offers a pay-as-you-go pricing model and full context sizes for all models at cost.Tokens are pieces of words. 100 Tokens correspond to approximately 75 words.