Limit number of output tokens with AsyncOpenAI().chat.completions.create()? #2471

Bright381 · 2025-07-16T10:44:19Z

Bright381
Jul 16, 2025

I would like to bound the maximum number of tokens of the response to help with formatting when working with AsyncOpenAI, but I could not find anything about it.
I tried :

AsyncOpenAI().chat.completions.create(
            model=model,
            messages=[
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": prompt}
            ],
            max_tokens=10
        )

but the max_tokens argument is seemingly ignored.

yashwantbezawada · 2025-11-03T04:25:10Z

yashwantbezawada
Nov 3, 2025

The max_tokens parameter should work with AsyncOpenAI. Can you check a few things:

First, verify it's actually limiting the response. Try something like this:

  from openai import AsyncOpenAI
  import asyncio

  async def test():
      client = AsyncOpenAI()
      response = await client.chat.completions.create(
          model="gpt-4",
          messages=[
              {"role": "system", "content": "You are a helpful assistant."},
              {"role": "user", "content": "Write a long essay about Python."}
          ],
          max_tokens=10
      )
      print(f"Tokens used: {response.usage.completion_tokens}")
      print(f"Finish reason: {response.choices[0].finish_reason}")

  asyncio.run(test())

If max_tokens is working, you should see:

completion_tokens around 10
finish_reason should be "length" (meaning it was cut off)

A couple things that might be happening:

Some models have a minimum response length they try to hit
Very low max_tokens values (like 5 or less) can behave weird
Make sure you're checking response.usage.completion_tokens to see the actual count

What model are you using and what token count are you seeing in the response?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limit number of output tokens with AsyncOpenAI().chat.completions.create()? #2471

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Limit number of output tokens with AsyncOpenAI().chat.completions.create()? #2471

Uh oh!

Uh oh!

Bright381 Jul 16, 2025

Replies: 1 comment

Uh oh!

yashwantbezawada Nov 3, 2025

Bright381
Jul 16, 2025

yashwantbezawada
Nov 3, 2025