A "Pending Queue" Pattern on top of Vercel AI SDK for Non-Parallelizable Tools

A "Pending Queue" Pattern on top of Vercel AI SDK for Non-Parallelizable Tools

Nov 25, 2025

Yulei Sheng

We recently adopted Gemini 3 Pro in our system. While it's great in general and aggressively uses parallel tool calls (which increases speed and reduces token consumption), it introduces a challenge. Although the majority of our tools can be called in parallel, there are a few that absolutely cannot.

There are two typical solutions:

1. Prompt Engineering: It works, but not 100% of the time due to the probabilistic nature of LLMs.

2. Disable Parallel Tool Calls Altogether: We want to avoid this because we prefer the speed and cost benefits of parallelization.

So, we built a "Pending Queue" pattern on top of the Vercel AI SDK.

System Requirements

We have several requirements for the system:

1. Fully leverage the power of LLM parallel tool calls.

2. Detect bad parallel tool calls before execution to prevent side effects.

3. Help the model self-recover.

The "Pending Queue" Pattern

Instead of executing tools immediately, we decouple the execution from the tool call. Here is the architecture:

1. Flag: Mark tools that cannot be called in parallel as 'nonParallelizable'.

2. Intercept: Before sending tools to 'streamText', we wrap 'nonParallelizable' tools. When called, the wrapper:

a. Pushes the real execution closure into a Pending Queue.

b. Returns a placeholder result immediately.

3. Validate: Once the AI SDK has executed all tool calls for the step, we inspect the batch.

4. Run or Reject:

- If the batch is invalid ('nonParallelizable' tools was called with others): Reject the executions of 'nonParallelizable' tools. Replace the placeholder results with a clear error message asking the agent to call the tool individually. NOTE: All parallelizable tools in the batch were already successfully executed, so we treat them as no-ops.

- If the batch is valid (the 'nonParallelizable' tool was called alone): Fetch the original execution closure from the pending queue. Execute it. Replace the placeholder result with the actual result.

5. Send Back to LLM: Send the final tool call results (including any error messages or delayed execution results) back to the LLM.

Implementation

Here is a simplified generic implementation using TypeScript.

1. The Tool Wrapper

First, we wrap our tools. If a tool is marked 'nonParallelizable', we don't run it; we queue it.

// Store the real execution logic, keyed by the SDK's toolCallId
const pendingQueue = new Map<string, () => Promise<unknown>>();

type ToolExecute<Args = unknown, Result = unknown> = (args: Args, options: { toolCallId: string }) => Promise<Result>;

type ToolDefinition<Args = unknown, Result = unknown> = {
  name: string;
  execute: ToolExecute<Args, Result>;
  nonParallelizable?: boolean;
};

function wrapTool<Args, Result>(tool: ToolDefinition<Args, Result>): ToolDefinition<Args, Result> {
  if (!tool.nonParallelizable) return tool;

  const originalExecute = tool.execute;

  return {
    ...tool,
    async execute(args, options) {
      const { toolCallId } = options;

      // 1. Enqueue the real execution
      pendingQueue.set(toolCallId, () => originalExecute(args, options));

      // 2. Return a placeholder so the SDK thinks it's "done"
      return { status: 'pending' } as Result;
    },
  };
}
  1. The Execution Loop

In your main agent loop (where you handle the model's response), you validate the entire batch before finalizing results.

type ToolCall = {
  toolCallId: string;
  toolName: string;
  result?: unknown;
};

type ToolRegistryEntry = {
  nonParallelizable?: boolean;
};

type ToolRegistry = Record<string, ToolRegistryEntry>;

async function handleModelResponse(response: { toolCalls: ToolCall[] }, tools: ToolRegistry) {
  const { toolCalls } = response;

  // --- Validation Phase ---
  const nonParallelTools = toolCalls.filter((call) => tools[call.toolName]?.nonParallelizable);

  // Check: Is a non-parallel tool mixed with others?
  if (toolCalls.length > 1 && nonParallelTools.length > 0) {
    // Clear the queue - we are not executing these
    nonParallelTools.forEach((call) => pendingQueue.delete(call.toolCallId));

    // Return a "recoverable error" to the model
    // We patch the tool result to be an error message
    return toolCalls.map((call) => ({
      toolCallId: call.toolCallId,
      result: {
        success: false,
        error: `Tool [${call.toolName}] must be called alone. Please call it in its own step.`,
      },
    }));
  }

  // --- Execution Phase ---
  // If we are here, the batch is valid (e.g. the non-parallel tool is alone).
  const results = await Promise.all(
    toolCalls.map(async (call) => {
      const queued = pendingQueue.get(call.toolCallId);
      if (queued) {
        pendingQueue.delete(call.toolCallId);
        return queued();
      }

      // Otherwise it was a normal tool that already ran (or you can wrap generic tools too)
      return call.result;
    }),
  );

  return results;
}

Benefits

1. No "Ghost" Side Effects

Because we return a placeholder (`status: 'pending'`) initially, the dangerous code never runs if the validation fails. You don't have to rollback database transactions or undo API calls.

2. Self-Correcting Agents

By returning a specific error message ("Must be called alone"), you turn a system failure into a prompt. The model sees the error in the message history and self-corrects in the next step.

// Model sees this in history:
{
  "role": "tool",
  "content": "Error: Tool [deploy_production] must be called alone. Please call it in its own step."
}

// Model corrects itself by calling deploy_production alone

3. Fully Leverage Parallel Tool Calls

We don't have to disable parallel tool calls globally just for the 1% of tools that can't be run in parallel.

4. Compatibility

This pattern works cleanly on top of the Vercel AI SDK and any model providers.

Automate the "Write" Way

Automate the "Write" Way

Our Aident Playbook Editor (APE) turns plain English instructions into smart and reliable automations.

No coding, no complicated setups—just describe your task, and watch your tools and AI agents work seamlessly together.

Our Aident Playbook Editor (APE) turns plain English instructions into smart and reliable automations. No coding, no complicated setups—just describe your task, and watch your tools and AI agents work seamlessly together.

Our Aident Playbook Editor (APE) turns plain English instructions into

smart and reliable automations. No coding, no complicated setups—

just describe your task, and watch your tools and AI agents work seamlessly together.

Turn search trends into SEO-ready blog posts.

Turn search trends into SEO-ready blog posts.

Turn search trends into SEO-ready blog posts.

Trend-Driven Blog Maker

Trend-Driven Blog Maker

Trend-Driven Blog Maker

Track post data and flag top performers instantly.

Track post data and flag top performers instantly.

Track post data and flag top performers instantly.

Social Insight Collector

Social Insight Collector

Social Insight Collector

Refresh old hits into new, high-performing formats.

Refresh old hits into new, high-performing formats.

Refresh old hits into new, high-performing formats.

Evergreen Content Recycler

Evergreen Content Recycler

Evergreen Content Recycler

Auto-create and schedule Visuals from trending topics.

Auto-create and schedule Visuals from trending topics.

Auto-create and schedule Visuals from trending topics.

Auto Social Post Designer

Auto Social Post Designer

Auto Social Post Designer

Auto-launch and optimize ads across all platforms.

Auto-launch and optimize ads across all platforms.

Auto-launch and optimize ads across all platforms.

Smart Ads Pilot

Smart Ads Pilot

Smart Ads Pilot

Post trending content, on-brand and on time.

Post trending content, on-brand and on time.

Post trending content, on-brand and on time.

Social Media Auto Poster

Social Media Auto Poster

Social Media Auto Poster

Subscribe to Know More

Subscribe to Know More