Skip to main content
January 24, 2025

Mastering LLM Function Calling: A Guide to Enhancing AI Capabilities

Abigail Wall
Abigail Wall

Mastering LLM Function Calling: A Guide to Enhancing AI Capabilities

LLM function calling allows large language models (LLMs) to interact with the real world by executing external functions. Instead of just generating text, LLMs can now trigger actions based on user requests. To be clear, these are not agents but function calling is something an agent would use to translate language into concrete actions in the real world (or within digital environments). Imagine an agent needing to complete a multi-step task, like ordering a pizza. Function calling allows the LLM to break this down:

  1. Understand the user's order ("I want a large pepperoni pizza with extra cheese")
  2. Call a function to find the nearest pizza place
  3. Call another function to place the order with the correct details
  4. Call a final function to provide the user with an order confirmation and estimated delivery time

More technically, it allows the LLM to interface with a structured function signature rather than just free-form text, parsing the user’s text into the arguments the function needs (inputs, types, constraints). The translation happens via machine learning models trained to understand context and map natural language to structured function calls. In our pizza example, the LLM:

1. Parse the intent ("order pizza")

2. Extract key parameters:

   - Size: "large"

   - Toppings: "pepperoni"

   - Implied parameters like delivery (if applicable)

3. Map these to function parameters through:

   - Intent recognition

   - Named entity extraction

   - Predefined mapping rules

So "I want a large pepperoni pizza" gets translated to:

order_pizza(
	size="large",
	toppings=["pepperoni"])

flowchart of the function calling process

Function-Calling Formats

While there isn't a universal standard for function call instructions across all LLMs, many are converging towards similar JSON structures. The tool definition (instructions) describe a function's capabilities, consisting of a name matching the regex ^[a-zA-Z0-9_-]{1,64}$, a detailed description of the tool's purpose, and an input schema using JSON Schema. The input schema defines expected parameters, their types, descriptions, and any constraints like enumerated values. For optimal performance, tool definitions should provide extremely comprehensive and precise descriptions of functionality, parameter behaviors, usage scenarios, and potential limitations, prioritizing detailed explanations over brief examples. Here is a general example as each LLM currently has a specific format.

{
  "name": "get_weather",
  "description": "Get the current weather in a given location",
  "input_schema": {
    "type": "object",
    "properties": {
      "location": {
        "type": "string",
        "description": "The city and state, e.g., San Francisco, CA"
      },
      "unit": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "description": "The unit of temperature, either 'celsius' or 'fahrenheit'"
      }
    },
    "required": ["location"]
  }
}

Function-Calling Challenges and Implications

While function calling represents a significant advancement in LLM capabilities, it is not without challenges that developers need to address. One major hurdle is the ambiguous nature of user input. Natural language can be vague or incomplete, potentially leading to incorrect function calls. To overcome this, developers can employ techniques like intent recognition and named entity extraction to better understand and clarify user requests.

Another common pitfall is errors in function definitions. Poorly defined tools can result in unexpected behavior and hinder the AI's ability to perform tasks accurately. To mitigate this, it's crucial to provide comprehensive descriptions and input schemas for each tool, ensuring clarity and consistency. Additionally, the LLMs itself has limitations and may struggle with complex or multi-step tasks. Breaking down these tasks into smaller, more manageable steps and utilizing frameworks like LangChain to orchestrate workflows can significantly improve their performance.

Robust error handling is also essential for reliable function calling. If a function call fails, the system should be equipped to retry, prompt the user for clarification, or fall back to an alternative tool. Continuous training of LLMs is equally important, as it improves their ability to parse user input and map it to the correct function, reducing errors over time.

Beyond technical challenges, function calling raises ethical considerations. Protecting user privacy is paramount, especially when handling sensitive data like location or payment details. Developers must ensure secure data handling and compliance with regulations like GDPR. 

To enhance security, several best practices should be followed. Validating inputs to ensure they meet expected formats and constraints before processing is crucial. Sanitizing outputs by removing any sensitive or unnecessary information from function responses adds another layer of protection. Finally, monitoring function calls to detect and respond to suspicious activity helps maintain the integrity and security of the system.

LLMs Divergent Approaches to Function Calling

Different LLM providers have unique approaches to function calling. Here’s a quick comparison among closed-source providers:

Provider Function-Calling Approach Strengths
OpenAI Uses JSON-based function definitions with clear input schemas and constraints. High accuracy, extensive documentation, and strong community support.
Anthropic Focuses on structured prompts and explicit tool definitions. Emphasis on safety and ethical considerations.
Cohere Leverages natural language instructions with minimal structured input. Simplicity and ease of use for basic use cases.
Google Integrates with Vertex AI, offering advanced orchestration and multi-modal capabilities. Enterprise-grade scalability and integration with Google Cloud services.

Framework Alternatives: LangChain & Abstraction Layers

A standard for function-calling among all LLMs would reduce variation but the difficulty is currently handled by frameworks like LangChain. LangChain simplifies the integration of LLMs with external tools by providing a unified interface for defining and executing functions. It abstracts away the differences between LLM providers, allowing developers to define tools once and use them across multiple platforms. For example, LangChain can handle variations in function-calling formats between OpenAI and Anthropic, ensuring consistent behavior. It also supports multi-step workflows, error handling, and fallback mechanisms, some of the many reasons it has become the default choice for many developers. Here is LangChain's version of the weather function call.

from langchain.tools import Tool

def get_weather(location: str, unit: str = "celsius"):
    # Simulate fetching weather data
    return f"Weather in {location}: 22°{unit}"

tool = Tool(
    name="get_weather",
    description="Get the current weather in a given location",
    input_schema={
        "location": {"type": "string"},
        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
    },
    func=get_weather
)

# Execute the tool
response = tool.execute({"location": "San Francisco", "unit": "fahrenheit"})
print(response)  # Output: "Weather in San Francisco: 72°fahrenheit"

Scale your AI coding solution faster.

Stop building infrastructure. Start building your AI engineering product.

Join Waitlist
Join
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Join Waitlist
Explore Docs