Open In Colab

Gemini + Klavis AI Integration

This tutorial demonstrates how to use Google’s Gemini with function calling with Klavis MCP (Model Context Protocol) servers.

Prerequisites

Before we begin, you’ll need:

Installation

First, install the required packages:
pip install google-genai klavis

Full Code Examples

For complete working examples, check out the source code:

Setup Environment Variables

import os
import webbrowser
from google import genai
from google.genai import types
from klavis import Klavis
from klavis.types import McpServerName, ToolFormat

# Set environment variables (you can also use .env file)
os.environ["GEMINI_API_KEY"] = "YOUR_GEMINI_API_KEY"  # Replace with your actual Gemini API key
os.environ["KLAVIS_API_KEY"] = "YOUR_KLAVIS_API_KEY"  # Replace with your actual Klavis API key

# Initialize clients
gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
klavis_client = Klavis(api_key=os.getenv("KLAVIS_API_KEY"))

Case Study 1: Gemini + YouTube MCP Server

Step 1 - Create YouTube MCP Server using Klavis

youtube_mcp_instance = klavis_client.mcp_server.create_server_instance(
    server_name=McpServerName.YOUTUBE,
    user_id="1234",
    platform_name="Klavis",
)

print(f"🔗 YouTube MCP server created at: {youtube_mcp_instance.server_url}, and the instance id is {youtube_mcp_instance.instance_id}")

Step 2 - Create general method to use MCP Server with Gemini

def gemini_with_mcp_server(mcp_server_url: str, user_query: str):
    # Get tools from MCP server
    mcp_server_tools = klavis_client.mcp_server.list_tools(
        server_url=mcp_server_url,
        format=ToolFormat.GEMINI,
    )
    
    # Prepare conversation contents
    contents = [types.Content(role="user", parts=[types.Part(text=user_query)])]
    
    # Generate response with function calling
    response = gemini_client.models.generate_content(
        model='gemini-1.5-pro',
        contents=contents,
        config=types.GenerateContentConfig(tools=mcp_server_tools.tools)
    )
    
    if response.candidates and response.candidates[0].content.parts:
        contents.append(response.candidates[0].content)
        
        # Check if there are function calls to execute
        has_function_calls = False
        for part in response.candidates[0].content.parts:
            if hasattr(part, 'function_call') and part.function_call:
                has_function_calls = True
                print(f"🔧 Calling function: {part.function_call.name}")
                
                try:
                    # Execute tool call via Klavis
                    function_result = klavis_client.mcp_server.call_tools(
                        server_url=mcp_server_url,
                        tool_name=part.function_call.name,
                        tool_args=dict(part.function_call.args),
                    )
                    
                    # Create function response in the proper format
                    function_response = {'result': function_result.result}
                    
                except Exception as e:
                    print(f"Function call error: {e}")
                    function_response = {'error': str(e)}
                
                # Add function response to conversation
                function_response_part = types.Part.from_function_response(
                    name=part.function_call.name,
                    response=function_response,
                )
                function_response_content = types.Content(
                    role='tool', 
                    parts=[function_response_part]
                )
                contents.append(function_response_content)
        
        if has_function_calls:
            # Generate final response after function calls
            final_response = gemini_client.models.generate_content(
                model='gemini-1.5-pro',
                contents=contents,
                config=types.GenerateContentConfig(tools=mcp_server_tools.tools)
            )
            return final_response.text
        else:
            # No function calls, return original response
            return response.text
    else:
        return "No response generated."

Step 3 - Summarize your favorite video!

YOUTUBE_VIDEO_URL = "https://www.youtube.com/watch?v=LCEmiRjPEtQ"  # pick a video you like!

result = gemini_with_mcp_server(
    mcp_server_url=youtube_mcp_instance.server_url, 
    user_query=f"Please provide a complete summary of this YouTube video with timestamp: {YOUTUBE_VIDEO_URL}"
)

print(result)
✅ Great! You’ve successfully created an AI agent that uses Gemini’s function calling with Klavis MCP servers to summarize YouTube videos!

Case Study 2: Gemini + Gmail MCP Server (OAuth needed)

# Create Gmail MCP server instance
gmail_mcp_server = klavis_client.mcp_server.create_server_instance(
    server_name=McpServerName.GMAIL,
    user_id="1234",
    platform_name="Klavis",
)

# Redirect to Gmail OAuth page for authorization
webbrowser.open(gmail_mcp_server.oauth_url)

print(f"🔐 Opening OAuth authorization for Gmail, if you are not redirected, please open the following URL in your browser: {gmail_mcp_server.oauth_url}")
EMAIL_RECIPIENT = "zihaolin@klavis.ai" # Replace with your email
EMAIL_SUBJECT = "Test Gemini + Gmail MCP Server"
EMAIL_BODY = "Hello World from Gemini!"

result = gemini_with_mcp_server(
    mcp_server_url=gmail_mcp_server.server_url, 
    user_query=f"Please send an email to {EMAIL_RECIPIENT} with subject {EMAIL_SUBJECT} and body {EMAIL_BODY}"
)

print(result)

Summary

This tutorial demonstrated how to integrate Google’s Gemini with function calling capabilities with Klavis MCP servers to create powerful AI applications. We covered two practical examples: 🎥 YouTube Integration: Built an AI assistant that can automatically summarize YouTube videos by extracting transcripts and providing detailed, timestamped summaries. 📧 Gmail Integration: Created an AI-powered email assistant that can send emails through Gmail with OAuth authentication.

Next Steps

Explore More MCP Servers

Try other available servers like Slack, Notion, GitHub, etc.

Multimodal Workflows

Build workflows that combine text, images, and other media

Production Deployment

Scale these patterns for production applications

Custom Integrations

Build custom MCP servers for your specific needs

Useful Resources

Happy building! 🚀