Introducing Spring AI Amazon Bedrock Nova Integration via Converse API

Engineering | Christian Tzolov | December 10, 2024 | ...

The Amazon Bedrock Nova models represent a new generation of foundation models supporting a broad range of use cases, from text and image understanding to video-to-text analysis.

With the Spring AI Bedrock Converse API integration, developers can seamlessly connect to these advanced Nova models and build sophisticated conversational applications with minimal effort.

This blog post introduces the key features of Amazon Nova models, demonstrates their integration with Spring AI's Bedrock Converse API, and provides practical examples for text, image, video, document processing, and function calling.

What are Amazon Nova Models?

Amazon Nova offers three tiers of models—Nova Pro, Nova Lite, and Nova Micro—to address different performance and cost requirements:

Specification	Nova Pro	Nova Lite	Nova Micro
Modalities	Text, Image, Video-to-text	Text, Image, Video-to-text	Text
Model ID	amazon.nova-pro-v1:0	amazon.nova-lite-v1:0	amazon.nova-micro-v1:0
Max tokens	300K	300K	128K

Nova Pro and Lite support multimodal capabilities, including text, image, and video inputs, while Nova Micro is optimized for text-only interactions at a lower cost.

Setting Up the Integration

Prerequisites

AWS Configuration: You need:
- AWS credentials with access to Amazon Bedrock
- Necessary permissions to use Nova models
- Models enabled in the Amazon Bedrock console

Spring AI Dependency: Add the Spring AI Bedrock Converse starter to your Spring Boot project:

Maven:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-bedrock-converse-spring-boot-starter</artifactId>
</dependency>

Gradle:

dependencies {
    implementation 'org.springframework.ai:spring-ai-bedrock-converse-spring-boot-starter'
}

Application Configuration: Configure application.properties for Amazon Bedrock:

spring.ai.bedrock.aws.region=us-east-1
spring.ai.bedrock.aws.access-key=${AWS_ACCESS_KEY_ID}
spring.ai.bedrock.aws.secret-key=${AWS_SECRET_ACCESS_KEY}
spring.ai.bedrock.aws.session-token=${AWS_SESSION_TOKEN}

spring.ai.bedrock.converse.chat.options.model=amazon.nova-pro-v1:0

spring.ai.bedrock.converse.chat.options.temperature=0.8
spring.ai.bedrock.converse.chat.options.max-tokens=1000

For more details, refer to the Chat Properties documentation.

Key Features of the Bedrock Nova Integration

1. Text Completion

Text-based chat completion is straightforward:

String response = ChatClient.create(chatModel)
    .prompt("Tell me a joke about AI.")
    .call()
    .content();

2. Multimodal Input

Nova Pro and Lite support multimodal inputs, enabling text and visual data processing. Spring AI provides a portable Multimodal API that supports Bedrock Nova models.

Text + Image

Nova Pro and Lite support multiple image modalities. These models can analyze images, answer questions about them, classify them, and generate summaries based on provided instructions. They support base64-encoded images in image/jpeg, image/png, image/gif, and image/webp formats.

Example combining user text with an image:

String response = ChatClient.create(chatModel)
    .prompt()
    .user(u -> u.text("Explain what do you see on this picture?")
        .media(Media.Format.IMAGE_PNG, new ClassPathResource("/test.png")))
    .call()
    .content();

This code processes the test.png image: with the text message "Explain what do you see on this picture?" and generates a response like:

The image shows a close-up view of a wire fruit basket containing several pieces of fruit...

Text + Video

Amazon Nova Pro/Lite models support a single video modality in the payload, provided either in base64 format or through an Amazon S3 URI.

Supported video formats include video/x-matros, video/quicktime, video/mp4, video/webm, video/x-flv, video/mpeg, video/x-ms-wmv, and image/3gpp.

Example combining user text with a video:

String response = ChatClient.create(chatModel)
    .prompt()
    .user(u -> u.text("Explain what do you see in this video?")
        .media(Media.Format.VIDEO_MP4, new ClassPathResource("/test.video.mp4")))
    .call()
    .content();

This code processes the test.video.mp4 video with the text message "Explain what do you see in this video?" and generates a response like:

The video shows a group of baby chickens, also known as chicks, huddled together on a surface ...

Text + Document

Nova Pro/Lite supports document modalities in two variants:

Text document types (txt, csv, html, md, etc.) for text understanding and answering questions based on textual elements
Media document types (pdf, docx, xlsx) for vision-based understanding, such as analyzing charts and graphs

Example combining user text with a media document:

String response = ChatClient.create(chatModel)
    .prompt()
    .user(u -> u.text(
            "You are a very professional document summarization specialist. Please summarize the given document.")
        .media(Media.Format.DOC_PDF, new ClassPathResource("/spring-ai-reference-overview.pdf")))
    .call()
    .content();

This code processes the spring-ai-reference-overview.pdf document: with the text message and generates a response like:

Introduction:

Spring AI is designed to simplify the development of applications with artificial intelligence (AI) capabilities, aiming to avoid unnecessary complexity....

3. Function Calling

Nova models support Tool/Function Calling for integration with external tools.

Define a Function

@Bean
@Description("Get the weather in a location. Return temperature in Celsius or Fahrenheit.")
public Function<WeatherRequest, WeatherResponse> weatherFunction() {
    return new MockWeatherService();
}

Use the Function in a Chat Prompt

String response = ChatClient.create(this.chatModel)
        .prompt("What's the weather like in Boston?")
        .function("weatherFunction") // bean name
        .inputType(WeatherRequest.class)
        .call()
        .content();

Resources

Getting Started

Spring AI Documentation - Comprehensive guide to Spring AI
Spring AI Bedrock Converse API Guide - Detailed API documentation

Amazon Bedrock Resources

Amazon Bedrock Nova Documentation - Official Nova models documentation
Amazon Bedrock Console - Manage and monitor your Bedrock resources
Nova Model Capabilities - Detailed information about Nova model parameters and capabilities

Code Examples

Spring AI Bedrock Nova Demo - Complete example project showcasing integration features
BedrockNovaChatClientIT.java - Integration test examples
Spring AI Samples Repository - Additional code samples and use cases

Tanzu AI Server

VMware Tanzu Platform 10 integrates Amazon Bedrock Nova models through the VMware Tanzu AI Server, powered by Spring AI. This integration provides:

Enterprise-Grade AI Deployment: Production-ready solution for deploying AI applications within your VMware Tanzu environment
Simplified Model Access: Streamlined access to Amazon Bedrock Nova models through a unified interface
Security and Governance: Enterprise-level security controls and governance features
Scalable Infrastructure: Built on Spring AI, the integration supports for scalable deployment of AI applications while maintaining high performance

For more information about deploying AI applications with Tanzu AI Server, visit the VMware Tanzu AI documentation.

Conclusion

The integration of Spring AI with Amazon Bedrock Nova models via the Converse API enables powerful capabilities for building advanced conversational applications. Nova Pro and Lite provide comprehensive tools for developing multimodal experiences across text, images, videos, and documents. Function calling extends these capabilities further by enabling interaction with external tools and services.

Start building advanced AI applications with Nova models and Spring AI today!

Spring blog