Get ahead
VMware offers training and certification to turbo-charge your progress.
Learn moreIn the rapidly evolving world of artificial intelligence, developers are constantly seeking ways to enhance their AI applications. Spring AI, a Java framework for building AI-powered applications, has introduced a powerful feature: the Spring AI Advisors.
The advisors can supercharge your AI applications, making them more modular, portable and easier to maintain.
If reading the post isn't convenient, you can listen to this experimental podcast, AI-generated from blog's content:
At their core, Spring AI Advisors are components that intercept and potentially modify the flow of chat-completion requests and responses in your AI applications. The key player in this system is the AroundAdvisor, which allows developers to dynamically transform or utilize information within these interactions.
The main benefits of using Advisors include:
The Advisor system operates as a chain, with each Advisor in the sequence having the opportunity to process both the incoming request and the outgoing response. Here's a simplified flow:
AdvisedRequest
is created from the user's prompt, along with an empty advisor-context
.AdvisedResponse
a combination of the original ChatResponse
and the advise context from the input path of the chain.ChatResponse
from the final AdvisedResponse
is returned to the client.Spring AI comes with several pre-built Advisors to handle common scenarios and Gen AI patterns:
With the ChatClient API you can register the advisors needed for your pipeline:
var chatClient = ChatClient.builder(chatModel)
.defaultAdvisors(
new MessageChatMemoryAdvisor(chatMemory), // chat-memory advisor
new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()) // RAG advisor
)
.build();
String response = chatClient.prompt()
// Set chat memory parameters at runtime
.advisors(advisor -> advisor.param("chat_memory_conversation_id", "678")
.param("chat_memory_response_size", 100))
.user(userText)
.call()
.content();
The Advisor API consists of CallAroundAdvisor and CallAroundAdvisorChain for non-streaming, and StreamAroundAdvisor and StreamAroundAdvisorChain for streaming scenarios.
It also includes AdvisedRequest to represent the unsealed Prompt request data, and AdvisedResponse for the chat completion data.
The AdvisedRequest and the AdvisedResponse have an advise-context
field, used to share state across the advisor chain.
Creating a custom Advisor is straightforward. Let's implement a simple logging Advisor to demonstrate the process:
public class SimpleLoggerAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {
private static final Logger logger = LoggerFactory.getLogger(SimpleLoggerAdvisor.class);
@Override
public String getName() {
return this.getClass().getSimpleName();
}
@Override
public int getOrder() {
return 0;
}
@Override
public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {
logger.debug("BEFORE: {}", advisedRequest);
AdvisedResponse advisedResponse = chain.nextAroundCall(advisedRequest);
logger.debug("AFTER: {}", advisedResponse);
return advisedResponse;
}
@Override
public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {
logger.debug("BEFORE: {}", advisedRequest);
Flux<AdvisedResponse> advisedResponses = chain.nextAroundStream(advisedRequest);
return new MessageAggregator().aggregateAdvisedResponse(advisedResponses,
advisedResponse -> logger.debug("AFTER: {}", advisedResponse));
}
}
This Advisor logs the request before it's processed and the response after it's received, providing valuable insights into the AI interaction process.
The aggregateAdvisedResponse(...)
utility combines AdviseResponse
chunks into a single AdvisedResponse
, returning the original stream and accepting a Consumer
callback for the completed result.
It preserves original content and context.
Let's implement a more advanced Advisor based on the Re-Reading (Re2) technique, inspired by this paper, which can improve the reasoning capabilities of large language models:
public class ReReadingAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {
private static final String DEFAULT_USER_TEXT_ADVISE = """
{re2_input_query}
Read the question again: {re2_input_query}
""";
@Override
public String getName() {
return this.getClass().getSimpleName();
}
@Override
public int getOrder() {
return 0;
}
private AdvisedRequest before(AdvisedRequest advisedRequest) {
String inputQuery = advisedRequest.userText(); //original user query
Map<String, Object> params = new HashMap<>(advisedRequest.userParams());
params.put("re2_input_query", inputQuery);
return AdvisedRequest.from(advisedRequest)
.withUserText(DEFAULT_USER_TEXT_ADVISE)
.withUserParams(params)
.build();
}
@Override
public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {
return chain.nextAroundCall(before(advisedRequest));
}
@Override
public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {
return chain.nextAroundStream(before(advisedRequest));
}
}
This Advisor modifies the input query to include a "re-reading" step, potentially improving the AI model's understanding and reasoning about the question.
Spring AI's advanced topics encompass important aspects of advisor management, including order control, state sharing, and streaming capabilities. Advisor execution order is determined by the getOrder() method. State sharing between advisors is enabled through a shared advise-context object, facilitating complex multi-advisor scenarios. The system supports both streaming and non-streaming advisors, allowing for processing of complete requests and responses or handling continuous data streams using reactive programming concepts.
The order of Advisors in the chain is crucial and is determined by the getOrder()
method. Advisors with lower order values are executed first.
Because the advisor chain is a stack, the first advisor in the chain is the last to process the request and the first to process the response.
If you want to ensure that an advisor is executed last, set its order close to the Ordered.LOWEST_PRECEDENCE
value and vice versa to execute first set the order close to the Ordered.HIGHEST_PRECEDENCE
value.
If you have multiple advisors with the same order value, the order of execution is not guaranteed.
Both the AdvisedRequest
and the AdvisedResponse
share an advise-context object.
You can use the advise-context
to share state between the advisors in the chain, and build more complex processing scenarios that involve multiple advisors.
Spring AI supports both streaming and non-streaming Advisors. Non-streaming Advisors work with complete requests and responses, while streaming Advisors handle continuous streams using reactive programming concepts (e.g., Flux for responses).
For streaming advisors, it's crucial to note that a single AdvisedResponse
instance represents only a chunk (i.e., part) of the entire Flux<AdvisedResponse>
response. In contrast, for non-streaming advisors, the AdvisedResponse
encompasses the complete response.
advise-context
to share state between Advisors when necessary.Spring AI Advisors provide a powerful and flexible way to enhance your AI applications. By leveraging this API, you can create more sophisticated, reusable, and maintainable AI components. Whether you're implementing custom logic, managing conversation history, or improving model reasoning, Advisors offer a clean and efficient solution.
We encourage you to experiment with Spring AI Advisors in your projects and share your custom implementations with the community. The possibilities are endless, and your innovations could help shape the future of AI application development!
Happy coding, and may your AI applications be ever more intelligent and responsive!