Cloud Native Diary #12

Exciting projects with Java, LLMs, and Generative AI. Conference news from Spring I/O and Devoxx UK. OpenTelemetry and supply chain security.

A llama walking down the street in a sunny and beautiful city that resembles Barcelona. He he he.

In Cloud Native Diary, I periodically share my journey working with application development, platform engineering, and cloud native technologies.

The past couple of months have been really busy. In this issue, you'll find lots of content about Java, Generative AI, OpenTelemetry, Buildpacks, supply chain security, and news from events. Ready? Let's get to it!

Java and AI

Everything is evolving so fast in the field of artificial intelligence. More and more organizations are researching how to infuse their software solutions with AI. Some of them are finding quite exciting use cases. Unfortunately, others are skipping over the question, "What problem are we trying to solve?" and are only focused on chasing the hype, with disastrous effects.

I'm personally interested in finding sensible use cases to solve real problems through software and provide value to the end users. As part of that research activity, I've been actively working on reaching reasonable technical solutions to achieve production-grade, AI-infused applications. Java is a vital part of those solutions.

Spring AI

Spring AI provides a framework to infuse artificial intelligence capabilities into Java applications, including integrations with machine learning model services (chat, image, speech, embedding, multimodal) and vector stores.

The project is under active development and, over the past few weeks, has witnessed massive activity that resulted in the first 1.0.0 milestone. Exciting! If you want to try it, you can bootstrap a new project with Spring AI from start.spring.io. Also, feel free to check out my GitHub repository with tons of examples showcasing the many features of Spring AI and demonstrating actual use cases such as classification, semantic search, structured data extraction, question answering with documents, and chatbots.

GitHub - ThomasVitale/llm-apps-java-spring-ai: Samples showing how to build Java applications powered by Generative AI and LLMs using Spring AI and Spring Boot.
Samples showing how to build Java applications powered by Generative AI and LLMs using Spring AI and Spring Boot. - ThomasVitale/llm-apps-java-spring-ai

Since the last issue of Cloud Native Diary, I have contributed several improvements to Spring AI and have more in the backlog.

  • Updated the Ollama integration to keep it in sync with the Ollama API, enhanced the test setup to use the Ollama Testcontainers module, and added the new llama3 and phi3 models (#gh-662).
  • Fixed the image model configuration in the OpenAI integration and added the new gpt-4o model (#gh-672, #gh-722).
  • Integrated Spring AI with the Kubernetes Service Binding API, making it possible to bind applications automatically with model inference services and vector stores in the cloud (#gh-500).
  • Configured a Spring Boot Starter to streamline the usage of the Spring AI integration with Hugging Face (#gh-839).
  • Updated the Mistral integration to keep it in sync with the current model offering (#gh-826).
  • Extended the pgvector documentation with more detailed information on how to use this PostgreSQL extension to store embeddings and run it as a Spring Boot dev service with Testcontainers or Docker Compose (#gh-825).
  • Improved the documentation explaining the main framework concepts, Testcontainers, function calling, and embeddings (#gh-647, #gh-666).

Something still missing in Spring AI yet fundamental to achieving production-grade applications is dedicated observability for the model integrations. I'm actively working on a proposal to contribute to the project. In the meantime, you can peek at what I implemented for my demo application at Spring I/O.

LangChain4j

LangChain4j is another framework for building LLM-powered Java applications. For a few months, I worked on a dedicated Spring Boot extension to bring native support for LangChain4j and used it in my presentation at KubeCon with Lize Raes.

GitHub - ThomasVitale/langchain4j-spring-boot: LangChain4j support in Spring Boot to build AI and LLM-powered applications.
LangChain4j support in Spring Boot to build AI and LLM-powered applications. - ThomasVitale/langchain4j-spring-boot

I'm now working with Dmytro Liubarskyi, the creator of LangChain4j, to contribute what I implemented to the official Spring Boot extension for LangChain4j. The first step was to update the project baseline to Java 17 and Spring Boot 3.2. More contributions will follow. 

If you'd like to try it, feel free to check out my GitHub repository, which has examples showcasing some of LangChain4j's features.

GitHub - ThomasVitale/llm-apps-java-langchain4j: Samples showing how to build Java applications powered by Generative AI and LLMs using LangChain4j and Spring Boot.
Samples showing how to build Java applications powered by Generative AI and LLMs using LangChain4j and Spring Boot. - ThomasVitale/llm-apps-java-langchain4j

Ollama

Ollama is a great way to run LLMs locally, saving money and keeping your data private. It quickly became one of the tools I used the most every day. I use it as a native application and with the Ollama Testcontainers module integrated with Java and Spring Boot.

I've been maintaining for a while a collection of OCI images for popular open-source or free large language models to run on Ollama. They are very convenient when I run Ollama with Testcontainers. I was thrilled to find out that more people are using these images and finding them useful. Eddú Meléndez (Docker, Testcontainers) used them in his presentation at Devoxx UK while showing how Testcontainers, Ollama, and Spring AI simplify building LLM applications and improve the developer experience.

You can find the custom, multi-arch images in this GitHub repository. They are published weekly and include signatures and SLSA attestations so that you can verify their integrity and provenance. I use them with Spring AI, LangChain4j, and Testcontainers to build and test LLM-powered applications.

GitHub - ThomasVitale/llm-images: Catalog of OCI images for popular open-source or open Large Language Models.
Catalog of OCI images for popular open-source or open Large Language Models. - ThomasVitale/llm-images

OpenTelemetry

OpenTelemetry is a "high-quality, ubiquitous, and portable telemetry to enable effective observability". It offers unified APIs, SDKs, and protocols to collect and process any telemetry signal produced by software applications, including logs, metrics, traces, events, and profiles.

It's the de facto standard for observability, supported by all major vendors and tools. Work is even in progress to define semantic conventions to structure telemetry signals from LLM-powered applications. I'm actively following that in my implementation for Spring AI.

Spring Boot and OpenTelemetry

Spring Boot offers partial support for OpenTelemetry via Micrometer, a framework that provides unified observability APIs and acts as a facade toward specific backends.

I have previously raised some issues about this topic, and some of them have been addressed in the latest versions of Spring Boot. Others are still pending, including the inclusion of an "OpenTelemetry" starter option in the Spring Initializr. I hope to see more focus on OpenTelemetry from Spring Boot in the future.

On GitHub, you can find some examples I made to show the current state of the OpenTelemetry support for metrics and traces.

GitHub - ThomasVitale/spring-boot-opentelemetry
Contribute to ThomasVitale/spring-boot-opentelemetry development by creating an account on GitHub.

Buildpacks and OpenTelemetry

A while back, due to the partial support for OpenTelemetry in Spring Boot, I implemented a dedicated Buildpack to include the OpenTelemetry Java Agent in any containeriszd Java application, including Spring ones. I then contributed the project to Paketo, and I currently maintain it.

This week, I delivered a major change to migrate to the new 2.x line of the OpenTelemetry Java Agent. I also updated this sample application showcasing how to containerize a Spring Boot application and automatically configure it with OpenTelemetry.

If you tried the Paketo Buildpack for OpenTelemetry, feel free to reach out and share your experience!

GitHub - paketo-buildpacks/opentelemetry
Contribute to paketo-buildpacks/opentelemetry development by creating an account on GitHub.

Buildpacks

The most exciting thing happening to Buildpacks is something I've been waiting for a long time. When you use Pack, you can now create multi-platform builders and buildpacks. This new feature has just landed in version 0.34.0!

The Paketo community has been working on producing multi-platform builders and buildpacks for a while and can now take advantage of the official support upstream. The buildpacks for Java, GO, and Rust have been officially published with support for AMD64 and ARM64 architectures. You can try them out with the paketobuildpacks/builder-jammy-buildpackless-tiny builder. Multi-platform builders for common use cases are still in the work and are planned to be delivered together to the base image upgrade from Ubuntu 22.04 to Ubuntu 24.04.

How about Spring Boot? You can start using the new multi-platform buildpacks today for your Java applications. If you want to containerize your application to run on the JVM, provide this configuration to the Spring Boot plugin for Gradle or Maven.

tasks.named('bootBuildImage') {
    builder = "paketobuildpacks/builder-jammy-buildpackless-tiny"
    buildpacks = [ "gcr.io/paketo-buildpacks/java" ]
}

If your application is configured to be compiled to a native executable with GraalVM, you can provide this configuration to the Spring Boot plugin for Gradle or Maven.

tasks.named('bootBuildImage') {
    builder = "paketobuildpacks/builder-jammy-buildpackless-tiny"
    buildpacks = [ "paketobuildpacks/java-native-image" ]
}

I've successfully run the new multi-platform buildpacks on macOS, Windows, and some Linux machines. However, in a few cases, I noticed that the Spring Boot plugin gets stuck during the containerization process with Buildpacks when using the new Java multi-platform buildpack. I have reported the issue to the Spring team, which is currently looking into a fix.

Speaking of Buildpacks and Spring Boot, I also investigated an issue causing the Gradle plugin to fail on Podman. I want to thank Scott Frederik from the Spring team for a nice pair troubleshooting session and for ultimately fixing the bug. Podman Desktop is my primary container runtime, and I'm glad I can use it with all my favorite tools, including Spring Boot and Buildpacks.

Devoxx UK

Last May, I attended Devoxx UK and had a great time connecting with the developer community in London. My session was about Securing the Supply Chain for Your Java Applications, from source code to dependencies to build up to the final artifacts and deployments. This year, I was also glad to contribute to the Devoxx UK organization by helping on the program committee.

You can find all the examples from my presentation (and more) on my GitHub repository focused on supply chain security, showcasing different tools and techniques to enhance your application projects. You can also check out the slides.

GitHub - ThomasVitale/supply-chain-security-java: Samples showing how to secure the supply chain for Java applications.
Samples showing how to secure the supply chain for Java applications. - ThomasVitale/supply-chain-security-java

Spring I/O

Spring I/O was a blast! If you're a Spring developer, Barcelona is the place to be to join this fantastic conference. It was great to catch up with old friends, meet new ones, and connect with the developer community.

Workshop

My Spring I/O experience started with a full-day workshop on Securing the Supply Chain for Your Spring Boot Applications. It was a very productive day, and I really enjoyed the many conversations I had with the attendees about security topics and what we can do as Java developers.

Dapr and Spring Boot

On the first conference day, I got back on stage with my friend Mauricio Salatino. Recently, we've been looking into how to bring Dapr closer to the Spring Boot experience to tackle the complexity of distributed systems. In our session, we shared what we accomplished so far and previewed some of our planned future work.

The source code for the distributed system we used in the presentation is in this GitHub repository. You can also check out the slides.

GitHub - salaboy/example-voting-app: Example distributed app composed of multiple containers for Docker, Compose, Swarm, and Kubernetes
Example distributed app composed of multiple containers for Docker, Compose, Swarm, and Kubernetes - salaboy/example-voting-app

Dapr currently provides a Java SDK to integrate with the Dapr control plane in a convenient way. However, it's not very flexible at the moment when it comes to configuration, especially in the context of a cloud native application using Spring Boot. One area we're investigating is how to solve that problem. You can check this draft pull request where I've been experimenting with a few ideas.

In addition to the Java SDK, Spring Boot support can also be improved and integrated with common Spring APIs for data and messaging integrations. I contributed some of that work in this pull request, defining auto-configuration for the new Dapr APIs that Mauricio implemented to make Dapr available via Spring Data and Spring Messaging.

Concerto for Java and AI

On the second conference day, I went back on stage with a very ambitious goal: coding live AI-infused applications and composing music live. I did it. And it was super fun! The recording will be published soon on YouTube. The slides are available here.

Going beyond the hype, I explored some use cases for Generative AI and showed how to enhance an existing Java application. Using Spring AI, I covered the following scenarios: classification, semantic search, question answering, structured data extraction, and speech transcription.

Exploring the path to production for LLM-powered applications, I covered some of the OWASP Top 10 LLM Security Risks, prompt design strategies, and how to handle observability and resilience for large language models.

I'm working on a series of articles to dive deeper into each of those topics, so I won't say more here. In the meantime, you can check out the application I used for my presentation. It's a music composer assistant. At the end of the presentation, I let the audience pick a movie scene, used the application to plan a composition strategy, and improvised and recorded the soundtrack live. So fun!

GitHub - ThomasVitale/concerto-for-java-and-ai
Contribute to ThomasVitale/concerto-for-java-and-ai development by creating an account on GitHub.

Coffee + Software with Josh Long

To close a wonderful conference, Josh Long invited me to his show "Coffee + Software," which was streamed live from the event venue. We talked about cloud native development with Spring Boot, supply chain security, and LLM-powered applications with Spring AI.

CNCF Aarhus

In April, I was again invited to the CNCF meetup in Aarhus (Denmark). I met with the local cloud native community and shared the latest and greatest about building cloud native and Kubernetes native Java applications using Spring Boot.

You can find all the examples in this GitHub repository and the slides here.

GitHub - ThomasVitale/cloud-native-aarhus-april-2024
Contribute to ThomasVitale/cloud-native-aarhus-april-2024 development by creating an account on GitHub.

Java is an excellent choice for cloud native development. I also got to show some of the latest features added to the language. For example, did you know this is a complete and valid Java 23 program (with preview features enabled)?

void main(String[] args) {
  var book = new Book(args[0], args[1]);
  println("Great book: " + book.title());
}

record Book(String title, String author){}

And did you know that you can run that Java program from a Main.java file like this?

$ java Main.java "The Hobbit" "J.R.R. Tolkien"

Alexa's Input (AI)

Last month, I also had the pleasure of being a guest on Alexa's Input podcast. Many thanks to Alexa Griffith for inviting me. I enjoyed our conversation about generative AI, exploring how things will change for cloud native platforms that need to support LLM workloads, and how to offer a good experience to developers who start building AI-infused applications.

‎Alexa’s Input (AI): Generative AI and Cloud-Native with Thomas Vitale on Apple Podcasts
‎Show Alexa’s Input (AI), Ep Generative AI and Cloud-Native with Thomas Vitale - 19 May 2024

CNCF WG App Development

I want to end this Cloud Native Diary by sharing the exciting news that we got the official approval from the CNCF for a new working group focused on application development and part of the TAG App Delivery (Technical Advisory Group). I've been nominated and voted as one of this new working group's Chairs, along with Mauricio Salatino (Diagrid) and Daniel Oh (Red Hat). More news will come soon, so stay tuned! In the meantime, you can join us on the CNCF Slack in the #wg-app-development channel.

Cover image generated with DALL-E 3.