Skip to content

ascheman/maven-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Maven GPT Web Application

Mission statement

This project (Maven GPT) aims at answering common Apache Maven questions, in particular for developers. It (currently) provides a web service that will accept HTTP GET requests at http://localhost:8080/ask.

The next steps will be to extend it by a suitable UI and host it on the public Internet. The author hopes to deliver a valuable service to the Maven community. Additionally, a small group of people will try to gain a better understanding of common problems with Maven (or at least its documentation) and feed this feedback back to the Maven developer community.

Background

Maven claims to be a software project management and comprehension tool provided by the Apache Software Foundation (ASF). In fact, Maven is used to build and test Java software (or other languages from the JVM universe).

There are many sources to gain information about Maven like

  • The Maven Project Site

  • Mailing Lists

  • An ASF hosted Confluence and Jira

  • Misc. source code repositories (hosted by the ASF, GitHub and others)

  • Uncountable blog-articles, conference talks etc.

However, even for experienced Maven users or developers, it is sometimes hard to answer questions or give background information (design decisions, current requirements, good practices, etc.). Sometimes answers and discussions are very opinionated. What seems to be a great approach in one context, could be an antipattern in a different scope.

Solution Outline

Maven GPT (currently) uses a simple AI model to generate a response to the question. The AI model uses an of-the-shelf GPT (like OpenAI ChatGPT) and additional information, e.g.,

TechStack

Context View
@startuml
skinparam handwritten true

!define CLOUDOGUURL https://raw.githubusercontent.com/cloudogu/plantuml-cloudogu-sprites/master
!includeurl CLOUDOGUURL/common.puml
!includeurl CLOUDOGUURL/dogus/cloudogu.puml
!includeurl CLOUDOGUURL/dogus/confluence.puml
!includeurl CLOUDOGUURL/tools/elastic.puml

actor "Maven User/Developer" as user #beige
interface "LLM" as llm

node "localhost" #lightgreen {
TOOL_ELASTIC(es, "Vector\nDatabase")
llm -[hidden]- es
control "Asynchronous\nDocument\nLoader" as dl #orange
control "AI Agent" as agent #orange
}

DOGU_CONFLUENCE(confluence,"ASF Confluence")

agent -[hidden]- dl

dl -right-> es : Upload\nvectorized\nknowledge
dl -down--> confluence : Analyze\nDocumentation

user -down-> agent : query
agent -right-> es : enrich query
agent -right-> llm : context based query

note right of llm #beige
OpenAI API
(or other Cloud
provided or local
hosted LLM)
end note

@enduml
  • The project uses Spring Boot to provide its service.

  • The underlying LangChain4J technology would enable to use misc. Large Language Models (LLMs).

    Note
    Currently, we only use OpenAI with an older model and its parameters:
    link:src/main/resources/application.properties[role=include]
  • A vector database (or vectorized retrieval store, i.e., Elasticsearch) runs in the background to enable Retrieval Augmented Generation (RAG).

Building and running

If you are familiar with Spring Boot, you may find other ways to play around with the project.

Prerequisites

Note
Currently, the project is only prepared to run locally (on your machine).
  • Install Java 21, e.g., via SDKman.

  • Get an OpenAI API Token and store it in the environment:

    • To obtain an OpenAI API token, you will need to create an account on the OpenAI website. Once you have created an account, navigate to the API page and click on the "Get API Key" button. You will then be prompted to enter your billing information and select a plan. After completing these steps, you will be provided with an API key that you can use to access the OpenAI API.

    • Store the key locally (for your convenience).

    • Provide it for subsequent steps by either

      • Adding it to application.resources (not recommended), or

      • Creating a particular Spring profile, or

      • Setting an environment variable OPENAI_API_TOKEN (cf. DirEnv to store it in the long run).

  • Download (update) input sources

    • ASF Confluence Maven

      mkdir -p download/cwiki
      cd download/cwiki
      wget -P display/MAVEN -m --no-parent https://cwiki.apache.org/confluence/display/MAVEN/Index

Run the Vector Database

Run the backing services (Elasticsearch and Kibana).

docker compose up -d

Load Data into Vector Database

Load data into the vector store (Elasticsearch). This only needs to be performed once after each download/update.

Note

Delete the content of the ES store before reloading the data.

curl -X DELETE http://localhost:9200/maven-gpt

Then run the document loader class.

./mvnw spring-boot:run -Ploaddata

Once data is loaded, you should see them via Kibana in the respective Index (maven-gpt).

Run GPT Engine (AI Agent)

Start the application.

./mvnw spring-boot:run

Then access the endpoint

curl http://localhost:8080/ask?message="Which%20plugins%20handle%20the%20build%20lifecycle?"

This should respond with something like

{"result":"The plugins that handle the build lifecycle in Apache Maven are categorized into different groups based on their functionalities. Group 1 consists of core lifecycle plugins such as maven-clean-plugin, maven-compiler-plugin, maven-deploy-plugin, maven-help-plugin, maven-install-plugin, maven-gpg-plugin, maven-resources-plugin, maven-source-plugin, and maven-toolchains-plugin. Group 2 includes site-

Testing/Usage

IntelliJ HTTP Requests in src/test/http-requests/application.http show some manual testing and usage examples.

Administration

Configuration

TBD

Elasticsearch administration

IntelliJ HTTP Requests in src/test/http-requests/elasticsearch.http provide some useful RESTful access patterns for the underlying Elasticsearch engine.

Ideas

  • Load data from other sources, e.g., Mojohaus Plugins.

  • Generate and verify questions from Stackoverflow

  • Add feedback to the UI (once it is created) for users of the service (collect via DB and evaluate frequently)

About

Maven GPT Application(s)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages