CODING CHALLENGE

Tech stack

The project used the followig technologies:

Java 11
Spring Boot 2.4.1
Spring Cloud
Spring Cloud Stream
Apache Kafka & Apache Zookeeper
Stanford Core NLP 4.2.0
MongoDB
Github Actions
Github Packages

Continuous integration and continuous deployment (CI/CD)

The steps defined in CI workflow are the following:

Build: This step compiles all the microservices. This step runs on every pipeline
Test: The mission of this step is to run all tests in the project
- CI on every push: This pipeline only runs unit tests on this step.
- CI on PRs and main branch: It runs unit and integration tests.
Publish artifacts: This step publishes the artifacts on github packages
Publish docker: This step builds the docker images and publishes them to dockerhub repositories
Publish chart: This step builds a helm chart and publishes it into gh-pages

Also it is integrated on github to pass this checks on every pull request

Security

Despite these coding test have not any security implemented, in a production environment these APIs should have a security layer using JWT tokens or the OAuth2 standard. Also, I would suggest having a gateway on top of these services, so you only expose these APIs through it.

API documentation

Both APIs are documented with Swagger following the OpenAPI 3 Specification.

NLP-Processor Swagger: http://localhost:8081/swagger
Patent-Manager Swagger: http://localhost:8082/swagger

Heathchecks

Both APIs have the basic actuator endpoints to check if the services are running.

NLP-Processor health endpoint: http://localhost:8081/actuator/health
Patent-Manager health endpoint: http://localhost:8082/actuator/health

Architecture

All asynchronous communications between microservices are handled by Apache kafka and Zookeeper, instead of using HTTP protocol, since it is more reliable and allows us to have fault-tolerance.

All synchronous communications between microservices are handled calling the REST API by using HTTP protocol.

Build

In order to build all components, from root folder you have to do one of the following steps:

Build helm chart locally:
Note: You need to have installed kompose and helm CLI
Run the script build-all.sh on the root folder with the following command
PACKAGE=true BUILD_IMAGES=true ./build-all.sh
Compile all microservices and build docker images:
Go to nlp-processor and patent-manager folders and run the following command on both:
./gradlew build && cd docker && ./build-image.sh
Compile all microservices:
Go to nlp-processor and patent-manager folders and run the following command on both:
./gradlew build
Compile all microservices skipping tests:
Go to nlp-processor and patent-manager folders and run the following command on both:
./gradlew build -PskipTests

Run in docker

From root folder you have to run the following command to start docker containers with the docker hub images:

REGISTRY=rogomdi/ docker-compose up -d

Or if you have built them on your computer:

docker-compose up -d

When you run docker compose, these are the ports exposed to your computer:

2181 for Zookeeper
2717 for MongoDB
9000 for Kafdrop (Kafka UI)
9092 for Kafka
8081 for NLP-Processor
8082 for Patent-Manager

Run in kubernetes

Note: You need to have installed kompose and helm CLI

From root folder you have to run the following command to run it building the images:

PACKAGE=true BUILD_IMAGES=true ./build-all.sh

Or you can run it with the images uploaded to dockerhub by the CD:

REGISTRY=rogomdi/ ./build-all.sh

Install the chart by running: helm install basf-coding-challenge basf-test-1.0.0-local.tgz

Since we have not configured any Ingress controller, to access the APIs and Kafdrop you will need to expose the ports from services. To do that we will need to run the following commands:

Exposing kafdrop:
- POD_NAME=$(kubectl get pods -n dev | grep kafdrop | awk '{print $1}')
- kubectl port-forward 9000:9000 $POD_NAME
Exposing patent-manager:
- POD_NAME=$(kubectl get pods -n dev | grep patent-manager | awk '{print $1}')
- kubectl port-forward 8082:8082 $POD_NAME
Exposing nlp-processor:
- POD_NAME=$(kubectl get pods -n dev | grep nlp-processor | awk '{print $1}')
- kubectl port-forward 8082:8082 $POD_NAME

FAQ

Why use Kafka instead of RabbitMQ?
Since Kafka is designed for deliver thousand of messages at a lower latency than RabbitMQ, this is the appropiate technology.
Why use a NoSQL database such as MongoDB?
As mentioned in the statement, we need to store lot of data and the schema for the patent can be different in the future. In this case, a NoSQL database is a good option.
Why use Github Actions?
I have selected it because it is easy to configure and it is fully integrated with Github
Why having the synchronous and asynchronous process?
In my opinion, to debug and to process a few patents. Of course, if you want to process a huge amount of data, the asynchronous way is the best option.
If I process a ZIP in an asynchronous way, how do I know if the NLP Process has finished?
Well, if you want to look for a patent, you can use the API to request it by its UUID or application. Note that other best approachs would be to notify through a websocket or publish on a kafka topic once the process is finished and read messages from it.
Why do you have two microservices in the same repository?
Since it is a coding challenge I think it is simpler to look into a single repository instead of cloning three different repositories. In production, each microservice should have its own repository and CI/CD pipelines.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
nlp-processor		nlp-processor
patent-manager		patent-manager
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build-all.sh		build-all.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CODING CHALLENGE

Tech stack

Continuous integration and continuous deployment (CI/CD)

Security

API documentation

Heathchecks

Architecture

Build

Run in docker

Run in kubernetes

FAQ

About

Releases

Packages

Languages

License

rogomdi/basf

Folders and files

Latest commit

History

Repository files navigation

CODING CHALLENGE

Tech stack

Continuous integration and continuous deployment (CI/CD)

Security

API documentation

Heathchecks

Architecture

Build

Run in docker

Run in kubernetes

FAQ

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages