Here you will learn how to configure the VMware Spring Cloud Data Flow for Kubernetes (SCDF for Kubernetes) installation resources before deploying to the Kubernetes cluster. Before configuring the installation resources, verify that you have followed the steps described in:
Configuration of Spring Cloud Data Flow for Kubernetes involves a number of features:
- Support for multiple deployment environments
- Application properties configuration
- Kubernetes resources configuration
- Support for air-gapped environments
Support for multiple deployment environments
The SCDF for Kubernetes installation process uses Kustomize, a tool for customizing Kubernetes resource configuration. For each resource, Kustomize takes a base configuration file and applies patch files that modify the base configuration to target different deployment environments. Kustomize is built into the kubectl CLI.
SCDF for Kubernetes provides Kustomize configuration files for a development environment and for a production environment. The development environment is intended for quick exploration of SCDF for Kubernetes. Its configuration provisions a RabbitMQ message broker and a PostgreSQL database.
Directory structure
The “Spring Cloud Data Flow for Kubernetes Legacy Installation” archive, downloaded from the Broadcom Support portal, contains the following directory structure:
bin
: Scripts for installing to the development environment, as well as a script for performing image relocationapps
: Kubernetes resource files and application configuration files to support deployment to multiple environmentsservices
: Kubernetes resource files for deploying RabbitMQ and PostgreSQL to the development environment
The apps
directory contains the following directory structure, based on Kustomize naming conventions.
.
├── data-flow
│ ├── images
│ ├── kustomize
│ │ ├── base
│ │ └── overlays
│ └── schemas
├── ingress
│ └── kustomize
│ ├── base
│ └── overlays
└── skipper
├── images
├── kustomize
│ ├── base
│ └── overlays
└── schemas
Each application included as part of SCDF for Kubernetes has a directory:
- The
data-flow
directory contains the Data Flow server. - The
skipper
directory contains the Skipper server. - The
ingress
directory contains configuration files for an Ingress resource.
Each of the two application directories (data-flow
and skipper
) contains the following directory structure:
images
: Container images (only present if you have downloaded and extracted the “SCDF for Kubernetes installation images” archive)kustomize
: Kubernetes and application configuration files for use with Kustomizeschemas
: Database schemas (you can install these manually if you do not want each server to install them on startup)
For each application, you can edit files in the kustomize
directory to configure the application and Kubernetes resources. The contents of the directory are shown here.
.
├── base
│ ├── deployment.yaml
│ ├── kustomization.yaml
│ ├── role-binding.yaml
│ ├── roles.yaml
│ ├── service-account.yaml
│ └── service.yaml
└── overlays
├── dev
│ ├── application.yaml
│ ├── application-database.yaml
│ ├── deployment-patch.yaml
│ ├── kustomization.yaml
│ └── service-patch.yaml
└── production
├── application.yaml
├── application-database.yaml
├── deployment-patch.yaml
├── kustomization.yaml
└── service-patch.yaml
The base directory |
---|
The base directory contains Kubernetes resource files with default configuration values. These files are then patched by files in the overlays directory. The kustomization.yaml file references the other YAML files in the directory. |
The overlays directory |
The overlays directory contains one subdirectory for each deployment environment: a dev subdirectory and a production subdirectory.
Within each subdirectory, a kustomization.yaml file references the kustomization.yaml file in the base directory, and adds more patches to modify values that are unique to the target deployment environment. |
You can set the target namespace to deploy, and add labels and name prefixes, by adding additional configuration fields to the kustomization.yaml
file. See the Kustomize documentation for information about the additional fields that are available.
Configuration steps
Within each environment-specific subdirectory in the overlays
directory, the application.yaml
and application-database.yaml
files is a Spring Boot configuration file that you can edit to configure the application (the Data Flow server or Skipper server). For example, you can edit application.yaml
and application-database.yaml
to configure spring.datasource
properties or spring.cloud.dataflow.features
properties.
Also within each environment-specific subdirectory, the deployment-patch.yaml
and service-patch.yaml
files are used to patch the base
Kubernetes resource files, substituting values that are appropriate for the specific environment. For example, you might configure memory or CPU allocations and ingress configuration using one set of values for your dev
environment and another for your production
environment.
For information about how to configure specific settings, see the following sections.
Configure database
You can customize database configuration settings in the application-database.yaml
files. These files are preconfigured to connect to a PostgreSQL database, which can be installed either by using the manifests provided in services/dev/postgresql
or by installing the bitnami/postgresql
Helm chart.
Configure settings for database connections
To use a different database, you must replace or remove the secret volume
and volumeMount
in the deployment-patch.yaml
file. If you want to replace the secret, VMware recommends that you provide database-password
as the path for the password key. This minimizes the changes needed in the application.yaml
or application-database.yaml
file.
You must modify database settings for both the skipper
application and the data-flow
application. The following example uses a secret named postgresql
with a postgresql-password
key. You can specify additional keys that are available in the secret, as long as they reference the paths that you specify in the application.yaml
or application-database.yaml
file.
containers:
- name: skipper
volumeMounts:
- name: database
mountPath: /workspace/runtime/secrets/database
readOnly: true
...
volumes:
- name: database
secret:
secretName: postgresql
items:
- key: postgresql-password
path: database-password
In the application-database.yaml
file, you must edit settings under spring.datasource
. Update the url
, username
, password
, and driverClassName
values to match your database settings, as shown in this example.
spring:
datasource:
url: jdbc:postgresql://database-host:5432/dataflow-db
username: postgres
password: ${database-password}
driverClassName: org.postgresql.Driver
testOnBorrow: true
validationQuery: "SELECT 1"
Initialize the database schema
To initialize your database schema, you can use the DDL scripts provided in apps/skipper/schemas
and apps/data-flow/schemas
. SCDF for Kubernetes provides DDL scripts for IBM DB2, MySQL, Oracle, PostgreSQL, and Microsoft SQL Server. If applying multiple files, ensure that you apply them in the correct order (the V1-
file first, then the V2-
file, then the V3-
file, and so on).
If you manually initialize the schema for the database, you can disable automatic database initialization for the Skipper and Data Flow server applications. Add the following settings to the application.yaml
files for the applications:
spring:
...
jpa:
hibernate:
ddlAuto: none
flyway:
enabled: false
Add a JDBC driver (optional)
The published application artifacts for SCDF for Kubernetes include JDBC drivers for MySQL (using the MariaDB driver), HSQLDB, PostgreSQL, and embedded H2. See Spring Cloud Data Flow Artifacts for details. If desired, you can use your own JDBC driver instead.
To add a JDBC driver, perform the following steps:
-
Download the “SCDF Pro Server jar file” release artifact from the Broadcom Support portal.
-
Download the server JAR files for Skipper and the Composed Task Runner:
$ wget https://repo1.maven.org/maven2/org/springframework/cloud/spring-cloud-skipper-server/2.11.5/spring-cloud-skipper-server-2.11.5.jar $ wget https://repo1.maven.org/maven2/org/springframework/cloud/spring-cloud-dataflow-composed-task-runner/2.11.5/spring-cloud-dataflow-composed-task-runner-2.11.5.jar
-
Within each application directory (
apps/data-flow
andapps/skipper
), create a directory for the new JDBC driver:$ mkdir -p BOOT-INF/lib
-
Copy your JDBC driver jar to the new directories.
-
Add the driver to the downloaded JAR files:
$ jar -u0f spring-cloud-skipper-server-2.11.5.jar BOOT-INF/lib/*.jar $ jar -u0f scdf-pro-server-1.6.4.jar BOOT-INF/lib/*.jar $ jar -u0f spring-cloud-dataflow-composed-task-runner-2.11.5.jar BOOT-INF/lib/*.jar
-
Create new container images using the updated server JAR files. The easiest way to build new container images is to use the pack CLI utility. Use a base image that is based on JDK 8.
-
First, create a
registry_prefix
environment variable containing the prefix that you want to use:$ export registry_prefix=registry.example.com/example
-
Then build the three container images.
-
To build the container image for the
skipper
app, run:$ pack build --env BP_JVM_VERSION=8 \ --path spring-cloud-skipper-server-2.11.5.jar \ --builder gcr.io/paketo-buildpacks/builder:base \ --publish ${registry_prefix}/spring-cloud-skipper-server:2.11.5-jdbc
-
To build the container image for the
data-flow
app, run:$ pack build --env BP_JVM_VERSION=8 \ --path scdf-pro-server-1.6.4.jar \ --builder gcr.io/paketo-buildpacks/builder:base \ --publish ${registry_prefix}/scdf-pro-server:1.6.4-jdbc
-
To build the container image for the
composed-task-runner
app, run:$ pack build --env BP_JVM_VERSION=8 \ --path spring-cloud-dataflow-composed-task-runner-2.11.5.jar \ --builder gcr.io/paketo-buildpacks/builder:base \ --publish ${registry_prefix}/spring-cloud-dataflow-composed-task-runner:2.11.5-jdbc
-
-
-
Update your Kustomize manifests with the locations of the new images.
For the
skipper
application, update thenewName
andnewTag
values inapps/skipper/kustomize/overlays/production/kustomization.yaml
. Update thedigest
field with the new image digest, or remove thedigest
field altogether.images: - name: springcloud/spring-cloud-skipper-server # Used for Kustomize matching newName: registry.example.com/example/spring-cloud-skipper-server newTag: 2.11.5-jdbc configMapGenerator: - name: skipper files: - bootstrap.yaml - application.yaml - application-database.yaml bases: - ../../base patches: - deployment-patch.yaml - service-patch.yaml
For the
data-flow
application, update thenewName
andnewTag
values inapps/data-flow/kustomize/overlays/production/kustomization.yaml
. Update thedigest
field with the new image digest, or remove thedigest
field altogether.images: - name: springcloud/spring-cloud-dataflow-server # Used for Kustomize matching newName: registry.example.com/example/scdf-pro-server newTag: 1.6.4-jdbc configMapGenerator: - name: scdf-server files: - bootstrap.yaml - application.yaml - application-database.yaml bases: - ../../base patches: - deployment-patch.yaml - service-patch.yaml
For the
composed-task-runner
image, update theapps/data-flow/kustomize/overlays/production/application.yaml
file. Update thespring.cloud.dataflow.task.composedTaskRunnerUri
entry to match your new image. The value should be prefixed withdocker://
. You can provide either a tag or a digest that matches your new image, as shown in the following example:spring: cloud: config: enabled: false dataflow: server: uri: http://${SCDF_SERVER_SERVICE_HOST}:${SCDF_SERVER_SERVICE_PORT} features: schedules-enabled: true task: composedtaskrunner: uri: docker://registry.example.com/my-project/spring-cloud-dataflow-composed-task-runner:2.11.5-jdbc ...
If you add your own JDBC driver, you should manually initialize the database schema and disable automatic database initialization for the Skipper and Data Flow server applications. See Initializing the Database Schema.
Configure message broker
You can customize message broker configuration settings in the application.yaml
and the deployment-patch.yaml
files. These files are preconfigured to connect to a RabbitMQ broker, which can be installed in one of two ways:
- Using the manifests provided in
services/dev/rabbitmq
- Installing the
bitnami/rabbitmq
Helm chart
Use RabbitMQ as the message Broker
If you want to use RabbitMQ as the message broker, review the skipper
application settings provided in the Kustomize overlays for the dev
and production
environments. As mentioned earlier, the default configuration connects to a RabbitMQ broker that runs in the same namespace as SCDF for Kubernetes.
If your RabbitMQ broker runs in a different namespace, or if you use an external broker, you must edit the deployment-patch.yaml
file in dev
or production
overlays. In the volume
and container.volumeMounts
sections, remove references to the rabbitmq
secret.
The RabbitMQ broker connection settings are located in the application.yaml
file for the Skipper application. This file is included in the dev
and production
overlays. When deploying a Spring Cloud Stream application, SCDF for Kubernetes passes the value of the environmentVariables
field to the application.
The configuration must include the following environment variables:
SPRING_RABBITMQ_HOST
SPRING_RABBITMQ_PORT
SPRING_RABBITMQ_USERNAME
SPRING_RABBITMQ_PASSWORD
If the RabbitMQ cluster runs in the same namespace as SCDF for Kubernetes, you can reference the service environment variables provided by Kubernetes. If not, replace these references with the actual values.
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
environmentVariables: 'SPRING_CLOUD_CONFIG_ENABLED=false,SPRING_RABBITMQ_HOST=${RABBITMQ_SERVICE_HOST},SPRING_RABBITMQ_PORT=${RABBITMQ_SERVICE_PORT_AMQP},SPRING_RABBITMQ_USERNAME=user,SPRING_RABBITMQ_PASSWORD=${rabbitmq-password}'
Setting SPRING_CLOUD_CONFIG_ENABLED
to false
prevents Spring Cloud Stream applications from attempting to discover a Spring Cloud Config server instance.
Use Kafka as the message broker
If you want to use Kafka as the message broker, review the skipper
application settings provided in the Kustomize overlays for the dev
and production
environments. Edit the deployment-patch.yaml
file. In the volume
and container.volumeMounts
sections, remove references to the rabbitmq
secret.
The Kafka broker connection settings are located in the application.yaml
file for the Skipper application. This file is included in the dev
and production
overlays.
You can either provide your own Kafka cluster, or use the bitnami/kafka
Helm chart to install one. The following example assumes that you used the Helm chart. If you are using a different Kafka installation, adjust the environment variables specified in the application.yaml
file.
Remove or comment out the line with environmentVariables
for RabbitMQ settings and uncomment the line with the corresponding Kafka settings. The value of the environmentVariables
property is passed to the deployed Spring Cloud Stream apps.
The configuration must include the following environment variables:
SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS
(value: the host and port of the Kafka broker)SPRING_CLOUD_STREAM_KAFKA_BINDER_ZK_NODES
(value: the host and port for ZooKeeper)
If the Kafka cluster runs in the same namespace as SCDF for Kubernetes, you can reference the service environment variables provided by Kubernetes. If not, replace these references with the actual values.
spring:
cloud:
skipper:
server:
platform:
kubernetes:
accounts:
default:
environmentVariables: 'SPRING_CLOUD_CONFIG_ENABLED=false,SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS=${KAFKA_SERVICE_HOST}:${KAFKA_SERVICE_PORT},SPRING_CLOUD_STREAM_KAFKA_BINDER_ZK_NODES=${KAFKA_ZOOKEEPER_SERVICE_HOST}:${KAFKA_ZOOKEEPER_SERVICE_PORT}'
Setting SPRING_CLOUD_CONFIG_ENABLED
to false
prevents Spring Cloud Stream applications from attempting to discover a Spring Cloud Config server instance.
Configure the ingress resource
If your Kubernetes cluster supports Kubernetes Services of type LoadBalancer
, you may use that type of service to provision a load balancer using an Ingress Controller such as NGINX or Contour.
If your cluster has an ingress controller configured, you can use an Ingress resource manager to manage external access to the Data Flow server service.
Edit the ingress-patch.yaml
file in either the dev
overlay (apps/ingress/kustomize/overlays/dev/
) or the production
overlay (apps/ingress/kustomize/overlays/production/
). Under spec.rules
, update the host
entry, providing the DNS name required for your ingress controller configuration.
If your DNS name is data-flow.example.com
, add that DNS name as the host:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: scdf-ingress
spec:
rules:
- host: data-flow.example.com
http:
paths:
- backend:
serviceName: scdf-server
servicePort: 80
path: /
Configure security
The Spring Cloud Data Flow and Skipper servers use the Spring Security library to configure authentication and authorization with OAuth 2.0 and OpenID Connect. You can use this to integrate SCDF for Kubernetes into Single Sign-On (SSO) environments.
To configure the servers, edit the application.yaml
configuration file for each server. In the Spring Cloud Data Flow Reference Guide, the Security section describes how to configure the Data Flow server, and the Azure section describes the configuration to use with Azure Active Directory.
To configure the application.yaml
file for the Skipper server, follow the same steps, using the same documentation from the Spring Cloud Data Flow Reference Guide.
Configure container registry
The Data Flow server displays Spring Boot application metadata, which is stored as a label in the container image. To retrieve this metadata, you can configure the Data Flow server to access the container registry for the application images that you will deploy.
Place this configuration under the spring.cloud.dataflow.container.registry-configurations
section of the application.yaml
file. For examples of how to configure this section for Docker Hub, JFrog Artifactory, Amazon Elastic Container Registry, Azure Container Registry, and Harbor, see the open-source Spring Cloud Data Flow documentation.
Configure application ImagePullSecrets
When launching a Spring Cloud Task or Spring Cloud Stream application, the Kubernetes nodes must be able to pull the corresponding container image. If the image is stored in a private repository, you must provide authentication information for the nodes to use in pulling the image from a registry.
You can provide this authentication information by creating a secret in the namespace where you will install SCDF for Kubernetes. To create the secret and configure the SCDF for Kubernetes server applications to use it, follow these steps.
-
Set
username
andpassword
environment variables. These variables should contain your login credentials for the registry that stores the application images.$ registry=<your registry host> $ username=<your user name> $ password=<your password>
-
Run the
kubectl create secret
command, using the variables set in the previous step.$ kubectl create secret \ docker-registry scdf-apps-regcred \ --docker-server=${registry} \ --docker-username=${username} \ --docker-password=${password}
-
Add the name of the secret to the
application.yaml
file for the Skipper application. This will be used for Spring Cloud Stream applications.spring: cloud: skipper: server: platform: kubernetes: accounts: default: imagePullSecret: scdf-apps-regcred
If you want to use the Composed Task Runner (CTR), create a secret that contains the credentials for the appropriate registry (either the Tanzu Net registry, or if you relocated the CTR image, the registry to which you relocated the image).
-
Add the name of the secret to the
application.yaml
file for the Data Flow application. This will be used for Spring Cloud Task applications.spring: cloud: dataflow: task: platform: kubernetes: accounts: default: imagePullSecret: scdf-apps-regcred
Configure task limiting
SCDF for Kubernetes provides a configurable limit for the maximum number of concurrently-running tasks. This limit can prevent the saturation of IaaS or hardware resources. If the number of concurrently running tasks on a platform instance is greater than or equal to the limit, the next task launch request fails, and SCDF for Kubernetes returns an error message through the RESTful API, the shell, or the UI.
By default, the concurrent task limit is set to 20. You can configure the limit by setting the maximumConcurrentTasks
property in the account:
spring:
cloud:
dataflow:
task:
platform:
kubernetes:
accounts:
default:
namespace: default
maximumConcurrentTasks: 40
limits:
memory: 1024Mi
A successful task launch results in a running pod, which is expected to eventually either complete or fail. Because there is no convenient way to identify the pods that correspond to a task, SCDF for Kubernetes counts only pods that were launched by the Data Flow server. The task launcher provides the task ID as a label in the pod metadata, and SCDF for Kubernetes filters all running pods by this label.
Next: Configure monitoring for SCDF for Kubernetes
For information about configuring monitoring for SCDF for Kubernetes, see Configuring monitoring for SCDF for Kubernetes.
Content feedback and comments