Connector for Tanzu Greenplum and Tanzu GemFire 4.0

Using The Connector

Last Updated February 14, 2025

The Connector for VMware Greenplum and VMware GemFire is a available as a separate download for VMware Greenplum 5.x and 6.x from Broadcom Support Portal. Before you can use the connector, you must download the connector JAR file and copy the JAR to each VMware GemFire host.

To use the connector, you specify configuration details with gfsh commands or within a cache.xml file. Do not mix the use of gfsh for configuration with the use of a cache.xml file.

To do an explicit mapping of fields, or to map only a subset of the fields, specify all configuration in a cache.xml file.

Downloading the Connector JAR File and Copying to GemFire Hosts

The Connector for Greenplum and GemFire is available as a separate download for Greenplum Database 5.x and 6.x from Broadcom Support Portal. The connector download package is a .tar.gz file that contains the connector and javadoc JAR files as well as a license file.

Perform these steps to download and unpack the connector package:

  1. Navigate to the VMware Greenplum product on Broadcom Support Portal, select Connector for Greenplum and GemFire under the desired Greenplum release.

    The connector download file name format is gemfire-greenplum-<version>.tar.gz. For example:

    gemfire-greenplum-4.0.1.tar.gz
    

    For more information about download prerequisites, troubleshooting, and instructions, see Download Broadcom products and software.

  2. Make note of the directory to which the file was downloaded.

  3. Follow the instructions in Verifying the VMware Greenplum Software Download in the VMware Greenplum documentation to verify the integrity of the Connector for Greenplum and GemFire software.

  4. Unpack the .tar.gz file. For example:

    $ tar xzvf gemfire-greenplum-4.0.1.tar.gz
    

    Unpacking the file creates a directory named gemfire-greenplum-<version> in the current working directory. The directory contents include the connector and javadoc JAR files as well as a license file:

    gemfire-greenplum-4.0.1-javadoc.jar
    gemfire-greenplum-4.0.1.jar
    open_source_license_Connector_for_VMware_Greenplum_and_VMware_GemFire_4.0.1_GA.txt
    
  5. For each host in the GemFire cluster, copy the connector JAR file to the path_to_product/extensions/ directory on the GemFire host. This step ensures that the connector JAR file is loaded on GemFire startup. Refer to Installing GemFire Extensions in the GemFire documentation for further information.

Using gfsh Commands to Specify Configuration

gfsh may be used to configure all aspects of transfer and the the mapping, as follows:

  • If domain objects are not on the classpath, configure PDX serialization with the GemFire configure pdx command after starting locators, but before starting servers. For example:

    gfsh>configure pdx --read-serialized=true \
      --auto-serializable-classes=io.pivotal.gemfire.demo.entity.*
    
  • After starting servers, use the GemFire create jndi-binding command to specify all aspects of the data source. For example,

    gfsh>create jndi-binding --name=datasource --type=SIMPLE \
      --jdbc-driver-class="org.postgresql.Driver" \
      --username="g2c_user" --password="changeme" \
      --connection-url="jdbc:postgresql://localhost:5432/gemfire_db"
    
  • After creating regions, set up the gpfdist protocol by using configure gpfdist-protocol. For example,

    gfsh>configure gpfdist-protocol --port=8000
    
  • Specify the mapping of the Greenplum Database table to the GemFire region with the create gpdb-mapping command. For example,

    gfsh>create gpdb-mapping --region=/Child --data-source=datasource \
      --pdx-name="io.pivotal.gemfire.demo.entity.Child" --table=child --id=id,parent_id
    

Specifying Configuration with a cache.xml File

To provide configuration details within a cache.xml file, specify the correct xsi:schemaLocation attribute within the cache.xml file.

For the v4.0.x connector, use

http://schema.pivotal.io/gemfire/gpdb/gpdb-4.0.xsd

Connector Requirements and Caveats

  • Export is supported from partitioned GemFire regions only. Data cannot be exported from replicated regions. Data can be imported to replicated regions.

  • The number of Greenplum Database segments must be greater than or equal to the number of GemFire servers. If there is a high ratio of Greenplum Database segments to GemFire servers, the Greenplum configuration parameter gp_external_max_segs may be used to limit Greenplum Database concurrency. See gp_external_max_segs for details on this parameter. An approach to finding the best setting begins with identifying a representative import operation.

    • Measure the performance of the representative import operation with the default setting.
    • Measure again with gp_external_max_segs set to half the total number of Greenplum Database segments. If there is no gain in performance, then the parameter does not need to be adjusted.
    • Iterate with values of gp_external_max_segs that are half as much at each iteration, until there is no performance improvement or the value of gp_external_max_segs is the same as the number of GemFire servers.

Upgrading Java Applications from Version 2.4 to Version 3.x

API changes implemented for version 3.0.0 that are also in this connector version require code revisions in all applications that use import or export functionality.

For this sample version 2.4 export operation, an upsert type of operation was implied:

// Version 2.4 API
long numberExported = GpdbService.createOperation(region).exportRegion();

Here is the equivalent version 3.x code to implement the upsert type of operation:

// Version 3.x API
ExportConfiguration exportConfig = ExportConfiguration.builder(region)
   .setType(ExportType.UPSERT)
   .build();
ExportResult result = GpdbService.exportRegion(exportConfig);
int numberExported = result.getExportedCount();

For this sample version 2.4 import operation,

// Version 2.4 API
long numberImported = GpdbService.createOperation(region).importRegion();

here is the version 3.x code to implement the import operation:

// Version 3.x API
ImportConfiguration importConfig = ImportConfiguration.builder(region)
   .build();
ImportResult result = GpdbService.importRegion(importConfig);
int numberImported = result.getImportedCount();

Please note that the new result objects' counts are of type int instead of type long. This is for consistency, as the connector internally uses JDBC's executeQuery(), which supports int.