This documentation uses <PXF_INSTALL_DIR>
to refer to the VMware Tanzu Greenplum platform extension framework (PXF) installation directory. Its value depends on how you have installed PXF:
- If you installed PXF as part of Greenplum Database, its value is
$GPHOME/pxf
. - If you installed the PXF
rpm
package, its value is/usr/local/pxf-gp<greenplum-major-version>
, or the directory of your choosing (RHEL only).
<PXF_INSTALL_DIR>
includes both the PXF executables and the PXF runtime configuration files and directories. In PXF 7.x, in order to enable the PXF C extension to identify the PXF runtime configuration directory, a Greenplum Database configuration parameter pxf.pxf_base
is set with $PXF_BASE
during PXF initialization via the pxf [cluster] register
command. The default $PXF_BASE
is <PXF_INSTALL_DIR>
. A nested cluster runtime configuration directory $PXF_BASE/clusters/default
is used by the default PXF cluster.
If you want to store your configuration and runtime files in a different location, see Relocating $PXF_BASE.
This documentation uses
<PXF_INSTALL_DIR>
to reference the PXF installation directory. This documentation uses the$PXF_BASE
environment variable to reference the PXF runtime configuration directory. PXF uses the variable internally. It only needs to be set in your shell environment if you explicitly relocate the directory.
PXF Installation Directories
The following PXF files and directories are installed to <PXF_INSTALL_DIR>
when you install the PXF 7.x rpm
package:
Directory | Description |
---|---|
application/ | The PXF Server application JAR file. |
bin/ | The PXF command line executable directory. |
commit.sha | The commit identifier for this PXF release. |
gpextable/ | The PXF extension files. PXF copies the pxf.control file from this directory to the Greenplum installation ($GPHOME ) on a single host when you run the pxf register command, or on all hosts in the cluster when you run the pxf [cluster] register command from the Greenplum coordinator host. |
share/ | The directory for shared PXF files that you may require depending on the external data stores that you access. share/ initially includes only the PXF HBase JAR file. |
templates/ | The PXF directory for server configuration file templates. |
version | The PXF version. |
PXF Configuration Directories
After you prepare the PXF cluster by running the pxf [cluster] prepare
, the following directories are installed into $PXF_BASE
for the PXF deployment:
Directory | Description |
---|---|
clusters/ | The configuration directory for PXF clusters for the PXF deployment; each subdirectory contains the configuration directories for a PXF cluster, and the name of the subdirectory identifies the name of the cluster. The default cluster is named default . The Greenplum Database administrator may configure other clusters. |
The pxf [cluster] prepare
command also installs the following PXF directories into $PXF_BASE/clusters/default
for the default PXF cluster:
Directory | Description |
---|---|
conf/ | The location of user-customizable PXF cluster configuration files. This directory contains the cluster.txt , and hosts.txt files. |
groups/ | The configuration directory for PXF service groups; each subdirectory contains the configurations for a Service group, and the name of the subdirectory identifies the name of the Service group. The default Service group is named default . The Greenplum Database administrator may configure other Service groups. |
keytabs/ | The location of the PXF Service Kerberos principal keytab file. The keytabs/ directory and contained files are readable only by the Greenplum Database installation user, typically gpadmin . |
lib/ | The location of user-added runtime dependencies. The native/ subdirectory is the default PXF runtime directory for native libraries. |
servers/ | The configuration directory for PXF servers; each subdirectory contains a server definition, and the name of the subdirectory identifies the name of the server. Servers are shared across all PXF service groups in a PXF cluster. The default server is named default . The Greenplum Database administrator may configure other servers. |
The pxf [cluster] prepare
command also installs the following PXF directories into $PXF_BASE/clusters/default/groups/default
for the default PXF service group in the default PXF cluster:
Directory | Description |
---|---|
conf/ | The location of user-customizable PXF service group configuration files for runtime and logging configuration settings. This directory contains the pxf-application.properties , pxf-env.sh , pxf-log4j2.xml , and pxf-profiles.xml files. |
logs/ | The runtime log file directory. The logs/ directory and log files are readable only by the Greenplum Database installation user, typically gpadmin . |
run/ | The PXF run directory. After starting PXF, this directory contains a PXF process id file, pxf-app.pid . run/ and contained files and directories are readable only by the Greenplum Database installation user, typically gpadmin . |
Refer to Configuring PXF and Starting PXF for detailed information about the PXF configuration and startup commands and procedures.
Relocating $PXF_BASE
If you require that $PXF_BASE
reside in a directory distinct from <PXF_INSTALL_DIR>
, you can change it from the default location to a location of your choosing after you install PXF 7.x.
PXF provides the pxf [cluster] prepare command to prepare a new $PXF_BASE
location. The command copies the runtime and configuration directories identified above to the file system location that you specify in a PXF_BASE
environment variable.
For example, to relocate $PXF_BASE
to the /path/to/dir
directory on all Greenplum hosts and PXF hosts, run the following commands:
-
Set up
$PXF_BASE
in the/path/to/dir
directory on Greenplum coordinator host with theprepare
command:gpadmin@coordinator$ PXF_BASE=/path/to/dir pxf cluster prepare
-
Sync the
/path/to/dir
directory to the Greenplum standby coordinator host, segment hosts and any PXF hosts with thesync
command:gpadmin@coordinator$ pxf cluster sync
-
Register the new
$PXF_BASE
location with the Greenplum Database:gpadmin@coordinator$ PXF_BASE=/path/to/dir pxf cluster register
-
Start the PXF cluster with the new
$PXF_BASE
location. When your$PXF_BASE
is different from<PXF_INSTALL_DIR>
, inform PXF by setting thePXF_BASE
environment variable when you run apxf
command:gpadmin@coordinator$ PXF_BASE=/path/to/dir pxf cluster start
-
Set the environment variable in the
.bashrc
shell initialization script for the PXF installation owner (typically thegpadmin
user) as follows:export PXF_BASE=/path/to/dir
Content feedback and comments