GCP PubSub Ingester#

Gravwell provides an ingester capable of fetching entries from Google Compute Platform’s PubSub stream service. The ingester can process multiple PubSub streams within a single GCP project. The process of setting up a PubSub stream is outside the scope of this document, but in order to configure the PubSub ingester for an existing stream you will need:

  • The Google Project ID

  • A file containing GCP service account credentials (see the Creating a service account documentation)

  • The name of a PubSub topic

Once the stream is configured, each record in the PubSub stream topic will be stored as a single entry in Gravwell.

Installation#

To install the Debian package, make sure the Gravwell Debian repository is configured as described in the quickstart. Then run the following command as root:

apt update && apt install gravwell-pubsub

To install the Redhat package, make sure the Gravwell Redhat repository is configured as described in the quickstart. Then run the following command as root:

yum install gravwell-pubsub

To install via the standalone shell installer, download the installer from the downloads page, then run the following command as root, replacing X.X.X with the appropriate version:

bash gravwell_pubsub_ingest_installer_X.X.X.sh

You may be prompted for additional configuration during the installation.

There is currently no Docker image for this ingester

Basic Configuration#

The PubSub ingester uses the unified global configuration block described in the ingester section. Like most other Gravwell ingesters, PubSub supports multiple upstream indexers, TLS, cleartext, and named pipe connections, a local cache, and local logging.

The configuration file is at /opt/gravwell/etc/pubsub_ingest.conf. The ingester will also read configuration snippets from its configuration overlay directory (/opt/gravwell/etc/pubsub_ingest.conf.d).

PubSub Examples#

[PubSub "gravwell"]
	Topic-Name=mytopic	# the pubsub topic you want to ingest
	Tag-Name=gcp
	Parse-Time=false
	Assume-Local-Timezone=true

[PubSub "my_other_topic"]
	Topic-Name=foo # the pubsub topic you want to ingest
	Tag-Name=gcp
	Assume-Local-Timezone=false

Installation and configuration#

First, download the installer from the Downloads page, then install the ingester:

root@gravserver ~# bash gravwell_pubsub_ingest_installer.sh

If the Gravwell services are present on the same machine, the installation script should automatically extract and configure the Ingest-Auth parameter and set it appropriately. You will now need to open the /opt/gravwell/etc/pubsub_ingest.conf configuration file and set it up for your PubSub topic. Once you have modified the configuration as described below, start the service with the command systemctl start gravwell_pubsub_ingest.service

The example below shows a sample configuration which connects to an indexer on the local machine (note the Pipe-Backend-target setting) and feeds it from a single PubSub topic named “mytopic”, which is part of the “myproject-127400” GCP project.

[Global]
Ingest-Secret = IngestSecrets
Connection-Timeout = 0
Insecure-Skip-TLS-Verify = false
Pipe-Backend-target=/opt/gravwell/comms/pipe #a named pipe connection, this should be used when ingester is on the same machine as a backend
Log-Level=ERROR #options are OFF INFO WARN ERROR

# The GCP project ID to use
Project-ID="myproject-127400"
Google-Credentials-Path=/opt/gravwell/etc/google-compute-credentials.json

[PubSub "gravwell"]
	Topic-Name=mytopic	# the pubsub topic you want to ingest
	Tag-Name=gcp
	Parse-Time=false
	Assume-Localtime=true

Note the following essential fields:

  • Project-ID - the Project ID string for a GCP project

  • Google-Credentials-Path - the path to a file containing GCP service account credentials in JSON format

  • Topic-Name - the name of a PubSub topic within the specified GCP project

You can configure multiple PubSub sections to support multiple different PubSub topics within a single GCP project.

You can test the config by running /opt/gravwell/bin/gravwell_pubsub_ingester -v by hand; if it does not print out errors, the configuration is probably acceptable.

The PubSub ingester does not provide the Ignore-Timestamps option found in many other ingesters. PubSub messages include an arrival timestamp; by default, the ingester will use that as the Gravwell timestamp. If Parse-Time=true is specified in the data consumer definition, the ingester will instead attempt to extract a timestamp from the message body.