google_api_dataflow v0.0.1 API Reference
Modules
API calls for all endpoints tagged Projects
Handle Tesla connections for GoogleApi.Dataflow.V1b3
Helper functions for deserializing responses into models
Obsolete in favor of ApproximateReportedProgress and ApproximateSplitRequest
A progress measurement of a WorkItem by a worker
A suggestion by the service to the worker to dynamically split the WorkItem
A structured message reporting an autoscaling decision made by the Dataflow service
Settings for WorkerPool autoscaling
Description of an interstitial value between transforms in an execution stage
Description of a transform executed as part of an execution stage
All configuration data for a particular Computation
A position that encapsulates an inner position and an index for the inner position. A ConcatPosition can be used by a reader of a source that encapsulates a set of other sources
CounterMetadata includes all static non-name non-value counter attributes
Identifies a counter within a per-job namespace. Counters whose structured names are the same get merged into a single value for the job
A single message which encapsulates structured name and metadata for a given counter
An update to a Counter sent from a worker
Modeled after information exposed by /proc/stat
A request to create a Cloud Dataflow job from a template
Identifies the location of a custom souce
Data disk assignment for a given VM instance
Specification of one of the bundles produced as a result of splitting a Source (e.g. when executing a SourceSplitRequest, or when splitting an active task using WorkItemStatus.dynamic_source_split), relative to the source being split
Describes the data disk used by a workflow job
Data provided with a pipeline or transform to provide descriptive info
A metric value representing a distribution
When a task splits using WorkItemStatus.dynamic_source_split, this message describes the two parts of the split relative to the description of the current task's input
Describes the environment in which a Dataflow Job runs
A message describing the state of a particular execution stage
Description of the composing transforms, names/ids, and input/outputs of a stage of execution. Some composing transforms and sources may have been generated by the Dataflow service during execution planning
Indicates which location failed to respond to a request for data
An instruction that copies its inputs (zero or more) to its (single) output
A metric value representing a list of floating point numbers
A representation of a floating point mean metric contribution
Request to get updated debug configuration for component
Response to a get debug configuration request
The response to a GetTemplate request
An input of an instruction, as a reference to an output of a producer instruction
An output of an instruction
A metric value representing a list of integers
A representation of an integer mean metric contribution
Defines a job to be run by the Cloud Dataflow service
Additional information about how a Cloud Dataflow job will be executed that isn't contained in the submitted job
Contains information about how a particular google.dataflow.v1beta3.Step will be executed
A particular message pertaining to a Dataflow job
JobMetrics contains a collection of metrics descibing the detailed progress of a Dataflow job. Metrics correspond to user-defined and system-defined metrics in the job. This resource captures only the most recent values of each metric; time-series data can be queried for them (under the same metric names) from Cloud Monitoring
Data disk assignment information for a specific key-range of a sharded computation. Currently we only support UTF-8 character splits to simplify encoding into JSON
Location information for a specific key-range of a sharded computation. Currently we only support UTF-8 character splits to simplify encoding into JSON
Parameters to provide to the template being launched
Response to the request to launch a template
Request to lease WorkItems
Response to a request to lease WorkItems
Response to a request to list job messages
Response to a request to list Cloud Dataflow jobs. This may be a partial response, depending on the page size in the ListJobsRequest
Bucket of values for Distribution's logarithmic histogram
MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem. Each instruction must appear in the list before any instructions which depends on its output
The metric short id is returned to the user alongside an offset into ReportWorkItemStatusRequest
Identifies a metric, by describing the source which generated the metric
Describes the state of a metric
Describes mounted data disk
Information about an output of a multi-output DoFn
Basic metadata about a counter
The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool. This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run
An instruction that does a ParDo operation. Takes one main input and zero or more side inputs, and produces zero or more outputs. Runs user code
Describes a particular operation comprising a MapTask
Structured data associated with this message
Metadata for a specific parameter
An instruction that does a partial group-by-key. One input and one output
A descriptive representation of submitted pipeline as well as the executed form. This data is provided by the Dataflow service for ease of visualizing the pipeline and interpretting Dataflow provided metrics
Position defines a position within a collection of data. The value can be either the end position, a key (used with ordered collections), a byte offset, or a record index
Identifies a pubsub location to use for transferring data into or out of a streaming Dataflow job
An instruction that reads records. Takes no inputs, produces one output
Request to report the status of WorkItems
Response from a request to report the status of WorkItems
Represents the level of parallelism in a WorkItem's input, reported by the worker
Worker metrics exported from workers. This contains resource utilization metrics accumulated from a variety of sources. For more information, see go/df-resource-signals
Service-side response to WorkerMessage reporting resource utilization
The environment values to set at runtime
Request to send encoded debug information
Response to a send capture request. nothing
A request for sending worker messages to the service
The response to the worker messages
Describes a particular function to invoke
Information about an output of a SeqMapTask
A task which consists of a shell command for the worker to execute
Information about a side input of a DoFn or an input of a SeqDoFn
A sink that records can be encoded and written to
A source that records can be read and decoded from
DEPRECATED in favor of DynamicSourceSplit
A request to compute the SourceMetadata of a Source
The result of a SourceGetMetadataOperation
Metadata about a Source useful for automatically optimizing and tuning the pipeline, etc
A work item that represents the different operations that can be performed on a user-defined Source specification
The result of a SourceOperationRequest, specified in ReportWorkItemStatusRequest.source_operation when the work item is completed
Hints for splitting a Source into bundles (parts for parallel processing) using SourceSplitRequest
Represents the operation to split a high-level Source specification into bundles (parts for parallel processing). At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource. As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it
The response to a SourceSplitRequest
DEPRECATED in favor of DerivedSource
A representation of an int64, n, that is immune to precision loss when encoded in JSON
Description of an input or output of an execution stage
State family configuration
The `Status` type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC. The error model is designed to be: - Simple to use and understand for most users - Flexible enough to meet unexpected needs # Overview The `Status` message contains three pieces of data: error code, error message, and error details. The error code should be an enum value of google.rpc.Code, but it may accept additional error codes if needed. The error message should be a developer-facing English message that helps developers understand and resolve the error. If a localized user-facing error message is needed, put the localized message in the error details or localize it in the client. The optional error details may contain arbitrary information about the error. There is a predefined set of error detail types in the package `google.rpc` that can be used for common error conditions. # Language mapping The `Status` message is the logical representation of the error model, but it is not necessarily the actual wire format. When the `Status` message is exposed in different client libraries and different wire protocols, it can be mapped differently. For example, it will likely be mapped to some exceptions in Java, but more likely mapped to some error codes in C. # Other uses The error model and the `Status` message can be used in a variety of environments, either with or without APIs, to provide a consistent developer experience across different environments. Example uses of this error model include: - Partial errors. If a service needs to return partial errors to the client, it may embed the `Status` in the normal response to indicate the partial errors. - Workflow errors. A typical workflow has multiple steps. Each step may have a `Status` message for error reporting. - Batch operations. If a client uses batch request and batch response, the `Status` message should be used directly inside batch response, one for each error sub-response. - Asynchronous operations. If an API call embeds asynchronous operation results in its response, the status of those operations should be represented directly using the `Status` message. - Logging. If some API errors are stored in logs, the message `Status` could be used directly after any stripping needed for security/privacy reasons
Defines a particular step within a Cloud Dataflow job. A job consists of multiple steps, each of which performs some specific operation as part of the overall job. Data is typically passed from one step to another as part of the job. Here's an example of a sequence of steps which together implement a Map-Reduce job: Read a collection of data from some source, parsing the collection's elements. Validate the elements. Apply a user-defined function to map each element to some value and extract an element-specific key value. Group elements with the same key into a single element with that key, transforming a multiply-keyed collection into a uniquely-keyed collection. * Write the elements out to some data sink. Note that the Cloud Dataflow service may be used to run many different types of jobs, not just Map-Reduce
Describes a stream of data, either as input to be processed or as output of a streaming Dataflow job
Configuration information for a single streaming computation
Describes full or partial data disk assignment information of the computation ranges
A task which describes what action should be performed for the specified streaming computation ranges
A task that carries configuration information for streaming computations
A task which initializes part of a streaming Dataflow job
Identifies the location of a streaming side input
Identifies the location of a streaming computation stage, for stage-to-stage communication
A metric value representing a list of strings
A rich message format, including a human readable string, a key for identifying the message, and structured data associated with the message for programmatic consumption
Taskrunner configuration settings
Metadata describing a template
Global topology of the streaming Dataflow job, including all computations and their sharded locations
Description of the type, names/ids, and input/outputs for a transform
WorkItem represents basic information about a WorkItem to be executed in the cloud
The Dataflow service's idea of the current state of a WorkItem being processed by a worker
Conveys a worker's progress through the work described by a WorkItem
WorkerHealthReport contains information about the health of a worker. The VM should be identified by the labels attached to the WorkerMessage that this health ping belongs to
WorkerHealthReportResponse contains information returned to the worker in response to a health ping
WorkerMessage provides information to the backend about a worker
A message code is used to report status and error messages to the service. The message codes are intended to be machine readable. The service will take care of translating these into user understandable messages if necessary. Example use cases: 1. Worker processes reporting successful startup. 2. Worker processes reporting specific errors (e.g. package staging failure)
A worker_message response allows the server to pass information to the sender
Describes one particular pool of Cloud Dataflow workers to be instantiated by the Cloud Dataflow service in order to perform the computations required by a job. Note that a workflow job may use multiple pools, in order to match the various computational requirements of the various stages of the job
Provides data to pass through to the worker harness
An instruction that writes records. Takes one input, produces no outputs
Helper functions for building Tesla requests