grepros 1.2.2
grep for ROS bag files and live topics
Loading...
Searching...
No Matches
ParquetSink Class Reference

Writes messages to Apache Parquet files. More...

Inheritance diagram for ParquetSink:
Inheritance graph

Public Member Functions

 __init__ (self, args=None, **kwargs)
 
 close (self)
 Writes out any remaining messages, closes writers, clears structures.
 
 emit (self, topic, msg, stamp=None, match=None, index=None)
 Writes message to a Parquet file.
 
 validate (self)
 Returns whether required libraries are available (pandas and pyarrow) and overwrite is valid and file base is writable.
 
- Public Member Functions inherited from Sink
 __init__ (self, args=None, **kwargs)
 
 __enter__ (self)
 Context manager entry.
 
 __exit__ (self, exc_type, exc_value, traceback)
 Context manager exit, closes sink.
 
 autodetect (cls, target)
 Returns true if target is recognizable as output for this sink class.
 
 bind (self, source)
 Attaches source to sink.
 
 close (self)
 Shuts down output, closing any files or connections.
 
 configure (self, args=None, **kwargs)
 Updates sink configuration.
 
 emit (self, topic, msg, stamp=None, match=None, index=None)
 Outputs ROS message.
 
 emit_meta (self)
 Outputs source metainfo like bag header as debug stream, if not already emitted.
 
 flush (self)
 Writes out any pending data to disk.
 
 is_highlighting (self)
 Returns whether this sink requires highlighted matches.
 
 thread_excepthook (self, text, exc)
 Handles exception, used by background threads.
 
 validate (self)
 Returns whether sink prerequisites are met (like ROS environment set if LiveSink).
 

Public Attributes

 COMMON_TYPES
 
 MESSAGE_TYPE_BASECOLS
 
 MESSAGE_TYPE_NESTCOLS
 
 valid
 
 WRITER_ARGS
 
- Public Attributes inherited from Sink
 args
 
 source
 inputs.Source instance bound to this sink
 
 valid
 Result of validate()
 

Static Public Attributes

dict ARROW_TYPES
 Mapping from pyarrow type names and aliases to pyarrow type constructors.
 
int CHUNK_SIZE = 100
 Number of dataframes to cache before writing, per type.
 
dict COMMON_TYPES
 Mapping from ROS common type names to pyarrow type constructors.
 
 DEFAULT_ARGS = dict(EMIT_FIELD=(), META=False, NOEMIT_FIELD=(), WRITE_OPTIONS={}, VERBOSE=False)
 Constructor argument defaults.
 
 DEFAULT_TYPE = pyarrow.string() if pyarrow else None
 Fallback pyarrow type if mapped type not found.
 
tuple FILE_EXTENSIONS = (".parquet", )
 Auto-detection file extensions.
 
list MESSAGE_TYPE_BASECOLS
 Default columns for message type tables.
 
list MESSAGE_TYPE_NESTCOLS
 Additional default columns for messaga type tables with nesting output.
 
dict WRITER_ARGS = {"version": "2.6"}
 Custom arguments for pyarrow.parquet.ParquetWriter.
 
- Static Public Attributes inherited from Sink
 DEFAULT_ARGS = dict(META=False)
 Constructor argument defaults.
 
tuple FILE_EXTENSIONS = ()
 Auto-detection file extensions for subclasses, as (".ext", )
 

Detailed Description

Writes messages to Apache Parquet files.

Definition at line 35 of file parquet.py.

Constructor & Destructor Documentation

◆ __init__()

__init__ (   self,
  args = None,
**  kwargs 
)
Parameters
argsarguments as namespace or dictionary, case-insensitive; or a single path as the base name of Parquet files to write
args.emit_fieldmessage fields to emit in output if not all
args.noemit_fieldmessage fields to skip in output
args.writebase name of Parquet files to write
args.write_options
{"column": additional columns as {name: (rostype, value)},
 "type": {rostype: PyArrow type or typename like "uint8"},
 "writer": dictionary of arguments passed to ParquetWriter,
 "idgenerator": callable or iterable for producing message IDs
                like uuid.uuid4 or itertools.count();
                nesting uses UUID values by default,
 "column-k=rostype:v": one "column"-argument
                       in flat string form,
 "type-k=v: one "type"-argument in flat string form,
 "writer-k=v": one "writer"-argument in flat string form,
 "nesting": "array" to recursively insert arrays
            of nested types, or "all" for any nesting,
 "overwrite": whether to overwrite existing file
              (default false)}
args.metawhether to print metainfo
args.verbosewhether to print debug information
kwargsany and all arguments as keyword overrides, case-insensitive

Reimplemented from Sink.

Definition at line 121 of file parquet.py.

Member Function Documentation

◆ close()

close (   self)

Writes out any remaining messages, closes writers, clears structures.

Reimplemented from Sink.

Definition at line 180 of file parquet.py.

◆ emit()

emit (   self,
  topic,
  msg,
  stamp = None,
  match = None,
  index = None 
)

Writes message to a Parquet file.

Reimplemented from Sink.

Definition at line 171 of file parquet.py.

◆ validate()

validate (   self)

Returns whether required libraries are available (pandas and pyarrow) and overwrite is valid and file base is writable.

Reimplemented from Sink.

Definition at line 145 of file parquet.py.

Member Data Documentation

◆ ARROW_TYPES

dict ARROW_TYPES
static
Initial value:
= {
"bool": pyarrow.bool_, "bool_": pyarrow.bool_,
"float16": pyarrow.float16, "float64": pyarrow.float64,
"float32": pyarrow.float32, "decimal128": pyarrow.decimal128,
"int8": pyarrow.int8, "uint8": pyarrow.uint8,
"int16": pyarrow.int16, "uint16": pyarrow.uint16,
"int32": pyarrow.int32, "uint32": pyarrow.uint32,
"int64": pyarrow.int64, "uint64": pyarrow.uint64,
"date32": pyarrow.date32, "time32": pyarrow.time32,
"date64": pyarrow.date64, "time64": pyarrow.time64,
"timestamp": pyarrow.timestamp, "duration": pyarrow.duration,
"binary": pyarrow.binary, "large_binary": pyarrow.large_binary,
"string": pyarrow.string, "large_string": pyarrow.large_string,
"utf8": pyarrow.string, "large_utf8": pyarrow.large_utf8,
"list": pyarrow.list_, "list_": pyarrow.list_,
"large_list": pyarrow.large_list,
} if pyarrow else {}

Mapping from pyarrow type names and aliases to pyarrow type constructors.

Definition at line 45 of file parquet.py.

◆ CHUNK_SIZE

int CHUNK_SIZE = 100
static

Number of dataframes to cache before writing, per type.

Definition at line 42 of file parquet.py.

◆ COMMON_TYPES [1/2]

dict COMMON_TYPES
static
Initial value:
= {
"int8": pyarrow.int8(), "int16": pyarrow.int16(), "int32": pyarrow.int32(),
"uint8": pyarrow.uint8(), "uint16": pyarrow.uint16(), "uint32": pyarrow.uint32(),
"int64": pyarrow.int64(), "uint64": pyarrow.uint64(), "bool": pyarrow.bool_(),
"string": pyarrow.string(), "wstring": pyarrow.string(), "uint8[]": pyarrow.binary(),
"float32": pyarrow.float32(), "float64": pyarrow.float64(),
} if pyarrow else {}

Mapping from ROS common type names to pyarrow type constructors.

Definition at line 67 of file parquet.py.

◆ COMMON_TYPES [2/2]

COMMON_TYPES

Definition at line 377 of file parquet.py.

◆ DEFAULT_ARGS

DEFAULT_ARGS = dict(EMIT_FIELD=(), META=False, NOEMIT_FIELD=(), WRITE_OPTIONS={}, VERBOSE=False)
static

Constructor argument defaults.

Definition at line 91 of file parquet.py.

◆ DEFAULT_TYPE

DEFAULT_TYPE = pyarrow.string() if pyarrow else None
static

Fallback pyarrow type if mapped type not found.

Definition at line 76 of file parquet.py.

◆ FILE_EXTENSIONS

tuple FILE_EXTENSIONS = (".parquet", )
static

Auto-detection file extensions.

Definition at line 39 of file parquet.py.

◆ MESSAGE_TYPE_BASECOLS [1/2]

list MESSAGE_TYPE_BASECOLS
static
Initial value:
= [("_topic", "string"),
("_timestamp", "time"), ]

Default columns for message type tables.

Definition at line 79 of file parquet.py.

◆ MESSAGE_TYPE_BASECOLS [2/2]

MESSAGE_TYPE_BASECOLS

Definition at line 450 of file parquet.py.

◆ MESSAGE_TYPE_NESTCOLS [1/2]

list MESSAGE_TYPE_NESTCOLS
static
Initial value:
= [("_id", "string"),
("_parent_type", "string"),
("_parent_id", "string"), ]

Additional default columns for messaga type tables with nesting output.

Definition at line 83 of file parquet.py.

◆ MESSAGE_TYPE_NESTCOLS [2/2]

MESSAGE_TYPE_NESTCOLS

Definition at line 451 of file parquet.py.

◆ valid

valid

Definition at line 165 of file parquet.py.

◆ WRITER_ARGS [1/2]

dict WRITER_ARGS = {"version": "2.6"}
static

Custom arguments for pyarrow.parquet.ParquetWriter.

Definition at line 88 of file parquet.py.

◆ WRITER_ARGS [2/2]

WRITER_ARGS

Definition at line 378 of file parquet.py.


The documentation for this class was generated from the following file: