grepros 1.3.0
grep for ROS bag files and live topics
Loading...
Searching...
No Matches
ParquetSink Class Reference
Inheritance diagram for ParquetSink:
Inheritance graph

Public Member Functions

 __init__ (self, args=None, **kwargs)
 
 close (self)
 
 emit (self, topic, msg, stamp=None, match=None, index=None)
 
 validate (self)
 
- Public Member Functions inherited from Sink
 __enter__ (self)
 
 __exit__ (self, exc_type, exc_value, traceback)
 
 autodetect (cls, target)
 
 bind (self, source)
 
 configure (self, args=None, **kwargs)
 
 emit_meta (self)
 
 flush (self)
 
 is_highlighting (self)
 
 thread_excepthook (self, text, exc)
 

Public Attributes

 COMMON_TYPES
 
 MESSAGE_TYPE_BASECOLS
 
 MESSAGE_TYPE_NESTCOLS
 
 valid
 
 WRITER_ARGS
 
- Public Attributes inherited from Sink
 args
 
 source
 inputs.Source instance bound to this sink
 
 valid
 Result of validate()
 

Static Public Attributes

dict ARROW_TYPES
 Mapping from pyarrow type names and aliases to pyarrow type constructors.
 
int CHUNK_SIZE = 100
 Number of dataframes to cache before writing, per type.
 
dict COMMON_TYPES
 Mapping from ROS common type names to pyarrow type constructors.
 
 DEFAULT_ARGS = dict(EMIT_FIELD=(), META=False, NOEMIT_FIELD=(), WRITE_OPTIONS={}, VERBOSE=False)
 Constructor argument defaults.
 
 DEFAULT_TYPE = pyarrow.string() if pyarrow else None
 Fallback pyarrow type if mapped type not found.
 
tuple FILE_EXTENSIONS = (".parquet", )
 Auto-detection file extensions.
 
list MESSAGE_TYPE_BASECOLS
 Default columns for message type tables.
 
list MESSAGE_TYPE_NESTCOLS
 Additional default columns for messaga type tables with nesting output.
 
dict WRITER_ARGS = {"version": "2.6"}
 Custom arguments for pyarrow.parquet.ParquetWriter.
 
- Static Public Attributes inherited from Sink
 DEFAULT_ARGS = dict(META=False)
 Constructor argument defaults.
 
tuple FILE_EXTENSIONS = ()
 Auto-detection file extensions for subclasses, as (".ext", )
 

Detailed Description

Writes messages to Apache Parquet files.

Definition at line 35 of file parquet.py.

Constructor & Destructor Documentation

◆ __init__()

__init__ (   self,
  args = None,
**  kwargs 
)
@param   args                 arguments as namespace or dictionary, case-insensitive;
                              or a single path as the base name of Parquet files to write
@param   args.emit_field      message fields to emit in output if not all
@param   args.noemit_field    message fields to skip in output
@param   args.write           base name of Parquet files to write
@param   args.write_options   ```
                              {"column": additional columns as {name: (rostype, value)},
                               "type": {rostype: PyArrow type or typename like "uint8"},
                               "writer": dictionary of arguments passed to ParquetWriter,
                               "idgenerator": callable or iterable for producing message IDs
                                              like uuid.uuid4 or itertools.count();
                                              nesting uses UUID values by default,
                               "column-k=rostype:v": one "column"-argument
                                                     in flat string form,
                               "type-k=v: one "type"-argument in flat string form,
                               "writer-k=v": one "writer"-argument in flat string form,
                               "nesting": "array" to recursively insert arrays
                                          of nested types, or "all" for any nesting,
                               "overwrite": whether to overwrite existing file
                                            (default false)}
                              ```
@param   args.meta            whether to print metainfo
@param   args.verbose         whether to print debug information
@param   kwargs               any and all arguments as keyword overrides, case-insensitive

Reimplemented from Sink.

Definition at line 94 of file parquet.py.

Member Function Documentation

◆ close()

close (   self)
Writes out any remaining messages, closes writers, clears structures.

Reimplemented from Sink.

Definition at line 179 of file parquet.py.

◆ emit()

emit (   self,
  topic,
  msg,
  stamp = None,
  match = None,
  index = None 
)
Writes message to a Parquet file.

Reimplemented from Sink.

Definition at line 170 of file parquet.py.

◆ validate()

validate (   self)
Returns whether required libraries are available (pandas and pyarrow) and overwrite is valid
and file base is writable.

Reimplemented from Sink.

Definition at line 140 of file parquet.py.

Member Data Documentation

◆ ARROW_TYPES

dict ARROW_TYPES
static
Initial value:
= {
"bool": pyarrow.bool_, "bool_": pyarrow.bool_,
"float16": pyarrow.float16, "float64": pyarrow.float64,
"float32": pyarrow.float32, "decimal128": pyarrow.decimal128,
"int8": pyarrow.int8, "uint8": pyarrow.uint8,
"int16": pyarrow.int16, "uint16": pyarrow.uint16,
"int32": pyarrow.int32, "uint32": pyarrow.uint32,
"int64": pyarrow.int64, "uint64": pyarrow.uint64,
"date32": pyarrow.date32, "time32": pyarrow.time32,
"date64": pyarrow.date64, "time64": pyarrow.time64,
"timestamp": pyarrow.timestamp, "duration": pyarrow.duration,
"binary": pyarrow.binary, "large_binary": pyarrow.large_binary,
"string": pyarrow.string, "large_string": pyarrow.large_string,
"utf8": pyarrow.string, "large_utf8": pyarrow.large_utf8,
"list": pyarrow.list_, "list_": pyarrow.list_,
"large_list": pyarrow.large_list,
} if pyarrow else {}

Mapping from pyarrow type names and aliases to pyarrow type constructors.

Definition at line 45 of file parquet.py.

◆ CHUNK_SIZE

int CHUNK_SIZE = 100
static

Number of dataframes to cache before writing, per type.

Definition at line 42 of file parquet.py.

◆ COMMON_TYPES [1/2]

dict COMMON_TYPES
static
Initial value:
= {
"int8": pyarrow.int8(), "int16": pyarrow.int16(), "int32": pyarrow.int32(),
"uint8": pyarrow.uint8(), "uint16": pyarrow.uint16(), "uint32": pyarrow.uint32(),
"int64": pyarrow.int64(), "uint64": pyarrow.uint64(), "bool": pyarrow.bool_(),
"string": pyarrow.string(), "wstring": pyarrow.string(), "uint8[]": pyarrow.binary(),
"float32": pyarrow.float32(), "float64": pyarrow.float64(),
} if pyarrow else {}

Mapping from ROS common type names to pyarrow type constructors.

Definition at line 67 of file parquet.py.

◆ COMMON_TYPES [2/2]

COMMON_TYPES

Definition at line 364 of file parquet.py.

◆ DEFAULT_ARGS

DEFAULT_ARGS = dict(EMIT_FIELD=(), META=False, NOEMIT_FIELD=(), WRITE_OPTIONS={}, VERBOSE=False)
static

Constructor argument defaults.

Definition at line 91 of file parquet.py.

◆ DEFAULT_TYPE

DEFAULT_TYPE = pyarrow.string() if pyarrow else None
static

Fallback pyarrow type if mapped type not found.

Definition at line 76 of file parquet.py.

◆ FILE_EXTENSIONS

tuple FILE_EXTENSIONS = (".parquet", )
static

Auto-detection file extensions.

Definition at line 39 of file parquet.py.

◆ MESSAGE_TYPE_BASECOLS [1/2]

list MESSAGE_TYPE_BASECOLS
static
Initial value:
= [("_topic", "string"),
("_timestamp", "time"), ]

Default columns for message type tables.

Definition at line 79 of file parquet.py.

◆ MESSAGE_TYPE_BASECOLS [2/2]

MESSAGE_TYPE_BASECOLS

Definition at line 435 of file parquet.py.

◆ MESSAGE_TYPE_NESTCOLS [1/2]

list MESSAGE_TYPE_NESTCOLS
static
Initial value:
= [("_id", "string"),
("_parent_type", "string"),
("_parent_id", "string"), ]

Additional default columns for messaga type tables with nesting output.

Definition at line 83 of file parquet.py.

◆ MESSAGE_TYPE_NESTCOLS [2/2]

MESSAGE_TYPE_NESTCOLS

Definition at line 436 of file parquet.py.

◆ valid

valid

Definition at line 164 of file parquet.py.

◆ WRITER_ARGS [1/2]

dict WRITER_ARGS = {"version": "2.6"}
static

Custom arguments for pyarrow.parquet.ParquetWriter.

Definition at line 88 of file parquet.py.

◆ WRITER_ARGS [2/2]

WRITER_ARGS

Definition at line 365 of file parquet.py.


The documentation for this class was generated from the following file: