grep for ROS bag files and live topics
Index
Installation
Using the program
Matching and filtering
Outputs
Command-line options
Plugins
embag
mcap
parquet
sql
Writing your own
Changelog
API documentation
View the Project on GitHub suurjaak/grepros
grepros supports loading custom plugins, mainly for additional output formats.
Load one or more Python modules or classes as plugins:
--plugin some.python.module some.other.module.Class
Specifying --plugin someplugin
and --help
will include plugin options in printed help.
There are a number of built-in plugins not loaded by default:
--plugin grepros.plugins.embag
Use the embag library for reading ROS1 bags.
Significantly faster, but library tends to be unstable.
--plugin grepros.plugins.mcap
Read or write messages in MCAP format.
Requires mcap, and mcap_ros1_support or mcap_ros2_support.
In ROS2, messages grepped from MCAP files can only be published to live topics if the same message type packages are locally installed.
Write bags in MCAP format:
--plugin grepros.plugins.mcap \ --write path/to/my.mcap [format=mcap] [overwrite=true|false] [rollover-size=NUM] [rollover-count=NUM] [rollover-duration=NUM] [rollover-template=STR]
If the file already exists, a unique counter is appended to the name of the new file,
e.g. my.2.mcap
, unless specified to overwrite.
Specifying write format=mcap
is not required
if the filename ends with .mcap
.
--plugin grepros.plugins.parquet \ --write path/to/my.parquet [format=parquet] [overwrite=true|false] \ [column-name=rostype:value] [type-rostype=arrowtype] \ [idgenerator=callable] [nesting=array|all] [writer-argname=argvalue]
Write messages to Apache Parquet files (columnar storage format, version 2.6),
each message type to a separate file, named path/to/package__MessageType__typehash/my.parquet
for package/MessageType
(typehash is message type definition MD5 hashsum).
Adds fields _topic string()
and _timestamp timestamp("ns")
to each type.
If a file already exists, a unique counter is appended to the name of the new file,
e.g. package__MessageType__typehash/my.2.parquet
, unless specified to overwrite.
Specifying format=parquet
is not required if the filename ends with .parquet
.
By default, message IDs are only added when populating nested message types,
as field _id string()
with UUID content. To explicitly add ID columns:
--write path/to/my.parquet idgenerator="itertools.count()"
Column type is auto-detected from produced ID values: int64
/float64
for numerics,
string
for anything else (non-numerics cast to string).
Supports adding supplementary columns with fixed values to Parquet files:
--write path/to/my.parquet column-bag_hash=string:26dfba2c
Supports custom mapping between ROS and pyarrow types with type-rostype=arrowtype
:
--write path/to/my.parquet type-time="timestamp('ns')" --write path/to/my.parquet type-uint8[]="list(uint8())"
Time/duration types are flattened into separate integer columns secs
and nsecs
,
unless they are mapped to pyarrow types explicitly, like:
--write path/to/my.parquet type-time="timestamp('ns')" type-duration="duration('ns')"
Supports additional arguments given to pyarrow.parquet.ParquetWriter, as:
--write path/to/my.parquet writer-argname=argvalue
For example, specifying no compression:
--write path/to/my.parquet writer-compression=null
The value is interpreted as JSON if possible, e.g. writer-use_dictionary=false
.
To recursively populate nested array fields:
--write path/to/my.parquet nesting=array
E.g. for diagnostic_msgs/DiagnosticArray
, this would populate files with following schemas:
diagnostic_msgs__DiagnosticArray = pyarrow.schema([ ("header.seq", pyarrow.int64()), ("header.stamp.secs", pyarrow.int32()), ("header.stamp.nsecs", pyarrow.int32()), ("header.frame_id", pyarrow.string()), ("status", pyarrow.string()),# [_id from "diagnostic_msgs/DiagnosticStatus", ]
("_topic", pyarrow.string()), ("_timestamp", pyarrow.int64()), ("_id", pyarrow.string()), ("_parent_type", pyarrow.string()), ("_parent_id", pyarrow.string()), ]) diagnostic_msgs__DiagnosticStatus = pyarrow.schema([ ("level", pyarrow.int16()), ("name", pyarrow.string()), ("message", pyarrow.string()), ("hardware_id", pyarrow.string()), ("values"", pyarrow.string()),# [_id from "diagnostic_msgs/KeyValue", ]
("_topic", pyarrow.string()),# _topic from "diagnostic_msgs/DiagnosticArray"
("_timestamp", pyarrow.int64()),# _timestamp from "diagnostic_msgs/DiagnosticArray"
("_id", pyarrow.string()), ("_parent_type", pyarrow.string()),# "diagnostic_msgs/DiagnosticArray"
("_parent_id", pyarrow.string()),# _id from "diagnostic_msgs/DiagnosticArray"
]) diagnostic_msgs__KeyValue = pyarrow.schema([ ("key" pyarrow.string()), ("value", pyarrow.string()), ("_topic", pyarrow.string()),# _topic from "diagnostic_msgs/DiagnosticStatus"
("_timestamp", pyarrow.int64()),# _timestamp from "diagnostic_msgs/DiagnosticStatus"
("_id", pyarrow.string()), ("_parent_type", pyarrow.string()),# "diagnostic_msgs/DiagnosticStatus"
("_parent_id", pyarrow.string()),# _id from "diagnostic_msgs/DiagnosticStatus"
])
Without nesting, array field values are inserted as JSON with full subtype content.
To recursively populate all nested message types:
--write path/to/my.parquet nesting=all
E.g. for diagnostic_msgs/DiagnosticArray
, this would, in addition to the above, populate:
std_msgs__Header = pyarrow.schema([ "seq", pyarrow.int64()), "stamp.secs", pyarrow.int32()), "stamp.nsecs", pyarrow.int32()), "frame_id", pyarrow.string()), "_topic", pyarrow.string()),# _topic from "diagnostic_msgs/DiagnosticArray"
"_timestamp", pyarrow.int64()),# _timestamp from "diagnostic_msgs/DiagnosticArray"
"_id", pyarrow.string()), "_parent_type", pyarrow.string()),# "diagnostic_msgs/DiagnosticArray"
"_parent_id", pyarrow.string()),# _id from "diagnostic_msgs/DiagnosticArray"
])
--plugin grepros.plugins.sql \ --write path/to/my.sql [format=sql] [overwrite=true|false] \ [nesting=array|all] [dialect=clickhouse|postgres|sqlite] \ [dialect-file=path/to/dialects.yaml]
Write SQL schema to output file, CREATE TABLE for each message type and CREATE VIEW for each topic.
If the file already exists, a unique counter is appended to the name of the new file,
e.g. my.2.sql
, unless specified to overwrite.
Specifying format=sql
is not required if the filename ends with .sql
.
To create tables for nested array message type fields:
--write path/to/my.sql nesting=array
To create tables for all nested message types:
--write path/to/my.sql nesting=all
A specific SQL dialect can be specified (defaults to sqlite
):
--write path/to/my.sql dialect=clickhouse|postgres|sqlite
Additional dialects, or updates for existing dialects, can be loaded from a YAML or JSON file:
--write path/to/my.sql dialect=mydialect dialect-file=path/to/dialects.yaml
Supported (but not required) plugin interface methods:
init(args)
: invoked at startup with command-line argumentsload(category, args)
: invoked with category "search" or "source" or "sink",
using returned value for specified component if not None
Plugins are free to modify grepros
internals, like adding command-line arguments
to grepros.main.ARGUMENTS
or adding sink types to grepros.outputs.MultiSink
.
Convenience methods:
plugins.add_write_format(name, cls, label=None, options=())
:
adds an output plugin to defaultsplugins.get_argument(name)
: returns a command-line argument dictionary, or None