grep for ROS bag files and live topics
Index
Installation
Using the program
Matching and filtering
Outputs
Command-line options
Plugins
embag
mcap
parquet
sql
Writing your own
Changelog
API documentation
View the Project on GitHub suurjaak/grepros
grepros supports loading custom plugins, mainly for additional output formats.
Load one or more Python modules or classes as plugins:
--plugin some.python.module some.other.module.Class
Specifying --plugin someplugin and --help will include plugin options in printed help.
There are a number of built-in plugins not loaded by default:
--plugin grepros.plugins.embag
Use the embag library for reading ROS1 bags.
Significantly faster, but library tends to be unstable.
--plugin grepros.plugins.mcap
Read or write messages in MCAP format.
Requires mcap, and mcap_ros1_support or mcap_ros2_support.
In ROS2, messages grepped from MCAP files can only be published to live topics if the same message type packages are locally installed.
Write bags in MCAP format:
--plugin grepros.plugins.mcap \
--write path/to/my.mcap [format=mcap] [overwrite=true|false]
[rollover-size=NUM] [rollover-count=NUM] [rollover-duration=NUM]
[rollover-template=STR]
If the file already exists, a unique counter is appended to the name of the new file,
e.g. my.2.mcap, unless specified to overwrite.
Specifying write format=mcap is not required
if the filename ends with .mcap.
--plugin grepros.plugins.parquet \
--write path/to/my.parquet [format=parquet] [overwrite=true|false] \
[column-name=rostype:value] [type-rostype=arrowtype] \
[idgenerator=callable] [nesting=array|all] [writer-argname=argvalue]
Write messages to Apache Parquet files (columnar storage format, version 2.6),
each message type to a separate file, named path/to/package__MessageType__typehash/my.parquet
for package/MessageType (typehash is message type definition MD5 hashsum).
Adds fields _topic string() and _timestamp timestamp("ns") to each type.
If a file already exists, a unique counter is appended to the name of the new file,
e.g. package__MessageType__typehash/my.2.parquet, unless specified to overwrite.
Specifying format=parquet is not required if the filename ends with .parquet.
By default, message IDs are only added when populating nested message types,
as field _id string() with UUID content. To explicitly add ID columns:
--write path/to/my.parquet idgenerator="itertools.count()"
Column type is auto-detected from produced ID values: int64/float64 for numerics,
string for anything else (non-numerics cast to string).
Supports adding supplementary columns with fixed values to Parquet files:
--write path/to/my.parquet column-bag_hash=string:26dfba2c
Supports custom mapping between ROS and pyarrow types with type-rostype=arrowtype:
--write path/to/my.parquet type-time="timestamp('ns')"
--write path/to/my.parquet type-uint8[]="list(uint8())"
Time/duration types are flattened into separate integer columns secs and nsecs,
unless they are mapped to pyarrow types explicitly, like:
--write path/to/my.parquet type-time="timestamp('ns')" type-duration="duration('ns')"
Supports additional arguments given to pyarrow.parquet.ParquetWriter, as:
--write path/to/my.parquet writer-argname=argvalue
For example, specifying no compression:
--write path/to/my.parquet writer-compression=null
The value is interpreted as JSON if possible, e.g. writer-use_dictionary=false.
To recursively populate nested array fields:
--write path/to/my.parquet nesting=array
E.g. for diagnostic_msgs/DiagnosticArray, this would populate files with following schemas:
diagnostic_msgs__DiagnosticArray = pyarrow.schema([
("header.seq", pyarrow.int64()),
("header.stamp.secs", pyarrow.int32()),
("header.stamp.nsecs", pyarrow.int32()),
("header.frame_id", pyarrow.string()),
("status", pyarrow.string()), # [_id from "diagnostic_msgs/DiagnosticStatus", ]
("_topic", pyarrow.string()),
("_timestamp", pyarrow.int64()),
("_id", pyarrow.string()),
("_parent_type", pyarrow.string()),
("_parent_id", pyarrow.string()),
])
diagnostic_msgs__DiagnosticStatus = pyarrow.schema([
("level", pyarrow.int16()),
("name", pyarrow.string()),
("message", pyarrow.string()),
("hardware_id", pyarrow.string()),
("values"", pyarrow.string()), # [_id from "diagnostic_msgs/KeyValue", ]
("_topic", pyarrow.string()), # _topic from "diagnostic_msgs/DiagnosticArray"
("_timestamp", pyarrow.int64()), # _timestamp from "diagnostic_msgs/DiagnosticArray"
("_id", pyarrow.string()),
("_parent_type", pyarrow.string()), # "diagnostic_msgs/DiagnosticArray"
("_parent_id", pyarrow.string()), # _id from "diagnostic_msgs/DiagnosticArray"
])
diagnostic_msgs__KeyValue = pyarrow.schema([
("key" pyarrow.string()),
("value", pyarrow.string()),
("_topic", pyarrow.string()), # _topic from "diagnostic_msgs/DiagnosticStatus"
("_timestamp", pyarrow.int64()), # _timestamp from "diagnostic_msgs/DiagnosticStatus"
("_id", pyarrow.string()),
("_parent_type", pyarrow.string()), # "diagnostic_msgs/DiagnosticStatus"
("_parent_id", pyarrow.string()), # _id from "diagnostic_msgs/DiagnosticStatus"
])
Without nesting, array field values are inserted as JSON with full subtype content.
To recursively populate all nested message types:
--write path/to/my.parquet nesting=all
E.g. for diagnostic_msgs/DiagnosticArray, this would, in addition to the above, populate:
std_msgs__Header = pyarrow.schema([ "seq", pyarrow.int64()), "stamp.secs", pyarrow.int32()), "stamp.nsecs", pyarrow.int32()), "frame_id", pyarrow.string()), "_topic", pyarrow.string()),# _topic from "diagnostic_msgs/DiagnosticArray""_timestamp", pyarrow.int64()),# _timestamp from "diagnostic_msgs/DiagnosticArray""_id", pyarrow.string()), "_parent_type", pyarrow.string()),# "diagnostic_msgs/DiagnosticArray""_parent_id", pyarrow.string()),# _id from "diagnostic_msgs/DiagnosticArray"])
--plugin grepros.plugins.sql \
--write path/to/my.sql [format=sql] [overwrite=true|false] \
[nesting=array|all] [dialect=clickhouse|postgres|sqlite] \
[dialect-file=path/to/dialects.yaml]
Write SQL schema to output file, CREATE TABLE for each message type and CREATE VIEW for each topic.
If the file already exists, a unique counter is appended to the name of the new file,
e.g. my.2.sql, unless specified to overwrite.
Specifying format=sql is not required if the filename ends with .sql.
To create tables for nested array message type fields:
--write path/to/my.sql nesting=array
To create tables for all nested message types:
--write path/to/my.sql nesting=all
A specific SQL dialect can be specified (defaults to sqlite):
--write path/to/my.sql dialect=clickhouse|postgres|sqlite
Additional dialects, or updates for existing dialects, can be loaded from a YAML or JSON file:
--write path/to/my.sql dialect=mydialect dialect-file=path/to/dialects.yaml
Supported (but not required) plugin interface methods:
init(args): invoked at startup with command-line argumentsload(category, args): invoked with category "search" or "source" or "sink",
using returned value for specified component if not None
Plugins are free to modify grepros internals, like adding command-line arguments
to grepros.main.ARGUMENTS or adding sink types to grepros.outputs.MultiSink.
Convenience methods:
plugins.add_write_format(name, cls, label=None, options=()):
adds an output plugin to defaultsplugins.get_argument(name): returns a command-line argument dictionary, or None