Processing Collected Data

Once you collect some new data on the Stick, you need to process it into a dataset before you can train policies on it. This step will help you get started on that.

Clone the Repo

git clone [email protected]:notmahi/dobb-e
cd dobb-e/stick-data-collection

Installation

On your machine, in a new conda/virtual environment
```
mamba env create -f conda_env.yaml
```

Usage

For extracting a single environment:

Compress video taken from the Record3D app:
Export Data
Get the files on your machine.
1. Using Google drive:
  1. [Only once] Generate Google Service Account API key to download from private folders on Google Drive. There are some instructions on how to do so in this Stackoverflow link https://stackoverflow.com/a/72076913
  2. [Only once] Rename the .json file to client_secret.json and put it in the same directory as gdrive_downloader.py
  3. Upload .zip file into its own folder on Google Drive, and copy folder_id from URL to put it in the GDRIVE_FOLDER_ID in the ./do-all.sh file.
2. Manually:
  - Comment out the GDRIVE_FOLDER_ID line from ./do-all.sh and create the following hierarchy locally
    dataset/ |--- task1/ |------ home1/ |--------- env1/ |------------ {data_file}.zip |--------- env2/ |------------ {data_file}.zip |--------- env.../ |------------ {data_file}.zip |------ home2/ |------ home.../ |--- task2/ |--- task.../
  - The .zip files should contain .r3d files exported from the Record3D app in the previous step.
Modify required variables in do-all.sh.
1. TASK_NO task id, see gdrive_downloader.py for more information.
2. HOME name or ID of the home.
3. ROOT_FOLDER folder where the data is stored after downloading.
4. EXPORT_FOLDER folder where the dataset is stored after processing. Should be different from ROOT_FOLDER.
5. ENV_NO current environment number in the same home and task set.
6. GRIPPER_MODEL_PATH path to the gripper model. It should be in the github repo already, and can be downloaded from http://dl.dobb-e.com/models/gripper_model.pth.
Change current working directory to local repository root folder and run
```
./do-all.sh
```

Split the extracted data to include a validation set for each environment. The data should follow the following hierarchy: (Be sure change the corresponding paths in r3d_files.txt to include “_val”)\

dataset/
|--- task1/
|------ home1/
|--------- env1/
|--------- env1_val/
|--------- env2/
|--------- env2_val/
|--------- env.../
|------ home2/
|--------- env1/
|--------- env1_val/
|--------- env.../
|------ home.../
|--- task2/
|------ home1/
|--------- env1/
|--------- env1_val/
|--------- env.../
|------ home2/
|--------- env1/
|--------- env1_val/
|--------- env.../
|------ home.../
|--- task.../
|--- r3d_files.txt

Previous[Optional] Training Your Own Home Pretrained Representations NextRunning the Robot Controller

Last updated 1 year ago