For the complete documentation index, see llms.txt. This page is also available as Markdown.

Processing Collected Data

Once you collect some new data on the Stick, you need to process it into a dataset before you can train policies on it. This step will help you get started on that.

Clone the Repo

git clone git@github.com:notmahi/dobb-e
cd dobb-e/stick-data-collection

Installation

  • On your machine, in a new conda/virtual environment

    mamba env create -f conda_env.yaml

Usage

For extracting a single environment:

  1. Compress video taken from the Record3D app:

    Export Data
  2. Get the files on your machine.

    1. Using Google drive:

      1. [Only once] Generate Google Service Account API key to download from private folders on Google Drive. There are some instructions on how to do so in this Stackoverflow link https://stackoverflow.com/a/72076913

      2. [Only once] Rename the .json file to client_secret.json and put it in the same directory as gdrive_downloader.py

      3. Upload .zip file into its own folder on Google Drive, and copy folder_id from URL to put it in the GDRIVE_FOLDER_ID in the ./do-all.sh file.

    2. Manually:

      • Comment out the GDRIVE_FOLDER_ID line from ./do-all.sh and create the following hierarchy locally

        dataset/
        |--- task1/
        |------ home1/
        |--------- env1/
        |------------ {data_file}.zip
        |--------- env2/
        |------------ {data_file}.zip
        |--------- env.../
        |------------ {data_file}.zip
        |------ home2/
        |------ home.../
        |--- task2/
        |--- task.../
      • The .zip files should contain .r3d files exported from the Record3D app in the previous step.

  3. Modify required variables in do-all.sh.

    1. TASK_NO task id, see gdrive_downloader.py for more information.

    2. HOME name or ID of the home.

    3. ROOT_FOLDER folder where the data is stored after downloading.

    4. EXPORT_FOLDER folder where the dataset is stored after processing. Should be different from ROOT_FOLDER.

    5. ENV_NO current environment number in the same home and task set.

    6. GRIPPER_MODEL_PATH path to the gripper model. It should be in the github repo already, and can be downloaded from http://dl.dobb-e.com/models/gripper_model.pth.

  4. Change current working directory to local repository root folder and run

    ./do-all.sh
  5. Split the extracted data to include a validation set for each environment. The data should follow the following hierarchy: (Be sure change the corresponding paths in r3d_files.txt to include “_val”)\

    dataset/
    |--- task1/
    |------ home1/
    |--------- env1/
    |--------- env1_val/
    |--------- env2/
    |--------- env2_val/
    |--------- env.../
    |------ home2/
    |--------- env1/
    |--------- env1_val/
    |--------- env.../
    |------ home.../
    |--- task2/
    |------ home1/
    |--------- env1/
    |--------- env1_val/
    |--------- env.../
    |------ home2/
    |--------- env1/
    |--------- env1_val/
    |--------- env.../
    |------ home.../
    |--- task.../
    |--- r3d_files.txt

Last updated