Processing Collected Data
Once you collect some new data on the Stick, you need to process it into a dataset before you can train policies on it. This step will help you get started on that.
Clone the Repo
git clone [email protected]:notmahi/dobb-e
cd dobb-e/stick-data-collectionInstallation
- On your machine, in a new conda/virtual environment - mamba env create -f conda_env.yaml
Usage
For extracting a single environment:
- Compress video taken from the Record3D app:  - Export Data 
- Get the files on your machine. - Using Google drive: - [Only once] Generate Google Service Account API key to download from private folders on Google Drive. There are some instructions on how to do so in this Stackoverflow link https://stackoverflow.com/a/72076913 
- [Only once] Rename the .json file to - client_secret.jsonand put it in the same directory as- gdrive_downloader.py
- Upload - .zipfile into its own folder on Google Drive, and copy folder_id from URL to put it in the- GDRIVE_FOLDER_IDin the- ./do-all.shfile.
 
- Manually: - Comment out the - GDRIVE_FOLDER_IDline from- ./do-all.shand create the following hierarchy locally- dataset/ |--- task1/ |------ home1/ |--------- env1/ |------------ {data_file}.zip |--------- env2/ |------------ {data_file}.zip |--------- env.../ |------------ {data_file}.zip |------ home2/ |------ home.../ |--- task2/ |--- task.../
- The .zip files should contain .r3d files exported from the Record3D app in the previous step. 
 
 
- Modify required variables in - do-all.sh.- TASK_NOtask id, see- gdrive_downloader.pyfor more information.
- HOMEname or ID of the home.
- ROOT_FOLDERfolder where the data is stored after downloading.
- EXPORT_FOLDERfolder where the dataset is stored after processing. Should be different from- ROOT_FOLDER.
- ENV_NOcurrent environment number in the same home and task set.
- GRIPPER_MODEL_PATHpath to the gripper model. It should be in the github repo already, and can be downloaded from http://dl.dobb-e.com/models/gripper_model.pth.
 
- Change current working directory to local repository root folder and run - ./do-all.sh
- Split the extracted data to include a validation set for each environment. The data should follow the following hierarchy: (Be sure change the corresponding paths in - r3d_files.txtto include “- _val”)\- dataset/ |--- task1/ |------ home1/ |--------- env1/ |--------- env1_val/ |--------- env2/ |--------- env2_val/ |--------- env.../ |------ home2/ |--------- env1/ |--------- env1_val/ |--------- env.../ |------ home.../ |--- task2/ |------ home1/ |--------- env1/ |--------- env1_val/ |--------- env.../ |------ home2/ |--------- env1/ |--------- env1_val/ |--------- env.../ |------ home.../ |--- task.../ |--- r3d_files.txt
Last updated
