# Processing Collected Data

## Clone the Repo

```
git clone git@github.com:notmahi/dobb-e
cd dobb-e/stick-data-collection
```

## Installation

* On your machine, in a new conda/virtual environment

  ```bash
  mamba env create -f conda_env.yaml
  ```

## Usage

For extracting a single environment:

1. Compress video taken from the Record3D app:

   ![Export Data](/files/GhBKBjdV56PbCaVP30UO)
2. Get the files on your machine.
   1. **Using Google drive:**
      1. \[Only once] Generate Google Service Account API key to download from private folders on Google Drive. There are some instructions on how to do so in this Stackoverflow link <https://stackoverflow.com/a/72076913>
      2. \[Only once] Rename the .json file to `client_secret.json` and put it in the same directory as `gdrive_downloader.py`
      3. Upload `.zip` file into its own folder on Google Drive, and copy folder\_id from URL to put it in the `GDRIVE_FOLDER_ID` in the `./do-all.sh` file.
   2. **Manually**:
      * Comment out the `GDRIVE_FOLDER_ID` line from `./do-all.sh` and create the following hierarchy locally

        ```bash
        dataset/
        |--- task1/
        |------ home1/
        |--------- env1/
        |------------ {data_file}.zip
        |--------- env2/
        |------------ {data_file}.zip
        |--------- env.../
        |------------ {data_file}.zip
        |------ home2/
        |------ home.../
        |--- task2/
        |--- task.../
        ```
      * The .zip files should contain .r3d files exported from the Record3D app in the previous step.
3. Modify required variables in `do-all.sh`.
   1. `TASK_NO` task id, see `gdrive_downloader.py` for more information.
   2. `HOME` name or ID of the home.
   3. `ROOT_FOLDER` folder where the data is stored after downloading.
   4. `EXPORT_FOLDER` folder where the dataset is stored after processing. Should be different from `ROOT_FOLDER`.
   5. `ENV_NO` current environment number in the same home and task set.
   6. `GRIPPER_MODEL_PATH` path to the gripper model. It should be in the github repo already, and can be downloaded from <http://dl.dobb-e.com/models/gripper_model.pth>.
4. Change current working directory to local repository root folder and run

   ```bash
   ./do-all.sh
   ```
5. Split the extracted data to include a validation set for each environment. The data should follow the following hierarchy: (Be sure change the corresponding paths in `r3d_files.txt` to include “`_val`”)\\

   ```bash
   dataset/
   |--- task1/
   |------ home1/
   |--------- env1/
   |--------- env1_val/
   |--------- env2/
   |--------- env2_val/
   |--------- env.../
   |------ home2/
   |--------- env1/
   |--------- env1_val/
   |--------- env.../
   |------ home.../
   |--- task2/
   |------ home1/
   |--------- env1/
   |--------- env1_val/
   |--------- env.../
   |------ home2/
   |--------- env1/
   |--------- env1_val/
   |--------- env.../
   |------ home.../
   |--- task.../
   |--- r3d_files.txt
   ```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dobb-e.com/software/processing-collected-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
