SDK_Document_Lens_Bulk

Installation

[ ]:
!python3 -m pip install bioturing_connector

1. Connect to host server:

Must run this step before any further analyses

User’s token is generated from host website

[42]:
import numpy as np
import pandas as pd
from bioturing_connector.typing import Species
from bioturing_connector.typing import ChunkSize
from bioturing_connector.typing import StudyType
from bioturing_connector.typing import StudyUnit
from bioturing_connector.typing import InputMatrixType
from bioturing_connector.lens_bulk_connector import LensBulkConnector

connector = LensBulkConnector(
  host="https://talk2data.bioturing.com/lens_bulk/",
  token="930e9375d5164aa7a4a36593a52c6cd5",
  ssl=True
)
[43]:
connector.test_connection()
Connecting to host at https://talk2data.bioturing.com/lens_bulk/api/v1/test_connection
Connection successful: BioTuring Lens Bulk server

2. List groups, studies and s3

2.1. Get info of available groups

[4]:
user_groups = connector.get_user_groups()
user_groups
[4]:
[{'group_id': '48ba44afb7f14f51a6f6f1dc6f4c3ea9', 'group_name': 'Demo'},
 {'group_id': 'all_members', 'group_name': 'All members'},
 {'group_id': 'bioturing_public_studies',
  'group_name': 'BioTuring Public Studies'},
 {'group_id': 'd32297be0bb543688994cca14f58b14e',
  'group_name': 'BioTuring Spatial'},
 {'group_id': 'personal', 'group_name': 'Personal workspace'}]

2.2. List all available studies in a group

[7]:
# Using group_id from step 2.1

study_list = connector.get_all_studies_info_in_group(
  group_id='personal',
  species=Species.HUMAN.value,
)
study_list
[7]:
[{'uuid': 'f6f4c94460af44fabaa07ac77087351c',
  'study_title': 'TBD',
  'study_hash_id': 'MERGED_VISIUM',
  'created_by': 'dev@bioturing.com'},
 {'uuid': 'b25ff33bead3453680e802963d3e9caf',
  'study_title': 'TBD',
  'study_hash_id': 'GEOMX',
  'created_by': 'dev@bioturing.com'}]

2.3. List all s3 bucket of current user

[ ]:
connector.get_user_s3()
[{'id': '505e49d2abee405f8a7b4ce2628d5270',
  'bucket': 'bioturingdebug',
  'prefix': ''},
 {'id': 'd938706094354d7eb4726d6c9b07de9c',
  'bucket': 'talk2data',
  'prefix': ''}]

2.4. List all shared s3 of a group

[ ]:
connector.get_shared_s3_of_group('all_members')
[]

3. Submit study

NOTE: Get group_id from step “2.1. Get info of available groups”

3.1. Option 1: Submit study from s3

Parameters:
----
group_id: str
      ID of the group to submit the data to.
s3_id: str
      ID of s3 bucket. Default: None
      If s3_id is not provided, we will use the first s3 bucket configured on the platform.
batch_info: List[dict]
      File path and batch name information, the path DOES NOT include bucket path configured on platform!
      Example:
        For DSP format:
          [{
            'matrix': 's3_path/data_1/matrix.xlsx',
            'image': 's3_path/data_1/image.ome.tiff',
          }, {...}]
        For Visium format:
          [{
            'matrix': 's3_path/data_1/matrix.h5',
            'image': 's3_path/data_1/image.tiff'
            'position': 's3_path/data_1/tissue_positions_list.csv'
            'scale': 's3_path/data_1/scalefactors_json.json'
          }, {...}]
        For Visium RDS format:
          [{
            'matrix': 's3_path/GSE128223_1.rds'
          }, {...}]
        For Visium Anndata format:
          [{
            'matrix': 's3_path/GSE128223_1.h5ad'
          }, {...}]
study_id: str
      Will be name of study (eg: VISIUM_PBMC)
      If no value is provided, default id will be a random uuidv4 string
name: str
      Name of the study.
authors: List[str]
      Authors of the study.
abstract: str
      Abstract of the study.
species: str
      Species of the study.
      Support:
            Species.HUMAN.value
            Species.MOUSE.value
            Species.NON_HUMAN_PRIMATE.value
            Species.OTHERS.value
study_type: int
      Format of the study
      Support:
            StudyType.DSP.value
            StudyType.VISIUM.value
            StudyType.VISIUM_RDS.value
            StudyType.VISIUM_ANN.value

3.1.1. Visium format

[52]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 'demo_data/visium_test/Visium_FFPE_Human_Prostate_IF_filtered_feature_bc_matrix.h5',
    'image': 'demo_data/visium_test/tissue_hires_image.png',
    'position': 'demo_data/visium_test/tissue_positions.csv',
    'scale': 'demo_data/visium_test/scalefactors_json.json',
}, {...}]

connector.submit_study_from_s3(
  group_id='personal',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM.value
)
[2023-09-26 06:24] Waiting in queue
[2023-09-26 06:24] Downloading demo_data/visium_test/Visium_FFPE_Human_Prostate_IF_filtered_feature_bc_matrix.h5 from s3: 262.1 KB / 19.0 MB
[2023-09-26 06:24] Downloading demo_data/visium_test/tissue_hires_image.png from s3: 262.1 KB / 4.4 MB
[2023-09-26 06:24] Downloading demo_data/visium_test/tissue_positions.csv from s3: 191.8 KB / 191.8 KB
[2023-09-26 06:24] Downloading demo_data/visium_test/scalefactors_json.json from s3: 148 B / 148 B
[2023-09-26 06:24] File downloaded
[2023-09-26 06:24] Reading batch: visium_test
[2023-09-26 06:24] [visium_test] Indexing matrix
[2023-09-26 06:24] [visium_test] Indexing images
[2023-09-26 06:24] Finish batch: visium_test
[2023-09-26 06:24] Preprocessing expression matrix: 3460 cells x 17943 genes
[2023-09-26 06:24] Filtered: 3460 cells remain
[2023-09-26 06:24] Waiting in queue (matrix processing)
[2023-09-26 06:24] Normalizing expression matrix (matrix processing)
[2023-09-26 06:24] Running PCA (matrix processing)
[2023-09-26 06:24] Running venice binarizer (matrix processing)
[2023-09-26 06:25] Study was successfully submitted
[2023-09-26 06:25] DONE!!!
Study submitted successfully!
[52]:
True

3.1.2. DSP format

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/data_1/matrix.xlsx',
    'image': 's3_path/data_1/image.ome.tiff',
  }, {...}]

connector.submit_study_from_s3(
  group_id='personal',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.DSP.value
)

3.1.3. Visium Scanpy object

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/GSE128223_1.h5ad'
}, {...}]

connector.submit_study_from_s3(
  group_id='personal',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_ANN.value
)

3.1.4. Visium Seurat object

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/GSE128223_1.rds'
}, {...}]

connector.submit_study_from_s3(
  group_id='personal',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_RDS.value
)

3.2. Option 2: Submit study from local machine

Parameters:
------
group_id: str
      ID of the group to submit the data to.
batch_info: List[dict]
      File path and batch name information
      Example:
        For DSP format:
          [{
            'name': 'data_1',
            'matrix': 'local_path/data_1/matrix.xlsx',
            'image': 'local_path/data_1/image.ome.tiff',
          }, {...}]
        For Visium format:
          [{
            'name': 'data_1',
            'matrix': 'local_path/data_1/matrix.h5',
            'image': 'local_path/data_1/image.tiff'
            'position': 'local_path/data_1/tissue_positions_list.csv'
            'scale': 'local_path/data_1/scalefactors_json.json'
          }, {...}]
        For Visium RDS format:
          [{
            'matrix': 'local_path/GSE128223_1.rds'
          }, {...}]
        For Visium Anndata format:
          [{
            'matrix': 'local_path/GSE128223_1.h5ad'
          }, {...}]
study_id: str
      If no value is provided, default id will be a random uuidv4 string
name: str
      Name of the study.
authors: List[str]
      Authors of the study.
abstract: str
      Abstract of the study.
species: str
      Species of the study.
      Support:
            Species.HUMAN.value
            Species.MOUSE.value
            Species.NON_HUMAN_PRIMATE.value
            Species.OTHERS.value
study_type: int
      Format of the study
      Support:
            StudyType.DSP.value
            StudyType.VISIUM.value
            StudyType.VISIUM_RDS.value
            StudyType.VISIUM_ANN.value
chunk_size: int
      size of each separated chunk for uploading. Default: ChunkSize.CHUNK_100_MB.value\n
      Support:
            ChunkSize.CHUNK_5_MB.value
            ChunkSize.CHUNK_100_MB.value
            ChunkSize.CHUNK_500_MB.value
            ChunkSize.CHUNK_1_GB.value

3.2.1. Visium format

[4]:
## Support multiple batches per submission
batch_info = [{
    'name': 'test_visium',
    'matrix': '/data/dev/SonVo/visium_test/Visium_FFPE_Human_Prostate_IF_filtered_feature_bc_matrix.h5',
    'image': '/data/dev/SonVo/visium_test/tissue_hires_image.png',
    'position': '/data/dev/SonVo/visium_test/tissue_positions.csv',
    'scale': '/data/dev/SonVo/visium_test/scalefactors_json.json',
}]
connector.submit_study_from_local(
  group_id='personal',
  batch_info=batch_info,
  study_id='test_visium',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM.value,
)
test_visiummatrix.h5 - chunk_0:  18%|█████████████████████████████▋                                                                                                                                      | 18.1M/100M [00:00<00:01, 84.8MMB/s]
test_visiumhires.png - chunk_0:   4%|██████▉                                                                                                                                                             | 4.24M/100M [00:00<00:01, 62.5MMB/s]
test_visiumposition.csv - chunk_0:   0%|▎                                                                                                                                                                 | 188k/100M [00:00<00:13, 7.72MMB/s]
test_visiumscale.json - chunk_0:   0%|                                                                                                                                                                   | 510/100M [00:00<1:06:43, 26.2kMB/s]
[2023-09-26 06:36] Waiting in queue
[2023-09-26 06:36] Reading batch: test_visium
[2023-09-26 06:36] [test_visium] Indexing matrix
[2023-09-26 06:36] [test_visium] Indexing images
[2023-09-26 06:36] Finish batch: test_visium
[2023-09-26 06:36] Preprocessing expression matrix: 3460 cells x 17943 genes
[2023-09-26 06:36] Filtered: 3460 cells remain
[2023-09-26 06:36] Waiting in queue (matrix processing)
[2023-09-26 06:36] Normalizing expression matrix (matrix processing)
[2023-09-26 06:36] Running PCA (matrix processing)
[2023-09-26 06:36] Running kNN (matrix processing)
[2023-09-26 06:36] Study was successfully submitted
[2023-09-26 06:36] DONE!!!
Study submitted successfully!
[4]:
True

3.2.2. DSP format

[ ]:
## Support multiple batches per submission
batch_info = [{
    'name': 'data_1',
    'matrix': 'local_path/data_1/matrix.xlsx',
    'image': 'local_path/data_1/image.ome.tiff',
}, {...}]

connector.submit_study_from_local(
  group_id='personal',
  batch_info=batch_info,
  study_id='test_visium',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.DSP.value,
)

3.2.3. Visium Scanpy object

[ ]:
## Support multiple batches per submission
batch_info = [{
    'matrix': 'local_path/GSE128223_1.h5ad'
}, {...}]

connector.submit_study_from_local(
  group_id='personal',
  batch_info=batch_info,
  study_id='test_visium',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_ANN.value,
)

3.2.4. Visium Seurat object

[ ]:
## Support multiple batches per submission
batch_info = [{
    'matrix': 'local_path/GSE128223_1.rds'
}, {...}]

connector.submit_study_from_local(
  group_id='personal',
  batch_info=batch_info,
  study_id='test_visium',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_RDS.value,
)

3.3. Option 3: Submit study with shared s3 of a group

Parameters:
----
group_id: str
      ID of the group to submit the data to.
shared_s3_id: str
      ID of s3 bucket.
batch_info: List[dict]
      File path and batch name information, the path DOES NOT include bucket path configured on platform!
      Example:
        For DSP format:
          [{
            'matrix': 's3_path/data_1/matrix.xlsx',
            'image': 's3_path/data_1/image.ome.tiff',
          }, {...}]
        For Visium format:
          [{
            'matrix': 's3_path/data_1/matrix.h5',
            'image': 's3_path/data_1/image.tiff'
            'position': 's3_path/data_1/tissue_positions_list.csv'
            'scale': 's3_path/data_1/scalefactors_json.json'
          }, {...}]
        For Visium RDS format:
          [{
            'matrix': 's3_path/GSE128223_1.rds'
          }, {...}]
        For Visium Anndata format:
          [{
            'matrix': 's3_path/GSE128223_1.h5ad'
          }, {...}]
study_id: str
      Will be name of study (eg: VISIUM_PBMC)
      If no value is provided, default id will be a random uuidv4 string
name: str
      Name of the study.
authors: List[str]
      Authors of the study.
abstract: str
      Abstract of the study.
species: str
      Species of the study.
      Support:
            Species.HUMAN.value
            Species.MOUSE.value
            Species.NON_HUMAN_PRIMATE.value
            Species.OTHERS.value
study_type: int
      Format of the study
      Support:
            StudyType.DSP.value
            StudyType.VISIUM.value
            StudyType.VISIUM_RDS.value
            StudyType.VISIUM_ANN.value

3.1.1. Visium format

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 'demo_data/visium_test/Visium_FFPE_Human_Prostate_IF_filtered_feature_bc_matrix.h5',
    'image': 'demo_data/visium_test/tissue_hires_image.png',
    'position': 'demo_data/visium_test/tissue_positions.csv',
    'scale': 'demo_data/visium_test/scalefactors_json.json',
}, {...}]

connector.submit_study_from_shared_s3(
  group_id='6b3cfc27fa694779a1b2a5015e438b94',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM.value,
  shared_s3_id='15de18d355b4ce0a1u512a5b45c8e3c'
)

3.1.2. DSP format

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/data_1/matrix.xlsx',
    'image': 's3_path/data_1/image.ome.tiff',
  }, {...}]

connector.submit_study_from_shared_s3(
  group_id='6b3cfc27fa694779a1b2a5015e438b94',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.DSP.value,
  shared_s3_id='15de18d355b4ce0a1u512a5b45c8e3c'
)

3.1.3. Visium Scanpy object

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/GSE128223_1.h5ad'
}, {...}]

connector.submit_study_from_shared_s3(
  group_id='6b3cfc27fa694779a1b2a5015e438b94',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_ANN.value,
  shared_s3_id='15de18d355b4ce0a1u512a5b45c8e3c'
)

3.1.4. Visium Seurat object

[ ]:
## The path DOES NOT include the bucket path configured on platform
## Support multiple batches per submission
batch_info = [{
    'matrix': 's3_path/GSE128223_1.rds'
}, {...}]

connector.submit_study_from_shared_s3(
  group_id='6b3cfc27fa694779a1b2a5015e438b94',
  batch_info=batch_info,
  study_id='visium_test',
  name='This is my first study',
  authors=['Huy Nguyen', 'Thao Truong'],
  species=Species.HUMAN.value,
  study_type=StudyType.VISIUM_RDS.value,
  shared_s3_id='15de18d355b4ce0a1u512a5b45c8e3c'
)

4. Submit metadata

NOTE: Get group_id and study_id (uuid) from step “2. List groups and studies”

4.1. Submit a dataframe directly

This is an example metadata. Barcodes column must be DataFrame.index

[9]:
meta_df = pd.read_csv('MERGED_VISIUM_metadata.tsv', sep='\t', index_col=0)
meta_df
[9]:
Batches
Barcodes
spatial_TATGGCAGACTTTCGA-1 spatial
spatial_CTTCGTGCCCGCATCG-1 spatial
spatial_AAACGGGTTGGTATCC-1 spatial
spatial_TGCAAACCCACATCAA-1 spatial
spatial_GACGGGATGTCTTATG-1 spatial
... ...
visium_test_AGTATACACAGCGACA-1 visium_test
visium_test_TGTGGTTGCTAAAGCT-1 visium_test
visium_test_TGATTCCCGGTTACCT-1 visium_test
visium_test_AACATTGTGACTCGAG-1 visium_test
visium_test_GCTCTTTCCGCTAGTG-1 visium_test

7495 rows × 1 columns

[22]:
connector.submit_metadata_from_dataframe(
    species=Species.HUMAN.value,
    study_id='f6f4c94460af44fabaa07ac77087351c',
    group_id='personal',
    df=meta_df
)
[22]:
'Successful'

4.2. Submit file from local / server

[23]:
connector.submit_metadata_from_local(
    species=Species.HUMAN.value,
    study_id='f6f4c94460af44fabaa07ac77087351c',
    group_id='personal',
    file_path='./MERGED_VISIUM_metadata.tsv'
)
[23]:
'Successful'

4.3. Submit file from s3

[ ]:
connector.submit_metadata_from_s3(
    species=Species.HUMAN.value,
    study_id='f6f4c94460af44fabaa07ac77087351c',
    group_id='personal',
    file_path='test_bucket/GSE128223_meta.tsv'        #This path DOES NOT include the bucket path configured on platform e.g. s3://bioturing_bucket
)

4.4. Submit file from shared s3 of a group

[ ]:
connector.submit_metadata_from_shared_s3(
    species=Species.HUMAN.value,
    study_id='a1558f8ed6064095be86a091a4118c4a',
    group_id='bioturing_public_studies',              #This function DOES NOT applied for group_id='personal'
    file_path='test_bucket/GSE128223_meta.tsv',        #This path DOES NOT include the bucket path configured on platform e.g. s3://bioturing_bucket
    shared_s3_id='ce26142487ed4a3697bb8902bf9d9670'
)

5. Access study data

NOTE: Get study_id (uuid) from step “2.2. List all available studies in a group”

5.1. Get barcodes

[29]:
barcodes = np.array(connector.get_barcodes(
  study_id='f6f4c94460af44fabaa07ac77087351c',
  species=Species.HUMAN.value,
))
print(barcodes)
['spatial_TATGGCAGACTTTCGA-1' 'spatial_CTTCGTGCCCGCATCG-1'
 'spatial_AAACGGGTTGGTATCC-1' ... 'visium_test_TGATTCCCGGTTACCT-1'
 'visium_test_AACATTGTGACTCGAG-1' 'visium_test_GCTCTTTCCGCTAGTG-1']

5.2. Get features

[30]:
features = np.array(connector.get_features(
  study_id='f6f4c94460af44fabaa07ac77087351c',
  species=Species.HUMAN.value,
))
print(features)
['5S_RRNA' '5_8S_RRNA' '7SK' ... 'AL121908.1' 'AP000527.1' 'AL035681.1']

5.3. Get metadata dataframe

[32]:
metadata = connector.get_metadata(
  study_id='f6f4c94460af44fabaa07ac77087351c',
  species=Species.HUMAN.value
)
metadata.iloc[:5, :5]
[32]:
Barcodes Batches Batches (1) Batches (2) Number of genes
0 spatial_TATGGCAGACTTTCGA-1 spatial spatial spatial 6782
1 spatial_CTTCGTGCCCGCATCG-1 spatial spatial spatial 6948
2 spatial_AAACGGGTTGGTATCC-1 spatial spatial spatial 6972
3 spatial_TGCAAACCCACATCAA-1 spatial spatial spatial 8065
4 spatial_GACGGGATGTCTTATG-1 spatial spatial spatial 6229

5.4. Get embeddings

5.4.1. List all embeddings

[34]:
embeddings = connector.list_all_custom_embeddings(
  study_id='f6f4c94460af44fabaa07ac77087351c',
  species=Species.HUMAN.value,
)
embeddings
[34]:
[{'embedding_id': '8e31785c43b6458c8fdc7aa06d2e1028',
  'embedding_name': 'PCA (no batch corrected)'},
 {'embedding_id': '1a23d7b23f164d258bbd24d83658f194',
  'embedding_name': 'tSNE (perplexity=30)'}]

5.4.2. Access an embedding

[35]:
chosen_embedding = connector.retrieve_custom_embedding(
  study_id='f6f4c94460af44fabaa07ac77087351c',
  species=Species.HUMAN.value,
  embedding_id='8e31785c43b6458c8fdc7aa06d2e1028',
)
chosen_embedding
[35]:
array([[-24.439453  ,   1.8610998 ,  -4.453333  , ...,  -0.23137376,
          0.70885265,   0.513127  ],
       [-25.617645  ,   2.945625  ,  -5.5363693 , ...,   0.13122909,
         -0.7750866 ,   0.21110779],
       [-25.639389  ,   1.7325156 ,  -3.724022  , ...,   0.02746161,
         -0.19130976,  -0.55235636],
       ...,
       [ 27.012297  , -13.96414   ,   2.4462044 , ...,  -0.89074665,
          2.0481367 ,   1.2320619 ],
       [ 26.676434  ,  -9.607573  ,   2.836241  , ...,  -1.4937118 ,
         -3.8412411 ,   3.404403  ],
       [ 26.823132  ,  -0.6082494 ,   8.160352  , ...,   0.42766818,
         -3.8642507 ,   4.8920965 ]], dtype=float32)

5.5. Query genes

Parameters:
----
group_id: str
    ID of the group to submit the data to.
study_id: str
    If no value is provided, default id will be a random uuidv4 string
gene_names: List[str], default=[]
    If the value array is empty, the return value will be the whole matrix
unit: str
    Support:
          StudyUnit.UNIT_RAW.value
          StudyUnit.UNIT_LOGNORM.value
[36]:
gene_exp = connector.query_genes(
  species=Species.HUMAN.value,
  study_id='f6f4c94460af44fabaa07ac77087351c',
  gene_names=['CD3D', 'CD8A'],
  unit=StudyUnit.UNIT_RAW.value,
)
gene_exp
[36]:
<7495x2 sparse matrix of type '<class 'numpy.float32'>'
        with 7006 stored elements in Compressed Sparse Column format>

6. Standardize your metadata

NOTE: Get group_id and study_id (uuid) from step “2. List groups and studies”

6.1. Retrieve ontology tree

Returns
----------
Ontologies tree : Dict[Dict]
  In which:
    'name': name of the node, which will be used in further steps
[ ]:
connector.get_ontologies_tree(
    species=Species.HUMAN.value,
    group_id='bioturing_public_studies'
)

6.2. Assign standardized terms

Parameters
-----
species: str
      Species of the study.
      Support:  Species.HUMAN.value
                Species.MOUSE.value
                Species.PRIMATE.value
                Species.OTHERS.value
group_id: str
      ID of the group to submit the data to.
study_id: str
      ID of the study (uuid)
metadata_field: str
      column name of meta dataframe in platform (eg: author's tissue)
metadata_value: str
      metadata value within the metadata field (eg: normal lung)
root_name: str
      name of root in btr ontologies tree (eg: tissue)
leaf_name: str
      name of leaf in btr ontologies tree (eg: lung)
[ ]:
# This function is only usable in a group (not 'personal')

connector.assign_standardized_meta(
    species=Species.HUMAN.value,
    group_id='bioturing_public_studies',
    study_id='a1558f8ed6064095be86a091a4118c4a',
    metadata_field='Cell type',
    metadata_value='TCRV delta 1 gamma-delta T cell',
    root_name='cell type',
    leaf_name='gamma-delta T cell',
)