Inference
CHEESE Inference
The CLI tool supports running CHEESE inference on your custom database. You can just run the command cheese inference
and you can check the available options by running cheese inference --help
Usage: -c inference [OPTIONS]
Run CHEESE Inference for an input file.'
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --input_file TEXT The input file in CSV format. Please provide a CSV │
│ file in the following format : 'SMILES,ID │
│ [default: None] │
│ [required] │
│ --dest TEXT Destination folder where to save the results. Will be │
│ inside your source folder │
│ [default: output] │
│ --index_type TEXT Index type : clustered, in_memory, auto │
│ [default: auto] │
│ --gpu_devices TEXT List of GPU devices on which to run computation : e.g │
│ '0,3,2' │
│ [default: 0] │
│ --validate_smiles --no-validate_smiles Whether to validate the SMILES of the input file │
│ [default: no-validate_smiles] │
│ --canonicalize_smiles --no-canonicalize_smiles Whether to canonicalize the SMILES of the input file │
│ [default: no-canonicalize_smiles] │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
The input file should contain lines of molecules in SMILES format and their IDs in the following format : SMILES,ID
. Here is an example of an input CSV file.
smiles,id
C[C@H](NC(=O)N1CC2(CCC2)C1c1ccc(F)cc1)C1CC,Z5348285396
CC(NC(=O)N1CC2(CCC2)C1c1ccc(F)cc1)C1CC1,Z5348285396
C[C@@H](NC(=O)N1CC2(CCC2)C1c1ccc(F)cc1)C1CC1,Z5348285396
Please note that the index type is defined automatically by default. If the input file exceeds 1GB in size, the script will automatically run the clustered inference, otherwise it will run the in_memory inference.
Example
cheese inference --input_file '/data/mydb.csv' --dest /path/to/my_output --index_type in_memory
Multiple GPUs inference speed benchmark