CLI
CHEESE CLI
Once you install CHEESE you should now have access for a CLI tool for the on-prem users. You can test if the installation is working by running cheese
and display the possible commands.
Usage: -c [OPTIONS] COMMAND [ARGS]...
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ embeddings Run CHEESE embeddings CPU computation on an input file. │
│ embeddings_gpu Run CHEESE embeddings GPU computation for an input file. │
│ generate_license_key Generate a license key for CHEESE │
│ inference Run CHEESE Inference for an input file.' │
│ search Run CHEESE Search on a file of your choice, and save the search outputs to an output file. │
│ start_app Start the CHEESE APP │
│ update Get the latest CHEESE version │
│ visualize Visualize molecules in 2D from an input file. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Updating CHEESE
To get the latest CHEESE version you can run the command cheese update
CHEESE license file
- you can run the command
cheese generate_license_key
to generate a license key. Please note that the license key is environment specific, i.e, you will need another license file if you want to run CHEESE on another host machine. - Copy the license key and send it to us.
- We will give you a JSON license file that should have the same path defined in the
LICENSE_FILE
environment variable during the installation.
CHEESE CPU Embeddings computation
CHEESE CLI supports large scale embedding computation on CPU using CHEESE models by running the command cheese embeddings
. You can supply an input file of molecules, a destination folder to save the embeddings and the search type. You can check the available options by running cheese embeddings --help
Usage: -c embeddings [OPTIONS]
Run CHEESE embeddings CPU computation on an input file.
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --input_file TEXT The input file in one of the following formats : .csv, .sdf, .smi or .txt [default: None] │
│ [required] │
│ --dest TEXT Destination path to save embeddings [default: /data/computed_embeddings] │
│ --search_type TEXT Search type. Can be : 'morgan', 'espsim_shape','espsim_electrostatic', │
│ 'active_pairs' │
│ [default: morgan] │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Example
cheese embeddings --input_file '/data/mydb.smi' --dest /data/my_embeddings --search_type active_pairs
CHEESE GPU Embeddings computation
CHEESE CLI supports large scale embedding computation on GPU using CHEESE models by running the command cheese embeddings_gpu
. You can supply an input CSV file of molecules and a destination folder to save the embeddings. In this command, you will get embeddings from all available CHEESE models. You can check the available options by running cheese embeddings_gpu --help
Run CHEESE embeddings GPU computation for an input file.
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --input_file TEXT The input file in CSV format. Please provide a CSV file in the following format : 'SMILES,ID │
│ [default: None] │
│ [required] │
│ --search_type TEXT Type of embeddings : morgan, espsim_shape, espsim_electrostatic, active_pairs, all │
│ [default: all] │
│ --gpu_devices TEXT List of GPU devices on which to run computation : e.g '0,3,2' [default: 0] │
│ --save_format TEXT Save format of the embeddings. Can be 'parquet' or 'numpy' [default: numpy] │
│ --dest TEXT Destination folder. Will be inside your source folder. [default: computed_embeddings] │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Example
cheese embeddings_gpu --input_file '/data/mydb.csv' --dest /data/my_embeddings --save_format parquet
CHEESE Search
CHEESE CLI supports searching in your available databases by running the command cheese search
. You can supply an input file of molecules an output CSV folder to save the search results, together with other search parameters. You can check the available options by running cheese search --help
Usage: -c search [OPTIONS]
Run CHEESE Search on a file of your choice, and save the search outputs to an output file.
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --input_file TEXT The input file in one of the following formats : .csv, .sdf, .smi or .txt │
│ [default: None] │
│ [required] │
│ * --output_file TEXT The output file in CSV format [default: None] [required] │
│ --db_names TEXT Databases to search in separated by ','. e.g 'ENAMINE-REAL,ZINC15' │
│ [default: ENAMINE-REAL] │
│ --search_type TEXT Search type. Can be : 'morgan', 'espsim_shape','espsim_electrostatic', 'active_pairs' │
│ [default: morgan] │
│ --search_quality TEXT Search quality. Can be : 'fast', 'accurate','very accurate' [default: fast] │
│ --n_neighbors INTEGER Number of results to retrieve. [default: 30] │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Example
cheese search --input_file '/data/myqueries.smi' --output_file '/data/results.csv' --db_names 'ZINC15,CUSTOM_DB' --search_type morgan --search_quality accurate --n_neighbors 100
CHEESE Visualization
CHEESE CLI supports visualizing molecules in 2D by running the command cheese visualize
. You can supply an input file of molecules, a destination folder to save the coordinates, together with the visualization method (PCA or UMAP). You can check the available options by running cheese visualize --help
Usage: -c visualize [OPTIONS]
Visualize molecules in 2D from an input file.
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * --input_file TEXT The input file in one of the following formats : .csv, .sdf, .smi or .txt [default: None] │
│ [required] │
│ --dest TEXT Destination path to save embeddings [default: computed_coordinates] │
│ --sim_name TEXT Similarity type. Can be : 'morgan', 'espsim_shape','espsim_electrostatic', 'active_pairs' │
│ [default: morgan] │
│ --method TEXT Visualization method. Can be : 'umap' or 'pca' [default: umap] │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Example
cheese visualize --input_file '/data/myqueries.smi' --dest '/data/mymols_viz' --sim_name 'espsim_shape' --method pca