biothings.cli¶

biothings.cli.main()[source]¶: The main entry point for running the BioThings CLI to test your local data plugins.

biothings.cli.setup_config()[source]¶: Setup a config module necessary to launch the CLI

biothings.cli.dataplugin¶

biothings.cli.dataplugin.clean_data(dump: bool | None = False, upload: bool | None = False, clean_all: bool | None = False)[source]¶: clean command for deleting all dumped files and/or drop uploaded sources tables

biothings.cli.dataplugin.create_data_plugin(name: str = '', multi_uploaders: bool | None = False, parallelizer: bool | None = False)[source]¶: create command for creating a new data plugin from the template

biothings.cli.dataplugin.dump_and_upload()[source]¶: dump_and_upload command for downloading source data files to local, then converting them into JSON documents and uploading them to the source database. Two steps in one command.

biothings.cli.dataplugin.dump_data()[source]¶: dump command for downloading source data files to local

biothings.cli.dataplugin.inspect_source(sub_source_name: str | None = '', mode: str | None = 'type,stats', limit: int | None = None, output: str | None = None)[source]¶: inspect command for giving detailed information about the structure of documents coming from the parser after the upload step

biothings.cli.dataplugin.listing(dump: bool | None = False, upload: bool | None = False, hubdb: bool | None = False)[source]¶: list command for listing dumped files and/or uploaded sources

biothings.cli.dataplugin.serve(host: str | None = 'localhost', port: int | None = 9999)[source]¶

serve command runs a simple API server for serving documents from the source database.

For example, after run ‘dump_and_upload’, we have a source_name = “test” with a document structure like this:

doc = {“_id”: “123”, “key”: {“a”:{“b”: “1”},”x”:[{“y”: “3”, “z”: “4”}, “5”]}}.

An API server will run at http://host:port/<your source name>/, like http://localhost:9999/test/:

You can see all available sources on the index page: http://localhost:9999/

You can list all docs: http://localhost:9999/test/ (default is to return the first 10 docs)

You can paginate doc list: http://localhost:9999/test/?start=10&limit=10

You can retrieve a doc by id: http://localhost:9999/test/123

You can filter out docs with one or multiple fielded terms:

http://localhost:9999/test/?q=key.a.b:1 (query by any field with dot notation like key.a.b=1)

http://localhost:9999/test/?q=key.a.b:1%20AND%20key.x.y=3 (find all docs that match two fields)

http://localhost:9999/test/?q=key.x.z:4* (field value can contain wildcard * or ?)

http://localhost:9999/test/?q=key.x:5&start=10&limit=10 (pagination also works)

biothings.cli.dataplugin.upload_source(batch_limit: int | None = None)[source]¶: upload command for converting downloaded data from dump step into JSON documents and upload the to the source database. A local sqlite database used to store the uploaded data

biothings.cli.dataplugin_hub¶

biothings.cli.dataplugin_hub.clean_data(plugin_name: str = '', dump: bool | None = False, upload: bool | None = False, clean_all: bool | None = False)[source]¶: clean command for deleting all dumped files and/or drop uploaded sources tables

biothings.cli.dataplugin_hub.create_data_plugin(name: str = '', multi_uploaders: bool | None = False, parallelizer: bool | None = False)[source]¶: create command for creating a new data plugin from the template

biothings.cli.dataplugin_hub.dump_and_upload(plugin_name: str = '')[source]¶: dump_and_upload command for downloading source data files to local, then converting them into JSON documents and uploading them to the source database. Two steps in one command.

biothings.cli.dataplugin_hub.dump_data(plugin_name: str = '')[source]¶: dump command for downloading source data files to local

biothings.cli.dataplugin_hub.inspect_source(plugin_name: str = '', sub_source_name: str | None = '', mode: str | None = 'type,stats', limit: int | None = None, output: str | None = None)[source]¶: inspect command for giving detailed information about the structure of documents coming from the parser after the upload step

biothings.cli.dataplugin_hub.listing(plugin_name: str = '', dump: bool | None = False, upload: bool | None = False, hubdb: bool | None = False)[source]¶: list command for listing dumped files and/or uploaded sources

biothings.cli.dataplugin_hub.serve(plugin_name: str = '', host: str | None = 'localhost', port: int | None = 9999)[source]¶

serve command runs a simple API server for serving documents from the source database.

For example, after run ‘dump_and_upload’, we have a source_name = “test” with a document structure like this:

doc = {“_id”: “123”, “key”: {“a”:{“b”: “1”},”x”:[{“y”: “3”, “z”: “4”}, “5”]}}.

An API server will run at http://host:port/<your source name>/, like http://localhost:9999/test/:

You can see all available sources on the index page: http://localhost:9999/

You can list all docs: http://localhost:9999/test/ (default is to return the first 10 docs)

You can paginate doc list: http://localhost:9999/test/?start=10&limit=10

You can retrieve a doc by id: http://localhost:9999/test/123

You can filter out docs with one or multiple fielded terms:

http://localhost:9999/test/?q=key.a.b:1 (query by any field with dot notation like key.a.b=1)

http://localhost:9999/test/?q=key.a.b:1%20AND%20key.x.y=3 (find all docs that match two fields)

http://localhost:9999/test/?q=key.x.z:4* (field value can contain wildcard * or ?)

http://localhost:9999/test/?q=key.x:5&start=10&limit=10 (pagination also works)

biothings.cli.dataplugin_hub.upload_source(plugin_name: str = '', batch_limit: int | None = None)[source]¶: upload command for converting downloaded data from dump step into JSON documents and upload the to the source database. A local sqlite database used to store the uploaded data

biothings.cli.utils¶

biothings.cli.utils.do_clean(plugin_name=None, dump=False, upload=False, clean_all=False, logger=None)[source]¶: Clean the dumped files, uploaded sources, or both.

biothings.cli.utils.do_clean_dumped_files(data_folder, plugin_name)[source]¶: Remove all dumped files by a data plugin in the data folder.

biothings.cli.utils.do_clean_uploaded_sources(working_dir, plugin_name)[source]¶: Remove all uploaded sources by a data plugin in the working directory.

biothings.cli.utils.do_create(name, multi_uploaders=False, parallelizer=False, logger=None)[source]¶: Create a new data plugin from the template

biothings.cli.utils.do_dump(plugin_name=None, show_dumped=True, logger=None)[source]¶: Perform dump for the given plugin

biothings.cli.utils.do_dump_and_upload(plugin_name, logger=None)[source]¶: Perform both dump and upload for the given plugin

biothings.cli.utils.do_inspect(plugin_name=None, sub_source_name=None, mode='type,stats', limit=None, merge=False, output=None, logger=None)[source]¶: Perform inspection on a data plugin.

biothings.cli.utils.do_list(plugin_name=None, dump=False, upload=False, hubdb=False, logger=None)[source]¶: List the dumped files, uploaded sources, or hubdb content.

biothings.cli.utils.do_serve(plugin_name=None, host='localhost', port=9999, logger=None)[source]¶

biothings.cli.utils.do_upload(plugin_name=None, show_uploaded=True, logger=None)[source]¶: Perform upload for the given list of uploader_classes

biothings.cli.utils.get_logger(name=None)[source]¶: Get a logger with the given name. If name is None, return the root logger.

biothings.cli.utils.get_manifest_content(working_dir)[source]¶: return the manifest content of the data plugin in the working directory

biothings.cli.utils.get_plugin_name(plugin_name=None, with_working_dir=True)[source]¶: return a valid plugin name (the folder name contains a data plugin) When plugin_name is provided as None, it use the current working folder. when with_working_dir is True, returns (plugin_name, working_dir) tuple

biothings.cli.utils.get_uploaded_collections(src_db, uploaders)[source]¶: A helper function to get the uploaded collections in the source database

biothings.cli.utils.get_uploaders(working_dir: Path)[source]¶: A helper function to get the uploaders from the manifest file in the working directory, used in show_uploaded_sources function below

biothings.cli.utils.is_valid_data_plugin_dir(data_plugin_dir)[source]¶: Return True/False if the given folder is a valid data plugin folder (contains either manifest.yaml or manifest.json)

biothings.cli.utils.load_plugin(plugin_name=None, dumper=True, uploader=True, logger=None)[source]¶

Return a plugin object for the given plugin_name. If dumper is True, include a dumper instance in the plugin object. If uploader is True, include uploader_classes in the plugin object.

If <plugin_name> is not valid, raise the proper error and exit.

biothings.cli.utils.load_plugin_managers(plugin_path, plugin_name=None, data_folder=None)[source]¶: Load a data plugin from <plugin_path>, and return a tuple of (dumper_manager, upload_manager)

biothings.cli.utils.process_inspect(source_name, mode, limit, merge, logger, do_validate, output=None)[source]¶: Perform inspect for the given source. It’s used in do_inspect function below

biothings.cli.utils.remove_files_in_folder(folder_path)[source]¶: Remove all files in a folder.

biothings.cli.utils.run_sync_or_async_job(func, *args, **kwargs)[source]¶: When func is defined as either normal or async function/method, we will call this function properly and return the results. For an async function/method, we will use CLIJobManager to run it.

biothings.cli.utils.serve(host, port, plugin_name, table_space)[source]¶: Serve a simple API server to query the data plugin source.

biothings.cli.utils.show_dumped_files(data_folder, plugin_name)[source]¶: A helper function to show the dumped files in the data folder

biothings.cli.utils.show_hubdb_content()[source]¶: Output hubdb content in a pretty format.

biothings.cli.utils.show_uploaded_sources(working_dir, plugin_name)[source]¶: A helper function to show the uploaded sources from given plugin.

biothings.cli.web_app¶

class biothings.cli.web_app.Application(db, table_space, **settings)[source]¶

Bases: Application

The main application class, which defines the routes and handlers.

class biothings.cli.web_app.BaseHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]¶

Bases: RequestHandler

set_default_headers()[source]¶

Override this to set HTTP headers at the beginning of the request.

For example, this is the place to set a custom Server header. Note that setting such headers in the normal flow of request processing may not do what you want, since headers may be reset during error handling.

class biothings.cli.web_app.DocHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]¶

Bases: BaseHandler

The handler for the detail view of a document, e.g. /<source>/<doc_id/

async get(slug, item_id)[source]¶

class biothings.cli.web_app.HomeHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]¶

Bases: BaseHandler

the handler for the landing page, which lists all available routes

async get()[source]¶

exception biothings.cli.web_app.NoResultError[source]¶: Bases: Exception

class biothings.cli.web_app.QueryHandler(application: Application, request: HTTPServerRequest, **kwargs: Any)[source]¶

Bases: BaseHandler

The handler for return a list of docs matching the query terms passed to “q” parameter e.g. /<source>/?q=<query>

async get(slug)[source]¶

async biothings.cli.web_app.get_available_routes(db, table_space)[source]¶: return a list available URLs/routes based on the table_space and the actual collections in the database

biothings.cli.web_app.get_example_queries(db, table_space)[source]¶: Populate example queries for a given table_space

async biothings.cli.web_app.main(host, port, db, table_space)[source]¶: The main function, which starts the server.