vantage_sdk.workbench.inference_endpoint
SDK helpers for cluster inference endpoint operations.
Classes
InferenceEndpointSDK()
SDK for cluster-scoped inference endpoint operations.
Methods
create(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str', kind: 'str' = 'predictive', model_source_type: 'str' = 'model_registry', model_id: 'str | None' = None, storage_uri: 'str | None' = None, image: 'str | None' = None, runtime: 'str | None' = None, compute_pool: 'str | None' = None, cpu: 'str' = '2', memory: 'str' = '8Gi', gpu_count: 'int' = 0, min_replicas: 'int' = 1, max_replicas: 'int' = 1, tensor_parallel: 'int' = 1, framework: 'str | None' = None, protocol_version: 'str | None' = None, credentials_secret: 'str | None' = None) -> 'ClusterServiceResponse': No documentation provided.delete(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.get(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.list(self, ctx: 'typer.Context', *, cluster_name: 'str') -> 'ClusterServiceResponse': No documentation provided.list_runtimes(self, ctx: 'typer.Context', *, cluster_name: 'str') -> 'ClusterServiceResponse': No documentation provided.logs(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str', lines: 'int' = 200, container: 'str' = 'kserve-container') -> 'ClusterServiceResponse': No documentation provided.start(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.stop(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.