Skip to main content

vantage_sdk.workbench.inference_endpoint

SDK helpers for cluster inference endpoint operations.

Classes

InferenceEndpointSDK()

SDK for cluster-scoped inference endpoint operations.

Methods

  • create(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str', kind: 'str' = 'predictive', model_source_type: 'str' = 'model_registry', model_id: 'str | None' = None, storage_uri: 'str | None' = None, image: 'str | None' = None, runtime: 'str | None' = None, compute_pool: 'str | None' = None, cpu: 'str' = '2', memory: 'str' = '8Gi', gpu_count: 'int' = 0, min_replicas: 'int' = 1, max_replicas: 'int' = 1, tensor_parallel: 'int' = 1, framework: 'str | None' = None, protocol_version: 'str | None' = None, credentials_secret: 'str | None' = None) -> 'ClusterServiceResponse': No documentation provided.
  • delete(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.
  • get(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.
  • list(self, ctx: 'typer.Context', *, cluster_name: 'str') -> 'ClusterServiceResponse': No documentation provided.
  • list_runtimes(self, ctx: 'typer.Context', *, cluster_name: 'str') -> 'ClusterServiceResponse': No documentation provided.
  • logs(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str', lines: 'int' = 200, container: 'str' = 'kserve-container') -> 'ClusterServiceResponse': No documentation provided.
  • start(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.
  • stop(self, ctx: 'typer.Context', *, cluster_name: 'str', name: 'str') -> 'ClusterServiceResponse': No documentation provided.