Buckets API Reference¶
Buckets¶
- class neuro_sdk.Buckets¶
Blob storage buckets subsystems, available as
Client.buckets
.The subsystem helps take advantage of many basic functionality of Blob Storage solutions different cloud providers support. For AWS it would be S3, for GCP - Cloud Storage, etc.
- async-for list(cluster_name: Optional[str] = None) AsyncContextManager[AsyncIterator[Bucket]] [source]¶
List user’s buckets, async iterator. Yields
Bucket
instances.- Parameters
cluster_name (str) – cluster to list buckets. Default is current cluster.
- coroutine create(name: Optional[str], cluster_name: Optional[str] = None, org_name: Optional[str] = None) Bucket [source]¶
Create a new bucket.
- coroutine import_external(provider: Bucket.Provider, provider_bucket_name: str, credentials: Mapping[str, str], name: Optional[str] = None, cluster_name: Optional[str] = None, org_name: Optional[str] = None) Bucket [source]¶
Import a new bucket.
- Parameters
provider (Bucket.Provider) – Provider type of imported bucket.
provider_bucket_name (str) – Name of external bucket inside the provider.
credentials (Mapping[str, str]) – Raw credentials to access bucket provider.
name (Optional[str]) – Name of the bucket. Should be unique among all user’s bucket.
cluster_name (str) – cluster to import a bucket. Default is current cluster.
org_name (str) – org to import a bucket. Default is current org.
- Returns
Newly imported bucket info (
Bucket
)
- coroutine get(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) Bucket [source]¶
Get a bucket with id or name bucket_id_or_name.
- coroutine rm(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) None [source]¶
Delete a bucket with id or name bucket_id_or_name.
- coroutine request_tmp_credentials(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) BucketCredentials [source]¶
Get a temporary provider credentials to bucket with id or name bucket_id_or_name.
- Parameters
- Returns
Bucket credentials info (
BucketCredentials
)
- coroutine set_public_access(bucket_id_or_name: str, public_access: bool, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) Bucket [source]¶
Enable or disable public (anonymous) read access to bucket.
- Parameters
- Returns
Bucket info (
Bucket
)
- coroutine head_blob(bucket_id_or_name: str, key: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) BucketEntry [source]¶
Look up the blob and return it’s metadata.
- Parameters
- Returns
BucketEntry
object.- Raises
ResourceNotFound
if key does not exist.
- coroutine put_blob(bucket_id_or_name: str, key: str, body: Union[AsyncIterator[bytes], bytes], cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None, ) None [source]¶
Create or replace blob identified by
key
in the bucket, e.g:large_file = Path("large_file.dat") size = large_file.stat().st_size file_md5 = await calc_md5(large_file) async def body_stream(): with large_file.open("r") as f: for line in f: yield f await client.buckets.put_blob( bucket_id_or_name="my_bucket", key="large_file.dat", body=body_stream, )
- Parameters
bucket_id_or_name (str) – bucket’s id or name.
key (str) – Key of the blob.
body (bytes) – Body of the blob. Can be passed as either
bytes
or as anAsyncIterator[bytes]
.cluster_name (str) – cluster to look for a bucket. Default is current cluster.
bucket_owner (str) – bucket owner’s username. Used only if looking up for bucket by it’s name. Default is current user.
- coroutine fetch_blob(bucket_id_or_name: str, key: str, offset: int = 0, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) AsyncIterator[bytes] [source]¶
Look up the blob and return it’s body content only. The content will be streamed using an asynchronous iterator, e.g.:
async with client.buckets.fetch_blob("my_bucket", key="file.txt") as content: async for data in content: print("Next chunk of data:", data)
- Parameters
bucket_id_or_name (str) – bucket’s id or name.
key (str) – Key of the blob.
offset (int) – Position in blob from which to read.
cluster_name (str) – cluster to look for a bucket. Default is current cluster.
bucket_owner (str) – bucket owner’s username. Used only if looking up for bucket by it’s name. Default is current user.
- coroutine delete_blob(bucket_id_or_name: str, key: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) None [source]¶
Remove blob from the bucket.
- coroutine list_blobs(uri: URL, recursive: bool = False, limit: int = 10000) AsyncContextManager[AsyncIterator[BucketEntry]] [source]¶
List blobs in the bucket. You can filter by prefix and return results similar to a folder structure if
recursive=False
is provided.- Parameters
uri (URL) – URL that specifies bucket and prefix to list blobs, e.g.
yarl.URL("blob:bucket_name/path/in/bucket")
.bool (recursive) – If
True
listing will contain all keys filtered by prefix, while withFalse
only ones up to next/
will be returned. To indicate missing keys, all that were listed will be combined under a common prefix and returned asBlobCommonPrefix
.int (limit) – Maximum number of
BucketEntry
objects returned.
- coroutine glob_blobs(uri: URL) AsyncContextManager[AsyncIterator[BucketEntry]] [source]¶
Glob search for blobs in the bucket:
async with client.buckets.glob_blobs( uri=URL("blob:my_bucket/folder1/**/*.txt") ) as blobs: async for blob in blobs: print(blob.key)
Similar to
Storage.glob()
the“**”
pattern means “this directory and all sub-directories, recursively”.- Parameters
uri (URL) – URL that specifies bucket and pattern to glob blobs, e.g.
yarl.URL("blob:bucket_name/path/**/*.bin")
.
- coroutine upload_file(src: URL, dst: URL, *, update: bool = False, progress: Optional[AbstractFileProgress] = None) None: [source]¶
Similarly to
Storage.upload_file()
, allows to upload local file src to bucket URL dst.- Parameters
src (URL) – path to uploaded file on local disk, e.g.
yarl.URL("file:///home/andrew/folder/file.txt")
.dst (URL) – URL that specifies bucket and key to upload file e.g.
yarl.URL("blob:bucket_name/folder/file.txt")
.update (bool) – if true, upload only when the source file is newer than the destination file or when the destination file is missing.
progress (AbstractFileProgress) – a callback interface for reporting uploading progress,
None
for no progress report (default).
- coroutine download_file(src: URL, dst: URL, *, update: bool = False, continue_: bool = False, progress: Optional[AbstractFileProgress] = None) None: [source]¶
Similarly to
Storage.download_file()
, allows to download remote file src to local path dst.- Parameters
src (URL) – URL that specifies bucket and blob key to download e.g.
yarl.URL("blob:bucket_name/folder/file.bin")
.dst (URL) – local path to save downloaded file, e.g.
yarl.URL("file:///home/andrew/folder/file.bin")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
continue (bool) – if true, download only the part of the source file past the end of the destination file and append it to the destination file if the destination file is newer and not longer than the source file. Otherwise download and overwrite the whole file.
progress (AbstractFileProgress) – a callback interface for reporting downloading progress,
None
for no progress report (default).
- coroutine upload_dir(src: URL, dst: URL, *, update: bool = False, filter: Optional[Callable[[str], Awaitable[bool]]] = None, ignore_file_names: AbstractSet[str] = frozenset(), progress: Optional[AbstractRecursiveFileProgress] = None) None: [source]¶
Similarly to
Storage.upload_dir()
, allows to recursively upload local directory src to Blob Storage URL dst.- Parameters
src (URL) – path to uploaded directory on local disk, e.g.
yarl.URL("file:///home/andrew/folder")
.dst (URL) – path on Blob Storage for saving uploading directory e.g.
yarl.URL("blob:bucket_name/folder/")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
filter (Callable[[str], Awaitable[bool]]) – a callback function for determining which files and subdirectories be uploaded. It is called with a relative path of file or directory and if the result is false the file or directory will be skipped.
ignore_file_names (AbstractSet[str]) – a set of names of files which specify filters for skipping files and subdirectories. The format of ignore files is the same as
.gitignore
.progress (AbstractRecursiveFileProgress) – a callback interface for reporting uploading progress,
None
for no progress report (default).
- coroutine download_dir(src: URL, dst: URL, *, update: bool = False, continue_: bool = False, filter: Optional[Callable[[str], Awaitable[bool]]] = None, progress: Optional[AbstractRecursiveFileProgress] = None) None: [source]¶
Similarly to
Storage.download_dir()
, allows to recursively download remote directory src to local path dst.- Parameters
src (URL) – path on Blob Storage to download a directory from e.g.
yarl.URL("blob:bucket_name/folder/")
.dst (URL) – local path to save downloaded directory, e.g.
yarl.URL("file:///home/andrew/folder")
.update (bool) – if true, download only when the source file is newer than the destination file or when the destination file is missing.
continue (bool) – if true, download only the part of the source file past the end of the destination file and append it to the destination file if the destination file is newer and not longer than the source file. Otherwise download and overwrite the whole file.
filter (Callable[[str], Awaitable[bool]]) – a callback function for determining which files and subdirectories be downloaded. It is called with a relative path of file or directory and if the result is false the file or directory will be skipped.
progress (AbstractRecursiveFileProgress) – a callback interface for reporting downloading progress,
None
for no progress report (default).
- coroutine blob_is_dir(uri: URL) bool [source]¶
Check weather uri specifies a “folder” blob in a bucket.
- Parameters
src (URL) – URL that specifies bucket and blob key e.g.
yarl.URL("blob:bucket_name/folder/sub_folder")
.
- coroutine blob_rm(uri: URL, *, recursive: bool = False, progress: Optional[AbstractDeleteProgress] = None) None [source]¶
Remove blobs from bucket.
- Parameters
uri (URL) – URL that specifies bucket and blob key e.g.
yarl.URL("blob:bucket_name/folder/sub_folder")
.recursive (bool) – remove a directory recursively with all nested files and folders if
True
(False
by default).progress (AbstractDeleteProgress) – a callback interface for reporting delete progress,
None
for no progress report (default).
- Raises
IsADirectoryError
if uri points on a directory and recursive flag is not set.
- coroutine make_signed_url(uri: URL, expires_in_seconds: int = 3600) URL [source]¶
Generate a singed url that allows temporary access to blob.
- coroutine get_disk_usage(bucket_id_or_name: str, cluster_name: Optional[str] = None, bucket_owner: Optional[str) = None) AsyncContextManager[AsyncIterator[BucketUsage]] [source]¶
Get disk space usage of a given bucket. Iterator yield partial results as calculation for the whole bucket can take time.
- async-for persistent_credentials_list(cluster_name: Optional[str] = None) AsyncContextManager[AsyncIterator[PersistentBucketCredentials]] [source]¶
List user’s bucket persistent credentials, async iterator. Yields
PersistentBucketCredentials
instances.- Parameters
cluster_name (str) – cluster to list persistent credentials. Default is current cluster.
- coroutine persistent_credentials_create(bucket_ids: Iterable[str], name: Optional[str], read_only: Optional[bool] = False, cluster_name: Optional[str] = None) PersistentBucketCredentials [source]¶
Create a new persistent credentials for given set of buckets.
- Parameters
bucket_ids (Iterable[str]) – Iterable of bucket ids to create credentials for.
name (Optional[str]) – Name of the persistent credentials. Should be unique among all user’s bucket persistent credentials.
read_only (str) – Allow only read-only access using created credentials.
False
by default.cluster_name (str) – cluster to create a persistent credentials. Default is current cluster.
- Returns
Newly created credentials info (
PersistentBucketCredentials
)
- coroutine persistent_credentials_get(credential_id_or_name: str, cluster_name: Optional[str] = None) PersistentBucketCredentials [source]¶
Get a persistent credentials with id or name credential_id_or_name.
- Parameters
- Returns
Credentials info (
PersistentBucketCredentials
)
Bucket¶
- class neuro_sdk.Bucket¶
Read-only
dataclass
for describing single bucket.- provider¶
Blob storage provider this bucket belongs to,
Bucket.Provider
.
BucketCredentials¶
Bucket.Provider¶
PersistentBucketCredentials¶
- class neuro_sdk.PersistentBucketCredentials¶
Read-only
dataclass
for describing persistent credentials to some set of buckets created after user request.- name¶
The credentials name set by user, unique among all user’s bucket credentials,
str
orNone
if no name was set.
- credentials¶
List of per bucket credentials,
List[BucketCredentials]
BucketEntry¶
- class neuro_sdk.BucketEntry¶
An abstract class
dataclass
for describing bucket contents entries.- created_at¶
Blob creation timestamp,
datetime
orNone
if underlying blob engine do not store such information
BlobObject¶
- class neuro_sdk.BlobObject¶
An ancestor of
BucketEntry
used for key that are present directly in underlying blob storage.
BlobCommonPrefix¶
- class neuro_sdk.BlobCommonPrefix¶
An ancestor of
BucketEntry
for describing common prefixes for blobs in non-recursive listing. You can treat it as a kind of folder on Blob Storage.