Stable Structures
TLDR
- 96 GiB of stable memory
- Persistent across upgrades
- Familiar API
- Must specify memory id
- Must specify maximum key size
- Must specify maximum value size
- No migrations per memory id
Stable structures are data structures with familiar APIs that allow access to stable memory. Stable memory is a separate memory location from the heap that currently allows up to 96 GiB of storage. Stable memory persists automatically across upgrades.
Persistence on the Internet Computer (IC) is very important to understand. When a canister is upgraded (its code is changed after being initially deployed) its heap is wiped. This includes all global variables.
On the other hand, anything stored in stable memory will be preserved. Writing and reading to and from stable memory can be done with a low-level API, but it is generally easier and preferable to use stable structures.
Kybra currently provides one stable structure called StableBTreeMap
. It's similar to a Python dictionary or a map that you'd find in most languages, and has most of the common operations you'd expect such as reading, inserting, and removing values.
Here's how to define a simple StableBTreeMap
. Each StableBTreeMap
must be defined in the global scope (not within any functions or objects etc):
from kybra import nat8, StableBTreeMap
map = StableBTreeMap[nat8, str](memory_id=0, max_key_size=100, max_value_size=1_000)
This is a StableBTreeMap
with a key of type nat8
and a value of type str
. Key and value types can be any Candid type.
This StableBTreeMap
also has a memory id
of 0
, a maximum key size of 100
bytes and a maximum value size of 1_000
bytes. You must statically specify the memory id
, maximum key size, and maximum value sizes (they cannot be variables).
The memory id
can be a number between 0
and 254
, We hope to reduce the complexity around memory id
in the future, and perhaps remove the need to specify it entirely.
Each StableBTreeMap
instance must have a unique memory id
. Once a memory id
is allocated, it cannot be used with a different StableBTreeMap
. This means you can't create another StableBTreeMap
using the same memory id
, and you can't change the key or value types of an existing StableBTreeMap
. This problem will be addressed.
Here's an example showing all of the basic StableBTreeMap
operations:
from kybra import (
Alias,
nat64,
nat8,
Opt,
query,
StableBTreeMap,
Tuple,
update,
Vec,
)
Key = Alias[nat8]
Value = Alias[str]
map = StableBTreeMap[Key, Value](memory_id=0, max_key_size=100, max_value_size=1_000)
@query
def contains_key(key: Key) -> bool:
return map.contains_key(key)
@query
def get(key: Key) -> Opt[Value]:
return map.get(key)
@update
def insert(key: Key, value: Value) -> Opt[Value]:
return map.insert(key, value)
@query
def is_empty() -> bool:
return map.is_empty()
@query
def items() -> Vec[Tuple[Key, Value]]:
return map.items()
@query
def keys() -> Vec[Key]:
return map.keys()
@query
def len() -> nat64:
return map.len()
@update
def remove(key: Key) -> Opt[Value]:
return map.remove(key)
@query
def values() -> Vec[Value]:
return map.values()
With these basic operations you can build more complex CRUD database applications:
import secrets
from kybra import (
blob,
ic,
nat64,
Opt,
Principal,
query,
Record,
StableBTreeMap,
update,
Variant,
Vec,
)
class User(Record):
id: Principal
created_at: nat64
recording_ids: Vec[Principal]
username: str
class Recording(Record):
id: Principal
audio: blob
created_at: nat64
name: str
user_id: Principal
users = StableBTreeMap[Principal, User](
memory_id=0, max_key_size=38, max_value_size=100_000
)
recordings = StableBTreeMap[Principal, Recording](
memory_id=1, max_key_size=38, max_value_size=5_000_000
)
@update
def create_user(username: str) -> User:
id = generate_id()
user: User = {
"id": id,
"created_at": ic.time(),
"recording_ids": [],
"username": username,
}
users.insert(user["id"], user)
return user
@query
def read_users() -> Vec[User]:
return users.values()
@query
def read_user_by_id(id: Principal) -> Opt[User]:
return users.get(id)
class DeleteUserResult(Variant, total=False):
Ok: User
Err: "DeleteUserErr"
class DeleteUserErr(Variant, total=False):
UserDoesNotExist: Principal
@update
def delete_user(id: Principal) -> DeleteUserResult:
user = users.get(id)
if user is None:
return {"Err": {"UserDoesNotExist": id}}
for recording_id in user["recording_ids"]:
recordings.remove(recording_id)
users.remove(user["id"])
return {"Ok": user}
class CreateRecordingResult(Variant, total=False):
Ok: Recording
Err: "CreateRecordingErr"
class CreateRecordingErr(Variant, total=False):
UserDoesNotExist: Principal
@update
def create_recording(
audio: blob, name: str, user_id: Principal
) -> CreateRecordingResult:
user = users.get(user_id)
if user is None:
return {"Err": {"UserDoesNotExist": user_id}}
id = generate_id()
recording: Recording = {
"id": id,
"audio": audio,
"created_at": ic.time(),
"name": name,
"user_id": user_id,
}
recordings.insert(recording["id"], recording)
updated_user: User = {
"id": user["id"],
"created_at": user["created_at"],
"username": user["username"],
"recording_ids": [*user["recording_ids"], recording["id"]],
}
users.insert(updated_user["id"], updated_user)
return {"Ok": recording}
@query
def read_recordings() -> Vec[Recording]:
return recordings.values()
@query
def read_recording_by_id(id: Principal) -> Opt[Recording]:
return recordings.get(id)
class DeleteRecordingResult(Variant, total=False):
Ok: Recording
Err: "DeleteRecordingError"
class DeleteRecordingError(Variant, total=False):
RecordingDoesNotExist: Principal
UserDoesNotExist: Principal
@update
def delete_recording(id: Principal) -> DeleteRecordingResult:
recording = recordings.get(id)
if recording is None:
return {"Err": {"RecordingDoesNotExist": id}}
user = users.get(recording["user_id"])
if user is None:
return {"Err": {"UserDoesNotExist": recording["user_id"]}}
updated_user: User = {
"id": user["id"],
"created_at": user["created_at"],
"username": user["username"],
"recording_ids": list(
filter(
lambda recording_id: recording_id.to_str() != recording["id"].to_str(),
user["recording_ids"],
)
),
}
users.insert(updated_user["id"], updated_user)
recordings.remove(id)
return {"Ok": recording}
def generate_id() -> Principal:
random_bytes = secrets.token_bytes(29)
return Principal.from_hex(random_bytes.hex())
The example above shows a very basic audio recording backend application. There are two types of entities that need to be stored, User
and Recording
. These are represented as Candid
records.
Each entity gets its own StableBTreeMap
:
from kybra import blob, nat64, Principal, Record, StableBTreeMap, Vec
class User(Record):
id: Principal
created_at: nat64
recording_ids: Vec[Principal]
username: str
class Recording(Record):
id: Principal
audio: blob
created_at: nat64
name: str
user_id: Principal
users = StableBTreeMap[Principal, User](
memory_id=0, max_key_size=38, max_value_size=100_000
)
recordings = StableBTreeMap[Principal, Recording](
memory_id=1, max_key_size=38, max_value_size=5_000_000
)
Notice that each StableBTreeMap
has a unique memory id
. The maximum key and value sizes are also set according to the expected application usage.
You can figure out the appropriate maximum key and value sizes by reasoning about your application and engaging in some trial and error using the insert
method. Calling insert
on a StableBTreeMap
will throw an error which in some cases will have the information that you need to determine the maximum key or value size.
If you attempt to insert a key or value that is too large, the KeyTooLarge
and ValueTooLarge
errors will show you the size of the value that you attempted to insert. You can increase the maximum key or value size based on the information you receive from the KeyTooLarge
and ValueTooLarge
errors and try inserting again.
Thus through some trial and error you can whittle your way to a correct solution. In some cases all of your values will have an obvious static maximum size. In the audio recording example, trial and error revealed that Principal
is most likely always 38 bytes, thus the maximum key size is set to 38.
Maximum value sizes can be more tricky to figure out, especially if the values are records or variants with dynamic fields such as arrays. User
has one such dynamic field, recording_ids
. Since each recording id is a Principal
, we know that each will take up 38 bytes. The other fields on User
shouldn't take up too many bytes so we'll ignore them for our analysis.
We've set the maximum value size of User
to be 100_000 bytes. If we divide 100_00 by 38, we get ~2_631. This will result in each user being able to store around that many recordings. That's acceptable for our example, and so we'll go with it.
As for Recording
, the largest dynamic field is audio
, which will be the actual bytes of the audio recording. We've set the maximum value size here to 5_000_000, which should allow for recordings of ~5 MB in size. That seems reasonable for our example, and so we'll go with it.
As you can see, finding the correct maximum key and value sizes is a bit of an art right now. Combining some trial and error with reasoning about your specific application should get you a working solution in most cases. It's our hope that the need to specify maximum key and value sizes will be removed in the future.
Caveats
Keys
You should be wary when using float64
, float32
, service
, or func
in any type that is a key for a stable structure. These types do not have the ability to be strictly ordered in all cases. service
and func
will have no order. float64
and float32
will treat NaN
as less than any other type. These caveats may impact key performance.