View

A View is a map/reduce-powered method of quickly accessing information inside of a Collection. A View can only belong to one Collection.

Views define two important associated types: a Key type and a Value type. You can think of these as the equivalent entries in a map/dictionary-like collection that supports more than one entry for each Key. The Key is used to filter the View's results, and the Value is used by your application or the reduce() function.

Views are a powerful, yet abstract concept. Let's look at a concrete example: blog posts with categories.

#[derive(Serialize, Deserialize, Debug)]
pub struct BlogPost {
    pub title: String,
    pub body: String,
    pub category: Option<String>,
}

While category should be an enum, let's first explore using String and upgrade to an enum at the end (it requires one additional step). Let's implement a View that will allow users to find blog posts by their category as well as count the number of posts in each category.

pub trait BlogPostsByCategory {
    type Collection = BlogPost;
    type Key = Option<String>;
    type Value = u32;

    fn map(&self, document: &Document<'_>) -> MapResult<Self::Key, Self::Value> {
        let post = document.contents::<BlogPost>()?;
        Ok(Some(document.emit_key_and_value(post.category.clone(), 1)))
    }

    fn reduce(
        &self,
        mappings: &[MappedValue<Self::Key, Self::Value>],
        _rereduce: bool,
    ) -> Result<Self::Value, Error> {
        Ok(mappings.iter().map(|mapping| mapping.value).sum())
    }
}

Map

The first line of the map function calls Document::contents() to deserialize the stored BlogPost. The second line returns an emitted Key and Value -- in our case a clone of the post's category and the value 1_u32. With the map function, we're able to use query() and query_with_docs():

    let rust_posts = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(String::from("Rust")))
        .query_with_docs().await?;

The above queries the Database for all documents in the BlogPost Collection that emitted a Key of Some("Rust").

Reduce

The second function to learn about is the reduce() function. It is responsible for turning an array of Key/Value pairs into a single Value. In some cases, PliantDb might need to call reduce() with values that have already been reduced one time. If this is the case, rereduce is set to true.

In this example, we're using the built-in Iterator::sum() function to turn our Value of 1_u32 into a single u32 representing the total number of documents.

    let rust_post_count = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(String::from("Rust")))
        .reduce().await?;

Understanding Re-reduce

Let's examine this data set:

Document IDBlogPost Category
1Some("Rust")
2Some("Rust")
3Some("Cooking")
4None

When updating views, each view entry is reduced and the value is cached. These are the view entries:

View Entry IDReduced Value
Some("Rust")2
Some("Cooking")1
None1

When a reduce query is issued for a single key, the value can be returned without further processing. But, if the reduce query matches multiple keys, the View's reduce() function will be called with the already reduced values with rereduce set to true. For example, retrieving the total count of blog posts:

    let total_post_count = db
        .view::<BlogPostsByCategory>()
        .reduce().await?;

Once PliantDb has gathered each of the key's reduced values, it needs to further reduce that list into a single value. To accomplish this, the View's reduce() function to be invoked with rereduce set to true, and with mappings containing:

KeyValue
Some("Rust")2
Some("Cooking")1
None1

This produces a final value of 4.

How does PliantDb make this efficient?

When saving Documents, PliantDb does not immediately update related views. It instead notes what documents have been updated since the last time the View was indexed.

When a View is accessed, the queries include an AccessPolicy. If you aren't overriding it, UpdateBefore is used. This means that when the query is evaluated, PliantDb will first check if the index is out of date due to any updated data. If it is, it will update the View before evaluating the query.

If you're wanting to get results quickly and are willing to accept data that might not be updated, the access policies UpdateAfter and NoUpdate can be used depending on your needs.

If multiple simulataneous queries are being evaluted for the same View and the View is outdated, PliantDb ensures that only a single view indexer will execute while both queries wait for it to complete.

Using arbitrary types as a View Key

In our previous example, we used String for the Key type. The reason is important: Keys must be sortable by our underlying storage engine, which means special care must be taken. Most serialization types do not guarantee binary sort order. Instead, PliantDb exposes the Key trait. On that documentation page, you can see that PliantDb implements Key for many built-in types.

Using an enum as a View Key

The easiest way to expose an enum is to derive num_traits::FromPrimitive and num_traits::ToPrimitive using num-derive, and add an impl EnumKey line:

#[derive(Serialize, Deserialize, Debug, num_derive::FromPrimitive, num_derive::ToPrimitive)]
pub enum Category {
    Rust,
    Cooking,
}

impl EnumKey for Category {}

The View code remains unchanged, although the associated Key type can now be set to Option<Category>. The queries can now use the enum instead of a String:

    let rust_post_count = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(Category::Rust))
        .reduce().await?;

PliantDb will convert the enum to a u64 and use that value as the Key. A u64 was chosen to ensure fairly wide compatibility even with some extreme usages of bitmasks. If you wish to customize this behavior, you can implement Key directly.

Implementing the Key trait

The Key trait declares two functions: as_big_endian_bytes() and from_big_endian_bytes. The intention is to convert the type to bytes using a network byte order for numerical types, and for non-numerical types, the bytes need to be stored in binary-sortable order.

Here is how PliantDb implements Key for EnumKey:

impl<T> Key for T
where
    T: EnumKey,
{
    fn as_big_endian_bytes(&self) -> anyhow::Result<Cow<'_, [u8]>> {
        self.to_u64()
            .ok_or_else(|| anyhow::anyhow!("Primitive::to_u64() returned None"))?
            .as_big_endian_bytes()
            .map(|bytes| Cow::Owned(bytes.to_vec()))
    }

    fn from_big_endian_bytes(bytes: &[u8]) -> anyhow::Result<Self> {
        let primitive = u64::from_big_endian_bytes(bytes)?;
        Self::from_u64(primitive)
            .ok_or_else(|| anyhow::anyhow!("Primitive::from_u64() returned None"))
    }
}

By implementing Key you can take full control of converting your view keys.