Programming

Aggregating trees with RavenDB

We got an interesting question in the RavenDB discussion group:How to do aggregation on a tree structure?The task is to build a Work Breakdown Structure, where you have:ProjectsMajor deliverablesSub-deliverablesWork packagesThe idea is to be able to track EstimatedHours and CompletedHours across the entire tree. For example, let’s say that I have the following:Project: Bee Keeper Chronicle AppMajor deliverable: App DesignSub-deliverable: Wireframes all screensWork Package: Login page wireframeUsers can add the EstimatedHours and CompletedHours at any level, and we want to be able to aggregate the data upward. So the Project: “Bee Keeper Chronicle App” should have a total estimated time and the number of hours that were worked on.The question is how to model & track that in RavenDB. Here is what I think the document structure should look like:{ "Name": "Login page wire frame", "Parent": { "Type": "Subs", "Id": "subs/0000000000000000009-A" }, "EsimatedHours": 8, "CompletedHours": 3, "@metadata": { "@collection": "WorkPackages" } } { "Name": "Wire frames all screens", "Parent": { "Type": "Majors", "Id": "major/0000000000000000008-A" }, "EsimatedHours": 20, "CompletedHours": 7, "@metadata": { "@collection": "Subs" } } { "Name": "App Design", "Parent": { "Type": "Projects", "Id": "projects/0000000000000000011-A" }, "EsimatedHours": 50, "CompletedHours": 12, "@metadata": { "@collection": "Majors" } } { "Name": "Bee Keeper Chronicle App", "EsimatedHours": 34, "CompletedHours": 21, "@metadata": { "@collection": "Projects" } }I used a Parent relationship, since that is the most flexible (it allows you to move each item to a completely different part of the tree easily). Now, we need to aggregate the values for all of those items using a Map-Reduce index. Because of the similar structure, I created the following JS function:function processWorkBreakdownHours(doc) { let hours = { EsimatedHours: doc.EsimatedHours, CompletedHours: doc.CompletedHours }; let results = [Object.assign({ Scope: id(doc) }, hours)]; let current = doc; while (current.Parent) { current = load(current.Parent.Id, current.Parent.Type); results.push(Object.assign({ Scope: id(current) }, hours)); } return results; }This will scan over the hierarchy and add the number of estimated and completed hours to all the levels. Now we just need to add this file as Additional Sources to an index and call it for all the relevant collections, like this:map("WorkPackages",processWorkBreakdownHours); map("Subs",processWorkBreakdownHours); map("Majors",processWorkBreakdownHours); map("Projects",processWorkBreakdownHours);And the last step is to aggregate across all of them in the reduce function: groupBy(x => x.Scope).aggregate(g => { return { Scope: g.key, EsimatedHours: g.values.reduce((c, val) => val.EsimatedHours + c, 0), CompletedHours: g.values.reduce((c, val) => val.CompletedHours + c, 0) }; })You can see the full index definition here.The end result is automatic aggregation at all levels. Change one item, and it will propagate upward.Considerations: I’m using load() here, which creates a reference from the parent to the child. The idea is that if we move a Work Package from one Sub-deliverable to another (in the same or a different Major & Project), this index will automatically re-index what is required and get you the right result.However, that also means that whenever the Major document changes, we’ll have to re-index everything below it (because it might have changed the Project). There are two ways to handle that. Instead of using load(), we’ll use noTracking.load(), which tells RavenDB that when the referenced document changes, we should not re-index. Provide the relevant scopes at the document level, like this:{ "Name": "Login page wire frame", "Scope": [ "subs/0000000000000000009-A", "major/0000000000000000008-A", "projects/0000000000000000011-A" ], "EsimatedHours": 8, "CompletedHours": 3, "@metadata": { "@collection": "WorkPackages" } }Note that in this case, changing the root will be more complex because you have to scan / touch everything if you move between parts of the tree. In most cases, that is such a rare event that it shouldn’t be a consideration, but it depends largely on your context. And there you have it, a simple Map-Reduce index that can aggregate across an entire hierarchy with ease.

Jan 14, 2025 - 13:36

We got an interesting question in the RavenDB discussion group:How to do aggregation on a tree structure?

The task is to build a Work Breakdown Structure, where you have:

Projects

Major deliverables

Sub-deliverables

Work packages

The idea is to be able to track EstimatedHours and CompletedHours across the entire tree. For example, let’s say that I have the following:

Project: Bee Keeper Chronicle App

Major deliverable: App Design

Sub-deliverable: Wireframes all screens

Work Package: Login page wireframe

Users can add the EstimatedHours and CompletedHours at any level, and we want to be able to aggregate the data upward. So the Project: “Bee Keeper Chronicle App” should have a total estimated time and the number of hours that were worked on.

The question is how to model & track that in RavenDB. Here is what I think the document structure should look like:

{
    "Name": "Login page wire frame",
    "Parent": {
        "Type": "Subs",
        "Id": "subs/0000000000000000009-A"
    },
    "EsimatedHours": 8,
    "CompletedHours": 3,
    "@metadata": {
        "@collection": "WorkPackages"
    }
}


{
    "Name": "Wire frames all screens",
    "Parent": {
        "Type": "Majors",
        "Id": "major/0000000000000000008-A"
    },
    "EsimatedHours": 20,
    "CompletedHours": 7,
    "@metadata": {
        "@collection": "Subs"
    }
}


{
    "Name": "App Design",
    "Parent": {
        "Type": "Projects",
        "Id": "projects/0000000000000000011-A"
    },
    "EsimatedHours": 50,
    "CompletedHours": 12,
    "@metadata": {
        "@collection": "Majors"
    }
}


{
    "Name": "Bee Keeper Chronicle App",
    "EsimatedHours": 34,
    "CompletedHours": 21,
    "@metadata": {
        "@collection": "Projects"
    }
}

I used a Parent relationship, since that is the most flexible (it allows you to move each item to a completely different part of the tree easily). Now, we need to aggregate the values for all of those items using a Map-Reduce index.

Because of the similar structure, I created the following JS function:

function processWorkBreakdownHours(doc) {
    let hours = {
        EsimatedHours: doc.EsimatedHours,
        CompletedHours: doc.CompletedHours
    };
    let results = [Object.assign({
        Scope: id(doc)
    }, hours)];


    let current = doc;
    while (current.Parent) {
        current = load(current.Parent.Id, current.Parent.Type);
        results.push(Object.assign({
            Scope: id(current)
        }, hours));
    }
    return results;
}

This will scan over the hierarchy and add the number of estimated and completed hours to all the levels. Now we just need to add this file as Additional Sources to an index and call it for all the relevant collections, like this:

map("WorkPackages",processWorkBreakdownHours);
map("Subs",processWorkBreakdownHours);
map("Majors",processWorkBreakdownHours);
map("Projects",processWorkBreakdownHours);

And the last step is to aggregate across all of them in the reduce function:

groupBy(x => x.Scope).aggregate(g => {
    return {
        Scope: g.key,
        EsimatedHours: g.values.reduce((c, val) => val.EsimatedHours + c, 0),
        CompletedHours: g.values.reduce((c, val) => val.CompletedHours + c, 0)
    };
})

You can see the full index definition here.

The end result is automatic aggregation at all levels. Change one item, and it will propagate upward.

Considerations:

I’m using load() here, which creates a reference from the parent to the child. The idea is that if we move a Work Package from one Sub-deliverable to another (in the same or a different Major & Project), this index will automatically re-index what is required and get you the right result.

However, that also means that whenever the Major document changes, we’ll have to re-index everything below it (because it might have changed the Project). There are two ways to handle that.

Instead of using load(), we’ll use noTracking.load(), which tells RavenDB that when the referenced document changes, we should not re-index.
Provide the relevant scopes at the document level, like this:

{
    "Name": "Login page wire frame",
    "Scope": [
       "subs/0000000000000000009-A",
       "major/0000000000000000008-A",
       "projects/0000000000000000011-A"
    ],
    "EsimatedHours": 8,
    "CompletedHours": 3,
    "@metadata": {
        "@collection": "WorkPackages"
    }
}

Note that in this case, changing the root will be more complex because you have to scan / touch everything if you move between parts of the tree.

In most cases, that is such a rare event that it shouldn’t be a consideration, but it depends largely on your context.

And there you have it, a simple Map-Reduce index that can aggregate across an entire hierarchy with ease.

Windows 上 VSCode 的 C/C++ 延伸模組處理編碼的...

Vijana Wawekeze Katika Uhandisi wa Programu: ...

How to Resolve the 'Permission Denied' Error ...

Resolving Auto-Scroll issues for overflow con...

KDE vs GNOME vs Others: Choosing the Best Lin...

Jonathan Williams Joins Leya For Legal Ops Push

AOS’s Fuse Incubator Calls For Applications

Redefining Single-Channel Speech Enhancement:...

Revolutionizing AI with Multimodal Large Lang...

Beyond Passwords: A Multimodal Approach to Bi...

Nice PDF, But Can It Run DOOM? Yup!

Android 16 might give the volume UI a visuall...

The MacBook Air Turns 17 Today

Extremely loud, the JBL Xtreme 4 drops to a n...

China considers selling TikTok to Elon Musk

Aggregating trees with RavenDB

Considerations:

Tags:

Performance discovery: IOPS vs. IOPS

Performance discovery: Managed vs. Unmanaged memory

Understanding the Fundamentals of AI

How we think about Threads’ iOS performance

Самые необычные технологические новинки, представленные...

Popular Posts

Introducing vulne-soldier: A Modern AWS EC2 Vulner...

Nice PDF, But Can It Run DOOM? Yup!

Android 16 might give the volume UI a visually ref...

The MacBook Air Turns 17 Today

Extremely loud, the JBL Xtreme 4 drops to a new al...

11 Must-Know Websites Every Developer Should Bookmark

Spicychat Alternatives

Outcome-Refining Process Supervision: Advancing Co...

Aggregating trees with RavenDB

Considerations:

Tags:

Related Posts

Popular Posts