Accidenal complexity: A tale of two GUIDs

For a new feature in RavenDB, I needed to associate each transaction with a source ID. The underlying idea is that I can aggregate transactions from multiple sources in a single location, but I need to be able to distinguish between transactions from A and B.Luckily, I had the foresight to reserve space in the Transaction Header, I had a whole 16 bytes available for me. Separately, each Voron database (the underlying storage engine that we use) has a unique Guid identifier. And a Guid is 16 bytes… so everything is pretty awesome.There was just one issue. I needed to be able to read transactions as part of the recovery of the database, but we stored the database ID inside the database itself. I figured out that I could also put a copy of the database ID in the global file header and was able to move forward. This is part of a much larger change, so I was going full steam ahead when I realized something pretty awful. That database Guid that I was relying on was already being used as the physical identifier of the storage as part of the way RavenDB distributes data. The reason it matters is that under certain circumstances, we may need to change that. If we change the database ID, we lose the association with the transactions for that database, leading to a whole big mess. I started sketching out a design for figuring out that the database ID has changed, re-writing all the transactions in storage, and… a colleague said: why don’t we use another ID?It hit me like a ton of bricks. I was using the existing database Guid because it was already there, so it seemed natural to want to reuse it. But there was no benefit in doing that. Instead, it added a lot more complexity because I was adding (many) additional responsibilities to the value that it didn’t have before.Creating a Guid is pretty easy, after all, and I was able to dedicate one I called Journal ID to this purpose. The existing Database ID is still there, and it is completely unrelated to it. Changing the Database ID has no impact on the Journal ID, so the problem space is radically simplified.I had to throw away heaps of complexity because of a single comment. I used the Database ID because it was there, never considering having a dedicated value for it. That single suggestion led to a better, simpler design and faster delivery. It is funny how you can sometimes be so focused on the problem at hand, when a step back will give you a much wider view and a better path to the solution.

Jan 15, 2025 - 14:48

For a new feature in RavenDB, I needed to associate each transaction with a source ID. The underlying idea is that I can aggregate transactions from multiple sources in a single location, but I need to be able to distinguish between transactions from A and B.

Luckily, I had the foresight to reserve space in the Transaction Header, I had a whole 16 bytes available for me. Separately, each Voron database (the underlying storage engine that we use) has a unique Guid identifier. And a Guid is 16 bytes… so everything is pretty awesome.

There was just one issue. I needed to be able to read transactions as part of the recovery of the database, but we stored the database ID inside the database itself. I figured out that I could also put a copy of the database ID in the global file header and was able to move forward.

This is part of a much larger change, so I was going full steam ahead when I realized something pretty awful. That database Guid that I was relying on was already being used as the physical identifier of the storage as part of the way RavenDB distributes data. The reason it matters is that under certain circumstances, we may need to change that.

If we change the database ID, we lose the association with the transactions for that database, leading to a whole big mess. I started sketching out a design for figuring out that the database ID has changed, re-writing all the transactions in storage, and… a colleague said: why don’t we use another ID?

It hit me like a ton of bricks. I was using the existing database Guid because it was already there, so it seemed natural to want to reuse it. But there was no benefit in doing that. Instead, it added a lot more complexity because I was adding (many) additional responsibilities to the value that it didn’t have before.

Creating a Guid is pretty easy, after all, and I was able to dedicate one I called Journal ID to this purpose. The existing Database ID is still there, and it is completely unrelated to it. Changing the Database ID has no impact on the Journal ID, so the problem space is radically simplified.

I had to throw away heaps of complexity because of a single comment. I used the Database ID because it was there, never considering having a dedicated value for it. That single suggestion led to a better, simpler design and faster delivery.

It is funny how you can sometimes be so focused on the problem at hand, when a step back will give you a much wider view and a better path to the solution.

Be sure to check out our new bug bounty platf...

Research DevOps metrics and KPIs

Introduction to Terraform: Revolutionizing In...

Leveraging Azure Key Vault for Secrets Manage...

Deploying and Configuring a Hybrid Identity L...

How Do Real Estate Listings Benefit from AI C...

What is Computer Vision? – A Comprehensive Ov...

Reshaping Data Pipelines: A Data Engineer’s R...

Answer Data Questions for Non-Technical Stake...

Why your AI investments aren’t paying off

These $99 earbuds I tested give Apple's base ...

Microsoft hints that free Windows 11 upgrades...

Why Does ChatGPT’s Algorithm ‘Think’ in Chinese?

Galaxy S25 press photos leak ahead of next we...

Is humanity alone in the Universe? What scien...

Accidenal complexity: A tale of two GUIDs

Tags:

Honor Magic7 Pro to get 5 years of OS and security updates

Test Scenarios vs. Test Cases: Understanding the Differences

Using LRU Cache in Node.js and TypeScript

Upcoming CVE for End-of-Life Node.js Versions

Label + Checkbox States

Popular Posts

Introducing vulne-soldier: A Modern AWS EC2 Vulner...

Be sure to check out our new bug bounty platform!

Research DevOps metrics and KPIs

Introduction to Terraform: Revolutionizing Infrast...

These $99 earbuds I tested give Apple's base AirPo...

11 Must-Know Websites Every Developer Should Bookmark

Spicychat Alternatives

The Intelligence Age by Sam Altman

Accidenal complexity: A tale of two GUIDs

Tags:

Related Posts

Popular Posts