Developers Keep Learning

Welcome to Keep Learning, an inspiring online publication created by developers, for developers.

Follow publication

Member-only story

Twitter snowflake approach is cool

Atakan Demircioğlu
Developers Keep Learning
2 min readDec 18, 2022

I was researching a solution to generate unique IDs and I liked the Twitter snowflake approach. These are my notes about this approach.

What is Twitter’s snowflake approach?

It is a solution to generate unique IDs in distributed systems. Twitter uses this approach in Tweets, DM’s, Lists and etc.

  • IDs are unique and sortable
  • IDs include time. (ordered by date)
  • IDs fit 64-bit unsigned integers.
  • Only numerical values.

Sign bit (1 bit): Reserved bit (It is always 0). This can be reserver for future requests. It can be potentially used to make the overall number positive.

Timestamp(41 bit): Epoch timestamp in a millisecond (Snowflake’s default epoch is equal to Nov 04, 2010, 01:42:54 UTC)

Machine ID(10-bit): accommodates 1024 machines

Sequence number(12-bit): It is a local counter per each machine and increments by 1. The number reset to 0 in every millisecond. Theoretically, a machine can support a max of 4096 (2¹²) new IDs per second.

Advantages & Disadvantages of the Twitter Snowflake Approach

  • It is 64-bit long, it is half the size of UUIDs
  • Scalable (it can accommodate 1024 machines)
  • Highly available (Each machine can generate 4096 unique IDs each millisecond)
  • Some of the UUID versions do not include a timestamp. In this case, Twitter Snowflake has a sortable advantage.
  • Design requires Zookeeper (disadvantage)
  • The generated IDs are not random like UUIDs. Future IDs can predictable.
  • The maximum timestamp that can be represented in 41 bits is (~ 69 years). Need a solution after this :)

Usage Notes

  • Discord uses snowflakes, with their epoch set to the first second of the year 2015.
  • Instagram uses a modified version of the format, with 41 bits for a timestamp, 13 bits for a shard ID, and 10 bits for a sequence number.
  • Mastodon’s modified format has 48 bits for a millisecond-level timestamp, it uses the UNIX epoch. The remaining 16 bits are for sequence data.

References

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Developers Keep Learning
Developers Keep Learning

Published in Developers Keep Learning

Welcome to Keep Learning, an inspiring online publication created by developers, for developers.

Atakan Demircioğlu
Atakan Demircioğlu

Written by Atakan Demircioğlu

Passionate about blogging and sharing insights on tech, web development, and beyond. Join me on this digital journey! 🚀

Responses (3)

Write a response