Thesis: Partial State in Dataflow-Based Materialized Views
This is my PhD dissertation presentation, which I gave at MIT (virtually) on October 22nd, 2020. It was immediately followed by my thesis defense, which I passed subject to the typical bits and pieces of revisions the committee wanted to see.
The person who introduces me in the video is Robert Morris, my thesis advisor. The Q&A has been cut out, as I want to edit it a bit and mix in questions from the public presentation I gave on YouTube the day before. I will link the Q&A video here once it is out. You can find the slides at https://jon.thesquareplanet.com/slides/thesis.pdf and at https://docs.google.com/presentation/d/1w2PlmqUIeue8VcNhBqOC0V3GTKQs9zBccrhd9tfAang/edit?usp=sharing. The thesis is available at https://jon.thesquareplanet.com/papers/phd-thesis.pdf.
What follows is the thesis abstract:
This thesis proposes a practical database system that lowers latency and increases supported load for read-heavy applications by using incrementally-maintained materialized views to cache query results. As opposed to state-of-the-art materialized view systems, the presented system builds the cache on demand, and evicts cache entries in response to a shifting workload.
The enabling technique the thesis introduces is partially stateful materialization, which allows entries in materialized views to be missing. The thesis proposes upqueries as a mechanism to fill such missing state on demand using dataflow, and implements them in the materialized view system Noria. The thesis then discusses additional mechanisms needed to establish eventual consistency for partially stateful dataflow.
Noria with partial materialization saves application developers from implementing their own ad hoc caching mechanisms to speed up their database accesses. Instead, the caching is built into the database, and is transparent to the application. Experimental results suggest that the presented system increases supported application load by up to 20x over MySQL and performs similarly to an optimized key-value store cache. Partial state also reduces memory use by up to 2/3 compared to traditional materialized views.
Other Videos By Jon Gjengset
2021-06-13 | Lock-Free to Wait-Free Simulation in Rust (part 2) |
2021-05-22 | Lock-Free to Wait-Free Simulation in Rust |
2021-04-30 | Crust of Rust: Dispatch and Fat Pointers |
2021-04-02 | Crust of Rust: Atomics and Memory Ordering |
2021-03-13 | Crust of Rust: The Drop Check |
2021-02-20 | Crust of Rust: Subtyping and Variance |
2021-01-23 | Q&A January 2021 (now with cat) |
2020-12-12 | The Unsafe Chronicles: Exhibit A: Aliasing Boxes |
2020-11-21 | A Cool Generic Concurrency Primitive in Rust |
2020-11-14 | Crust of Rust: Sorting Algorithms |
2020-10-23 | Thesis: Partial State in Dataflow-Based Materialized Views |
2020-08-19 | Q&A August #2 2020 |
2020-08-09 | Q&A August 2020 |
2020-08-05 | Crust of Rust: Channels |
2020-07-25 | Thesis Talk: The Evaluation Chapter |
2020-06-17 | Crust of Rust: Smart Pointers and Interior Mutability |
2020-05-27 | Crust of Rust: Iterators |
2020-04-29 | Crust of Rust: Declarative Macros |
2020-04-01 | (Partially) fixing a bug in a Rust research database |
2020-03-06 | Considering Rust |
2020-01-19 | Porting Java's ConcurrentHashMap to Rust (part 3) |