Twitter's GCP Architecture for Its Petabyte-Scale Data Storage in GCS (Cloud Next '19)

Channel:
Subscribers:
296,000
Published on ● Video Link: https://www.youtube.com/watch?v=rBNFwdVDlyo



Duration: 38:57
11,136 views
111


Twitter collects petabytes of data every day and empowers its engineers and data scientists for large data processing with an hybrid on-premises and cloud model. In this talk, we will look at its GCP architecture and the resource hierarchy. We will deep dive into the storage design that uses Google Cloud Storage to organize petabytes of data that are replicated from on-premises HDFS clusters. We will take a look at how the user-management tooling has been designed to create and manage access for thousands of accounts (human and service accounts) at Twitter. We will talk about how the design deals with the security measures for accounts and tooling systems running in GCP and the complexities of dataset permissions. We will share the challenges we faced as we tried to design our system at scale and our learnings and solutions.

Data Storage and User Identity → http://bit.ly/2TXooT0

Watch more:
Next '19 Security Sessions here → https://bit.ly/Next19Security
Next ‘19 All Sessions playlist → https://bit.ly/Next19AllSessions

Subscribe to the Google Cloud Channel → https://bit.ly/GoogleCloud1


Speaker(s): Vrushali Channapattan, James Duke


Session ID: SEC302
product:Cloud Identity and Access Management (IAM),Cloud IAM; fullname:James Duke;




Other Videos By Google Cloud


2019-05-09American Eagle: Building a multi-terabyte marketing data warehouse
2019-05-09How the American Cancer Society uses Cloud ML Engine to save lives
2019-05-09Why Keller Williams migrated to Google Cloud
2019-05-07Metro saved costs and improved operational efficiency by running SAP on Google Cloud
2019-05-07McKesson keeps patients healthier with digital transformation of running SAP on Google Cloud
2019-05-03JPMorgan Chase - Google Cloud Next '19
2019-05-02Focus on Your Customers. Salesforce and Google Can Make it Seamless (Cloud Next '19)
2019-05-02Tera migrates its business systems to Google Cloud & achieves a 6-fold increase in customer service
2019-04-25Women of Cloud: How to Grow our Clout 2.0 (Cloud Next '19)
2019-04-23How Chrome Enterprise can empower your frontline employees
2019-04-17Twitter's GCP Architecture for Its Petabyte-Scale Data Storage in GCS (Cloud Next '19)
2019-04-12Customer Stories: AI in Financial Services (Cloud Next '19)
2019-04-12Fireside Chat with Betty Reid Soskin (Cloud Next '19)
2019-04-12How Google’s AutoML Vision helps AES fight climate change
2019-04-12Python 2 to 3: Migration Patterns & Motivators (Cloud Next '19)
2019-04-12Preventing Data Exfiltration on GCP (Cloud Next '19)
2019-04-11Infrastructure as a Code with Deployment Manager (Cloud Next '19)
2019-04-11Google Cloud AI and Robotic Process Automation – Next '19
2019-04-11Comprehensive Protection of PII in GCP (Cloud Next '19)
2019-04-11Making Chrome Your Primary Browser (Cloud Next '19)
2019-04-11Reinventing Retail with AI (Cloud Next '19)



Tags:
Next 2018
data analysts
data scientists
SQL
structured query language
Booking.com
data quality
cloud data
cloud data warehouse
terabytes
petabytes
Google Cloud Next
Next
2019
Next19
Google Cloud Platform
talks
session
keynote
Twitter
Twitter and GCP