TTMsql
Tiktok Interview Prepare
OLTP 和 OLAP
OLTP
Online Transaction Processing
Data processing system for managing daily business operations
Support a large number of concurrency of transactional applications (in the case of simultaneous multi-user operation)
OLAP
Online Analytical Processing
数据仓库中支持复杂的分析操作,使得用户可以通过各种视图(如汇总、分类、聚合)快速、有效地访问数据进行分析和决策。
Complex analytical operations
Various views (such as summary, classification, aggregation)
to access data and make decisions.
Eg:
Suppose we have a large online retail store that needs to process thousands of customer orders every day. This system needs to not only be able to quickly process these orders, but also manage inventory, process payments and update customer account information.
RDBMS
Relational Database Management System
Table
Primary Key
Foreign Key
Relations
SQL
Select
update
insert
delete
create table
Alter table
transaction management
ACID
atomicity
consistency
isolation
durability
Concurrency control
When multiple users access the database at the same time, the relational database manages concurrent operations through locking mechanisms and transaction isolation levels to prevent data conflicts and inconsistencies.
optimzer
The goal of the optimizer is to minimize query response time and resource usage.
Data integrity
entity integrity: primary key
reference integrity: foreign key
user defination integrity: check constrains
Middleware
the software layer beteween clients and servers
focus on simplify and manage data transmission, business logic, message services, etc.
Types Middleware
Communication middleware
消息队列(如RabbitMQ、Apache Kafka)
provide Asynchronous communication mechanism
Database middleware
Provision of database connectivity services
Connection Pool
Optimize database access and manage multi-user database access
Transaction middleware
Manage transactions across multiple databases, messaging services and other resources.
Make sure the ACID attribution of transaction
Object Request Broker (ORB)
in OOD, ORB support remote interaction among Object.
it allows the Object request in different internet
eg:
在分布式计算环境中,当一个客户端应用程序想要调用远程服务器上的对象方法时,ORB充当中介。客户端不需要知道服务具体在哪台服务器上,也不需要知道如何在网络上找到它和如何与之通信。客户端只需要知道它需要调用的方法和必要的参数。
Application Server
Provide an execution environment for enterprise-level applications
Web Middleware
handle http request, manage web content display and interaction
eg: web service, CMS system
Sharding
What is database sharding?
the data are divided into mutiple parts or pieces.
There are two kind of sharding:
horizontal sharding
Divide the rows in the table into multiple fragments,
and each fragment stores different rows of the table
but maintains the same table structure.
Eg:
we have a table including all the user in the world
We can do horizontal sharding based on user’s location
the user in China will be stored in the China database.
the user in USA will be stored in USA database/
vertical sharding
split the table by columns
with different segements storing different columns of the table.
Eg:
we have a user information table
we store the basic information of user and description information into two parts.
bastic information: name, address
description informationL performance, historical records
How does database shrding works?
- Chose sharding key
- Data allocation
- Query Routing
- Independent operation
Eg:
we are working in a large retail company.
the sales table record all the sales info of stores.
in order to optimization the speed and report generate speed, we want to sharde our database.
Range partitioning
Based on sale date to partition
The sales record for 2020 is stored in the sales_2020 partition
Sales records for 2021 are stored in the sales_2021 section
List partitioning
Based on sale region to partition
The sales records of the eastern region are stored in the sales_east section.
The sales records of the western region are stored in the sales_west section.
Hash partitioning
using hash function to evenly distribute the data to mutiple partition.
Hash the sales table according to the hash value of customer_id
When inserting a record, the system applies a hash function to customer_id.
According to the hash results, the record is assigned to a specific partition.
How to make sure the data is evenly distributed in database?
we can use hash cycle to solve it.
Consistent hash sharding method —> hash cycle
Eg:
In an e-commercial platform
we want to do sharding by order_id
The order number is increasing sequentially.
Steps:
- Create a hash cycle
assume the range is beteween 0 and 2^32 - 1.
- Node mapping
map each sharding node to the hash ring.
the sharding node is calculated its hash value by the hash function.
- Data allocation
for each user_id, if we use hash function to calcuate a hash value.
The data for user_id will be stored on the first node found clockwise on the hash ring.
It means that each node is responsible for all hash values between its position on the ring
and the postion of the next node.
example
we have thre sharding node: node1, node2, node3.
node1 in 150
node2 in 500
Node3 in 1000
when a new user coming,
we assume the user_id is 12345
we get the hash value is 650
it will store in the clockwise closest node on the hash ring —> node3
Sharding
Sharding is the process of dividing data horizontally into subsets,
each stored on a different database server or slice.
Replication
copying data from a database server to one or more servers
Cluster
A collection of database servers
work together and appear as a single database system externally.
Working Experience in Database Sharding
Situation:
previous work
participant in a database optimization project of large e-commerical platform.
Task:
the data searching speed becomes slow when facing dramatic increase.
Especially,
During peak sales, the database performance will become the bottleneck.
We want to increase our database handling ability and query response time.
My role:
Design and implement the database shiding strategy
ensure the smooth execution of the whole process
Sharding strategy: horizontial sharding based on users’ geographical location.
we use mysql database, it has its own sharding function.
we also need to do data migration.
-
set up mutiple copies for each slice
-
use master-slave replication to ensure data consistency and avaliability.
-
implement a monitoring system to monitor the performance and health status of each slice.
1 | Situation |
1 | Task |
1 | Action |
1 | Result |
Transaction mechanism
MVCC
Query optimization
Master-slave replication
Backup and recovery
Storage engine
HA(High Availability)
Raft Project
隶属于 MIT 6824 Distributed System
what is Distribution System? —> How to distribute data into diff meachines/services.
当数据量 和 计算量 都非常大的时候, 我们需要将计算机 Scale.
vertical and horizontial
vertical: improve the performance of single PC
CPU cache
Instruction prediction
the speed of chip execution instructions
improve RAM capacity
improve hard disk read and write speed
what is RAM capacity?
Random Access Memory
store data for CPU in operation system.
DRAM: maintained by periodically refreshing the charge(电荷).
SRAM: quicker than DRAM, not need refreshing the charge.
How to improve the hard disk read and write speed?
upgrade to SSD(固态硬盘) solid state disk (compare to HDD)
using RAID configuration
RAID0 和 RAID1
RAID0, 图书馆存储一本书,书的每一页都放在不同的书架上。当需要借阅的时候,图书管理员会同时从多个书架上获取所有page。 可以加速查找和组装书。因为可以有多个图书管理员。
RAID1, 图书馆的每本书都放在 不同的位置。 但是都有一个copy放在不同位置。当书本丢失可以复制。
Optimize disk defragmentation
horizontal: Spreading the computation over many computers ----> Dstribution System
tips
vertical optimization
The CMU textbook CSAPP provides a number of examples.
horizontial optimization
use NGINX to do loadbalance
based on Map Reduce to do distribute calculation
Redis Cluster
ZooKeeper
Etcd: KV store
what is etcd?
Key-value distributed store.
used in Raft, for the log replication.
the information table in a large shopping mail.
Whenever a new store opens or the old store closes, the map and list of the information desk will be updated.
Pre-Knowledge
what is distribute System?
what is redis cluster?
using sharding and replication to distribute data into diff nodes. —> improve performance and data redundancy.
what is zookeeper?
waht is etcd
Real Question
Raft and Pixo protocols —> Consensus Agreement
They are all algorithms used to solve problems in distributed systems.
确保 多个分布式节点 能够 就 某个值 一致
Consensus Agreement 共识协议
Raft ----> Russia 集权
Mast and slave structure
Leader selection
Log copying: once the leader has been selected, all the modifcation will be send to master firstly. and leader will add these changes into their logs. and then send repelica to others.
Consistency confirmation: when most of slaves finish logs copy from the master. the leader notifies all servers to submit the entities.
Paxos —> America 分权
perfect in theory but complicated to implement.
there gonna be four basic components:
propose
Each node can propose an porposal
promise
After receiving proposal, the node will check the serial number of proposal
if the serial number is the largest number they have even seen
they will send promise to the proposed node
accept
After receving the promise of most mode, the proposed node would send accept to other nodes.
Learn
Once the propose is accept by most modes, others node will update its own state
and when node updating its own state, we call it learn.
Using Two Example to Exlpain Raft Protocol and Paxos Protocol
Eg:
Raft Protocol
Online Voting System
we have an online voting system
users can vote on different issues
this system consists mutiple servers to ensure reliability and availability.
Leader Selection:
when the system starts, on of the five servers trigger the leader election through a timeout(超时器).
the server would become a candiate.
sending requests votes to other servers.
Handling Voting:
Every vote submmited by users would be send to leader firstly.
the leader will record the vote as a log and send the log to other servers.
Log Replication and Submition:
when most servers have confired it has record the logs, the leader will record this log submmited status and apply it in System.
the leader will notifies other servers this commit and
Other servers will update their status.
there are also other detial conception in Raft.
Heartbeat Mechanism:
int Raft, the leader will send heartbeat signals to all followers.
the heartbeat usually be empty additional log entries.
Election Trigger:
if the follower does not recevie the leader’s heartbeat for a certain time(election time out)
they would turn their status into a candidate and start a new roud of election.
Term Number:
the candidate will increase their term number,
which is used to prevent the old leader from regaining power.
request Voting:
the vote that candidate send to other services will include current term number and candidate’s information(latest date of updating logs and term number)
Voting Strategy:
each server can only vote once in a term.
they will only vote for candidate whose log is at least as new as themself.
New:
the update and integrity of the log.
Election Result:
once canciadte recevie most vote from other service(including itself)
it will become the new leader
New leader will immedicately send heartbeat mechanism to others and prevent others from initiating elections.
Paxos Protocol
Eg:
Distributed Bookkeeping System 分布式记账系统
we have five node: A, B, C, D, E.
the new record: “Transfer 100 from account1 to account2”
Prepare Phase:
Assume it is node A to start the new proposal
it generate a proposal number n
n is greater than the previous proposal number
Promise Phase:
after other nodes recevied n, they will compare it with the largest known proposal number.
if n is the largest number, node send commit message to proposal node A.
Accept Phase:
if proposal node A receive most promise from the nodes group, it will make the final decision based on the content of proposal.
and A send acceptance request to other nodes and finalized the content.
Learn Phase:
accept the request and update in the account books.
The Different beteween Raft and Paxos
Raft
simplified leader election
one leader and many slaves
heatbeat
failure recovery: reelection
Paxos
Muti-stage consensus process
decentralization
example to undersand the different beteween Raft and Paxos
1 | Raft 协议的类比:选举班长方式 |
1 | Raft 类似于有一个明确的领导者(班长),他或她负责做出决策并保持班级的信息同步。这使得决策过程更快,但依赖于领导者的能力和可用性。 |
What is the difference beteween key and index?
key is only one unique identifier
primary key
foreign key
index is used to find target quickly
example to understand the key and index?
we have a student table, including student id, student name, class and birthdate.
the key can be student id, make sure each student is identiftied unique.
the index can be based on class or birthdate.
if we often use class field, we can chose class to build index.
when we need to search all the students belongs to 10 class, it gonna be quicker.
How does index to achieve quick search and query?
using B tree and Hash Index
B tree is a self balance tree
we normal use B+ tree as the index in database.
举个例子解释一下
Books tables
Book_id
Title
Autohr
Genre(type类型)
Public_date(出版日期)
without index, if we want to search a book name:
we need a full table scan
when using index
1
2
3
4
5 [60]
/ \
[50] [70, 80]
/ \ / \
[30, 40, 45] [55, 58] [65, 68] [75, 90]we have a student table, the field including id, name, grade and score.
Internal node: the root node of a tree is 60.
it split the tree into two parts.
the left subtree is small than 60.
the right subree is equal or greater than or equal 60.
when searching a student who’s score is 68.
from root start, we compare 68 and 60 —> 68 is greater than 60 —> go right
in the next layer, we compare 68 with 70 and 80 —> 68 is greater than 70 and 68 is smaller than 80 —> choose the middle one
in the leaft node find 68.
we have an online library system
we want to use B+ tree to implement the index of our database.
We have a reservations table
revervation_id
customer_name
restaurant_name
reservation_date
Num_of_Guests
using index
for example, we want to search a restaurant which name is Olive Garden
without index in the restaurant_name
We need to iterate through the whole table
setting restaurant_name as the index
B+tree structure
we do not need to iterate through the who table, we can jump into the record including ‘Olive Garden’ through index
search all the record, whose date is 2023-07-15
start from root node of B+Tree
chosing suitable sub branch
untill find the 2023-07-15 leaf node
in the leaf node, it including all the record information related with the 2023-07-15
the leaf node are composed by two parts:
Key: the reservation date
Value: the pointer, which pointed to a complete row of data
summary
within th helper of B plus tree, the Mysql Database can quickly find the actually row data by using the pointer in the leaf node tree.
why we need primary key in mysql DataBase?
-
unique identifier
it is the unique idetifier in talbe
we can quickly find the record in our table.
-
data integrity
make sure there is not repeatable row in table
-
build the connection among relational database
in RD, primary key is used to establish a relationship with foreign key in other tables.
These relationships enable data to be effectively associated with different tables.
-
Query performance optimization
Most database management System would create index for primary key.
DataStructure of DataBase
LSM tree
Log-Structured Merge-tree
commonly used in Apache Cassandra, RocksDB, LevelDB.
focus on handling large amount of writing
Used in database and index system
Core ideas of LSMT
- optimization of write
Put all the insertion, update and delete operation records into a structure in memory(memTable)
- compaction
when memTable achieve to a certain size, it will be converted into a unchangeable disk.
Call SSTable(Sorted Strings Table)
SSTable will be merged and compressed. it called compaction.
memTable data structure: jump table or red-black tree
What is jump table
also called branch table
by using if else condition to hdanling the data
create a arry, which strore mutiple function
when the new request/transaction coming in
according to the state in the array to call corresponding function to handle data.
eg:
it looks like a hashtable
key is the status
value is the function’s address
Real example:
1
2
3
4
5
6 order_status_to_function = {
'placed': handle_placed,
'shipped': handle_shipped,
'delivered': handle_delivered,
'canceled': handle_canceled
}
1
2
3
4
5
6
7 def update_order_status(order_id, status):
# 调用对应状态的处理函数
if status in order_status_to_function:
handler = order_status_to_function[status]
handler(order_id)
else:
print("Unknown status")What is black-red tree
High speed response —> 可以确保 insert data 到 MemTable中,每次时间复杂度都为 log(n)
Black Red Tree
self balance tree
- each node either red or black
- root node is black
- all leaf node is red
- the children of red must be black
- the total num of black node beteween a random node and each leaf node is equal.
Black Red Tree
节点颜色:每个节点要么是红色,要么是黑色。
根节点是黑色:红黑树的根节点总是黑色的。
所有叶子节点都是黑色:这里的叶子节点是指树尾端的空(NIL)节点,它们都是黑色的。
红色节点的子节点必须是黑色:如果一个节点是红色的,那么它的两个子节点都必须是黑色的。这条规则防止了路径上出现连续的红色节点。
从任一节点到其每个叶子的所有路径都包含相同数目的黑色节点:这个属性保证了没有任何一条路径会比其他路径长出两倍以上,因此树大致是平衡的。
Eg:
1 | 40(B) |
- The path from the root node to the leaf node:
40 -> 20 -> 10 (black -> black -> red)
40 -> 20 -> 30 (black -> black -> red)
40 -> 60 -> 50 (black -> black -> red)
40 -> 60 -> 70 (black -> black -> red)
- The path from the node 20 to the leaf node:
- 20 -> 10: on this path, only node 20 is black.
- 20 -> 30: again, only node 20 is black.
B+ tree
Self-balance tree
focus on frequently used database.
Logs stroage
Time serious database
NoSQL database
Cassandra and RocksDB
Summary
frequently writen —> use LSM Tree(Logs structure Merge Tree)
frequently read —> use B+ Tree
LLM combined with DataBase
Mainly used for enhancing user interface data analysis, and system maintance
User Interface
users can use natural language to do complex searching without using SQL
Eg:
Which region grew the fastest in sales in the last quarter?
The model is responsible for analyzing this problem and converting it into an appropriate database query.
Data Consistency Monitor and Control
eg:
for a Ditributed Database System
we have 3 different Node:
Node1, Node2, Node3
we want these tree node remian synchronized.
we can design a LLM. it receive snapshots from three nodes regually.
Analyse the snapshots and compare data diffreences among them.
Step 1: Data collection
Each node regularly generates data snapshots containing a summary of transaction records and sends these data snapshots to the central monitoring system.
Step 2: Data analysis
In the central monitoring system, a well-trained large language model has been deployed. This model is specially trained to understand and analyze database status reports and log files.
- Model task: Compare data snapshots generated by different nodes and identify the problem of inconsistent data.
- Analysis method: The model will check the key fields, such as customer ID, transaction amount and timestamp, and compare the record differences between different nodes.
Step 3: Report Generation
Once the model detects data inconsistency, it will automatically generate a detailed report.
-
Report content
-
There are data inconsistencies between which nodes.
-
Analysis of specific examples and possible causes of inconsistent data.
-
Provide repair suggestions based on past experience of successfully solving similar problems.
Step 4: Response and Optimization
-
Response measures: The database administrator takes measures according to the report of the model, such as manually triggering data synchronization or adjusting the configuration of automatic synchronization.
-
Continuous optimization: The model learns and adjusts according to the results of each event to improve the accuracy and usefulness of future reports.
Sample report
Assume that the model finds inconsistencies between North American and Asian nodes in processing the transaction records of a high-frequency customer. The report may include:
-
Problem description: The North American node shows that the customer’s transaction is $1,000, while the Asian node shows $1,200.
-
Analysis: Preliminary analysis shows that time difference and concurrent transactions may lead to update delays.
-
Recommendation: Immediately check the last synchronization time of the two nodes, confirm the network connection status, and manually trigger a forced synchronization.
In this way, the large language model helps to automate and simplify the process of monitoring and maintaining the consistency of distributed databases, reduces the need for manual intervention, and accelerates the diagnosis and response of problems. This not only improves the overall reliability of the database system, but also optimizes the work efficiency of the operation and maintenance team.
C++技术栈