A good database specification helps to reduce the complexity of software implementation and reduce communication costs.
1. The iron law of building a warehouse
- | Iron law | Level | remark | character set | Use UTF-8. If the emoji is stored, use utf8mb4 for storage. | forced |
| | Sorting rules | Use utf8_general_ci | forced | |
2. The iron law of table construction
- | Iron law | Level | remark | exegesis | Be sure to have field annotations. | forced |
| | encode | Use UTF-8. If the emoji is stored, use utf8mb4 for storage. | forced |
| | whether the field is conceptual | It must be named with is_xx, and the data type is unsigned tinyint(1 yes, 0 no), e.g. is_deleted(1 delete, 0 not deleted). | forced | Any field must be unsigned if it is not negative | Table name, field name | Only lowercase letters, underscores or numbers can be used; It is forbidden to start with an underline or number; Only numbers are prohibited between two underscores; Disable reserved words; The use of plural nouns is prohibited in table names. | forced |
| | The naming of the database name and table name | The database name should be consistent with the application name, and the table name should be named with Business Name_Role of the table. | forced |
| | Index naming | The primary key index uses pk_ field name; Unique index with uk_ field name; Normal indexes use idx_ field names. | forced | pk_ is the primary key; uk_ is unique key; idx_ is index | Decimal type | The data type is decimal, and the use of float and double is prohibited, float and double have precision loss, and if the stored data range exceeds the range of decimal, it is recommended to split the data into integers and decimals and store them separately. | forced |
| | varchar type | varchar is a variable long string, no storage space is allocated in advance, the length should not exceed 5000 characters, if the length is greater than 5000, apply text (create a separate table, use the primary key to correspond, to avoid affecting the indexing efficiency of other fields). | forced |
| | There must be three fields in the table name | id (data type is unsigned bigint, single table increment, step size is 1), gmt_create, gmt_modified (active creation time, passive update time, data type is datetime). | forced |
| | Field redundancy | Fields allow appropriate redundancy, but data consistency must be considered, and redundant fields should have 1) infrequent modifications; 2) Not a varchar super long field, let alone a text field. | recommend |
| | Divide the database and tables | Partitioning is recommended only when the number of rows in a single table exceeds 5 million rows or the capacity of a single table exceeds 2 GB. | recommend | |
Setting the appropriate character storage length not only saves database table space and index storage, but more importantly, improves retrieval speed.
3. Establish an index iron law
- | Iron law | Level | remark | Unique index | Fields with unique characteristics in the business, even if they are a combination of fields, must be uniquely indexed. Although the unique index affects the insert speed, this loss is negligible, but it significantly improves query speed. In addition, even if the application layer has very complete check control, as long as there is no unique index, according to Murphy's Law, dirty data will inevitably be generated. | forced |
| | join | More than three tables prohibit joining, fields that require join, and the data types must be consistent; When multiple tables are associated with queries, ensure that the associated fields need to have an index. Even if you have a double table join, pay attention to table indexing, SQL performance. | forced |
| | varcharfield | The index length must be specified, and there is no need to index all fields, just determine the index length according to the actual text distinction. Index length and distinction are a pair of contradictions, generally for string type data, indexes with a length of 20 will have a distinction degree of more than 90%, which can be determined by the distinction degree of count(distinct left(column name, index length))/count(*). | forced |
| | Blurring is prohibited in page search | Page search prohibits blurring or full blurring, if necessary, please go to the search engine to solve it. Prohibition Reason: The index file has the leftmost prefix matching property of the B-Tree, and if the value on the left is not determined, then this index cannot be used. | forced |
| | order by | If there is an order by scenario, pay attention to the orderliness of the index. The last field of order by is part of the combined index and is placed at the end of the index combination order to avoid file_sort and affect query performance. Example: where a=? and b=? order by c; The index should be built as a_b_c; Counterexample: If there is a range lookup in the index, then the index orderliness cannot be utilized, such as where a>10 order by b; Index a_b cannot be sorted. | recommend | |
4. Write SQL iron rules
- | Iron law | Level | remark | count(*) | Don't use count(column name) or count(constant) instead of count(*), which is the syntax for the standard count of rows defined by SQL92, independent of the database, and independent of NULL and non-NULL. count(*) counts rows with a NULL value, while count(column name) does not count rows with this column NULL. | forced |
| | count(distinct col) | Counts the number of unique rows in the column except for NULL. Note that count(distinct col1, col2), if one of the columns is all NULL, then it returns 0 even if the other column has a different value. | forced |
| | sum(col) | When the values of a column are all NULL, count(col) returns 0, but sum(col) returns NULL, so you need to be aware of NPE issues when using sum(). NPE problems can be avoided in the following ways: select if(isnull(sum(g)), 0, sum(g)) from table; | forced |
| | isnull | Use isnull() to determine if it is a NULL value. NULL is NULL compared to any value. | forced |
| | Pagination query logic | If the count is 0, it should be returned directly to avoid executing the subsequent pagination statement. | forced |
| | Outer keys and cascades | The use of foreign keys and cascading is prohibited, and all foreign key concepts must be solved at the application layer. Reason: Foreign keys and cascades are not suitable for distributed, high-concurrency clusters, cascading updates are strong blocking, there is a risk of database update storms, and foreign keys affect the insertion speed of the database. | forced |
| | Stored procedures | Stored procedures are prohibited, and stored procedures are difficult to debug and scale, and are not portable. | forced |
| | Data correction | When correcting data (especially deleting or modifying records), select first to avoid accidental deletion, and only execute the update statement after confirming that it is correct. | forced |
| | in | If it can't be avoided, the number of set elements after in should be controlled within 1000. | recommend |
| | truncate table | It is forbidden to use truncate table, which is faster than delete and uses fewer system and log resources, but truncate is transaction-free and does not trigger triggers, which may cause accidents, so do not use this statement in development code. | reference |
|
5. ORM maps iron laws
| - | Iron law | Level | remark | table query | The list of fields that are prohibited from using * for queries must be clear which fields are required. | forced |
| | POJO | The Boolean attribute of the POJO class cannot be added to is, while the database field must be added to is, requiring mapping between fields and attributes in the resultMap. | forced |
| | Return parameters | It is forbidden to use resultClass as a return parameter, even if all class attribute names correspond to database fields one by one, they need to be defined; In turn, each table must have an attribute corresponding to it. Reason: Configure the mapping relationship to couple the field with the DO class for easy maintenance. | forced |
| | Return parameters | It is forbidden to directly use HashMap and HashTable as the output of the query result set. Reason: The type of attribute value is uncontrollable. | forced |
| | sql.xml Configure parameters | sql.xml Use #{}, #param# for configuration parameters, and do not use ${}, as ${} is prone to SQL injection. | forced |
| | queryForList | The use of queryForList(String statementName, int start, int size) that comes with Mybatis is prohibited. Reason: It is implemented by retrieving all records of the SQL statement corresponding to statementName in the database, and then using the subList to get a subset of start, size. | forced |
| | Update time | When updating a database table record, you must update the modification time of the record at the same time. | forced |
| | Update database table records | Don't write a large and full data update interface (passed in as a POJO class). When executing SQL, do not update unchanged fields due to error-prone, inefficient, and increased binlog storage. | recommend |
| | @Transactional | @Transactional Don't abuse transactions. Transactions affect the QPS of the database. In addition, where you use transactions, you need to consider various aspects of rollback schemes, including cache rollback, search engine rollback, message compensation, statistical correction, etc. | reference |
| | Mybatis dynamic sql tags | < compareValue in isEqual> is a constant compared to the attribute value, usually a number, indicating that the corresponding SQL statement is executed when equal; < isNotEmpty> indicates that it is executed when it is not empty and not null; < isNotNull> indicates that it is executed when it is not null. | reference | |
|