mysql数据库优化 | 冰霜时节

1.先说说数据库三范式

1.1 第一范式(确保每列保持原子性)

第一范式是最基本的范式。如果数据库表中的所有字段值都是不可分解的原子值，就说明该数据库表满足了第一范式。

1.2 第二范式(确保表中的每列都和主键相关)

第二范式在第一范式的基础之上更进一层。第二范式需要确保数据库表中的每一列都和主键相关，而不能只与主键的某一部分相关（主要针对联合主键而言）。也就是说在一个数据库表中，一个表中只能保存一种数据，不可以把多种数据保存在同一张数据库表中。

1.3 第三范式(确保每列都和主键列直接相关,而不是间接相关)

第三范式需要确保数据表中的每一列数据都和主键直接相关，而不能间接相关。

2.准备测试数据库

2.1 下载sakila数据库

2.2 开启慢查询日志

#查询是否开启慢查日志
show variables like 'slow_query_log';

set global slow_query_log=on;

#参数设置
show variables like 'log%';

#记录没有索引的
set global log_queries_not_using_indexes=on;

#设置要记录的查询时间
set long_query_time=0;

#查询慢查日志的位置
show variables like 'slow%';

3.慢查日志分析工具

3.1 mysqldumpslow

查前三条

mysqldumpslow -t 3 /www/wdlinux/mysql-5.1.69/var/oracledb-slow.log | more

3.2 pt-query-digest

percona-toolkit的rpm安装方式
- 注意：需要安装Term::ReadKey 包，否则会报perl(Term::ReadKey) >= 2.10 is needed by percona-toolkit-2.1.1-1.noarch错误 **
  1
  2
  rpm -ivh perl-TermReadKey-2.30-1.el5.rf.x86_64.rpm
  rpm -ivh percona-toolkit-2.1.1-1.noarch.rpm
percona-toolkit的编译安装方式

下载安装包 **

tar xzvf percona-toolkit-2.1.1.tar.gz
cd percona-toolkit-2.1.1
perl Makefile.PL
make && make install

查询日志

pt-query-digest /www/wdlinux/mysql-5.1.69/var/oracledb-slow.log | more

4.SQL优化

4.1索引优化

#查询最大值：max(),建该字段索引
explain select max(payment_date) from payment \G

#建索引
create index idx_paydate on payment(payment_date);

#在一条sql中查出2006和2007年的电影数量【count()函数优化】
#Count(*)和count(id)的区别:count(*)具体的行数，包括空值！
select count(release_year = '2006' or null )as '2006年',count(release_year = '2007' or null) as '2007年' from film;

#索引优化：在where , order by , group by 等从句中出现的列，索引字段越小越好，离散度大的列放到联合索引前面。
    #1.判断离散度：查找该字段的唯一值，越大离散度越好！
    select count(distinct customer_id),count(distinct staff_id) from payment;
    #建立联合索引
    index(customer_id,staff_id)
    #去掉重复及冗余索引,第三方分析工具：Pt-duplicate-key-checker
    pt-duplicate-key-checker -uroot -p'6n7d50' -h 127.0.0.1
    #索引使用情况分析：通过慢查日志配合pt-index-usage工具来进行索引使用情况分析
    pt-index-usage  -uroot -p'6n7d50' /www/wdlinux/mysql-5.1.69/var/oracledb-slow.log

4.2 子查询优化

#在t1表再插入一条数据已存在数据，那么t对t1表存在一对多的关系，子查询不会产生重复数据，但关联查询会产生重复数据，用distinct考虑去重！
#数据量小的时候适合用子查询，数据量大的时候适合连表表查询！
select * from t where t.id in (select t1.id from t1);
select t.id from t join t1 on t.id=t1.tid;

4.3 group by优化

#原SQL
explain select actor.first_name,actor.last_name,count(*) from film_actor inner join actor using(actor_id) group by film_actor.actor_id \G

#优化后
explain select actor.first_name,actor.last_name,c.cnt from actor inner join (select actor_id,count(*) as cnt from film_actor group by actor_id) as c using(actor_id) \G

4.4 limit优化

#原SQL
explain select film_id ,description from sakila.film order by title limit 50,5 \G

#优化：避免数据量大时，过多的扫描！
    #1.用有索引的字段或主键进行order by 进行操作
    explain select film_id ,description from film order by film_id limit 50,5 \G
    #2.主键如果顺序排序，连续的！可以记录上次返回的主键，在下次查询时使用主键过滤！
    explain select film_id,description from film where film_id >55 and film_id < 60 order by film_id limit 1,5;

5.数据库优化

5.1数据库结构优化

选择合适的数据类型:int 类型比varchar更优，字段尽量设置为not null，并给默认值。

使用int来存储日期时间，利用from_unixtime(),unix_timestamp()两个函数来进行转换。

1 2	Create table test(id int auto_increment not null ,timestr int ,primary key (id) ); Insert into test(timestr) values (unix_timestamp(‘2015-04-24 15:36:12’));

用bigint存IP地址：利用inet_aton(),inet_ntoa()两个函数来进行转换

1
2
3

Create table sessions( id int auto_increment not null, ip bigint, primary key (id));
Insert into sessions (ip) values (inet_aton(‘182.92.66.106’));
Select inet_ntoa(ip) from sessions;

5.2表的垂直拆分

把一个有很多列的表拆分成多个表，解决表的宽度问题。
1.把不常用的字段单独存放到一个表。
2.把大字段独立存放到一个表。
3.把经常使用的字段放到一起。

5.3表的水平拆分

解决单表数据量过大问题，水平拆分的表结构一样。
1.对表的某个字段（如：id）进行hash运算，如果要拆分成5个表则使用mod(id,5)取出0-4个值！
2.针对不同的hashID把数据存到不同的表中。