6000万个条目，请从特定月份选择条目如何优化数据库？

浏览：86日期：2024-03-14

(adsbygoogle = window.adsbygoogle || []).push({}); 如何解决6000万个条目，请从特定月份选择条目如何优化数据库？？

利用innodb集群主键索引。

http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html

这将是非常出色的：

create table datasources(year_id smallint unsigned not null,month_id tinyint unsigned not null,datasource_id tinyint unsigned not null,id int unsigned not null, -- needed for uniquenessdata int unsigned not null default 0,primary key (year_id, month_id, datasource_id, id))engine=innodb;select * from datasources where year_id = 2011 and month_id between 1 and 3;select * from datasources where year_id = 2011 and month_id = 4 and datasouce_id = 100;-- etc..

忘了我正在运行第一个包含3个月数据的测试脚本。这是一个月的结果：0.34和0.69秒。

select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 and datasource_id = 100 order by d.id desc limit 10;+---------+----------+---------------+---------+-------+| year_id | month_id | datasource_id | id | data |+---------+----------+---------------+---------+-------+| 2010 |3 | 100 | 3290330 | 38434 || 2010 |3 | 100 | 3290329 | 9988 || 2010 |3 | 100 | 3290328 | 25680 || 2010 |3 | 100 | 3290327 | 17627 || 2010 |3 | 100 | 3290326 | 64508 || 2010 |3 | 100 | 3290325 | 14257 || 2010 |3 | 100 | 3290324 | 45950 || 2010 |3 | 100 | 3290323 | 49986 || 2010 |3 | 100 | 3290322 | 2459 || 2010 |3 | 100 | 3290321 | 52971 |+---------+----------+---------------+---------+-------+10 rows in set (0.34 sec)select d.* from datasources d where d.year_id = 2010 and d.month_id = 3 order by d.id desc limit 10;+---------+----------+---------------+---------+-------+| year_id | month_id | datasource_id | id | data |+---------+----------+---------------+---------+-------+| 2010 |3 | 116 | 3450346 | 42455 || 2010 |3 | 116 | 3450345 | 64039 || 2010 |3 | 116 | 3450344 | 27046 || 2010 |3 | 116 | 3450343 | 23730 || 2010 |3 | 116 | 3450342 | 52380 || 2010 |3 | 116 | 3450341 | 35700 || 2010 |3 | 116 | 3450340 | 20195 || 2010 |3 | 116 | 3450339 | 21758 || 2010 |3 | 116 | 3450338 | 51378 || 2010 |3 | 116 | 3450337 | 34687 |+---------+----------+---------------+---------+-------+10 rows in set (0.69 sec)

决定使用大约。3年内分布了6000万行。每个查询都是冷运行的，即每个运行单独运行，然后重新启动MysqL，清除所有缓冲区且没有查询缓存。

完整的测试脚本可以在这里找到：http ://pastie.org/1723506或以下…

如您所见，即使在我不起眼的桌面上，它也是一个非常出色的架构：)

select count(*) from datasources;+----------+| count(*) |+----------+| 60306030 |+----------+select count(*) from datasources where year_id = 2010;+----------+| count(*) |+----------+| 16691669 |+----------+select year_id, month_id, count(*) as counterfrom datasourceswhere year_id = 2010group by year_id, month_id;+---------+----------+---------+| year_id | month_id | counter |+---------+----------+---------+| 2010 |1 | 1080108 || 2010 |2 | 1210121 || 2010 |3 | 1160116 || 2010 |4 | 1300130 || 2010 |5 | 1860186 || 2010 |6 | 1220122 || 2010 |7 | 1250125 || 2010 |8 | 1460146 || 2010 |9 | 1730173 || 2010 | 10 | 1490149 || 2010 | 11 | 1570157 || 2010 | 12 | 1360136 |+---------+----------+---------+12 rows in set (5.92 sec)select count(*) as counterfrom datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100;+---------+| counter |+---------+| 30003 |+---------+1 row in set (1.04 sec)explainselect d.* from datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100order by d.id desc limit 10;+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref |rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+| 1 | SIMPLE | d | range | PRIMARY | PRIMARY | 4 | NULL |4451372 | Using where; Using filesort |+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+1 row in set (0.00 sec)select d.* from datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3 and datasource_id = 100order by d.id desc limit 10;+---------+----------+---------------+---------+-------+| year_id | month_id | datasource_id | id | data |+---------+----------+---------------+---------+-------+| 2010 |3 | 100 | 3290330 | 38434 || 2010 |3 | 100 | 3290329 | 9988 || 2010 |3 | 100 | 3290328 | 25680 || 2010 |3 | 100 | 3290327 | 17627 || 2010 |3 | 100 | 3290326 | 64508 || 2010 |3 | 100 | 3290325 | 14257 || 2010 |3 | 100 | 3290324 | 45950 || 2010 |3 | 100 | 3290323 | 49986 || 2010 |3 | 100 | 3290322 | 2459 || 2010 |3 | 100 | 3290321 | 52971 |+---------+----------+---------------+---------+-------+10 rows in set (0.98 sec)select count(*) as counterfrom datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3;+---------+| counter |+---------+| 3450345 |+---------+1 row in set (1.64 sec)explainselect d.* from datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3order by d.id desc limit 10;+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref |rows | Extra |+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+| 1 | SIMPLE | d | range | PRIMARY | PRIMARY | 3 | NULL |6566916 | Using where; Using filesort |+----+-------------+-------+-------+---------------+---------+---------+------+---------+-----------------------------+1 row in set (0.00 sec)select d.* from datasources dwhere d.year_id = 2010 and d.month_id between 1 and 3order by d.id desc limit 10;+---------+----------+---------------+---------+-------+| year_id | month_id | datasource_id | id | data |+---------+----------+---------------+---------+-------+| 2010 |3 | 116 | 3450346 | 42455 || 2010 |3 | 116 | 3450345 | 64039 || 2010 |3 | 116 | 3450344 | 27046 || 2010 |3 | 116 | 3450343 | 23730 || 2010 |3 | 116 | 3450342 | 52380 || 2010 |3 | 116 | 3450341 | 35700 || 2010 |3 | 116 | 3450340 | 20195 || 2010 |3 | 116 | 3450339 | 21758 || 2010 |3 | 116 | 3450338 | 51378 || 2010 |3 | 116 | 3450337 | 34687 |+---------+----------+---------------+---------+-------+10 rows in set (1.98 sec)

希望这可以帮助：）

解决方法

我有一个拥有6000万个条目的数据库。

每个条目包含：

ID数据源ID一些数据约会时间我需要从特定月份中选择条目。每个月约有200万个条目。

select *

from Entries where time between “2010-04-01 00:00:00” and “2010-05-01 00:00:00”

（查询大约需要1.5分钟）

我还想从给定的DataSourceID中选择特定月份的数据。（大约需要20秒）

大约有50-100个不同的DataSourceID。

有没有办法使它更快？我有什么选择？如何优化此数据库/查询？

编辑： 有大约。每秒钟60-100次插入！

上一条：错误代码：1215无法添加外键约束（外键）下一条：大写第一个字母的MySQL

相关文章：
1. 为什么老师，你de button按钮处可以有两个ID是一样的id="loginbtn" 而不会报错啊2. 为什么矛那里的 <a href=" " 这地方为什么是空的呢？？3. 为什么总是提示我说Template "movieTemplate" not found，我路径都引对了呀4. <tr valign="top"> 看不懂5. mysql - sphinx查询 "中国" 时也能查询到 "中华人民共和国"6. MySQL"="自动 like7. node.js mysql Cannot find module "net" 和 "tls"和"fs" 的问题8. mysql 使用 join 还是 "," 进行多表查询？？？9. mysql - 使用hibernate连接数据库时,数据库版本过高不支持关键字"type" ；10. 发现 <li><a href="/index.php">回到前台</a></li>这样回到首页后，不是登录状态