Big MySQL Table? Watch the keyfile Length
I made some large tables (around 3-4 billion records each) and then tried to create indexes on one column in each of them. I found that MySQL stopped at a seemingly arbitrary maximum index file size (.MYI file) for both of them. When I tried to open either table, I received an error message “ERROR 1016 (HY000): Can’t open file: ‘mytable.MYI’ (errno: 144)” indicating that the index file was crashed.
CHECK TABLE mytable QUICK reported ‘warning : Keyfile is almost full, 17179867136 of 17179867136 used’ Then myisamchk -dv mytable reported similar information under ‘Max keyfile length,” and reported “Keyfile pointer (bytes): 3′
After a little digging around, I realized that a 3-byte keyfile pointer only allows the index file to hold 2^24 blocks (about 16 million blocks of 1024 bytes each). My index file was holding about 110 keys per block (e.g., each record needed 4 bytes for the key value, plus 4 bytes for the datafile pointer, and maybe one byte for null/housekeeping, and then each block kept a little more empty space). So the 3-byte keyfile pointer only allowed MySQL to index about 1.8 billion records. Then it gave up, leaving a corrupted index file.
I have found no documentation for how MySQL decides how big a keyfile pointer to use in the .MYI file. I had no trouble adding 3-4 billion records to the MYD file, so it didn’t occur to me that there would be a problem with the MAX_ROWS clause of the CREATE TABLE command (which I hadn’t specified). But that turned out to be the problem.
I recovered my data table without the index file, then issued ALTER TABLE and now it looks like everything will be OK.
Here’s how I got my table back, without the index file (similar to what’s described above):
- make a backup of the data file (mytable.myd), just in case
- create a new empty table with the same structure as the existing table, but no index keys.
- copy the .FRM and .MYI files from the new (empty) table over the existing mytable.frm and mytable.myi files.
- issue
REPAIR TABLE mytable USE_FRM;It is very important to include the USE_FRM clause. That way, MySQL knows to make a new (empty) .MYI file that matches the .MYD file, rather than truncating the .MYD file to match the .MYI file. You will eventually get a message along the lines of “Number of rows changed from 0 to xxxx”. - Now you have a working table, with no indexes. Then you can set MAX_ROWS to something appropriate, and try again to make the indexes.
MySQL config and iptables
This post is for reference.
Of note, there is a good example of using iptables, and a description of how to bind your local mysql client to a remote mysql server.
http://www.cyberciti.biz/tips/how-do-i-enable-remote-access-to-mysql-database-server.html
Configuring MySQL
The following is how I set up MySQL on wharf.
- First, ensure mysql is installed:
ls /etc/init.d/my* - Now, make mysql start on boot:
sudo /sbin/chkconfig --level 35 mysqld on - Start the mysql server:
sudo /etc/init.d/mysqld start - Make MySQL more secure:
sudo /usr/bin/mysql_secure_installation - Log into mysql as root:
mysql -u root -p [to be entered by prompt] - Add another mysql user:
mysql> CREATE USER 'newuser'@'%' IDENTIFIED BY 'newuserpass';
mysql> GRANT SELECT,INSERT,UPDATE,DELETE,CREATE,DROP,ALTER ON *.* TO 'newuser'@'%';
mysql> FLUSH PRIVILEGES; - Exit mysql:
mysql> exit - Test:
mysql -u <newuser> -p <newuserpass>
Things that Kill MySQL Performance
- Failing to plan for scaling out, scaling up
- Not using EXPLAIN (learn from your mistakes)
- Using the wrong data types (hint: use smallest fixed-size)
- Using persistent connections in PHP (leads to zombie processes)
- Using a heavy DB abstraction layer (hint: use PDO or custom)
- Using the wrong storage engine (MEMORY, ARCHIVE, InnoDB, etc.)
- Using indexes improperly (hint: select in the order of the index; hint: keep primary key small)
- Issuing SQL queries that can’t be cached easily; having too large of a query cache
- Using stored procedures when prepared statements are better
- Using functions on indexed columns in the WHERE clause
- Not indexing important columns; having redundant indexes
- Using sub-queries when joins would be better
- Issuing avoidable deep scans
- Doing SELECT COUNT(*) without WHERE on InnoDB table.
- Failing to profile/benchmark your SQL
- Not using AUTO_INCREMENT if applicable
- Not using ON DUPLICATE KEY UPDATE (1 query vs 2)
Reference:
MySQL DNS Caching
On December 12, 2008, SCCOOS metadata system was broken. The error was cryptic, but the message was clear: Connection to [IP] denied.
Resolution
Using the MySQL command-line tool:
mysql -u admin -p [-h
mysql> flush hosts;
That’s it!
Chain of Events
- UCSD’s DNS was polluted, identifying 132.239.123.144 as ridge2000.org for the past few weeks.
- Sandbar was restarted last week, freeing up both the OS and MySQL DNS caches, which meant all new connections will need to query a DNS server to identify whether it’s an allowed machine or not (do you see a security vulnerability?).
- Both the OS and MySQL cached the hostname, but MySQL further rejected all connections from alfredo because MySQL thought that alfredo was ridge2000 instead.
- The mysql database, user table identifies user sccoos can connect to sandbar from alfredo.ucsd.edu.
- The OS cache probably cleared, but since so few machines access sandbar, the MySQL cache hadn’t filled up yet.
- Just last night (2008-12-22), the UCSD DNS purged the ridge2000.org name from its list, so UCSD’s DNS is clean, but any caches may still be polluted.
- Windows machines cycle their caches regularly, so the problem isn’t more wide-spread or commonly understood.
- The long-lasting MySQL DNS cache is disturbing, but alternatives would cripple the imperative nature of a database.