<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>I want to be free &#187; db</title>
	<atom:link href="http://i1t2b3.com/category/db/feed/" rel="self" type="application/rss+xml" />
	<link>http://i1t2b3.com</link>
	<description>Any fool can make things bigger and more complex</description>
	<lastBuildDate>Wed, 28 Jul 2010 14:34:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=2346</generator>
		<item>
		<title>Convert latin1 to utf8</title>
		<link>http://i1t2b3.com/2010/04/14/convert-latin1-to-utf8/</link>
		<comments>http://i1t2b3.com/2010/04/14/convert-latin1-to-utf8/#comments</comments>
		<pubDate>Wed, 14 Apr 2010 12:51:35 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[development]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=598</guid>
		<description><![CDATA[If you import unicode text into latin1 database column, the symbols would be screwed up — russian symbols become a shit like &#8220;Ð·Ð°Ð»Ð¾Ð³Ð¾Ð²Ñ‹Ð¹ Ð´ÐµÐ¿Ð¾Ð·Ð¸Ñ‚&#8221;. To convert such quickly (as a test) you can use Lebedev&#8217;s convertor. To convert the whole table do the following (thanks to dull.ru): The most important step: dump such data in [...]]]></description>
			<content:encoded><![CDATA[<p>If you import unicode text into latin1 database column, the symbols would be screwed up — russian symbols become a shit like &#8220;Ð·Ð°Ð»Ð¾Ð³Ð¾Ð²Ñ‹Ð¹ Ð´ÐµÐ¿Ð¾Ð·Ð¸Ñ‚&#8221;.</p>
<p><img class="aligncenter size-full wp-image-599" title="extras" src="/wp-content/uploads/2010/04/extras.png" alt="" width="484" height="218" /></p>
<p>To convert such quickly (as a test) you can use <a rel="nofollow" href="http://www.artlebedev.ru/tools/decoder/">Lebedev&#8217;s convertor</a>.</p>
<p>To convert the whole table do the following (thanks to <a rel="nofollow" href="http://dull.ru/2008/11/06/konvertirovat_bazu_v_utf8/">dull.ru</a>):</p>
<ol>
<li>The most important step: dump such data in a file with <code>mysqldump</code>
<pre><code>mysqldump -u user -p <strong>--default-character-set=latin1 </strong>--skip-set-charset --no-create-info --extended-insert --complete-insert dbname table &gt; dbname.sql</code></pre>
<p>If not the whole table data is bad, create another table (<code>CREATE TABLE t2 LIKE t1</code>) and copy wrong rows to the new table.</li>
<li>Replace <code>latin1</code> by <code>utf8</code> in the file<br />
This can be done manually in your editor or by this command:</p>
<pre><code>sed -r 's/latin1/utf8/g' dbname.sql &gt; dbname_utf.sql</code></pre>
</li>
<li>Convert your latin1-table to utf8 (and maybe truncate it):
<pre><code>ALTER TABLE `table` CONVERT TO CHARACTER SET 'utf8';</code></pre>
</li>
<li>Import the utf8 data back to the table:
<pre><code>mysql -u user -p --default-character-set=utf8 dbname &lt; dbname_utf.sql</code></pre>
</li>
</ol>
<p>That&#8217;s it.</p>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2010/04/14/convert-latin1-to-utf8/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Federated engine &#8211; MySQL table as symlink</title>
		<link>http://i1t2b3.com/2010/01/29/federated-table-as-symlink/</link>
		<comments>http://i1t2b3.com/2010/01/29/federated-table-as-symlink/#comments</comments>
		<pubDate>Fri, 29 Jan 2010 13:54:38 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[development]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=570</guid>
		<description><![CDATA[Are you aware of Federated engine in MySQL (apart from MyISAM and InnoDB)? This engine allows you to define a table that sucks data from another table, even from a remore server. The tables definition must be the same. I use it for the following: Every time I rebuild the project, I have wait for [...]]]></description>
			<content:encoded><![CDATA[<p>Are you aware of <a href="http://dev.mysql.com/doc/refman/5.1/en/federated-storage-engine.html">Federated engine</a> in MySQL (apart from MyISAM and InnoDB)?</p>
<p>This engine allows you to define a table that sucks data from another table, even from a remore server. The tables definition must be the same.</p>
<p>I use it for the following:</p>
<ol>
<li>Every time I rebuild the project, I have wait for 15 minutes while two big tables are created and filled with data — these are geo data tables (world cities, regions, etc), 4 mln records, and <acronym title="Points Of Interest">POI</acronym> table, 2 mln records. I use Federated tables to create two separate databases and just link these tables in my project.</li>
<li>These tables are shared between several environments (dev, test and live) on the same server.</li>
</ol>
<p>To check if your MySQL server has the Federated engine supported, you can use just a phpMyAdmin — go to home page of you phpMyAdmin installation (click Home picture), then choose Engines tab and check there.</p>
<p>If it&#8217;s not enabled (gray), open your <em>my.ini</em> file, find the &#8220;<em>[mysqld]</em>&#8221; part and make it to look like this:</p>
<pre><code>[mysqld]
federated</code></pre>
<p>P.S. If you have an error in the table definition, phpMyAdmin shows your database as empty. To fix this, log in via mysql console and try to make a SELECT from this poorly defined table and you get the error message to work with.</p>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2010/01/29/federated-table-as-symlink/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Currency exchange in your application</title>
		<link>http://i1t2b3.com/2009/09/28/currency-exchange/</link>
		<comments>http://i1t2b3.com/2009/09/28/currency-exchange/#comments</comments>
		<pubDate>Mon, 28 Sep 2009 10:53:17 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[zend]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=555</guid>
		<description><![CDATA[That&#8217;s easy, you need 2 things: Fresh currencies exchange rates Some way to excange amount from one currency to another. This how I did it: get values from European Central Bank (ECB) for step #1 and wrote MySQL user defined function for step #2. Here is how to export currencies rates from ECB (EUR is [...]]]></description>
			<content:encoded><![CDATA[<p>That&#8217;s easy, you need 2 things:</p>
<ol>
<li>Fresh currencies exchange rates</li>
<li>Some way to excange amount from one currency to another.</li>
</ol>
<p>This how I did it: get values from European Central Bank (ECB) for step #1 and wrote MySQL user defined function for step #2.</p>
<p>Here is how to export currencies rates from ECB (EUR is a base currency, and I add self rate as 1:1). First I create such  database table:</p>
<pre><code class="sql">CREATE TABLE IF NOT EXISTS `currency` (
  `code` char(3) NOT NULL DEFAULT '',
  `rate` decimal(10,5) NOT NULL COMMENT 'Rate to EUR got from www.ecb.int',
  PRIMARY KEY (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Currency rates (regularly updated)';
</code></pre>
<p>Now let&#8217;s fill it with rates:</p>
<pre><code class="php">&lt;?php

class CurrencyController extends Controller_Ajax_Action {

  public function importAction() {

    $db = Zend_Registry::get('db');
    $db-&gt;beginTransaction();

    $url = 'http://www.ecb.int/stats/eurofxref/eurofxref.zip?1c7a343768baab4322620e3498553b5a';
    try {
      $contents = file_get_contents($url);
      $contents = archive::unzip($contents);
      $contents = explode(&quot;\n&quot;, $contents);

      $names = explode(',', $contents[0]);
      $rates = explode(',', $contents[1]);

      $names[] = 'EUR';
      $rates[] = 1;

      for ($i = 1; $i &lt; sizeof($names); $i++) {
        if (!(float) $rates[$i]) continue;
        $db-&gt;query( sprintf('INSERT INTO `currency`(`code`, `rate`)
              VALUES (&quot;%s&quot;, %10.5f)
              ON DUPLICATE KEY UPDATE `rate`=VALUES(`rate`)',
             trim( $names[$i] ),
             trim( $rates[$i] )
        ) );
      }

      $db-&gt;commit();
    } catch ( Exception $O_o ) {
      error_log( $O_o-&gt;getMessage() );
      $db-&gt;rollback();
    }

  }
}</code></pre>
<p>Now let&#8217;s create a SQL function for handy converts. Create a <code>udf.sql</code> file and add this in it:</p>
<pre><code class="sql">DELIMITER //

DROP  FUNCTION IF EXISTS EXCHANGE;
CREATE FUNCTION EXCHANGE( amount DOUBLE, cFrom CHAR(3), cTo CHAR(3) ) RETURNS DOUBLE READS SQL DATA DETERMINISTIC
    COMMENT 'converts money amount from one currency to another'
BEGIN
    DECLARE rateFrom DOUBLE DEFAULT 0;
    DECLARE rateTo DOUBLE DEFAULT 0;

    SELECT `rate` INTO rateFrom FROM `currency` WHERE `code` = cFrom;
    SELECT `rate` INTO rateTo   FROM `currency` WHERE `code` = cTo;

    IF ISNULL( rateFrom ) OR ISNULL( rateTo ) THEN
        RETURN NULL;
    END IF;

    RETURN amount * rateTo / rateFrom;
END; //

DELIMITER ;</code></pre>
<p>and run this command in your shell:</p>
<pre><code>mysql --user=USER --password=PASS DATABASE &lt; udf.sql</code></pre>
<p>This is how you can use this function &mdash; how to convert 10 US dollars to Canadian dollars:</p>
<pre><code class="sql">SELECT EXCHANGE( 10, 'USD', 'CAD')</code></pre>
<p>which results in $10 = 10.93 Canadian dollars.</p>
<p>P.S. Consider adding the currency export action call to your cron scripts.</p>
<p>P.P.S. A function to unzip the data file can be got at <noindex><a rel="nofollow" href="http://ua2.php.net/manual/en/ref.zip.php">php.net</a></noindex></p>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2009/09/28/currency-exchange/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>XML to CSV conversion</title>
		<link>http://i1t2b3.com/2009/09/03/xml-to-csv/</link>
		<comments>http://i1t2b3.com/2009/09/03/xml-to-csv/#comments</comments>
		<pubDate>Thu, 03 Sep 2009 15:18:10 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[ideas]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=497</guid>
		<description><![CDATA[Data feeds often come in XML format, so your application must be able to deal with that format. As I already wrote, data in CSV format (comma separated values) can be loaded to database extremely fast. So my idea was to convert XML data files to CSV and then use bulk load to database. My [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-504" title="MySQL" src="http://i1t2b3.com/wp-content/uploads/2009/09/mysql-doplhins.png" alt="MySQL" width="180" height="176" />Data feeds often come in XML format, so your application must be able to deal with that format.</p>
<p>As I already <a href="/2009/01/14/quick-csv-import-with-mapping/">wrote</a>, data in CSV format (comma separated values) can be loaded to database extremely fast. So my idea was to convert XML data files to CSV and then use bulk load to database. My tests shown that this is faster in 10-100 times than one by one inserts.</p>
<p>Yesterday I decided to write a generalized solution for this, and it turned out that there is no need: it&#8217;s just coming — <a rel="nofollow" href="http://dev.mysql.com/doc/refman/6.0/en/load-xml.html">MySQL 6 will have such feature</a>!</p>
<p>How it works: you create a table, name its columns exactly as XML nodes/attributes names or  — and MySQL server will load it correspondently.</p>
<p>Example   — you downloaded a POI list file (Points of Interest) called <code>poi.xml</code> that looks like this:</p>
<pre><code class="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;gpx&gt;
    &lt;wpt lat="58.691931900" lon="11.253125962"&gt;
        &lt;name&gt;Parking&lt;/name&gt;
    &lt;/wpt&gt;
    &lt;wpt lat="58.315525000" lon="12.305828000"&gt;
          &lt;name&gt;Fast Food Restaurant:Max i trollhattan&lt;/name&gt;
    &lt;/wpt&gt;
    &lt;wpt lat="57.717958100" lon="11.880860600"&gt;
          &lt;name&gt;Picnic spot&lt;/name&gt;
    &lt;/wpt&gt;
&lt;/gpx&gt;</code></pre>
<p>You create a MySQL table:</p>
<pre><code class="sql">CREATE TABLE IF NOT EXISTS `poi` (
  `lat` varchar(255) NOT NULL,
  `lon` varchar(255) NOT NULL,
  `name` varchar(255) NOT NULL
) DEFAULT CHARSET=utf8;
</code></pre>
<p>OK, now you load the XML data to your table:</p>
<pre><code class="sql">LOAD XML INFILE '\\path\\to\\poi.txt'
INTO TABLE `poi`
ROWS IDENTIFIED BY '&lt;wpt&gt;'</code></pre>
<p>Voila!</p>
<p>The good thing is that <a rel="nofollow" href="http://mysql.west.mirrors.airband.net/Downloads/MySQL-6.0/">MySQL 6 is already available</a> in alpha version — good enough for development purposes; I gave it a try — it takes 5 seconds to load 4.8 Mb of data in 19 files.</p>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2009/09/03/xml-to-csv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sharing work between agents</title>
		<link>http://i1t2b3.com/2009/07/06/zend-db-table-select-with-lock/</link>
		<comments>http://i1t2b3.com/2009/07/06/zend-db-table-select-with-lock/#comments</comments>
		<pubDate>Sun, 05 Jul 2009 21:08:04 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[php]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=405</guid>
		<description><![CDATA[There are situations when you need to separate processing of big amount of data between several &#8220;agents&#8221;, e.g.: you have a long list of websites which must be checked for being alive (404 error check) by your web-clawlers; a queue of photos to be resized or videos to be converted; articles that your editors must [...]]]></description>
			<content:encoded><![CDATA[<p>There are situations when you need to separate processing of big amount of data between several &#8220;agents&#8221;, e.g.:</p>
<ul>
<li>you have a long list of websites which must be checked for being alive (404 error check) by your web-clawlers;</li>
<li>a queue of photos to be resized or videos to be converted;</li>
<li>articles that your editors must review;</li>
<li>catalogue of blog feeds that your system must import posts from;</li>
<li>etc.</li>
</ul>
<p>The idea to do this is simple:</p>
<ol>
<li>Give a small piece of big work to an agent.</li>
<li>Mark this piece as given to him (so that none other starts to do the same job) and remember the time stamp when the job was given or when the job becomes obsolete (this agent is dead, let&#8217;s give this job to someone else).</li>
<li>If work is done — go to step #1.</li>
<li>After some period of time (1 hour) check all the time stamps, and if some agents didn&#8217;t cope with the job, mark the jobs as free so that others could start to work on it.</li>
</ol>
<p>The problem is between steps #1 and #2 — while you gave a job to Agent 1 and going to mark it as given to him, what if Agent 2 is given by the same job? If you have many Agents, this  can happen at real. This situations is called <em>concurrent read/write</em>.</p>
<p>To overcome this a lock can be used.</p>
<p>In this article I wil explain, how to use locks in Zend project with MySQL database.</p>
<p>First of all, <a title="SELECT ... FOR UPDATE and SELECT ... LOCK IN SHARE MODE Locking Reads" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.4/en/innodb-locking-reads.html">MySQL documentation tells</a> that <code>SELECT .. FOR UPDATE</code> can be used for that purpose. First step is to select records by that statement, and second step is to mark them as locked. Requirements are to use <a title="Creating and Using InnoDB Tables" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/using-innodb-tables.html">InnoDB storage</a> and to frame these two statements in a transaction.</p>
<p>Happily, Zend_Db_Table_Select has a special method forUpdate() that implements <code>SELECT .. FOR UPDATE</code> statement. Zend_Db can cope with transactions as well. Let&#8217;s try it!</p>
<p>To lock a record, we need two fields:</p>
<ol>
<li>one to remember ID of agent that is processing this record (let&#8217;s call this column &#8216;<code>locked_by</code>&#8216;)</li>
<li>one another to know the time when the lock becomes obsolete (let&#8217;s call this column &#8216;<code>expires_at</code>&#8216;)</li>
</ol>
<p>I wrote a  class that inherits from Zend_Db_Table and helps to get records with locking them.</p>
<pre><code class="php">&lt;?php

class Koodix_Db_Table_Lockable extends Zend_Db_Table
{
    protected $_lockedByField = 'locked_by';
    protected $_expiresAtField = 'expires_at';
    protected $_TTL = '1 HOUR'; //time to live for lock

    public function fetchLocked( Zend_Db_Table_Select $select,
        $lockerID ) {

        $db = $this-&gt;getAdapter();
        $db-&gt;beginTransaction();

        $column = $db-&gt;quoteIdentifier( $this-&gt;_lockedByField );
        $select-&gt;forUpdate()
             -&gt;where(&quot;$column=? OR $column IS NULL&quot;, $lockerID);

        $data = $this-&gt;fetchAll($select);
        if( empty($data) ) return null;

        $expiresAt = new Zend_Db_Expr('DATE_ADD( NOW(),
            INTERVAL ' . $this-&gt;_TTL . ')');
        if( sizeof($this-&gt;_primary) &gt; 1 ) {
            foreach( $data as $item ) {
                $item-&gt;{$this-&gt;_lockedByField} = $lockerID;
                $item-&gt;{$this-&gt;_expiresAtField} = $expiresAt;

                $item-&gt;save();
            }
        }
        else {
            $arrIds = array();
            foreach( $data as $item ) {
                $arrIds[] = $item-&gt;id;
            }

            $this-&gt;update(
                array(
                    $this-&gt;_lockedByField =&gt; $lockerID,
                    $this-&gt;_expiresAtField =&gt; $expiresAt,
                ),
                $db-&gt;quoteIdentifier(current($this-&gt;_primary)) .
                    ' IN (&quot;'.implode('&quot;,&quot;', $arrIds).'&quot;)'
            );
        }

        $db-&gt;commit();
        return $data;
    }

    public function releaseLocks( ) {

        $column = $db-&gt;quoteIdentifier( $this-&gt;_expiresAtField );

        return $this-&gt;update(
            array(
                $this-&gt;_lockedByField =&gt; null,
                $this-&gt;_expiresAtField =&gt; null,
            ),
            "$column &lt;= NOW()"
        );
        ;
    }
}</code></pre>
<p>If the table has a composite primary key (containing more than one column), the ActiveRecord approach is used, so the <code>save()</code> method for every record is called, that&#8217;s simple (drawback &mdash; multiple update queries). Otherwise, if it is a deep-seated table with one ID column as a primary key, then the IDs are collected in a list and all records are updated by a single statement with <code>IN</code> in where clause (which is much faster).</p>
<p>TTL (&#8216;<em>Time to Live</em>&#8216;) &mdash; period of time when lock is allowed. In my application the default is one hour. Format of TTL can be seen in <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/4.1/en/date-and-time-functions.html#function_date-add">MySQL documentation</a></noindex>.</p>
<p>And now how to use it. </p>
<p>Let&#8217;s imagine you have several editors that divide the big articles list and review them. My model class has a method <code>fetchForUser()</code> that returns no more than 5 articles for current user (by given user ID).</p>
<p>This is an Article table model, inherited from the class above.  Usually such classes are located at </p>
<pre><code>application/default/models/ArticleTable.php</code></pre>
<pre><code class="php">&lt;?php
class ArticleTable extends Koodix_Db_Table_Lockable
{
    protected $_name = 'article';

    public function fetchForUser( $userId, $count=5 ) {

        $select = $this-&gt;select()
            -&gt;where('reviewed = 0')
            -&gt;order('expires_at DESC')
            -&gt;order('date_imported DESC')
            -&gt;limit( $count );

        return $this-&gt;fetchLocked($select, $userId);
    }
}</code></pre>
<p>Note: if the editor refreses the page, the <code>expres_at</code> fields is refreshed by current time as well.</p>
<p>As for step four of our algorithm (releasing all obsolete locks) &mdash; create an action in your backend controller, call your table model <code>releaseLocks()</code> method in it and call that action periodically by Cron.</p>
<p>To boost the performance of the lock releasing, create an index on the <code>expires_at</code> column. (Because of this reason I rejected the &#8216;<code>locked_since</code>&#8216; column in favor of &#8216;<code>expires_at</code>&#8216;)</p>
<p>P.S. In my database date/time columns have DATETIME type. If you use INT to store timestamps, convert it to unix time and back.</p>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2009/07/06/zend-db-table-select-with-lock/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>32 tips to speed up your queries</title>
		<link>http://i1t2b3.com/2009/01/19/quick-queries/</link>
		<comments>http://i1t2b3.com/2009/01/19/quick-queries/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 18:55:00 +0000</pubDate>
		<dc:creator>Skakunov Alexander</dc:creator>
				<category><![CDATA[db]]></category>

		<guid isPermaLink="false">http://i1t2b3.com/?p=160</guid>
		<description><![CDATA[As certified MySQL developer (yes, I&#8217;m listed on mysql.com!!! ), I would like to share some experience I&#8217;ve got during training to the certification. Today I will tell you how to speed up your queries. Use persistent connections to the database to avoid connection overhead. Check all tables have PRIMARY KEYs on columns with high [...]]]></description>
			<content:encoded><![CDATA[<p>As certified MySQL developer (yes, I&#8217;m listed on <noindex><a rel="nofollow" href="http://www-it.mysql.com/certification/candidates.php?exam=1#Ukraine">mysql.com</a></noindex>!!! <img src='http://i1t2b3.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ), I would like to share some experience I&#8217;ve got during training to the certification. Today I will tell you how to speed up your queries.</p>
<ol>
<li>Use persistent connections to the database to avoid connection overhead.</li>
<li>Check all tables have <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/constraint-primary-key.html"><code>PRIMARY KEY</code></a></noindex>s on columns with high cardinality (many rows match the key value). Well,`gender` column has low cardinality (selectivity), unique user id column has high one and is a good candidate to become a primary key.</li>
<li>All references between different tables should usually be done with indices (which also means they must have identical data types so that joins based on the corresponding columns will be faster). Also check that fields that you often need to search in (appear frequently in <code>WHERE</code>, <code>ORDER BY</code> or <code>GROUP BY</code> clauses) have indices, but don&#8217;t add too many: the worst thing you can do is to add an index on every column of a table <img src='http://i1t2b3.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  (I haven&#8217;t seen a table with more than 5 indices for a table, even 20-30 columns big). If you never refer to a column in comparisons, there&#8217;s no need to index it.</li>
<li>Using simpler permissions when you issue <code>GRANT</code> statements enables MySQL to reduce permission-checking overhead when clients execute statements.</li>
<li>Use less RAM per row by declaring columns only as large as they need to be to hold the values stored in them.</li>
<li>Use <noindex><a rel="nofollow" href="http://www.devshed.com/c/a/MySQL/Optimizing-for-Query-Speed/5/">leftmost index prefix</a></noindex> — in MySQL you can define index on several columns so that left part of that index can be used a separate one so that you need less indices.</li>
<li>When your index consists of many columns, why not to create a hash column which is short, reasonably unique, and indexed? Then your query will look like:
<pre><code class="sql">SELECT *
FROM table
WHERE hash_column = MD5( CONCAT(col1, col2) )
AND col1='aaa' AND col2='bbb';</code></pre>
</li>
<li>Consider running <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html"><code>ANALYZE TABLE</code></a></noindex> (or <code>myisamchk --analyze</code> from command line) on a table after it has been loaded with data to help MySQL better optimize queries.</li>
<li>Use <code>CHAR</code> type when possible (instead of <code>VARCHAR</code>, <code>BLOB</code> or <code>TEXT</code>) — when values of a column have constant length: MD5-hash (32 symbols), ICAO or IATA airport code (4 and 3 symbols), BIC bank code (3 symbols), etc. Data in <code>CHAR</code> columns can be found faster rather than in variable length data types columns.</li>
<li>Don&#8217;t split a table if you just have too many columns. In accessing a row, the biggest performance hit is the disk seek needed to find the first byte of the row.</li>
<li>A column must be declared as <code>NOT NULL</code> if it really is — thus you speed up table traversing a bit.</li>
<li>If you usually retrieve rows in the same order like <code>expr1, expr2, ...</code>, make <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/alter-table.html"><code>ALTER TABLE ... ORDER BY expr1, expr2, ...</code></a></noindex> to optimize the table.</li>
<li>Don&#8217;t use PHP loop to fetch rows from database one by one just because you can <img src='http://i1t2b3.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  — use <code>IN</code> instead, e.g.
<pre><code class="sql">SELECT *
FROM `table`
WHERE `id` IN (1,7,13,42);</code></pre>
</li>
<li>Use column default value, and insert only those values that differs from the default. This reduces the query parsing time.</li>
<li>Use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/insert-delayed.html"><code>INSERT DELAYED</code></a></noindex> or <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/insert.html"><code>INSERT LOW_PRIORITY</code></a></noindex> (for MyISAM) to write to your change log table. Also, if it&#8217;s MyISAM, you can add <code>DELAY_KEY_WRITE=1</code> option — this  makes index updates faster because they are not flushed to disk until the table is closed</li>
<li>Think of storing users sessions data (or any other non-critical data) in <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html"><code>MEMORY</code></a></noindex> table — it&#8217;s very fast.</li>
<li>For your web application, images and other binary assets should normally be stored as files. That is, store only a reference to the file rather than the file itself in the database.</li>
<li>If you have to store big amounts of textual data, consider using <code>BLOB</code> column to contain compressed data (MySQL&#8217;s COMPRESS() seems to be slow, so gzipping at PHP side may help) and decompressing the contents at application server side. Anyway, it must be benchmarked.</li>
<li>If you often need to calculate <code>COUNT</code> or <code>SUM</code> based on information from a lot of rows (articles rating, poll votes, user registrations count, etc.), it makes sense to create a separate table and update the counter in real time, which is much faster. If you need to collect statistics from huge log tables, take advantage of using a summary table instead of scanning the entire log table every time.</li>
<li>Don&#8217;t use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/replace.html"><code>REPLACE</code></a></noindex> (which is <code>DELETE+INSERT</code> and wastes ids): use <code><noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html">INSERT ... ON DUPLICATE KEY UPDATE</a></noindex></code> instead (i.e. it&#8217;s <code>INSERT + UPDATE</code> if conflict takes place). The same technique can be used when you need first make a <code>SELECT</code> to find out if data is already in database, and then run either <code>INSERT</code> or <code>UPDATE</code>. Why to choose yourself — rely on database side.</li>
<li>Tune MySQL caching: allocate enough memory for the buffer (e.g. <code>SET GLOBAL <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_query_cache_size">query_cache_size</a></noindex> = 1000000</code>) and define <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_query_cache_min_res_unit"><code>query_cache_min_res_unit</code></a></noindex> depending on average query resultset size.</li>
<li>Divide complex queries into several simpler ones — they have more chances to be cached, so will be quicker.</li>
<li>Group several similar <code>INSERT</code>s in one long <code>INSERT</code> with multiple <code>VALUES</code> lists to insert several rows at a time: quiry will be quicker due to fact that connection + sending + parsing a query takes 5-7 times of actual data insertion (depending on row size). If that is not possible, use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/commit.html"><code>START TRANSACTION</code> and <code>COMMIT</code></a></noindex>, if your database is InnoDB, otherwise use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html"><code>LOCK TABLES</code></a></noindex> — this benefits performance because the index buffer is flushed to disk only once, after all <code>INSERT</code> statements have completed; in this case unlock your tables each 1000 rows or so to allow other threads access to the table.</li>
<li>When loading a table from a text file, use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/load-data.html"><code>LOAD DATA INFILE</code></a></noindex> (or <noindex><a rel="nofollow" title="Quick CSV import with visual mapping" href="http://i1t2b3.com/2009/01/14/quick-csv-import-with-mapping/">my tool</a></noindex> for that), it&#8217;s 20-100 times faster.</li>
<li><noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html">Log slow queries</a></noindex> on your dev/beta environment and investigate them. This way you can catch queries which execution time is high, those that don&#8217;t use indexes, and also — slow administrative statements (like  <code>OPTIMIZE TABLE</code> and <code>ANALYZE TABLE</code>)</li>
<li>Tune your database <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/server-parameters.html">server parameters</a></noindex>: for example, increase buffers size.</li>
<li>If you have lots of <code>DELETE</code>s in your application, or updates of dynamic format rows (if you have <code>VARCHAR</code>, <code>BLOB</code> or <code>TEXT</code> column, the row has dynamic format) of your MyISAM table to a longer total length (which may split the row), schedule running <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/optimize-table.html"><code>OPTIMIZE TABLE</code></a></noindex> query every weekend by crond. Thus you make the defragmentation, which means more speed of queries. If you don&#8217;t use replication, add <code>LOCAL</code> keyword to make it faster.</li>
<li>Don&#8217;t use <code>ORDER BY RAND()</code> to fetch several random rows. Fetch 10-20 entries (last by time added or ID) and make <code>array_random()</code> on PHP side. There are also <noindex><a rel="nofollow" href="http://jan.kneschke.de/projects/mysql/order-by-rand/">other solutions</a></noindex>.</li>
<li>Consider avoiding using of <code>HAVING</code> clause — it&#8217;s rather slow.</li>
<li>In most cases, a <code>DISTINCT</code> clause can be considered as a special case of <code>GROUP BY</code>; so the <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/group-by-optimization.html">optimizations applicable to <code>GROUP BY</code> queries</a></noindex> can be also applied to queries with a <code>DISTINCT</code> clause. Also, if you use <code>DISTINCT</code>, try to use <code>LIMIT</code> (MySQL stops as soon as it finds row_count unique rows) and avoid <code>ORDER BY</code> (it requires a temporary table in many cases).</li>
<li>When I read &#8220;Building scalable web sites&#8221;, I found that it worth sometimes to de-normalise some tables (Flickr does this), i.e. duplicate some data in several tables to avoid <code>JOIN</code>s which are expensive. You can support data integrity with foreign keys or triggers.</li>
<li>If you want to test a specific MySQL function or expression, use <noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_benchmark"><code>BENCHMARK</code></a></noindex> function to do that.</li>
</ol>
<p>Some of these hints are unapplicable if you use a framework because direct queries are uninvited guests in the case: focus on competent database optimization — tune indexes and server parameters.</p>
<p>More on queries optimization:</p>
<ul>
<li><noindex><a rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/query-speed.html">Optimizing SELECT and Other Statements</a></noindex></li>
<li><noindex><a rel="nofollow" href="http://www.devshed.com/c/a/MySQL/Optimizing-for-Query-Speed/">Optimizing for Query Speed</a></noindex> (more about good index creating practices).</li>
<li><noindex><a rel="nofollow" href="http://jan.kneschke.de/projects/mysql/order-by-rand/">ORDER BY RAND()</a></noindex></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://i1t2b3.com/2009/01/19/quick-queries/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
