My article about the queues

development, php, php classes Comments Off on My article about the queues

I wrote an article how queues can speed up your application and make your users happier.

Queues are everywhere. When you delete a photo on your iPhone, it goes to the Bin and will be deleted from there after 30 days. That is a queue. Without it you will not be able to recover a picture removed by mistake.

Read the full article here.

Grab a list of emails from a website with paging

development, ideas Comments Off on Grab a list of emails from a website with paging

A one-liner to grab a list of emails.

wget -q -O - http://server.com/?page={1..42} | grep -ioE '\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b' | sort -ui > emails.txt

Just replace the page URL and define the start-finish numbers of paging: the {1..42} part means paging from page #1 to page #42 — of course, this is what you should investigate to form a proper final URL.

The sorted results are in emails.txt file.

Yes, there is no phone numbers or first\last names parsed. Fast and easy solution.

MySQL stored procedures debugging

development, mysql Comments Off on MySQL stored procedures debugging

It seems there are no built-in ways to debug user-defined functions or procedures in MySQL. So if the function behaves bad, it’s hard to find out why.

Here is what I do in order to trace a function. Yes, it’s a bit ugly, but better than nothing.

1. Run this in your MySQL command line (or PhpMySQL with // as delimiter, there is a field for it below the SQL window).

DELIMITER //

DROP PROCEDURE IF EXISTS Debug; //
CREATE PROCEDURE Debug(Message TEXT)
BEGIN
    CREATE TABLE IF NOT EXISTS _debug (
        `id` int(10) unsigned NOT NULL auto_increment,
        `msg` TEXT DEFAULT NULL,
        `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        PRIMARY KEY  (`id`)
    );
    INSERT INTO _debug(`msg`)  VALUES(Message);
END; //

DROP PROCEDURE IF EXISTS ClearDebugMessages; //
CREATE PROCEDURE ClearDebugMessages()
BEGIN
    TRUNCATE TABLE _debug;
END; //

2. In your function, this can be made:

CALL Debug('Debug message goes here');

3. There gonna be _debug table in your database that contains debug messages with date of creation of each.

4. If you want to clear all the debug messages, make a call:

CALL ClearDebugMessages();

Alternative is to truncate the _debug table with PhpMyAdmin.

MySQL random function

db, development, mysql Comments Off on MySQL random function

MySQL RAND() function returns values between 0 and 1. At times you need a random value between A and B, then you can use this function:

DELIMITER //
DROP  FUNCTION IF EXISTS RANDOM;//
CREATE FUNCTION RANDOM(minimum INT, maximum INT)
    RETURNS INT NO SQL NOT DETERMINISTIC
    COMMENT 'integer random value in the bounds given'
RETURN minimum + 
  ROUND(1000000 * RAND() % (maximum-minimum) );//
DELIMITER ;

You can call it like this to get a random integer value between 0 and 100:

SELECT RANDOM(0, 100);

Sphinx DocumentID for complex queries

development Comments Off on Sphinx DocumentID for complex queries

Every document in Sphinx has to have an unique Document ID. Usually it’s the database table’s primary key value.

But what if your table doesn’t have a primary key or is a very complex query?

In that case we used this approach.
1. Create a table called _sequence with just one integer field called id.
2. Run this query:

DELIMITER //
DROP  FUNCTION IF EXISTS GET_NEW_ID; //
CREATE FUNCTION GET_NEW_ID( ) RETURNS INT READS SQL DATA
BEGIN
    UPDATE _sequence SET id = LAST_INSERT_ID(id+1);
    SET @id = LAST_INSERT_ID();
    IF 0 = @id THEN
        INSERT INTO _sequence VALUES (0); -- fix if there were no data to increment
    END IF;
    RETURN @id;
END; //
DELIMITER ;

3. In your Sphinx queries you can use this function to generate an unique document ID:

SELECT GET_NEW_ID() as id, title, ...

Ant build script to optimize images

ant, development, phpunit Comments Off on Ant build script to optimize images

My PHP-projects are built with help of Apache Ant tool. It just runs a set of predefined applications for many different reasons you define: to make a smoke test of your code, to prepare files for commit, grab code stats, etc.

So, Ant without params runs a default ruleset, which can be a simple smoke test: PHP syntax check of modified files + PHPUnit tests.

One more reason I use it is the recommendation from Google PageSpeed Insights to optimize the images — exclude unnecessary technical information (EXIF) from all your JPGs and PNGs pictures.

Here is how I did it.

First, they recommend to install optipng to optimize the PNG files and the jpegoptim for JPEG (look in “Tools and parameter tuning” section).

So, here is my Ant script:

<?xml version="1.0" encoding="UTF-8"?>
<!-- http://ant.apache.org/manual/using.html -->

<project name="yasno.tv" default="build" basedir=".">
  <target name="build" depends="..., optimg" />
  ...

  <target name="optimg" description="Optimizes images in public folder">
    <apply executable="optipng" failonerror="true" description="Optimizes PNG files">
      <arg value="-o7" />
      <fileset dir="${basedir}/public/img/">
        <include name="**/*.png" />
        <modified>
          <param name="cache.cachefile" value="${basedir}/etc/build/cache.properties"/>
        </modified>
      </fileset>  
    </apply>

    <apply executable="jpegoptim" failonerror="true" description="Optimizes JPG files">
      <arg value="--strip-all" />
      <fileset dir="${basedir}/public/img/">
        <include name="**/*.jpg" />
        <modified>
          <param name="cache.cachefile" value="${basedir}/etc/build/cache.properties"/>
        </modified>
      </fileset>  
    </apply>
  </target>
    
</project>

It optimizes all image files from public/img folder.

The etc/build/cache.properties is the file that contains a file name and the last modification time, so the script optimizes only the changed files.

As result, the image files usually loose 10-20% of their size.

The command to do is:

ant optimg

or just

ant

if you want to run the whole ruleset.

Specifying CSS class of a Zend_Navigation li element

delete your code, development, php, zend Comments Off on Specifying CSS class of a Zend_Navigation li element

Zend Navigation is a really nice tool to handle menus.

You just specify an array of your menu items:

$pages = array(
    array(
        'label'      => 'Privacy Statement',
        'controller' => 'terms',
        'action'     => 'privacy-statement',
        'class'      => 'firstNav',
    ),
    array(
        'label'      => 'General Terms of Use',
        'controller' => 'terms',
        'action'     => 'general-terms-of-use',
    ),
);

and initialize the navigation object with it — and it works:

$container = new Zend_Navigation($pages);
$this->view->navigation($container)->menu()
    ->setUlClass('nav')
    ->setActiveClass('active');

Result is the following:

<ul class="nav">
    <li class="active">
        <a class="firstNav" href="/terms/privacy-statement">Privacy Statement</a>
    </li>
    <li>
        <a href="/terms/general-terms-of-use">General Terms of Use</a>
    </li>
</ul>

There are options to specify the CSS class of the whole UL tag. If you specify CSS of a menu item (“firstNav” in my example), it’s added to to the A tag, not the LI tag as required by sliced design I have.

Googling shows that people are trying to work-around that by jQuery fixes.

It seems there is a proper way to solve this; you just need to add this option:

$this->view->navigation()->menu()->addPageClassToLi(true);

Enjoy!

Domain Driven Design as billiards game

DDD, development, fun Comments Off on Domain Driven Design as billiards game

I am becoming a fan of Domain Driven Design (DDD).

Here is a fun idea how DDD could look if you come to a pub to play a billiards game.

You are asked which game exactly you are going to play; so you say “pool”. As result you get a pool table and a pool rules agreement. This is an abstract factory.

The balls are numbered and colored. Black ball has a custom behavior. Since they all have an identity, they are entities.

As for the billiard cue — it’s a bit trickier. If the cues have different size, and you prefer to play by your own one, it’s an entity. Although, if you are drunk and don’t care about the stick you are playing with, it’s a value object.

A small pieces of chalk is a value object — any one works. The adjustment triangle is a value object too.

The rules of pool is a domain service.

The kick, the chalk break, the balls setup — all these form a infrastructure service.

A wall shelf for the balls is a repository.

Using non-alphanumeric characters in Sitemap URLs

development, encoding, google, links, php Comments Off on Using non-alphanumeric characters in Sitemap URLs

This article in Google Help explains how to deal with special characters in Sitemaps that you can submit to Webmaster tools in order to increase the number of indexed pages of your website.

The main point is: the URLs must contain ASCII symbols only.

It can be done this way:

  • (obvious) ampersand, both quotes and <> symbols must be encoded,
  • Unicode symbols must be encoded, eg. ü must be converted to %FC sequence,
  • URLs that you submit must follow the  RFC-3986

If you use PHP, pay attention to one thing: it seems rawurlencode should be used instead of the usual urlencode since it’s follows the RFC-3986 as stated in PHP documentation.

Install Sphinx on Mac

development, mac, server, sphinx Comments Off on Install Sphinx on Mac

To install Sphinx search on Mac, I had to find pieces of information here and there.

Here is my list how to sort it out:

1. Download a fresh stable source from Sphinx website to your /tmp folder. It’s a tar.gz file.

2. Go to /tmp folder and run this command to untar the arhieve:

tar -zxvf sphinx-2.0.6-release.tar.gz

Fix the file name since the version can change.

3. Go into the “sphinx-2.0.6-release” folder.

4. Fix this command by my comments below and then run:

./configure --enable-id64  --prefix=/usr/local --with-mysql=/usr/local/mysql-5.1.63-osx10.6-x86/ LDFLAGS="-arch i386"
  • --enable-id64 mean you want to support really long integers as document IDs, depends on your application needs
  • --with-mysql – fix the path to the file

5. Run this command:

sudo make

Check the response, there must be no errors. Mine ends like this:

...[scary looking C commands]...
Making all in test
make[1]: Nothing to be done for `all'.
Making all in doc
All docs are already pre-built by developer.
If you want to rebuild them, install docbook-xsl
and xsltproc and then run 'make docs' instead of simple 'make'.
make[1]: Nothing to be done for `all-am'.

6. Run this:

sudo make install

7. Test if all is OK. If you run this, you must get a list of parameters of this tool:

indexer

Good luck!

GIS: parent-child auto-detection. Part I. Data.

db, development, geo, gis, mysql, postgres Comments Off on GIS: parent-child auto-detection. Part I. Data.

In our project, we had to solve a task of definition of administrative districts nesting.

The task was to organize the administrative centers in a clear hierarchy like a Russian doll (nested principle).

Examples could be:

  • Spain → Costa del Sol → Marbella
  • Ukraine → Crimea → South Coast → Yalta

There are touristic destinations in bold, not the political areas.

russian doll

Also we wanted to correct the nesting errors in the current database.

In this article I’ll show you how I solved this problem by using the KML-files of borders polygons and Postgres database + Postgis extension.

The geo data, which we used for our project, is not a commercial one  — it’s user generated open source, and full of mistakes. For example, the most frequent case — there were lots of cities assigned directly to a country and not belonging to any of its regions or areas; we call those the orphaned cities.

Plus, our business is tourist destinations, so that administrative and political fragmentation of countries is not always appropriate, and sometimes we have to  manually add the tourist regions. For example, such an administrative region like “Southern Coast of Crimea”. Although there is no such thing on a political map, that’s what people google for: “home on the South Coast” and not “a house in Yalta, Gaspra, Gursuf, somewhere there“.

The question is how to automatically find a parent for such administrative region (Ukraine for Crimea), and nest all its children (like the city of Yalta and Sudak in Southern Coast of Crimea area).

By the way: a country is made up of regions, regions consist of areas, the areas are filled with subareas, subareas contain cities. A russian doll, really.

To solve this problem it was decided to use the data of the boundaries of the regions, and then to compare parents borders against children borders.

Preparing data

The database of boundaries of regions can be found free on the Internet. I used the files for Google Earth — they are in KML format (XML with coordinates), or KMZ (zipped KML), since it is convenient to preview in Google Earth.

Automatical convert of KMZ  into KML did not work — the Unicode-characters in the names got broken, so I opened the KMZ-files in Google Earth and saved as KML manually. Yes-yes, “It is unworthy of excellent men to lose hours like slaves in thelabour of calculationwhich could safelybe relegated to anyone else if machines were used“, but if was just faster to re-save it manually rather than find a tool to automate.

The other way to get the boundaries file, especially if you need a custom region like the South Coast — is to draw it manually in Google Earth and save as a KML file.

The files I got had boundary data for every region of every country, but there were no relationships between parents and children. For example, one file was called “all regions of Spain” and the other was “all areas of Spain”, and no relation between the regions and areas. In fact, this is the challenge that we have to solve, and now we have all the data prepared to crack the nut.

In next chapter we will talk about the database platform and the data loading. Stay tuned.

(full article in Russian — http://habrahabr.ru/post/164997/)

Fix “Danish to go” player position

development Comments Off on Fix “Danish to go” player position

You can always have the player visible in 3 easy steps.

1. Create a bookmark of current page in your browser (Google Chrome in my example, but it doesn’t really matter).
2. Right-click on the bookmark and choose “Edit…” menu item.
3. Replace the bookmark URL with this snippet of code and click Save:

javascript: $('#playerViewer').css('position', 'fixed').css('top', '5px').css('right', '100px');

Looks like this:

Edit the bookmark

So now, when you start to listen to audio on the Danish to go page, click the bookmark, and the player will be shown in top right position of your browser, no matter how fast you scroll. Magic ;]

Problems with download in InternetExplorer

development, php Comments Off on Problems with download in InternetExplorer

Don’t forget to include this simple header, otherwise Internet Explorer users over SSL will get a weird error message that the download cannot be done:

header('Pragma: public');

It’s better to make a filename nice and simple, avoid slashes in it and don’t quote it — again in respect to IE users.

The best way to go is just copy the list of headers to implement the download of a fil from PHP documentation page.

Delete Your Code #8: Use the default values

db, delete your code, development, mysql Comments Off on Delete Your Code #8: Use the default values

Default values — how trivial this hint might seem…

Although, simple things make life easier. If you have a huge amount of SQL updates in your system, having correct default values helps to build smaller queries.

Also, when playing with data in your PhpMyAdmin, you can afford to focus on what is really important, not on repeating the boring staff. Every time when you have a big table and need to paste a couple of new records manually, you or the pear developer will be thankful that the majority the fields can be just ommited in order to have the proper and meaningful value.

It can be zero for numeric types, current date and time value for timestamp, a most frequent and meaningful value for the ENUM field or a “magic constant” of default value if you have to use one in your application.

Email address RFC

development, fun Comments Off on Email address RFC

It’s was entertaining to dig into email address format while working on a corresponding task.

Here are a few facts that I didn’t expect to be allowed for email address:

  • the local part of an email address can contain spaces, and it must be quoted and escaped by a back slash like “\ “
  • the local part of an email address can contain comments! It’s put in parentheses and can be omitted. Example: “john(comment).smith@example.com” equals to “john.smith@example.com
  • domain part can have IP instead of domain. To do that, it must be put in square braces like “john@[192.168.1.1]”

Here are examples of VALID email addresses:

  • '@[10.10.10.10]
  • user@[IPv6:2001:db8:1ff::a0b:dbd0]
  • "much.more\ unusual"@example.com
  • "very.unusual.@.unusual.com"@example.com
  • "very.(),:;<>[]\".VERY.\"very@\\\ \"very\".unusual"@strange.example.com
  • 0@a
  • !#$%&'*+-/=?^_`{}|~@example.org
  • "()<>[]:;@,\\\"!#$%&'*+-/=?^_`{}|\ \ \ \ \ ~\ \ \ \ \ \ \ ?\ \ \ \ \ \ \ \ \ \ \ \ ^_`{}|~.a"@example.org
  • ""@example.org
WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in