MySQL stored procedures debugging

development, mysql Comments Off on MySQL stored procedures debugging

It seems there are no built-in ways to debug user-defined functions or procedures in MySQL. So if the function behaves bad, it’s hard to find out why.

Here is what I do in order to trace a function. Yes, it’s a bit ugly, but better than nothing.

1. Run this in your MySQL command line (or PhpMySQL with // as delimiter, there is a field for it below the SQL window).

DELIMITER //

DROP PROCEDURE IF EXISTS Debug; //
CREATE PROCEDURE Debug(Message TEXT)
BEGIN
    CREATE TABLE IF NOT EXISTS _debug (
        `id` int(10) unsigned NOT NULL auto_increment,
        `msg` TEXT DEFAULT NULL,
        `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        PRIMARY KEY  (`id`)
    );
    INSERT INTO _debug(`msg`)  VALUES(Message);
END; //

DROP PROCEDURE IF EXISTS ClearDebugMessages; //
CREATE PROCEDURE ClearDebugMessages()
BEGIN
    TRUNCATE TABLE _debug;
END; //

2. In your function, this can be made:

CALL Debug('Debug message goes here');

3. There gonna be _debug table in your database that contains debug messages with date of creation of each.

4. If you want to clear all the debug messages, make a call:

CALL ClearDebugMessages();

Alternative is to truncate the _debug table with PhpMyAdmin.

MySQL random function

db, development, mysql Comments Off on MySQL random function

MySQL RAND() function returns values between 0 and 1. At times you need a random value between A and B, then you can use this function:

DELIMITER //
DROP  FUNCTION IF EXISTS RANDOM;//
CREATE FUNCTION RANDOM(minimum INT, maximum INT)
    RETURNS INT NO SQL NOT DETERMINISTIC
    COMMENT 'integer random value in the bounds given'
RETURN minimum + 
  ROUND(1000000 * RAND() % (maximum-minimum) );//
DELIMITER ;

You can call it like this to get a random integer value between 0 and 100:

SELECT RANDOM(0, 100);

Major problems with geo in MySQL

boo, geo, mysql, postgres Comments Off on Major problems with geo in MySQL

Many development projects are started with MySQL on board — it’s free, stable and scalable.

At some point your project might need Geo features (a.k.a. “spatial calculations“): a great example would be to count a distance from where your user is to closest airport, Eiffel tower or hotel.

If you’re at this point, you are in a gray zone. Think twice if you want to stay with MySQL. You can save lots of hours and rude words if you make a right decision now.

Reason is simple: Geo calculations in MySQL are not implemented the way they should. OK, ok, count a distance between 2 points is an easy task and can be solved by a single function, but anything more complex just don’t work. For example, distinguishing if a point is inside a boundary, or calculation of two boundaries overlapping square.

That’s what MySQL website says:

MySQL originally implemented these functions such that they used object bounding rectangles and returned the same result as the corresponding MBR-based functions.

What does it mean? Look at this picture of an area in Egypt called Al Jizah.

Al Jizah

The shape is quite complex, but MySQL cannot process it, so instead of this a Minimal Bounding Rectangle is used — which is exactly the rectangle shape of the picture itself. I cannot find exact words to express how error-prone that is.

You can try to reinvent a wheel and write your own functions (like I did…), but they are very slow — iterating through 4 Kb of points of a polygon takes 1-2 seconds, so if you have hundreds of polygons to compare — say bye-bye to the product performance.

This is fixed only in versions after 5.6.1.

What to do? Use Postgres + special spatial extension called PostGis. It’s super-fast, works in multiple dimensions and does it RIGHT.

Database choice

complain, db, mysql, postgres Comments Off on Database choice

If you consider a database engine for the new project between MySQL and Postgres, choose Postgres.

Reasons are simple.

1. Postgres has stricter datatypes. If a field is integer, you cannot assign a string value to it. In the long run it makes the database data less error-prone. Frameworks+ORMs will make the code transparent anyway.

2. It supports JSON data type.

3. Postgres supports Geo calculations, why MySQL has a very limited support. Really. Many services and websites need geo calculations to count distances, proximity, etc. MySQL supports the most simple operations, but any sophisticated geo math is just not implemented in MySQL.

GIS: parent-child auto-detection. Part I. Data.

db, development, geo, gis, mysql, postgres Comments Off on GIS: parent-child auto-detection. Part I. Data.

In our project, we had to solve a task of definition of administrative districts nesting.

The task was to organize the administrative centers in a clear hierarchy like a Russian doll (nested principle).

Examples could be:

  • Spain → Costa del Sol → Marbella
  • Ukraine → Crimea → South Coast → Yalta

There are touristic destinations in bold, not the political areas.

russian doll

Also we wanted to correct the nesting errors in the current database.

In this article I’ll show you how I solved this problem by using the KML-files of borders polygons and Postgres database + Postgis extension.

The geo data, which we used for our project, is not a commercial one  — it’s user generated open source, and full of mistakes. For example, the most frequent case — there were lots of cities assigned directly to a country and not belonging to any of its regions or areas; we call those the orphaned cities.

Plus, our business is tourist destinations, so that administrative and political fragmentation of countries is not always appropriate, and sometimes we have to  manually add the tourist regions. For example, such an administrative region like “Southern Coast of Crimea”. Although there is no such thing on a political map, that’s what people google for: “home on the South Coast” and not “a house in Yalta, Gaspra, Gursuf, somewhere there“.

The question is how to automatically find a parent for such administrative region (Ukraine for Crimea), and nest all its children (like the city of Yalta and Sudak in Southern Coast of Crimea area).

By the way: a country is made up of regions, regions consist of areas, the areas are filled with subareas, subareas contain cities. A russian doll, really.

To solve this problem it was decided to use the data of the boundaries of the regions, and then to compare parents borders against children borders.

Preparing data

The database of boundaries of regions can be found free on the Internet. I used the files for Google Earth — they are in KML format (XML with coordinates), or KMZ (zipped KML), since it is convenient to preview in Google Earth.

Automatical convert of KMZ  into KML did not work — the Unicode-characters in the names got broken, so I opened the KMZ-files in Google Earth and saved as KML manually. Yes-yes, “It is unworthy of excellent men to lose hours like slaves in thelabour of calculationwhich could safelybe relegated to anyone else if machines were used“, but if was just faster to re-save it manually rather than find a tool to automate.

The other way to get the boundaries file, especially if you need a custom region like the South Coast — is to draw it manually in Google Earth and save as a KML file.

The files I got had boundary data for every region of every country, but there were no relationships between parents and children. For example, one file was called “all regions of Spain” and the other was “all areas of Spain”, and no relation between the regions and areas. In fact, this is the challenge that we have to solve, and now we have all the data prepared to crack the nut.

In next chapter we will talk about the database platform and the data loading. Stay tuned.

(full article in Russian — http://habrahabr.ru/post/164997/)

Delete Your Code #8: Use the default values

db, delete your code, development, mysql Comments Off on Delete Your Code #8: Use the default values

Default values — how trivial this hint might seem…

Although, simple things make life easier. If you have a huge amount of SQL updates in your system, having correct default values helps to build smaller queries.

Also, when playing with data in your PhpMyAdmin, you can afford to focus on what is really important, not on repeating the boring staff. Every time when you have a big table and need to paste a couple of new records manually, you or the pear developer will be thankful that the majority the fields can be just ommited in order to have the proper and meaningful value.

It can be zero for numeric types, current date and time value for timestamp, a most frequent and meaningful value for the ENUM field or a “magic constant” of default value if you have to use one in your application.

Delete Your Code #7: Right encoding

db, delete your code, development, encoding, mysql, php, server Comments Off on Delete Your Code #7: Right encoding

You might be surprised, but a right choise of the project text encoding can affect the project file size and amount of bugs.

To avoid bugs of wrong presentation of text on your page, make sure that all database entities and the application server (PHP) use the same encoding. That helps to forget about issues connected with text presentation.

On database side, make sure you set the correct encoding to:

  • database,
  • tables,
  • columns,
  • import-export tools parameters (a big source of wrong encoding bugs),
  • corresponding SQL server variables

On application server it’s usually just one query –

SET NAMES utf8

Pay attention, that utf8 might be not the best choise for your project: every non-English character needs 2-6 bytes of memory, so if you built a one-language (local) project with lots of database data, consider using a 1 byte encoding like windows-1251 and save about half of the space on server file system.

MySQL Proxy

db, development, mysql Comments Off on MySQL Proxy

At times it’s necessary to know which queries the server runs right now.

Of course, you can try this statement:

SHOW PROCESSLIST

but if you need a more poweful tool, try MySQL Proxy.

Although it’s still an alpha version, it’s a handy tool to monitor, filter and manipulate your queries since there is Lua scripting language interpreter supported which is quite obvious in usage.

The MySQL Proxy (as can be understood from its title) can help you to see communication between a few MySQL servers and a few clients.

My usage pattern is that the Proxy listens to 4040 port on my local machine, and if I change the port setting in connection string of any application (my job project or even PhpMyAdmin), I can see the queries logged in a console window.

You can track lots of characteristics of queries — just take a look at example Lua scripts in the package that you’ve downloaded.

Bulk SQL loading

db, development, ideas, mysql Comments Off on Bulk SQL loading

If you want to load a list of SQL files into your database on Windows, you can create a .cmd file with such content and run it.
Place it in the same folder where your .sql files are.

@echo off
SET DATABASE=my_database
SET USER=root
SET PASS=1
SET FORMAT=*.sql

FOR /F "usebackq" %%i IN (`dir /on /b %FORMAT%`) DO echo %%i && mysql --user=%USER% --password=%PASS% --default-character-set=utf8 %DATABASE% < %%i

Make sure, mysql command can be run. If no, either set the full path to mysql command tool, or add the path to PATH environment variable.

Reverse geocoding in your app

db, development, mysql 1 Comment »

Reverse geocoding is a process of getting toponym name (city name) by its coordinates.

To make this, you can use a remote service like I have already described, or implement it yourself.

To do so you need two things:

  1. A database of cities with coordinates (having a population value at least for major cities is a big plus)
  2. A method to find a closest city

#1 is left for you, let’s talk about #2.

In the previous post I introduced the DISTANCE function (SQL) to count a distance between 2 points by coordinates. Let’s use it to find the biggest city closest to a given point:

DELIMITER //
DROP FUNCTION IF EXISTS REVERSE_GEOCODE;//
CREATE FUNCTION REVERSE_GEOCODE( latitude DOUBLE, longitude DOUBLE, radius DOUBLE ) 
    RETURNS DOUBLE READS SQL DATA DETERMINISTIC
    COMMENT 'Finds a city by coordinates given. Returns City ID.'
BEGIN
    DECLARE cityID INT(10) DEFAULT 0;
    DECLARE d DOUBLE DEFAULT 0; /* useless var */
    
    SELECT id, DISTANCE(`lat`, `lng`, latitude, longitude) AS dist
    INTO cityID, d
    FROM `geo_cities`
    WHERE lat BETWEEN FLOOR(latitude - 0.5) 
                  AND CEIL(latitude  + 0.5)
      AND lng BETWEEN FLOOR(longitude- 0.5)
                  AND CEIL(longitude + 0.5)
    HAVING dist < radius
    ORDER BY IF(population > 0, population, 1) * (1/IF(dist > 0, dist, 1)) DESC
    LIMIT 1;
    
    RETURN cityID;
END; //
DELIMITER ;

Well, this function orders cities by (population and distance) function — so that the biggest closest city is returned.

WHERE clause is supposed to shorten the number of cities which are looked through; otherwise we would have to search through whole world’s cities. In my solution the search is limited by a reasonably small rectangle.

In the next post I’ll tell you how to find a biggest city depending on current Google Maps zoom, so that it won’t give you a small city near Paris if you click on France at a world-level map view.

SQL function to count distance between two points

db, development, mysql 4 Comments »

Useful function.

DELIMITER //

DROP FUNCTION IF EXISTS DISTANCE; //

CREATE FUNCTION DISTANCE( lat1 DOUBLE, lon1 DOUBLE, lat2 DOUBLE, lon2 DOUBLE ) 
    RETURNS DOUBLE NO SQL DETERMINISTIC
    COMMENT 'counts distance (km) between 2 points on Earth surface'
BEGIN
    DECLARE dtor DOUBLE DEFAULT 57.295800;

    RETURN (6371 * acos(sin( lat1/dtor) * sin(lat2/dtor) +
        cos(lat1/dtor) * cos(lat2/dtor) * 
        cos(lon2/dtor - lon1/dtor)));
END; //

DELIMITER ;

In next post I will explain how I used this function to implement a reverse geocoding — it’s useful when you need to find a closest city by given coordinates.

Preferrable sorting in MySQL

db, development, mysql Comments Off on Preferrable sorting in MySQL

At times, some records in a recordset are more important than others — e.g. a featured product in a products list. In such a case you can make 2 requests to get the featured item first, then to load other N-1 items, and then display them.

If you want to make it by one move, you can use a feature of preferrable sorting — you can define what value and in which field goes first if such value is present.

For instance, you want to show an item with item_id=15 first:

SELECT * 
FROM items 
ORDER BY item_id IN (15) DESC

Which returns something like (featured item goes first):

+----------+------------+
| item_id  | name       |
+----------+------------+
| 15       | Product 15 |
| 32       | Product 32 |
| 31       | Product 31 |
| 30       | Product 30 |
| 29       | Product 29 |
| ...      | ...        |
+----------+------------+

Convert latin1 to utf8

db, development, mysql Comments Off on Convert latin1 to utf8

If you import unicode text into latin1 database column, the symbols would be screwed up — russian symbols become a shit like “залоговый депозит”.

To convert such quickly (as a test) you can use Lebedev’s convertor.

To convert the whole table do the following (thanks to dull.ru):

  1. The most important step: dump such data in a file with mysqldump
    mysqldump -u user -p --default-character-set=latin1 --skip-set-charset --no-create-info --extended-insert --complete-insert dbname table > dbname.sql

    If not the whole table data is bad, create another table (CREATE TABLE t2 LIKE t1) and copy wrong rows to the new table.

  2. Replace latin1 by utf8 in the file
    This can be done manually in your editor or by this command:

    sed -r 's/latin1/utf8/g' dbname.sql > dbname_utf.sql
  3. Convert your latin1-table to utf8 (and maybe truncate it):
    ALTER TABLE `table` CONVERT TO CHARACTER SET 'utf8';
  4. Import the utf8 data back to the table:
    mysql -u user -p --default-character-set=utf8 dbname < dbname_utf.sql

That’s it.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in