Grab a list of emails from a website with paging

development, ideas Comments Off on Grab a list of emails from a website with paging

A one-liner to grab a list of emails.

wget -q -O - http://server.com/?page={1..42} | grep -ioE '\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b' | sort -ui > emails.txt

Just replace the page URL and define the start-finish numbers of paging: the {1..42} part means paging from page #1 to page #42 — of course, this is what you should investigate to form a proper final URL.

The sorted results are in emails.txt file.

Yes, there is no phone numbers or first\last names parsed. Fast and easy solution.

How to earn $50 on a bookmark

fun, ideas, story Comments Off on How to earn $50 on a bookmark

I gonna tell a story how I sold a browser bookmark for $50.

A client contacted me, he wanted a Firefox add-on that would do a simple, but important thing — the client needed to see contact details on a website, and in order to do this he had to click multiple “Show contact details” buttons. These buttons loaded the contact details by an AJAX call.

So he needed a solution to click these multiple buttons on the same page.

Plus the second requirement was to make this feature password protected. Once the password was entered, it’s “cached” for a long time. The client needed that so other people couldn’t reuse his code, and he knew that those users are not tech guys.

So, the budget was $50.

I told him that I can make a solution that would work not only in Firefox, but in any browser. He agreed.

So I created a bookmark and edited the URL.

Maybe you know, you can replace a usual HTTP-link by a javascript code.

So instead of “http://server.com/my/link” you can type “javascript:alert('this is a message');“. Like this:

$50 bookmark

So clicking this browser bookmark shows a JavaScript alert.

That’s what I used. The password was asked by a prompt() function and saved to a cookie. Then every button with a given title was pushed, as a queue.

Client is happy. $50 for a bookmark.

Delete Your Code #4: Ambiguous terms

delete your code, development, ideas Comments Off on Delete Your Code #4: Ambiguous terms

Every application operates in a knowledge domain.

File browser works with files, directories and drives. Library software deals with books and editions. Geo tool must be aware of coordinates.

Idea is, developers can save time and lines of code if they agree the terms of the knowledge domain and be strict with it.

Of course, all of us use some terms — this rule is to use a well defined set of terms and to avoid synonyms.

Examples of these rule violations:

  • “folder” vs. “directory”;
  • “date” or “timestamp”;
  • “partner” a.k.a. “affiliate”;
  • “DNS Provider” as “product supplier”;
  • short for “longitude” is “lon” or “lng”?
  • does “website address” differ from “URL”?
  • “region”, “area” and “subarea” — in contrast to ADM1, ADM2 and ADM3;
  • the application I deal with at my job has: “brand”, “model”, “market” and “platform”, which are also often called “manufacturer”, “vehicle”, “locale” and “client” respectively.

If you apply this rule, you will not waste time creating the multiple mapper classes. More reusage, more OOP inheritance, more rapport. More motivation to work on a solid, well defined project with people, you talk the same language with. Isn’t this nice?

Delete Your Code #2: less OOP visibility keywords

delete your code, development, ideas, php Comments Off on Delete Your Code #2: less OOP visibility keywords

Hey! I gonna describe tricks that I use to have less code. Why it is important to have less code, you ask? Less code means less bugs, less support, less developer brains waste.

Today’s trick is extremely simple — when you have a long set of public, protected or private class properties, remove the visibility keyword set to each declaration but define it once as comma separated declaration.

Same applies to class constants as well.

Example:

// Before:
class Cat {
    const KINGDOM = 'Animalia';
    const PHYLUM  = 'Chordata';
    const FAMILY  = 'Felidae';

    public $tail;
    public $whisker = '\/';
    public $head;
    public $legs = array(1,2,3,4);
}

// After:
class Cat {
    const
        KINGDOM = 'Animalia',
        PHYLUM  = 'Chordata',
        FAMILY  = 'Felidae';

    public
        $tail,
        $whisker  = '\/',
        $head,
        $legs = array(1,2,3,4);
}

One benefit is that the code looks clear and it’s much easier to scan rather then to read.

Second benefit is when you need to change the visibility for a property, you don’t have to edit the visibility keyword near the property name (which might be an error prone process when you are tired) — you just move the line up or down, which usually has a shortcut in IDE.

Known disadvantage is that most documenting engines don’t support this ferature.

6 points against writing an own CMS

development, ideas Comments Off on 6 points against writing an own CMS

Once upon a time I was asked at my job to work on company website. There was a choice among existing CMS for the website to be driven by, plus I could think about implemention of something of my own — but you know what? I didn’t feel comfortable to start a CMS from scratch. Now I know why, exactly.

I found this nice article called Homebrew CMS (by Seth Gottlieb), which lists six points of a modern content management system which you as architect should consider before a start:

  1. Versioning — it makes data model much more sophisticated;
  2. Localization — should images be translated, what is your default language, etc;
  3. Preview — the content being previewed must be connected with all the website and not visible to others;
  4. Deployment — dependency management like images which are pasted in the content;
  5. Usability — is probably the most common reason why companies abandon their home grown CMS;
  6. Access control — your system must control not functions only, but also the data.

All in all, an open source free CMS might be the best choice for you too. Spare your energy for something else.

Web application check list: geo data

development, ideas Comments Off on Web application check list: geo data

Things you would likely to check in your application. First story is about geo and localization features of your application.

  1. Detect browser languages, find the one your application supports and set it as your application’s current languages. If it’s not set, first auto-detect country (see below) and then use the country’s main language.
  2. Auto-detect country: first by IP (free database of countries IP addresses can be downloaded here), then by browser locale (en-US means it’s a USA user), then by auto-detected language (ru language gonna mean Russia. Yes, the user can be located in Ukraine or Belarus, not in Russia – but let’s face it, it’s much closer than nothing).
  3. On the grounds you’ve detected the country, you can preset the currency for the user.
  4. Save this data in user’s profile when he/she registers in your application. If your application cares about other languages the user can speak, you can parse the rest of browser languages and save them too.
  5. If entities of your application can be located on map, set geo meta tags to tell the search engines about place your entity is at. (success story). Find out more on geotagging at Wikipedia.

List of countries with a set of supported languages and default currency code for each one can be got from GeoNames (countries info dump).

Tech improvements would be funny to have

fun, ideas Comments Off on Tech improvements would be funny to have
  • a password field with auto-suggest feature
  • a message telling you that ‘Such password is already taken‘ (especially indicating by which user)
  • a authentication system that recognizes you not by the username and password you’ve typed, but by the speed of typing and delays between characters
  • case sensative domains names
  • a hyper link, that opens multiple windows when clicked
  • a page advert analyzing your face via webcam and rotating banners when you blink
  • porn sites that capture the video of you via your webcam while you surf them (maybe it exists already? beware!)

Open source project as a CV

development, fun, ideas, marketing Comments Off on Open source project as a CV

I think it’s a nice idea to contribute to open source project at least to just highlight this fact in your CV/resume when you are looking for a job.

Benefits:

  • you show your level of commitment to something
  • money is not the only motivator for you
  • the code written by you is publically visible — you don’t need to explain what major design patterns you know, which technologies you are familiar with and how good you usually polish your code, that all can be seen

I think it can be compared with marriage. Remember that Alec Baldwin quote  from “The Departed” movie?

Marriage is an important part of getting ahead: lets people know you’re not a homo; married guy seems more stable; people see the ring, they think at least somebody can stand the son of a bitch; ladies see the ring, they know immediately you must have some cash or your cock must work.

10 reasons not to use Assembla

assembla, development, ideas 3 Comments »

Although Assembla is one of the best ticket systems, I hate it at times. Main reason is that their team plays with fonts and colors and thus changes the site appearence, and at the same time ignores the major issues.

Here is the list that irritates me the most:

  1. When you create a ticket, the editor doesn’t work properly. For example, if you hightlight a URL with intention to mark it as a link, they still ask you for the URL in a prompt (a month ago they would even just replaced it with a default markup text like “[[http://server.com | URL TEXT]]“)
  2. The same is WIKI based on tinyMCE — it fails so often, that 50% of times I have to open the content HTML source and fix things manually
  3. Affiliate system — it pays you back only if you refered 3+ new users. I have just 2 leads, and I loose $80 every month…
  4. Some time ago they had an offer to blog about them and get $5 reward — I wrote a post, and their marketing guy Ryan told me that they gonna pay me only after a referal would pay a bill — which was not mentioned in the offer.
  5. If you want to export tickets data (for example, to count some stats regarding development performance), you will have to deal with their internal semi-JSON format with no documentation (only not-so-self-explanatory feilds names)
  6. Previously they had a project name in project space URL, now they use a unreadable hash — if you work with more then 1 project, you’d feel how unhandy it is
  7. There are 3 (three!) different SVN repositories types (SVN+Trac, External SVN, Source/SVN) in Tools which is confusing. Every one has its own set of possibilities — you have to choose what is more vital for you, you cannot have it all.
  8. Current filter choise is lost if you leave Assembla page for 5-10 minutes, and you have to switch to it over again. The choise is saved in a HTTP-secure cookie (it means you cannot change it by JavaScript). Damn, why?
  9. They don’t show hours total for tickets  no matter how you group them. It’s one of the most important things for developers, guys.
  10. If you want just to get content of the project, you cannot.  I mean not SVN Checkout, but kind of SVN Export — download all folders/files without SVN client. They had a button that allowed to download the project as ZIP archieve, but now it’s not there. I have a few PHP projects which many people download and use, and they complain often about this missing feature.

All in all, Assembla still has a big way to go to maturity.

UPDATE [Feb 7, 2012] – previously, when you click on a milestone name on a ticket page, you go to the milestone and can see all tickets. Now you get the ugly build-in JavaScript prompt to rename the milestone. WAT?!%$??

Bulk SQL loading

db, development, ideas, mysql Comments Off on Bulk SQL loading

If you want to load a list of SQL files into your database on Windows, you can create a .cmd file with such content and run it.
Place it in the same folder where your .sql files are.

@echo off
SET DATABASE=my_database
SET USER=root
SET PASS=1
SET FORMAT=*.sql

FOR /F "usebackq" %%i IN (`dir /on /b %FORMAT%`) DO echo %%i && mysql --user=%USER% --password=%PASS% --default-character-set=utf8 %DATABASE% < %%i

Make sure, mysql command can be run. If no, either set the full path to mysql command tool, or add the path to PATH environment variable.

How to rate Points Of Interest automatically

api, development, google, ideas Comments Off on How to rate Points Of Interest automatically

During my work on sunnyrentals.com I’ve got a task to add 2 closest airports to every property an owner creates.

I implemented it quite fast since we have a database of airports with coordinates.

While playing with this feature, we found out that it is not good enough — the real task must be to add 2 closest and biggest airports. The problem is that we don’t have any data in airports DB to guess how big or famous a particular airport is.

So we need to rate every airport somehow…

The solution we found was simple — we need to google for the airport name and get the search results count. The count can be considered as rating value — London Heathrow airport has 2.33 million results while Kiev Zhulyany airport has only 0.77 mln which looks fair.

Several things to pay attention to:

  • the query we formed was [city name] + [airport name] + ‘ airport’
  • if this query gives zero result, I omit the city name — at times it hepls
  • we put the query into quotes to google for the exact phrase, otherwise the London City airport gets the highest rating due to the fact that “city” is a general term
  • if the airport name includes the city name (Melbourne Intl), we omit the city name — “Melbourne Intl airport” is better then “Melbourne Melbourne Intl airport
  • in addition to previous idea — if the airport name sounds like the city name, we omit the city name as well. Example: Narsarsuaq airport in Narssarssuaq city. I used soundex function for this comparison — it’s present in PHP and MySQL.

To get the google results you can use the Google Search API:

$queryTemplate = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=%s';
$airportQuery = '"London Heathrow airport"';
$query = sprintf( $queryTemplate, urlencode( $airportQuery ) );
$json = json_decode( file_get_contents( $query ), 1 );
$rating = (int)$json['responseData']['cursor']['estimatedResultCount'];

Currency exchange in your application

db, development, ideas, php, zend 1 Comment »

That’s easy, you need 2 things:

  1. Fresh currencies exchange rates
  2. Some way to excange amount from one currency to another.

This how I did it: get values from European Central Bank (ECB) for step #1 and wrote MySQL user defined function for step #2.

Here is how to export currencies rates from ECB (EUR is a base currency, and I add self rate as 1:1). First I create such database table:

CREATE TABLE IF NOT EXISTS `currency` (
  `code` char(3) NOT NULL DEFAULT '',
  `rate` decimal(10,5) NOT NULL COMMENT 'Rate to EUR got from www.ecb.int',
  PRIMARY KEY (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Currency rates (regularly updated)';

Now let’s fill it with rates:

<?php

class CurrencyController extends Controller_Ajax_Action {

  public function importAction() {
  
    $db = Zend_Registry::get('db');
    $db->beginTransaction();
    
    $url = 'http://www.ecb.int/stats/eurofxref/eurofxref.zip?1c7a343768baab4322620e3498553b5a';
    try {
      $contents = file_get_contents($url);
      $contents = archive::unzip($contents);
      $contents = explode("\n", $contents);
      
      $names = explode(',', $contents[0]);
      $rates = explode(',', $contents[1]);
      
      $names[] = 'EUR';
      $rates[] = 1;
    
      for ($i = 1; $i < sizeof($names); $i++) {
        if (!(float) $rates[$i]) continue;
        $db->query( sprintf('INSERT INTO `currency`(`code`, `rate`)
              VALUES ("%s", %10.5f)
              ON DUPLICATE KEY UPDATE `rate`=VALUES(`rate`)', 
             trim( $names[$i] ), 
             trim( $rates[$i] )
        ) );
      }
      
      $db->commit();
    } catch ( Exception $O_o ) {
      error_log( $O_o->getMessage() );
      $db->rollback();
    }
    
  }
}

Now let’s create a SQL function for handy converts. Create a udf.sql file and add this in it:

DELIMITER //

DROP  FUNCTION IF EXISTS EXCHANGE;
CREATE FUNCTION EXCHANGE( amount DOUBLE, cFrom CHAR(3), cTo CHAR(3) ) RETURNS DOUBLE READS SQL DATA DETERMINISTIC
    COMMENT 'converts money amount from one currency to another'
BEGIN
    DECLARE rateFrom DOUBLE DEFAULT 0;
    DECLARE rateTo DOUBLE DEFAULT 0;
    
    
    SELECT `rate` INTO rateFrom FROM `currency` WHERE `code` = cFrom;
    SELECT `rate` INTO rateTo   FROM `currency` WHERE `code` = cTo;
    
    IF ISNULL( rateFrom ) OR ISNULL( rateTo ) THEN
        RETURN NULL;
    END IF;
    
    RETURN amount * rateTo / rateFrom;
END; //

DELIMITER ;

and run this command in your shell:

mysql --user=USER --password=PASS DATABASE < udf.sql

This is how you can use this function — how to convert 10 US dollars to Canadian dollars:

SELECT EXCHANGE( 10, 'USD', 'CAD')

which results in $10 = 10.93 Canadian dollars.

P.S. Consider adding the currency export action call to your cron scripts.

P.P.S. A function to unzip the data file can be got at php.net

Ajax controller in Zend project

ideas, php, zend 1 Comment »

I want to share a couple of features I use to handle AJAX requests in projects based on Zend Framework.

1. AJAX request handling

What: some parts of your application can be not loaded if currect request is AJAX.

Why: you don’t need views, templates, some routes — so you can add an AJAX check in your Initializer or bootstrap file and avoid loading not necessary things.

How: Zend Request object has a isXmlHttpRequest method to find out whether it’s AJAX request or not. It’s based on ‘X-Requested-With‘ header, which is sent by jQuery, Prototype, Scriptaculous, YUI and MochiKit frameworks.

2. AJAX Controller

Most AJAX controller’s methods I saw had an exit() inside to not to output Zend’s template — it is a work-around. The proper way to do so is to tell to Zend not to load anything. One step forward is to create an abstract Controller class and inherit all you AJAX classes from it:

/library/Koodix/Controller/Ajax/Action.php:

<?php

require_once 'Zend/Controller/Action.php';

abstract class Koodix_Controller_Ajax_Action 
  extends Zend_Controller_Action 
{
    public function init() {
        //disable the standard layout output
        $this->_helper->layout()->disableLayout();  
        $this->_helper->viewRenderer->setNoRender();
    }

    public function postDispatch() {
        //envelope and output json field
            if( !empty( $this->json ) ) {
                echo json_encode( $this->json );
            }
        }
}

/application/modules/default/controllers/AjaxController.php:

<?php

class AjaxController extends Koodix_Controller_Ajax_Action 
{
 // bla-bla-bla

Take a look at postDispatch method — idea behind it is to convert to JSON and output anything that is set to json field of your controller. If you want to send JSON data in special header (and not in body, like it’s done in my example), you can do it in this method.

Flash uploader and HTTP password protection

development, ideas, zend Comments Off on Flash uploader and HTTP password protection

You are working on a project and you want to protect the beta version with password so that only allowed people (beta testers) could access it.

You decide not to invent the wheel and to use the standard HTTP authentication.

First idea is to use your Apache web server to do this, so you write something like that in .htaccess file:

AuthName "Private zone"
AuthType Basic
AuthUserFile /path/to/.htpasswd
require valid-use

This solution is simple and that’s why good.

A problem comes on stage when a Flash file uploader is added to your project – usually it cannot “login” to your site, i.e. users are not able to use the Flash file uploader behind beta login.

That’s how I solved it.

It’s not the web server who must solve this (Apache), it’s the application server (PHP). So remove the lines above from .htaccess and use Zend_Auth_Adapter_Http for this purpose — it’s Zend’s HTTP Authentication Adapter.

What concerns the Flash uploader: it sends ‘Shockwave Flash’ as value of ‘User-Agent’ request header. So in your Initializer or Bootstrap file (where you load Zend_Auth_Adapter_Http) check this header value, and if it’s not Flash’s, go for HTTP authentication.

P.S. Hackers can assume this and fake the header to access your site. To cope with that, use an additional secret request variable (Flash uploaders allow this) and check it at server side.

XML to CSV conversion

db, development, ideas Comments Off on XML to CSV conversion

MySQLData feeds often come in XML format, so your application must be able to deal with that format.

As I already wrote, data in CSV format (comma separated values) can be loaded to database extremely fast. So my idea was to convert XML data files to CSV and then use bulk load to database. My tests shown that this is faster in 10-100 times than one by one inserts.

Yesterday I decided to write a generalized solution for this, and it turned out that there is no need: it’s just coming — MySQL 6 will have such feature!

How it works: you create a table, name its columns exactly as XML nodes/attributes names or — and MySQL server will load it correspondently.

Example — you downloaded a POI list file (Points of Interest) called poi.xml that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<gpx>
    <wpt lat="58.691931900" lon="11.253125962">
        <name>Parking</name>
    </wpt>
    <wpt lat="58.315525000" lon="12.305828000">
          <name>Fast Food Restaurant:Max i trollhattan</name>
    </wpt>
    <wpt lat="57.717958100" lon="11.880860600">
          <name>Picnic spot</name>
    </wpt>
</gpx>

You create a MySQL table:

CREATE TABLE IF NOT EXISTS `poi` (
  `lat` varchar(255) NOT NULL,
  `lon` varchar(255) NOT NULL,
  `name` varchar(255) NOT NULL
) DEFAULT CHARSET=utf8;

OK, now you load the XML data to your table:

LOAD XML INFILE '\\path\\to\\poi.txt'
INTO TABLE `poi`
ROWS IDENTIFIED BY '<wpt>'

Voila!

The good thing is that MySQL 6 is already available in alpha version — good enough for development purposes; I gave it a try — it takes 5 seconds to load 4.8 Mb of data in 19 files.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in