Compress Text Data Using Letter Frequent Information


Khaled Al-Sham'aa
This class compresses text strings into roughly 70% of their original size by benefit from using compact coding for most frequented letters in a given language. This algorithm associated with text language, so you will find 6 different classes for the following languages: Arabic, English, French, German, Italian and Spanish language.

 

Benefits of this compress algorithm include:

1- It is written in pure PHP code, so there is no need to any PHP extensions to use it.
2- You can search in compressed string directly without any need uncompress text before search in.
3- You can get original string length directly without need to uncompress compressed text.

Note:
Unfortunately text compressed using this algorithm lose the structure that normal zip algorithm used, so benefits from using ZLib functions on this text will be reduced.

There is another drawback, this algorithm working only on text from a given language, it does not working fine on binary files like images or PDF.