Context Navigation

← Previous Revision
Next Revision →
Blame
Revision Log

source: OpenRLabs-Git/deploy/rlabs-docker/web2py-rlabs/gluon/contrib/pyuca/README.markmin

main

Last change on this file was 42bd667, checked in by David Fuertes <dfuertes@…>, 4 years ago
Historial Limpio
Property mode set to `100755`
File size: 1.6 KB

Line
1	# pyuca: Python Unicode Collation Algorithm implementation
2	(http://jtauber.com/blog/2006/01/27/python_unicode_collation_algorithm/)
3
4	This is my preliminary attempt at a Python implementation of the
5	[Unicode Collation Algorithm (UCA)](http://unicode.org/reports/tr10/).
6	I originally posted it to my blog in 2006 but it seems to get enough
7	usage it really belongs here (and in PyPI).
8
9	What do you use it for? In short, sorting non-English strings properly.
10
11	The core of the algorithm involves multi-level comparison. For example,
12	``café`` comes before ``caff`` because at the primary level, the accent
13	is ignored and the first word is treated as if it were ``cafe``.
14	The secondary level (which considers accents) only applies then to words
15	that are equivalent at the primary level.
16
17	The Unicode Collation Algorithm and pyuca also support contraction and
18	expansion. Contraction is where multiple letters are treated as a
19	single unit. In Spanish, ``ch`` is treated as a letter coming between
20	``c`` and ``d`` so that, for example, words beginning ``ch`` should
21	sort after all other words beginnings with ``c``. Expansion is where
22	a single letter is treated as though it were multiple letters. In German,
23	``ä`` is sorted as if it were ``ae``, i.e. after ``ad`` but before ``af``.
24
25	## Here is how to use the ``pyuca`` module:
26	``
27	git clone https://github.com/jtauber/pyuca.git
28	cd pyuca
29	pip install pyuca
30	``
31
32	Usage example:
33	``
34	from pyuca import Collator
35	c = Collator("allkeys.txt")
36
37	sorted_words = sorted(words, key=c.sort_key)
38	``
39
40	``allkeys.txt`` (1 MB) is available at
41
42	http://www.unicode.org/Public/UCA/latest/allkeys.txt
43

Note: See TracBrowser for help on using the repository browser.

Download in other formats: