Snap! Websites
An Open Source CMS System in C++
![]() |
libtld
1.4.22
|
The libtld project is a library that gives you the capability to determine the TLD part of any Internet URI or email address.
The main function of the library, tld(), takes a URI string and a tld_info structure. From that information it computes the position where the TLD starts in the URI. For email addresses (see the tld_email_list C++ object, or the tld_email.cpp file for the C functions,) it breaks down a full list of emails verifying the syntax as defined in RFC 5822.
The C functions that you are expected to use are listed here:
For C++ users, please make use of these tld classes:
In C++, you may also make use of the tld_version() to check the current version of the library.
To check whether the version is valid for your tool, you may look at the version handling of the libdebpackages library of the wpkg project. The libtld version is always a Debian compatible version.
http://windowspackager.org/documentation/implementation-details/debian-version-api
At this point I do not have a very good environment to recompile everything for PHP. The main reason is because the library is being compiled with cmake opposed to the automake toolchain that Zend expects.
This being said, the php directory includes all you need to make use of the library under PHP. It works like a charm for me and there should be no reason for you not to be able to do the same with the library.
The way I rebuild everything for PHP:
The build script will copy the resulting php_libtld.so file where it needs to go using sudo. Your system (Red Hat, Mandrake, etc.) may use su instead. Update the script as required.
Note that the libtld will be linked statically inside the php_libtld.so so you do not have to actually install the libtld environment to make everything work as expected.
The resulting functions added to PHP via this extension are:
For information about these functions, check out the php/php_libtld.c file which describes each function, its parameters, and its results in great details.
We can successfully compile the library under MS-Windows with cygwin and the Microsoft IDE. To do so, we use the CMakeLists.txt file found under the dev directory. Overwrite the CMakeLists.txt file in the main directory before configuring and you'll get a library without having to first compile Qt4.
At this point this configuration only compiles the library. It gives you a shared (.DLL) and a static (.lib) version. With the IDE you may create a debug and a release version.
Later we'll look into having a single CMakeLists.txt so you do not have to make this copy.
We offer a file named example.c that shows you how to use the library in C. It is very simple, one main() function so it is very easy to get started with libtld.
For a C++ example, check out the src/validate_tld.cpp tool which was created as a command line tool coming with the libtld library.
If you want to work on the library, there are certainly things to enhance. We could for example offer more offsets in the info string, or functions to clearly define each part of the URI.
However, the most important part of this library is the XML file which defines all the TLDs. Maintaining that file is what will help the most. It includes all the TLDs known at this point (as defined in different places such as Wikipedia and each different authority in that area.) The file is easy to read so you can easily find whether your extension is defined and if not you can let us know.
The library doesn't need anything special. It's a few C functions.
The library also offers a C++ classes. You do not need a C++ compiler to use the library, but if you do program in C++, you can use the tld_object and tld_email_list instead of the C functions. It makes things a lot easier!
Also if you are programming using PHP, the library includes a PHP extension so you can check URIs and emails directly from PHP without trying to create crazy regular expressions (that most often do not work right!)
To compile the library, you'll need CMake, a C++ compiler for different parts and the Qt library as we use the QtXml and QtCore (Qt4). The QtXml library is used to parse the XML file (tld_data.xml) which defines all the TLDs, worldwide.
To regenerate the documentation we use Doxygen. It is optional, though.
In order to recompile the PHP extension the Zend environment is required. Under a Debian or Ubuntu system you can install the php5-dev package.
We have the following tests at this time:
This document is part of the Snap! Websites Project.
Copyright by Made to Order Software Corp.
Snap! Websites
An Open Source CMS System in C++