- 1 of 8
- ››
The URLs found in a page are added to the links table (a specific content type) so we can manage all the links.
In order to mark a link as this or that, one can add a rel="..." parameter. This is valid for <link ...> and <a ...> tags. Indicating relations often can be a great help to search engines. For example, a link to the home page can be marked as rel="home". This is particularly useful if the page is not at "/". Similarly, the author of a page can be indicated with an author link as in rel="author". Further an author can indicate an external page as being his page with the rel="me" indicator.
Editing of links will include the necessary dropdown to select the type of link the user adds. Also, links created by code should all have a rel="..." parameter when these are to appear on public pages. There are many already assigned rel values that we can use.
Source: http://microformats.org/wiki/existing-rel-values
Our software should be smart enough to create local links even if the user entered a full URL. That way we can optimize the pages (i.e. instead of href="http://www.example.com/foo/bar/page.html" we could maybe just have "page".)
Note: this is certainly very good in general, however, when generating things such as an Feed feature [core] (Atom, RSS 2.0, etc.) we need to have absolute links to make sure they work as expected.
Local links, when clearly created as such, can make use of the title of the destination page, as I do on Drupal with the {node:126 link} syntax (using '[' and ']' instead of '{' and '}'). This way, when a destination page title changes, any other page with a link to that page can be updated with the background filter mechanism.
Such tokens generate Dependencies between pages.
In some situations, a local link will break because the destination page moves (gets renamed). In that case, the source anchors should be fixed and now link to the new page. (The content backend process can do that work as required whenever something changes, assuming we have back links for all pages, we should know what pages get marked as modified on such an event.)
A local link to a page that gets deleted may require a little help by the editor of the website. Those get added to the list of broken links.
There are cases when we should not even have to bother with marking a local link to a newly deleted page as a broken link. This happens for a link to a product in an invoice. If the product gets deleted, then it is gone and thus the link is dead. Period. Such links should be marked by the e-Commerce feature plugin as automatically remove on delete.
The user is given the possibility to organize one's content so as to form a menu. The pages have next/previous/child/parent references that can be used to build a structured menu.
Although, we need to make sure that the same link can still be used in two different menus. The primary feature is to list all the links, once each.
It is possible to write a parser to transform what looks like a URL into an active, clickable URL. (can also manage email addresses and as such it can transform them in JavaScript, etc.)
This feature must skip all example.extension URLs since http://www.example.com/ doesn't exists (so doesn't need to become clickable.) The user should be able to add other exceptions.
The code to transform URLs may be part of the glossary code which also creates links to other pages and possibly websites.
This process can occur at the time the content of a page is saved (i.e. once.)
Also, we can have this process happen while the user edit his text (i.e. the editor is now the smart one!)
Finally, we want the opposite behavior. Someone may be copying a document from another page that has a link to using the example.extension type of URL and those should be unlinked because they are not real links. However, the user could still choose to keep a <span> tag with a class that makes the link look like a link, but not clickable... (i.e. the class can setup the cursor and colors as if that was a link, we could even add a title, maybe using something like overlib.)
The feature should allow a user to mark a page so the filter does not get applied.
Similarly, it should be possible to mark a whole set of pages, as defined by their type or a tag, so the filter is ignored on all of them.
We may want to check out what the default should be (i.e. we could offer the opposite: mark what you want to be parsed and not what should not be parsed.)
As we find the HTTP URLs, we can also detect any URL (i.e. whatever the protocol, we should then have a list of accepted protocols.)
We can also transform email addresses into links (mailto:someone@example.com). And with the server and JavaScript, we can scramble email addresses to where hackers have a much harder to time to steal them.
We offer a link shortener as well, the plugin is called shorturl.
This is used to generate a short URL (shorter than what the site offers by default) to access a page.
The short URL can be generated on the site itself using the /s path and a counter that counts in base 36.
The plugin (will) offers using an external shortener tool such as tinyURL and similar websites.
Note that there is no need nor is it a good idea to create more than one short URL for a page. So we offer a selection of shortener that we support, but we do not use more than one per page (that being said, you may change an any time and old short URLs are not lost in that case.) The reason for the single short URL is search engines: to verify the short URL, search engines are given the short URL in a link in the page and/or the HTTP header. Either way, we can only offer one such link.
The link is saved in the Cassandra database and integrated in the HTML HEAD using the LINK tag with the rel type shortlink.
The shorturl plugin allows for certain pages to not receive a short URL. For example, all pages that are private (not accessible publicly) are automatically ignored by the plugin. Similarly, some pages such as the Terms & Conditions probably do not need to have short URLs attached. Certain lists and intermediate pages probably do not need short version either. The module offers a signal which is used to know whether a short link should be created or not. At times, it can be a lot faster to test a path than permissions (i.e. any data under /ignore should not have a shorten URL.)
The plugin also counts the number of times a shorturl is followed. It uses our statistics tools for the purpose of counting only what we consider valid clicks (i.e. eliminate robots.)
Google "shortlink" proposal: https://code.google.com/archive/p/shortlink/ (It looks like this one is the current official version for Google to be happy.)
The "shorturl" proposal: https://sites.google.com/a/snaplog.com/wiki/short_url
A French shortener: https://lc.cx/api
URLs to external websites should be marked as "no follow" (see also the Per Page/Type Settings). On the other hand, if we want to count the number of clicks, all the links should go back to us anyway, and then we can emit a 302 (with AJAX, we could send a POST to our server to count and then do the correct redirect avoiding the 302, but it shouldn't make much difference, except that the 302 can be used to better control the destination--especially if the destination disappears or becomes infected... we can then tell the user what happened.)
This redirections can also be handy to make changes in one place instead of many (i.e. if 100 pages have a link to ABC and that was moved to XYZ, only the redirect needs to be changed.)
Administrators can be given a dropdown whenever they hover a link. That dropdown can then be used to administrate links without having to edit the page.
Links should be shown with a warning (redirect) or error (broken) so the user can quickly see that there is a problem. The page should have a fixed warning window that tells the user that there are problems and give him/her buttons/links to go to the location of concern. This is not specific to links since other things could be shown there (i.e. a bad word detected in a page, a missing resource such as an image, etc.)
This dropdown could include different types of information as follow:
See also the Spam extension as links to other websites when they are expected to be reciprocal in some way, we may want to consider a link as bad/ugly/unwanted and should quickly warn the administrator of such links.
See: Anti-Spam feature
It is quite annoying to not be able to create anchors on the fly and then link to them.
The system should allow us to add anchors on headers (H1, H2, etc.) and save the link in some form of temporary buffer (cookie?) then pretty much automatically create a link using that anchor (after all we have all the info: title, page URI, anchor name...)
If the anchor already exists on the header, then a Copy button should appear (along with the usual Edit and possibly in this case a Delete.)
On the other hand, it can at times be useful to remove all the links in a page, even those entered using the editor (i.e. the author lost the permission to include anchors... but some existed in older posts.)
Link can be completely removed (The whole anchor <a...>...</a>) or just the anchor tag (the <a ...> and </a> tags are removed, but the text in between is kept.)
See: Leaving HTML behind
We now have an administration menu that can be built by assigning specific tags to pages. It still requires some help so we can group the items. Also we want support for user defined menus (list of manually entered links.)
Snap! Websites
An Open Source CMS System in C++