It is expected that you use a firewall to prevent all connections to your webserver except those on port 80 and 443 and any other port that Apache would answer.
Any other port could cause issues and you are responsible for them.
The Snap Server, if given permission, will be capable to deal with iptables to block users that are detected as flooding the server and not slowing down when asked to.
At this point we intend to only support Apache as a webserver. One thing about this server, it is used by many and is known to be quite safe to use as a web server.
The Snap! Server will ensure that only public files appear in public folders, although you are free to add files there too and nothing will prevent you from making some file public... You still have to use caution when working on a live server.
The HSTS header, accepted by many browsers, tells the browsers that the HTTP protocol is not to be used. If a request is about to be done and the protocol is not HTTPS, the browser forces it anyway.
See also: OWASP HTTP Strict Transport Security
It is suggested, although not required, that you make use of some HTTP firewall mechanism such as modsecurity. These systems can be used to catch all sorts of hits that should never happen and ban the IP address of the host sending such requests (although be careful with blocking addresses when you are to be PCI Compliant.)
The blocking of fast connection requests is provided by this part of the server stack and this is a global feature (i.e. the same person is not going to look at 20 of our websites at once, except in one case: when opening their browsers that points to 20 websites at once... argh!)
Here we can very quickly check a text database and see whether someone is hitting the server too many times in a raw (we have to be careful with AJAX, images, CSS, JS scripts that are all attached to one HTML file. The loading of one HTML file should be a signal to reset counters, or something like that.)
We may also implement systems that blocks (via the firewall) stupid robots, although that logic may be part of the Snap! Server and relaid to the snap.cgi via a text file compatible with the fast connection blocking system.
If we create a cache that Apache can access directly, then we will miss many of the hits at this level unless we have a snap-logger.cgi or something similar.
Here we check and manage user credentials and sessions on top of everything else. The snap.cgi and Apache may have some knowledge of whether someone is logged in and thus be a little more relax on some of the security features (i.e. many AJAX hits for a logged in user is much more likely.)
User credentials are checked as defined in the User feature.
Consider implementing the server side Content Security Policy (CSP) which at least Firefox supports. This would allow us to control the URI that can be used by the different systems we're accessing (i.e. a frame from Google.com or Facebook.com, but scripts only from m2osw.com, etc.)
There are security risks linked with forms in a browser. One is for someone to start entering data such as their log in information and not submitting immediately... then that information is available until the page gets closed. Similarly, a credit card number entered by a user is visible in clear until the user submits the form. Although in both cases 99% of the time the user will submit the data and not leave the computer until things are through, it can happen that the user leaves the computer with such valuable information hanging.
We have two solutions for such problems:
1) Auto-clear the form after X amount of total inactivity (although we may want to ignore mouse hover events.) This means the data is lost, it was only on the user's computer and the delay made it obsolete or at least dangerous to keep as is. We should not need to warn the user.
2) Auto-save the form and clear it / log out / move to another page. In this case we do not penalize the user too much, but we still want to clear his data. When the user is logged in, the data will be saved so on next log in, he can finish the work on that data (i.e. finish the edit and properly save as a post, comment, etc.)
The problem with (2) is that it doesn't really work well with users who are not logged in. This means anonymous user's are either not affected by this concept or their data gets cleared as in (1).
Note that (1) is a problem even for fields that are for passwords since the password appears in clear somewhere in memory. So if someone gains access to the computer and the password is still in the form, it's still in memory somewhere.
How to? At times people do that so we may want to be careful about blocking all who do this, we probably want a timing on how fast the host connects to the robots.txt and then to the forbidden folder. We could also put a note in the robots.txt letting people know that they'll have to now wait 1h before going anywhere else.
The concept is very simple. Some idiots create robots that just post forms randomly with some totally random data. The concept is so simple... robots cannot read so even if you tell the normal users NOT to post on that form, robots still will do it and as they do, you can collect their IP address and block them forever.
There should be only one way for people to get their IP address released, it would be by sending us an email with an explanation, etc.
A form can be marred with the special attribute: autocompletion="off".
This tells the browser to never save the form content locally. This is especially important for Banks. Browsers that do not respect that feature are likely to be banned by your bank.
This needs to be supported by all our forms, especially all those forms that include a password field. Programmers should be able to decide whether a form has that attribute or not.
There is another project called ADsafe that offers a similar capability and may be quite valuable to get our own system to work well.
See the for details.
Someone looking for a page will generally try ONCE. There are some people who will try different syntax, maybe 4 or 5 times (some weird persistent guy?!)
After 3 times, we should warn the user that he's about to get banned and offer them to instead use our search feature.
One drawback is that most search engines (such as Google) will test many URLs that existed (and if you made a mistake with a domain or sub-domain and it was on a different system... that's many URLs!!!) So we have to be careful because we probably don't want to ban those. On the other hand, with normal processes, we should get a redirect (301) on any page that was deleted or moved. So we should not have to ban search engines.
Some users will change browser while browsing your site. For example, they could be testing to see whether the site reacts differently for Internet Explorer, Opera, Chrome, and Firefox.
However, that should be rare and we could offer such testers to be inserted in a white list. Others would get blocked.
We have to make sure that this doesn't happen while browsing for systems such as AOL where two different users would browse our site through two different browsers but the same IP address. Other than that, seeing connections from the same IP with many different browsers within seconds is clearly a sign that a robot is working its way in.
See Also: Coverity Scan of Snap!