DansGuardian - Web content filtering
http://www.dansguardian.org
Dans Guardian is a content filter that works together with existing proxy servers. Although simple to install and configure, it's still very powerful in that it will scan URLs, page content, reference PICS ratings, and more. I installed it on my home network.
Here are the basic steps I followed, as well as a plain english description of the path a web request follows. Some knowledge of basic TCP/IP networking, the HTTP protocol, proxies, and firewalling with the Linux kernel may help. Some of these steps are based on using the Netmaster GG-Blade (http://www.netmaster.com) but should work in a variety of environments.
First, some other links that may be useful:
Installation & Configuration
- follow the instructions in the DG INSTALL file to compile and install it.
- Modify the default Squid ACLs to only allow access from localhost and any host(s) that you want to be able to bypass the filter but still use squid for caching
- edit /etc/dansguardian/dansguardian.conf, and set the proxy host and port to wherever squid is running
Transparent Proxy configuration
In my setup, I have a Linux firewall (http://www.netmaster.com) that only provides firewalling services. Squid and Dansguardian are running on a seperate Linux server on the internal network. The ultimate goal of setting up content filtering is to have everybody use it, without being able to get around it. One way to do this is to block all out going web (port 80) requests, and only allow them from the proxy server. This will force every user to specify a port in their browser configuration if their browser supports it. An easier method is to set up some firewall rules:
- make sure transparent proxy support is compiled in the Linux kernel on the firewall
- at the top of the firewall rules/chains, Insert a rule to allow access from your proxy server
- at the bottom of the firewall rules/chains, add a rule to redirect all outgoing web requests to a local port: ipchains -A input -p tcp -d 0.0.0.0/0 80 -j REDIRECT 8081 -l
- use 'tproxyd' or 'redir' to do the redirection: redir --lport=8081 --laddr=192.168.20.1 --cport=8080 --caddr=192.168.20.3
- do not use the --transproxy flag with redir in this scenario. It will slow requests by 3-4 seconds.
- we need to use redir, because ipchains will only redirect to local ports, not ports on other systems.
- in the above notes, 192.168.20.1 is the firewall, 192.168.20.3 is the proxy server, port 3128 is squid, port 8080 is Dans Guardian, port 8081 is the local redirection port on the firewall.
In plain english
... here is what will happen to outgoing HTTP requests if a proxy server is not set in your browser:
- User types in address in browser (e.g. IE)
- Computer (e.g. 192.168.20.4) creates TCP/IP packet and sends it to the default gateway (e.g. 192.168.20.1)
- The gateway sees this outgoing request, and sends it to the local port 8081
- The 'redir' tool is listening on localhost:8081
- redir rewrites the packet and sends it to 192.168.20.3:8080, which is DansGuardian
- DansGuardian filters the URL. If the URL is ok and passes PICS ratings, it sends the request to localhost:3128 which is Squid
- Squid requests the page from the Internet. It's request does not get redirected because that would cause a infinite loop of redirection. The early ACCEPT in the firewall rules takes care of that
- Squid returns page to DG
- DG filters page for bad words
- DG returns page to browser
- Browser shows the "Denied!" page or the normal web page
Read more about Linux, LinuxSecurity, Squid, Debian, ContentFiltering