Faq
Faq
Faq
Contents
1 About Squid, this FAQ, and other Squid information resources 15
1.1 What is Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.2 What is Internet object caching? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3 Why is it called Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.4 What is the latest version of Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5 Who is responsible for Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.6 Where can I get Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.7 What Operating Systems does Squid support? . . . . . . . . . . . . . . . . . . . . . . . . . . 16 1.8 Does Squid run on Windows NT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.9 What Squid mailing lists are available? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.10 I can't gure out how to unsubscribe from your mailing list. . . . . . . . . . . . . . . . . . . . 17 1.11 What other Squid-related documentation is available? . . . . . . . . . . . . . . . . . . . . . . 18 1.12 Does Squid support SSL/HTTPS/TLS? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.13 What's the legal status of Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.14 Is Squid year-2000 compliant? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.15 Can I pay someone for Squid support? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.16 Squid FAQ contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.17 About This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.17.1 Want to contribute? Please write in SGML... . . . . . . . . . . . . . . . . . . . . . . . 21
22
2.1 Which le do I download to get Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 How do I compile Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3 What kind of compiler do I need? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 What else do I need to compile Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5 Do you have pre-compiled binaries available? . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.6 How do I apply a patch or a di ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.7 congure options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.8 undened reference to inet ntoa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.9 How can I get true DNS TTL info into Squid's IP cache? . . . . . . . . . . . . . . . . . . . . 25
CONTENTS
2.10 My platform is BSD/OS or BSDI and I can't compile Squid
2 . . . . . . . . . . . . . . . . . . 26
2.11 Problems compiling libmiscutil.a on Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.12 I have problems compiling Squid on Platform Foo. . . . . . . . . . . . . . . . . . . . . . . . . 27 2.13 I see a lot warnings while compiling Squid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.14 Building Squid on OS/2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
28
3.1 How big of a system do I need to run Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2 How do I install Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.3 What does the squid.conf le do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.4 Do you have a squid.conf example? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.5 How do I start Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.6 How do I start Squid automatically when the system boots? . . . . . . . . . . . . . . . . . . . 30 3.6.1 From inittab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.6.2 From rc.local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.6.3 From init.d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.7 How do I tell if Squid is running? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.8 squid command line options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.9 How do I see how Squid works? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.10 Can Squid benet from SMP systems? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.11 Is it okay to use separate drives and RAID on Squid? . . . . . . . . . . . . . . . . . . . . . . . 34
4 Conguration issues
35
4.1 How do I join a cache hierarchy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2 How do I join NLANR's cache hierarchy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.3 Why should I want to join NLANR's cache hierarchy? . . . . . . . . . . . . . . . . . . . . . . 36 4.4 How do I register my cache with NLANR's registration service? . . . . . . . . . . . . . . . . . 36 4.5 How do I nd other caches close to me and arrange parent/child/sibling relationships with them? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.6 My cache registration is not appearing in the Tracker database. . . . . . . . . . . . . . . . . . 36 4.7 What is the httpd-accelerator mode? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.8 How do I congure Squid to work behind a rewall? . . . . . . . . . . . . . . . . . . . . . . . 36 4.9 How do I congure Squid forward all requests to another proxy? . . . . . . . . . . . . . . . . 37 4.10 I have dnsserver processes that aren't being used, should I lower the number in squid.conf ? . 37 4.11 My dnsserver average/median service time seems high, how can I reduce it? . . . . . . . . . . 38 4.12 How can I easily change the default HTTP port? . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.13 Is it possible to control how big each cache dir is? . . . . . . . . . . . . . . . . . . . . . . . . 38 4.14 What cache dir size should I use? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
CONTENTS
4.15 I'm adding a new cache dir . Will I lose my cache? . . . . . . . . . . . . . . . . . . . . . . . . 39 4.16 Squid and http-gw from the TIS toolkit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.16.1 Firewall conguration: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.16.2 Squid conguration: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.17 What is \HTTP X FORWARDED FOR"? Why does squid provide it to WWW servers, and how can I stop it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.18 Can Squid anonymize HTTP requests? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.18.1 Squid 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.19 Can I make Squid go direct for some sites? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.20 Can I make Squid proxy only, without caching anything? . . . . . . . . . . . . . . . . . . . . 41 4.21 Can I prevent users from downloading large les? . . . . . . . . . . . . . . . . . . . . . . . . . 42
42
5.1 Netscape manual conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.2 Netscape automatic conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.3 Lynx and Mosaic conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.4 Redundant Proxy Auto-Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.5 Proxy Auto-Conguration with URL Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.6 Microsoft Internet Explorer conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.7 Netmanage Internet Chameleon WebSurfer conguration . . . . . . . . . . . . . . . . . . . . . 46 5.8 Opera 2.12 proxy conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.9 How do I tell Squid to use a specic username for FTP urls? . . . . . . . . . . . . . . . . . . 46 5.10 Conguring Browsers for WPAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.11 Conguring Browsers for WPAD with DHCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.12 IE 5.0x crops trailing slashes from FTP URL's . . . . . . . . . . . . . . . . . . . . . . . . . . 48 5.13 IE 6.0 SP1 fails when using authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
49
6.1 squid.out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.2 cache.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.3 useragent.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 6.4 store.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6.5 hierarchy.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.6 access.log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.6.1 The common log le format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.6.2 The native log le format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.6.3 access.log native format in detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.7 Squid result codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
CONTENTS
6.8 HTTP status codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.9 Request methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.10 Hierarchy Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.11 cache/log (Squid-1.x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 6.12 swap.state (Squid-2.x) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.13 Which log les can I delete safely? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.14 How can I disable Squid's log les? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.15 My log les get very big! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 6.16 I want to use another tool to maintain the log les. . . . . . . . . . . . . . . . . . . . . . . . . 61 6.17 Managing log les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 6.18 Why do I get ERR NO CLIENTS BIG OBJ messages so often? . . . . . . . . . . . . . . . . . 62 6.19 What does ERR LIFETIME EXP mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.20 Retrieving \lost" les from the cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.21 Can I use store.log to gure out if a response was cachable? . . . . . . . . . . . . . . . . . . . 63
7 Operational issues
63
7.1 How do I see system level Squid statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.2 How can I nd the biggest objects in my cache? . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.3 I want to restart Squid with a clean cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 7.4 How can I proxy/cache Real Audio? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.5 How can I purge an object from my cache? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.6 Using ICMP to Measure the Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 7.6.1 Supporting ICMP in your Squid cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.6.2 Utilizing your parents database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.6.3 Inspecting the database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 7.7 Why are so few requests logged as TCP IMS MISS? . . . . . . . . . . . . . . . . . . . . . . . 67 7.8 How can I make Squid NOT cache some servers or URLs? . . . . . . . . . . . . . . . . . . . . 67 7.9 How can I delete and recreate a cache directory? . . . . . . . . . . . . . . . . . . . . . . . . . 68 7.10 Why can't I run Squid as root? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.11 Can you tell me a good way to upgrade Squid with minimal downtime? . . . . . . . . . . . . 69 7.12 Can Squid listen on more than one HTTP port? . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.13 Can I make origin servers see the client's IP address when going through Squid? . . . . . . . 69
8 Memory
69
8.1 Why does Squid use so much memory!? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 8.2 How can I tell how much memory my Squid process is using? . . . . . . . . . . . . . . . . . . 70 8.3 My Squid process grows without bounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 8.4 I set cache mem to XX, but the process grows beyond that! . . . . . . . . . . . . . . . . . . . 71
CONTENTS
8.5 How do I analyze memory usage from the cache manger output? . . . . . . . . . . . . . . . . 71 8.6 The \Total memory accounted" value is less than the size of my Squid process. . . . . . . . . 72 8.7 xmalloc: Unable to allocate 4096 bytes! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 8.7.1 BSD/OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 8.7.2 FreeBSD (2.2.X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 8.7.3 OSF, Digital Unix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 8.8 fork: (12) Cannot allocate memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 8.9 What can I do to reduce Squid's memory usage? . . . . . . . . . . . . . . . . . . . . . . . . . 75 8.10 Using an alternate malloc library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 8.10.1 Using GNU malloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 8.10.2 dlmalloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 8.11 How much memory do I need in my Squid server? . . . . . . . . . . . . . . . . . . . . . . . . 77
77
9.1 What is the cache manager? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 9.2 How do you set it up? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 9.3 Cache manager conguration for CERN httpd 3.0 . . . . . . . . . . . . . . . . . . . . . . . . 78 9.4 Cache manager conguration for Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 9.5 Cache manager conguration for Roxen 2.0 and later . . . . . . . . . . . . . . . . . . . . . . . 79 9.6 Cache manager ACLs in squid.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 9.7 Why does it say I need a password and a URL? . . . . . . . . . . . . . . . . . . . . . . . . . . 80 9.8 I want to shutdown the cache remotely. What's the password? . . . . . . . . . . . . . . . . . 80 9.9 How do I make the cache host default to my cache? . . . . . . . . . . . . . . . . . . . . . . . 80 9.10 What's the dierence between Squid TCP connections and Squid UDP connections? . . . . . 81 9.11 It says the storage expiration will happen in 1970! . . . . . . . . . . . . . . . . . . . . . . . . 81 9.12 What do the Meta Data entries mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 9.13 In the utilization section, what is Other? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 9.14 In the utilization section, why is the Transfer KB/sec column always zero? . . . . . . . . . . 81 9.15 In the utilization section, what is the Object Count? . . . . . . . . . . . . . . . . . . . . . . . 82 9.16 In the utilization section, what is the Max/Current/Min KB? . . . . . . . . . . . . . . . . . . . 82 9.17 What is the I/O section about? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.18 What is the Objects section for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.19 What is the VM Objects section for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.20 What does AVG RTT mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.21 In the IP cache section, what's the dierence between a hit, a negative hit and a miss? . . . . 82 9.22 What do the IP cache contents mean anyway? . . . . . . . . . . . . . . . . . . . . . . . . . . 82 9.23 What is the fqdncache and how is it dierent from the ipcache? . . . . . . . . . . . . . . . . . 83
CONTENTS
9.24 What does \Page faults with physical i/o: 4897" mean? . . . . . . . . . . . . . . . . . . . . . 83 9.24.1 Ok, so what is unusually high? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 9.25 What does the IGNORED eld mean in the 'cache server list'? . . . . . . . . . . . . . . . . . 85
10 Access Controls
85
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 10.1.1 ACL elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 10.1.2 Access Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 10.2 How do I allow my clients to use the cache? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 10.3 how do I congure Squid not to cache a specic server? . . . . . . . . . . . . . . . . . . . . . 88 10.4 How do I implement an ACL ban list? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 10.5 How do I block specic users or groups from accessing my cache? . . . . . . . . . . . . . . . . 89 10.5.1 Ident . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 10.5.2 Proxy Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 10.6 Do you have a CGI program which lets users change their own proxy passwords? . . . . . . . 89 10.7 Is there a way to do ident lookups only for a certain host and compare the result with a userlist in squid.conf? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 10.8 Common Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 10.8.1 And/Or logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 10.8.2 allow/deny mixups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 10.8.3 Dierences between src and srcdomain ACL types. . . . . . . . . . . . . . . . . . . . . 91 10.9 I set up my access controls, but they don't work! why? . . . . . . . . . . . . . . . . . . . . . . 91 10.10Proxy-authentication and neighbor caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 10.11Is there an easy way of banning all Destination addresses except one? . . . . . . . . . . . . . 92 10.12Does anyone have a ban list of porn sites and such? . . . . . . . . . . . . . . . . . . . . . . . 93 10.13Squid doesn't match my subdomains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 10.14Why does Squid deny some port numbers? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 10.15Does Squid support the use of a database such as mySQL for storing the ACL list? . . . . . . 94 10.16How can I allow a single address to access a specic URL? . . . . . . . . . . . . . . . . . . . . 94 10.17How can I allow some clients to use the cache at specic times? . . . . . . . . . . . . . . . . . 94 10.18How can I allow some users to use the cache at specic times? . . . . . . . . . . . . . . . . . . 95 10.19Problems with IP ACL's that have complicated netmasks . . . . . . . . . . . . . . . . . . . . 95 10.20Can I set up ACL's based on MAC address rather than IP? . . . . . . . . . . . . . . . . . . . 95 10.21Debugging ACLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 10.22Can I limit the number of connections from a client? . . . . . . . . . . . . . . . . . . . . . . . 96 10.23I'm trying to deny foo.com , but it's not working. . . . . . . . . . . . . . . . . . . . . . . . . . 96 10.24I want to customize, or make my own error messages. . . . . . . . . . . . . . . . . . . . . . . 96
CONTENTS
11 Troubleshooting
97
11.1 Why am I getting \Proxy Access Denied?" . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 11.2 I can't get local domain to work; Squid is caching the objects from my local servers. . . . . 97 11.3 I get Connection Refused when the cache tries to retrieve an object located on a sibling, even though the sibling thinks it delivered the object to my cache. . . . . . . . . . . . . . . . 97 11.4 Running out of ledescriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 11.4.1 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 11.4.2 Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 11.4.3 FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 11.4.4 General BSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 11.4.5 Recongure afterwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 11.5 What are these strange lines about removing objects? . . . . . . . . . . . . . . . . . . . . . . 100 11.6 Can I change a Windows NT FTP server to list directories in Unix format? . . . . . . . . . . 100 11.7 Why am I getting \Ignoring MISS from non-peer x.x.x.x?" . . . . . . . . . . . . . . . . . . . 101 11.8 DNS lookups for domain names with underscores ( ) always fail. . . . . . . . . . . . . . . . . 101 11.9 Why does Squid say: \Illegal character in hostname; underscores are not allowed?' . . . . . . 101 11.10Why am I getting access denied from a sibling cache? . . . . . . . . . . . . . . . . . . . . . . 102 11.11Cannot bind socket FD NN to *:8080 (125) Address already in use . . . . . . . . . . . . . . . 103 11.12icpDetectClientClose: ERROR xxx.xxx.xxx.xxx: (32) Broken pipe . . . . . . . . . . . . . . . 103 11.13icpDetectClientClose: FD 135, 255 unexpected bytes . . . . . . . . . . . . . . . . . . . . . . . 103 11.14Does Squid work with NTLM Authentication? . . . . . . . . . . . . . . . . . . . . . . . . . . 103 11.15The default parent option isn't working! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 11.16\Hot Mail" complains about: Intrusion Logged. Access denied. . . . . . . . . . . . . . . . . . 104 11.17My Squid becomes very slow after it has been running for some time. . . . . . . . . . . . . . 105 11.18WARNING: Failed to start 'dnsserver' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 11.19Sending in Squid bug reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 11.19.1 crashes and core dumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 11.20Debugging Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 11.21FATAL: ipcache init: DNS name lookup tests failed . . . . . . . . . . . . . . . . . . . . . . . 109 11.22FATAL: Failed to make swap directory /var/spool/cache: (13) Permission denied . . . . . . . 109 11.23FATAL: Cannot open HTTP Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 11.24FATAL: All redirectors have exited! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 11.25FATAL: le map allocate: Exceeded lemap limit . . . . . . . . . . . . . . . . . . . . . . . . . 110 11.26FATAL: You've run out of swap le numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 11.27I am using up over 95% of the lemap bits?!! . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
CONTENTS
11.28FATAL: Cannot open /usr/local/squid/logs/access.log: (13) Permission denied . . . . . . . . 111 11.29When using a username and password, I can not access some les. . . . . . . . . . . . . . . . 111 11.30pingerOpen: icmp sock: (13) Permission denied . . . . . . . . . . . . . . . . . . . . . . . . . . 112 11.31What is a forwarding loop? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 11.32accept failure: (71) Protocol error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 11.33storeSwapInFileOpened: ... Size mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 11.34Why do I get fwdDispatch: Cannot retrieve 'https://www.buy.com/corp/ordertracking.asp' . . 113 11.35Squid can't access URLs like http://3626046468/ab2/cybercards/moreinfo.html . . . . . . . . 114 11.36I get a lot of \URI has whitespace" error messages in my cache log, what should I do? . . . . 114 11.37commBind: Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address . . 115 11.38Unknown cache dir type '/var/squid/cache' . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 11.39unrecognized: 'cache dns program /usr/local/squid/bin/dnsserver' . . . . . . . . . . . . . . . 115 11.40Is dns defnames broken in Squid-2.3 and later . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 11.41What does sslReadClient: FD 14: read failure: (104) Connection reset by peer mean? . . . . 115 11.42What does Connection refused mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 11.43squid: ERROR: no running copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 11.44FATAL: getgrnam failed to nd groupid for eective group 'nogroup' . . . . . . . . . . . . . . 117 11.45\Unsupported Request Method and Protocol" for https URLs. . . . . . . . . . . . . . . . . . 117 11.46Squid uses 100% CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 11.47Webmin's cachemgr.cgi crashes the operating system . . . . . . . . . . . . . . . . . . . . . . . 117 11.48Segment Violation at startup or upon rst request . . . . . . . . . . . . . . . . . . . . . . . . 117 11.49urlParse: Illegal character in hostname 'proxy.mydomain.com:8080proxy.mydomain.com' . . . 118 11.50Requests for international domain names does not work . . . . . . . . . . . . . . . . . . . . . 118 11.51Why do I sometimes get \Zero Sized Reply"? . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
120
12.1 What are cachable objects? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 12.2 What is the ICP protocol? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 12.3 What is the dnsserver ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 12.4 What is the ftpget program for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 12.5 FTP PUT's don't work! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 12.6 What is a cache hierarchy? What are parents and siblings? . . . . . . . . . . . . . . . . . . . 121 12.7 What is the Squid cache resolution algorithm? . . . . . . . . . . . . . . . . . . . . . . . . . . 121 12.8 What features are Squid developers currently working on? . . . . . . . . . . . . . . . . . . . . 122 12.9 Tell me more about Internet trac workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 12.10What are the tradeos of caching with the NLANR cache system? . . . . . . . . . . . . . . . 122 12.11Where can I nd out more about rewalls? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
CONTENTS
12.12What is the \Storage LRU Expiration Age?" . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 12.13What is \Failure Ratio at 1.01; Going into hit-only-mode for 5 minutes"? . . . . . . . . . . . 123 12.14Does squid periodically re-read its conguration le? . . . . . . . . . . . . . . . . . . . . . . . 123 12.15How does unlinkd work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 12.16What is an icon URL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 12.17Can I make my regular FTP clients use a Squid cache? . . . . . . . . . . . . . . . . . . . . . . 124 12.18Why is the select loop average time so high? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 12.19How does Squid deal with Cookies? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 12.20How does Squid decide when to refresh a cached object? . . . . . . . . . . . . . . . . . . . . . 125 12.20.1 Squid-1.1 and Squid-1.NOVM algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 125 12.20.2 Squid-2 algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 12.21What exactly is a deferred read ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 12.22Why is my cache's inbound trac equal to the outbound trac? . . . . . . . . . . . . . . . . 126 12.23How come some objects do not get cached? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 12.24What does keep-alive ratio mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 12.25How does Squid's cache replacement algorithm work? . . . . . . . . . . . . . . . . . . . . . . 128 12.25.1 Squid 1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 12.25.2 Squid 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 12.26What are private and public keys? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 12.27What is FORW VIA DB for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 12.28Does Squid send packets to port 7 (echo)? If so, why? . . . . . . . . . . . . . . . . . . . . . . 130 12.29What does \WARNING: Reply from unknown nameserver [a.b.c.d]" mean? . . . . . . . . . . 130 12.30How does Squid distribute cache les among the available directories? . . . . . . . . . . . . . 131 12.31Why do I see negative byte hit ratio? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 12.32What does \Disabling use of private keys" mean? . . . . . . . . . . . . . . . . . . . . . . . . . 131 12.33What is a half-closed ledescriptor? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 12.34What does {enable-heap-replacement do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 12.35Why is actual lesystem space used greater than what Squid thinks? . . . . . . . . . . . . . . 132 12.36How do positive dns ttl and negative dns ttl work? . . . . . . . . . . . . . . . . . . . . . . . . 133 12.37What does swapin MD5 mismatch mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 12.38What does failed to unpack swaple meta data mean? . . . . . . . . . . . . . . . . . . . . . . 134 12.39Why doesn't Squid make ident lookups in interception mode? . . . . . . . . . . . . . . . . . . 134 12.40dnsSubmit: queue overload, rejecting blah . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 12.41What are FTP passive connections? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
13 Multicast
135
CONTENTS
10
13.2 How do I know if my network has multicast? . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 13.3 Should I be using Multicast ICP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 13.4 How do I congure Squid to send Multicast ICP queries? . . . . . . . . . . . . . . . . . . . . 136 13.5 How do I know what Multicast TTL to use? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 13.6 How do I congure Squid to receive and respond to Multicast ICP? . . . . . . . . . . . . . . . 137
14 System-Dependent Weirdnesses
137
14.1 Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 14.1.1 TCP incompatibility? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 14.1.2 select() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 14.1.3 malloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 14.1.4 DNS lookups and nscd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 14.1.5 DNS lookups and /etc/nsswitch.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 14.1.6 DNS lookups and NIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 14.1.7 Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 14.1.8 disk write error: (28) No space left on device . . . . . . . . . . . . . . . . . . . . . . . 139 14.1.9 Solaris X86 and IPFilter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 14.1.10 Changing the directory lookup cache size . . . . . . . . . . . . . . . . . . . . . . . . . 140 14.1.11 The priority paging algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 14.2 FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 14.2.1 T/TCP bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 14.2.2 mbuf size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 14.2.3 Dealing with NIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 14.2.4 FreeBSD 3.3: The lo0 (loop-back) device is not congured on startup . . . . . . . . . 142 14.2.5 FreeBSD 3.x or newer: Speed up disk writes using Softupdates . . . . . . . . . . . . . 143 14.2.6 Internal DNS problems with jail environment . . . . . . . . . . . . . . . . . . . . . . . 143 14.3 OSF1/3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 14.4 BSD/OS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 14.4.1 gcc/yacc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 14.4.2 process priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 14.5 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 14.5.1 Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address . . . . . 144 14.5.2 FATAL: Don't run Squid as root, set 'cache eective user'! . . . . . . . . . . . . . . . 144 14.5.3 Large ACL lists make Squid slow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 14.5.4 gethostbyname() leaks memory in RedHat 6.0 with glibc 2.1.1. . . . . . . . . . . . . . 145 14.5.5 assertion failed: StatHist.c:91: `statHistBin(H, max) == H->capacity - 1' on Alpha system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
CONTENTS
11
14.5.6 tools.c:605: storage size of `rl' isn't known . . . . . . . . . . . . . . . . . . . . . . . . . 145 14.5.7 Can't connect to some sites through Squid . . . . . . . . . . . . . . . . . . . . . . . . . 145 14.6 HP-UX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 14.6.1 StatHist.c:74: failed assertion `statHistBin(H, min) == 0' . . . . . . . . . . . . . . . . 146 14.7 IRIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 14.7.1 dnsserver always returns 255.255.255.255 . . . . . . . . . . . . . . . . . . . . . . . . . 146 14.8 SCO-UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 14.9 AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 14.9.1 "shmat failed" errors with diskd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 14.9.2 Core dumps when squid process grows to 256MB . . . . . . . . . . . . . . . . . . . . . 147
15 Redirectors
147
15.1 What is a redirector? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.2 Why use a redirector? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.3 How does it work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.4 Do you have any examples? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 15.5 Can I use the redirector to return HTTP redirect messages? . . . . . . . . . . . . . . . . . . . 148 15.6 FATAL: All redirectors have exited! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 15.7 Redirector interface is broken re IDENT values . . . . . . . . . . . . . . . . . . . . . . . . . . 148
16 Cache Digests
149
16.1 What is a Cache Digest? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 16.2 How and why are they used? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 16.3 What is the theory behind Cache Digests? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 16.3.1 Adding a Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 16.3.2 Querying a Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 16.3.3 Deleting a Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 16.4 How is the size of the Cache Digest in Squid determined? . . . . . . . . . . . . . . . . . . . . 150 16.5 What hash functions (and how many of them) does Squid use? . . . . . . . . . . . . . . . . . 151 16.6 How are objects added to the Cache Digest in Squid? . . . . . . . . . . . . . . . . . . . . . . 151 16.7 Does Squid support deletions in Cache Digests? What are dis/deltas? . . . . . . . . . . . . . 151 16.8 When and how often is the local digest built? . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 16.9 How are Cache Digests transferred between peers? . . . . . . . . . . . . . . . . . . . . . . . . 152 16.10How and where are Cache Digests stored? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 16.10.1 Cache Digest built locally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 16.10.2 Cache Digest fetched from peer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 16.11How are the Cache Digest statistics in the Cache Manager to be interpreted? . . . . . . . . . 153 16.12What are False Hits and how should they be handled? . . . . . . . . . . . . . . . . . . . . . . 154
CONTENTS
12
16.13How can Cache Digest related activity be traced/debugged? . . . . . . . . . . . . . . . . . . . 155 16.13.1 Enabling Cache Digests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 16.13.2 What do the access.log entries look like? . . . . . . . . . . . . . . . . . . . . . . . . . . 155 16.13.3 What does a False Hit look like? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 16.13.4 How is the cause of a False Hit determined? . . . . . . . . . . . . . . . . . . . . . . . . 155 16.13.5 Use The Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 16.14What about ICP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 16.15Is there a Cache Digest Specication? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 16.16Would it be possible to stagger the timings when cache digests are retrieved from peers? . . . 156
17 Interception Caching/Proxying
157
17.1 Interception caching for Solaris, SunOS, and BSD systems . . . . . . . . . . . . . . . . . . . . 158 17.1.1 Install IP Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 17.1.2 Congure ipnat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 17.1.3 Congure Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 17.2 Interception caching with Linux 2.0 and ipfwadm . . . . . . . . . . . . . . . . . . . . . . . . 159 17.3 Interception caching with Linux 2.2 and ipchains . . . . . . . . . . . . . . . . . . . . . . . . . 162 17.4 Interception caching with Linux 2.4 and netlter . . . . . . . . . . . . . . . . . . . . . . . . . 163 17.5 Interception caching with Cisco routers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 17.5.1 possible bugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 17.6 Interception caching with LINUX 2.0.29 and CISCO IOS 11.1 . . . . . . . . . . . . . . . . . . 165 17.7 The cache is trying to connect to itself... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 17.8 Interception caching with FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 17.9 Interception caching with ACC Tigris digital access server . . . . . . . . . . . . . . . . . . . . 168 17.10\Connection reset by peer" and Cisco policy routing . . . . . . . . . . . . . . . . . . . . . . . 169 17.11WCCP - Web Cache Coordination Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 17.11.1 Does Squid support WCCP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 17.11.2 Conguring your Router . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 17.11.3 IOS 12.x problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 17.11.4 Conguring FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 17.11.5 Conguring Linux 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 17.11.6 Conguring Others . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 17.12Can someone tell me what version of cisco IOS WCCP is added in? . . . . . . . . . . . . . . . 172 17.13What about WCCPv2? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 17.14Interception caching with Foundry L4 switches . . . . . . . . . . . . . . . . . . . . . . . . . . 173 17.15Can I use proxy auth with interception? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
CONTENTS 18 SNMP
13
174
18.1 Does Squid support SNMP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 18.2 Enabling SNMP in Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 18.3 Conguring Squid 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 18.4 Conguring Squid 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 18.5 How can I query the Squid SNMP Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 18.6 What can I use SNMP and Squid for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 18.7 How can I use SNMP with Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 18.8 Where can I get more information/discussion about Squid and SNMP? . . . . . . . . . . . . . 176 18.9 Monitoring Squid with MRTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
19 Squid version 2
177
19.1 What are the new features? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 19.2 How do I congure 'ssl proxy' now? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.3 Logle rotation doesn't work with Async I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.4 Adding a new cache disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.5 Squid 2 performs badly on Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 19.6 How do I congure proxy authentication with Squid-2? . . . . . . . . . . . . . . . . . . . . . 179 19.7 Why does proxy-auth reject all users after upgrading from Squid-2.1 or earlier? . . . . . . . . 179 19.8 Delay Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 19.8.1 How can I limit Squid's total bandwidth to, say, 512 Kbps? . . . . . . . . . . . . . . . 181 19.8.2 How to limit a single connection to 128 Kbps? . . . . . . . . . . . . . . . . . . . . . . 181 19.8.3 How do you personally use delay pools? . . . . . . . . . . . . . . . . . . . . . . . . . . 182 19.8.4 Where else can I nd out about delay pools? . . . . . . . . . . . . . . . . . . . . . . . 183 19.9 Can I preserve my cache when upgrading from 1.1 to 2? . . . . . . . . . . . . . . . . . . . . . 186 19.10Customizable Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 19.11My squid.conf from version 1.1 doesn't work! . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
20 httpd-accelerator mode
189
20.1 What is the httpd-accelerator mode? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 20.2 How do I set it up? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 20.3 When using an httpd-accelerator, the port number for redirects is wrong . . . . . . . . . . . . 190
21 Related Software
191
21.1 Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.1.1 Wget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.1.2 echoping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.2 Logle Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
CONTENTS
14
21.3 Conguration Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.3.1 3Dhierarchy.pl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.4 Squid add-ons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.4.1 transproxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.4.2 Iain's redirector package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 21.4.3 Junkbusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.4 Squirm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.5 chpasswd.cgi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.6 jesred . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.7 squidGuard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.8 Central Squid Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 21.4.9 Cerberian content lter (subscription service) . . . . . . . . . . . . . . . . . . . . . . . 193 21.5 Ident Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
22 DISKD
193
22.1 What is DISKD? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 22.2 Does it perform better? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 22.3 How do I use it? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 22.4 FATAL: Unknown cache dir type 'diskd' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 22.5 If I use DISKD, do I have to wipe out my current cache? . . . . . . . . . . . . . . . . . . . . 193 22.6 How do I congure message queues? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 22.6.1 FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 22.6.2 OpenBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 22.6.3 Digital Unix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 22.6.4 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 22.6.5 Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 22.7 How do I congure shared memory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 22.7.1 FreeBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 22.7.2 OpenBSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 22.7.3 Digital Unix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 22.7.4 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 22.7.5 Solaris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 22.8 Sometimes shared memory and message queues aren't released when Squid exits. . . . . . . . 198 22.9 What are the Q1 and Q2 parameters? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
23 Authentication
199
23.1 How does Proxy Authentication work in Squid? . . . . . . . . . . . . . . . . . . . . . . . . . . 199 23.2 How do I use authentication in access controls? . . . . . . . . . . . . . . . . . . . . . . . . . . 200
15
23.3 Does Squid cache authentication lookups? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 23.4 Are passwords stored in clear text or encrypted? . . . . . . . . . . . . . . . . . . . . . . . . . 200 23.5 How do I use the Winbind authenticators? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 23.5.1 Supported Samba Releases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 23.5.2 Congure Samba . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 23.5.3 Congure Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
205
25 Security Concerns
205
25.1 Open-access proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 25.2 Mail relaying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 You can download the FAQ as PDF <FAQ.pdf>, compressed Postscript <FAQ.ps.gz>, plain text <FAQ.txt>, linuxdoc SGML source <FAQ.sgml> or as a compressed tar of HTML <FAQ.tar.gz>.
16
Linux FreeBSD NetBSD OpenBSD BSDI Mac OS/X OSF/Digital Unix/Tru64 IRIX SunOS/Solaris NeXTStep SCO Unix AIX HP-UX
17
2.14
For more specic information, please see platforms.html <http://www.squid-cache.org/platforms.html>. If you encounter any platform-specic problems, please let us know by registering a entry in our bug database <http://www.squid-cache.org/bugs/>.
Guido Serassio <http://www.acmeconsulting.it/SquidNT/> maintains the native NT port of Squid and is actively working on having the needed changes integrated into the standard Squid distribution. Partially based on earlier NT port by Romeo Anghelache <http://www.phys-iasi.ro/users/romeo/squidnt.htm>. LogiSense <http://www.logisense.com/> has ported Squid to Windows NT and sells a supported version. You can also download the source from their FTP site <ftp://ftp.logisense.com/cachexpress/>. Thanks to LogiSense for making the code available as required by the GPL terms.
1.9 What Squid mailing lists are available? [email protected]: general discussions about the Squid cache software. Subscribe via [email protected] . Previous messages are available
<http://www.squid-cache.org/mail-archive/squid-users/>, <http://marc.theaimsgroup.com/?l=squid-users&r=1&w=2>.
for
browsing
at
[email protected] : A closed list for sending us bug reports. Bug reports received here are
given priority over those mentioned on squid-users.
[email protected] : A closed list for sending us feed-back and ideas. [email protected] : A closed list for sending us feed-back, updates, and additions to the Squid
FAQ. We also have a few other mailing lists which are not strictly Squid-related.
[email protected] : A public list for discussion of Web Caching and SNMP issues and developments. Eventually we hope to put forth a standard Web Caching MIB. It may be resurrected some day, you never know!
[email protected] : Mostly-idle mailing list for the nonexistent ICP Working Group within the IETF.
1.10 I can't gure out how to unsubscribe from your mailing list.
All of our mailing lists have \-subscribe" and \-unsubscribe" addresses that you must use for subscribe and unsubscribe requests. To unsubscribe from the squid-users list, you send a message to [email protected] .
18
1.11 What other Squid-related documentation is available? The Squid home page <http://www.squid-cache.org/> for information on the Squid software Squid: The Denitive Guide <http://squidbook.org/> to be published by O'Reilly and Associates
<http://www.oreilly.com/catalog/squid/>January 2004.
The IRCache Mesh <http://www.ircache.net/> gives information on our operational mesh of caches. The Squid FAQ <http://www.squid-cache.org/Doc/FAQ/> (uh, you're reading it). Oskar's Squid Users Guide <http://squid-docs.sourceforge.net/latest/html/book1.html>. Visolve's Conguration Guide <http://www.visolve.com/squidman/Configuration Guide.html>. Squid documentation in German <http://www.squid-handbuch.de/>, Turk<http://istanbul.linux.org.tr/~ilkerg/squid/elkitabi.html>, <http://merlino.merlinobbs.net/Squid-Book/>, Brazilian <http://www.linuxman.pro.br/squid/>, and another in Brazilian <http://www.geocities.com/glasswalk3r/linux/squidnomicon.html>.
ish ian
ItalPortugese Portugese
Yeah, its extremely incomplete. I assure you this is the most recent version.
<http://www.squid-cache.org/Doc/Prog-Guide/prog-guide.html>.
Web Caching Resources <http://www.web-cache.com> Squid-1.0 Release Notes </Versions/1.0/Release-Notes-1.0.txt> Squid-1.1 Release Notes </Versions/1.1/Release-Notes-1.1.txt> Tutorial on Conguring Hierarchical Squid Caches
<http://www.squid-cache.org/Doc/Hierarchy-Tutorial/>
RFC 2186 <ftp://ftp.isi.edu/in-notes/rfc2616.txt> ICPv2 { Protocol RFC 2187 <ftp://ftp.isi.edu/in-notes/rfc2187.txt> ICPv2 { Application RFC 1016 <ftp://ftp.isi.edu/in-notes/rfc1016.txt>
19
GNU
General
Public
License
cache.log timestamps use 4-digit years instead of just 2 digits. parse rfc1123() assumes years less than "70" are after 2000. parse iso3307 time() checks all four year digits.
Year-2000 xes were applied to the following Squid versions:
squid-2.1 </Versions/v2/2.1/>: Year parsing bug xed for dates in the "Wed Jun 9 01:29:59 1993
GMT" format (Richard Kettlewell).
squid-1.1.22: Fixed likely year-2000 bug in ftpget's timestamp parsing (Henrik Nordstrom). squid-1.1.20: Misc xes (Arjan de Vet).
Patches:
Richard's lib/rfc1123.c patch <../Y2K/patch3>. If you are still running 1.1.X, then you should apply
this patch to your source and recompile.
Squid-2.2 and earlier versions have a New Year bug <http://www.squid-cache.org/Versions/v2/2.2/bugs/index.html#sq This is not strictly a Year-2000 bug; it would happen on the rst day of any year.
20
Cord Beermann <mailto:[email protected]> Tony Sterrett <mailto:[email protected]> Gerard Hynes <mailto:[email protected]> Katayama, Takeo <mailto:[email protected]> Duane Wessels <mailto:[email protected]> K Clay <mailto:[email protected]> Paul Southworth <mailto:[email protected]> Oskar Pearson <mailto:[email protected]> Ong Beng Hui <mailto:[email protected]> Torsten Sturm <mailto:[email protected]> James R Grinter <mailto:[email protected]> Rodney van den Oever <mailto:[email protected]> Kolics Bertold <mailto:[email protected]> Carson Gaspar <mailto:[email protected]> Michael O'Reilly <mailto:[email protected]> Hume Smith <mailto:[email protected]> Richard Ayres <mailto:[email protected]> John Saunders <mailto:[email protected]> Miquel van Smoorenburg <mailto:[email protected]> David J N Begley <mailto:[email protected]> Kevin Sartorelli <mailto:[email protected]> Andreas Doering <mailto:[email protected]> Mark Visser <mailto:[email protected]> tom minchin <mailto:[email protected]> Jens-S. V ockler <mailto:[email protected]> Andre Albsmeier <mailto:[email protected]> Doug Nazar <mailto:[email protected]> Henrik Nordstrom <mailto:[email protected]> Mark Reynolds <mailto:[email protected]> Arjan de Vet <mailto:[email protected]> Peter Wemm <mailto:[email protected]> John Line <mailto:[email protected]>
21
Jason Armistead <mailto:[email protected]> Chris Tilbury <mailto:[email protected]> Je Madison <mailto:[email protected]> Mike Batchelor <mailto:[email protected]> Bill Bogstad <mailto:[email protected]> Radu Greab <mailto:radu at netsoft dot ro> F.J. Bosscha <mailto:[email protected]> Brian Feeny <mailto:[email protected]> Martin Lyons <mailto:[email protected]> David Luyer <mailto:[email protected]> Chris Foote <mailto:[email protected]> Jens Elkner <mailto:[email protected]> Simon White <mailto:[email protected]> Jerry Murdock <mailto: jmurdoc at itraktech dot com> Gerard Eviston <mailto: geviston at bigpond dot net dot Rob Poe <mailto: rob at poeweb dot com>
<mailto:[email protected]>.
au>
Please
send
corrections,
updates,
and
comments
to:
SGML-Tools package
Most current version of this document can always be found at http://www.squid-cache.org/Doc/FAQ/ <http://www.squid-cache.org/Doc/FAQ/> in HTML, Plain Text, Postscript and SGML formats.
22
For Squid-2 you must run the congure script yourself before running make :
% % % % tar xzf squid-2.0.RELEASE-src.tar.gz cd squid-2.0.RELEASE ./configure make
23
After the patch has been applied, you must rebuild Squid from the very beginning, i.e.:
make distclean ./configure ... make make install
If your patch program seems to complain or refuses to work, you should get a more recent version, from the GNU FTP site <ftp://ftp.gnu.ai.mit.edu/pub/gnu/>, for example.
Type
% ./configure --help
24
to see all available options. You will need to specify some of these options to enable or disable certain features. Some options which are used often include:
--prefix=PREFIX install architecture-independent files in PREFIX [/usr/local/squid] --enable-dlmalloc[=LIB] Compile & use the malloc package by Doug Lea --enable-gnuregex Compile GNUregex --enable-splaytree Use SPLAY trees to store ACL lists --enable-xmalloc-debug Do some simple malloc debugging --enable-xmalloc-debug-trace Detailed trace of memory allocations --enable-xmalloc-statistics Show malloc statistics in status page --enable-carp Enable CARP support --enable-async-io Do ASYNC disk I/O using threads --enable-icmp Enable ICMP pinging --enable-delay-pools Enable delay pools to limit bandwith usage --enable-mem-gen-trace Do trace of memory stuff --enable-useragent-log Enable logging of User-Agent header --enable-kill-parent-hack Kill parent on shutdown --enable-snmp Enable SNMP monitoring --enable-cachemgr-hostname[=hostname] Make cachemgr.cgi default to this host --enable-arp-acl Enable use of ARP ACL lists (ether address) --enable-htpc Enable HTCP protocol --enable-forw-via-db Enable Forw/Via database --enable-cache-digests Use Cache Digests see http://www.squid-cache.org/Doc/FAQ/FAQ-16.html --enable-err-language=lang Select language for Error pages (see errors dir)
inet ntoa
and
Andreas
Doering
Probably you've recently installed bind 8.x. There is a mismatch between the header les and DNS library that Squid has found. There are a couple of things you can try. First, try adding -lbind to XTRA LIBS in src/Makele . If -lresolv is already there, remove it. If that doesn't seem to work, edit your arpa/inet.h le and comment out the following:
#define #define #define #define #define #define #define #define inet_addr inet_aton inet_lnaof inet_makeaddr inet_neta inet_netof inet_network inet_net_ntop __inet_addr __inet_aton __inet_lnaof __inet_makeaddr __inet_neta __inet_netof __inet_network __inet_net_ntop
25
2.9 How can I get true DNS TTL info into Squid's IP cache?
If you have source for BIND, you can modify it as indicated in the di below. It causes the global variable dns ttl to be set with the TTL of the most recent lookup. Then, when you compile Squid, the congure script will look for the dns ttl symbol in libresolv.a. If found, dnsserver will return the TTL value for every lookup. This hack was contributed by Endre Balint Nagy <mailto:[email protected]>.
diff -ru bind-4.9.4-orig/res/gethnamaddr.c bind-4.9.4/res/gethnamaddr.c --- bind-4.9.4-orig/res/gethnamaddr.c Mon Aug 5 02:31:35 1996 +++ bind-4.9.4/res/gethnamaddr.c Tue Aug 27 15:33:11 1996 @@ -133,6 +133,7 @@ } align; extern int h_errno; +int _dns_ttl_; #ifdef DEBUG static void @@ -223,6 +224,7 @@ host.h_addr_list = h_addr_ptrs; haveanswer = 0; had_error = 0; + _dns_ttl_ = -1; while (ancount-- > 0 && cp < eom && !had_error) { n = dn_expand(answer->buf, eom, cp, bp, buflen); if ((n < 0) || !(*name_ok)(bp)) { @@ -232,8 +234,11 @@ cp += n; /* name */ type = _getshort(cp); cp += INT16SZ; /* type */ class = _getshort(cp); cp += INT16SZ + INT32SZ; /* class, TTL */ + class = _getshort(cp); + cp += INT16SZ; /* class */ + if (qtype == T_A && type == T_A) + _dns_ttl_ = _getlong(cp); + cp += INT32SZ; /* TTL */ n = _getshort(cp); cp += INT16SZ; /* len */ if (class != C_IN) {
26
27
Note on the second line the /usr/bin/false. This is supposed to be a path to the ar program. If congure cannot nd ar on your system, then it substitues false . To x this you either need to:
Add /usr/ccs/bin to your PATH. This is where the ar command should be. You need to install
SUNWbtool if ar is not there. Otherwise,
Install the binutils package from the GNU FTP site <ftp://ftp.gnu.org/gnu/binutils>. This
package includes programs such as ar , as , and ld .
28
You will need to run scripts/convert.congure.to.os2 (in the Squid source distribution) to modify the congure script so that it can search for the various programs. Next, you need to set a few environment variables (see EMX docs for meaning):
export EMXOPT="-h256 -c" export LDFLAGS="-Zexe -Zbin -s"
Compile everything:
make
This will by default, install into /usr/local/squid . If you wish to install somewhere else, see the {prex option for congure. Now, don't forget to set EMXOPT before running squid each time. I recommend using the -Y and -N options.
29
If you have enabled the 7.6 then you will also want to type
% su # make install-pinger
After installing, you will want to edit and customize the squid.conf le. By default, this le is located at /usr/local/squid/etc/squid.conf . Also, a QUICKSTART guide has been included with the source distribution. Please see the directory where you unpacked the source archive.
If this outputs any errors then these are syntax errors or other fatal miscongurations and needs to be corrected before you continue. If it is silent and immediately gives back the command promt then your squid.conf is syntactically correct and could be understood by Squid. After you've nished editing the conguration le, you can start Squid for the rst time. The procedure depends a little bit on which version you are using. First, you must create the swap directories. Do this by running Squid with the -z option:
% /usr/local/squid/sbin/squid -z
30
NOTE: If you run Squid as root then you may need to rst create /usr/local/squid/var/logs and your cache dir directories and assign ownership of these to the cache eective user congured in your squid.conf. Once the creation of the cache directories completes, you can start Squid and try it out. Probably the best thing to do is run it from your terminal and watch the debugging output. Use this command:
% /usr/local/squid/sbin/squid -NCd1
If you want to run squid in the background, as a daemon process, just leave o all options:
% /usr/local/squid/sbin/squid
NOTE: depending on which http port you select you may need to start squid as root (http port <1024). NOTE: In Squid-2.4 and earlier Squid was installed in bin by default, not sbin.
Squid will automatically background itself and then spawn a child process. In your syslog messages le, you should see something like this:
Sep 23 23:55:58 kitty squid[14616]: Squid Parent: child process 14617 started
That means that process ID 14563 is the parent process which monitors the child process (pid 14617). The child process is the one that does all of the work. The parent process just waits for the child process to exit. If the child process exits unexpectedly, the parent will automatically start another child process. In that case, syslog shows:
Sep 23 23:56:02 kitty squid[14616]: Squid Parent: child process 14617 exited with status 1 Sep 23 23:56:05 kitty squid[14616]: Squid Parent: child process 14619 started
If there is some problem, and Squid can not start, the parent process will give up after a while. Your syslog will show:
Sep 23 23:56:12 kitty squid[14616]: Exiting due to repeated, frequent failures
When this happens you should check your syslog messages and cache.log le for error messages. When you look at a process (ps command) listing, you'll see two squid processes:
24353 ?? Ss 24354 ?? R 0:00.00 /usr/local/squid/bin/squid 0:03.39 (squid) (squid)
31
The rst is the parent process, and the child process is the one called \(squid)". Note that if you accidentally kill the parent process, the child process will not notice. If you want to run Squid from your termainal and prevent it from backgrounding and spawning a child process, use the -N command line option.
/usr/local/squid/bin/squid -N
We recommend using a squid.sh shell script, but you could instead call Squid directly with the -N option and other options you may require. A sameple squid.sh script is shown below:
#!/bin/sh C=/usr/local/squid PATH=/usr/bin:$C/bin TZ=PST8PDT export PATH TZ # User to notify on restarts notify="root" # Squid command line options opts="" cd $C umask 022 sleep 10 while [ -f /var/run/nosquid ]; do sleep 1 done /usr/bin/tail -20 $C/logs/cache.log \ | Mail -s "Squid restart on `hostname` at `date`" $notify exec bin/squid -N $opts
32
Squid ships with a init.d type startup script in contrib/squid.rc which works on most init.d type systems. Or you can write your own using any normal init.d script found in your system as template and add the start/stop fragments shown below. Start:
/usr/local/squid/sbin/squid
Stop:
/usr/local/squid/sbin/squid -k shutdown n=120 while /usr/local/squid/sbin/squid -k check && [ $n -gt 120 ]; do sleep 1 echo -n . n=`expr $n - 1` done
There are other command-line HTTP client programs available as well. Two that you may nd useful are wget <ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/> and echoping <ftp://ftp.internatif.org/pub/unix/echoping/>. Another way is to use Squid itself to see if it can signal a running Squid process:
% squid -k check
And then check the shell's exit status variable. Also, check the log les, most importantly the access.log and cache.log les.
-a
Specify an alternate port number for incoming HTTP requests. Useful for testing a conguration le on a non-standard port.
-d
Debugging level for \stderr" messages. If you use this option, then debugging messages up to the specied level will also be written to stderr.
-f
Specify an alternate squid.conf le instead of the pathname compiled into the executable.
33
-k recongure
Sends a HUP signal, which causes Squid to re-read its conguration les.
-k rotate
Sends an USR1 signal, which causes Squid to rotate its log les. Note, if logle rotate is set to zero, Squid still closes and re-opens all log les.
-k shutdown
Sends a TERM signal, which causes Squid to wait briey for current connections to nish and then exit. The amount of time to wait is specied with shutdown lifetime .
-k interrupt
Sends an INT signal, which causes Squid to shutdown immediately, without waiting for current connections.
-k kill
Sends a KILL signal, which causes the Squid process to exit immediately, without closing any connections or log les. Use this only as a last resort.
-k debug
Sends an USR2 signal, which causes Squid to generate full debugging messages until the next USR2 signal is recieved. Obviously very useful for debugging problems.
-k check
Sends a \ZERO " signal to the Squid process. This simply checks whether or not the process is actually running.
-s
Send debugging (level 0 only) message to syslog.
-u
Specify an alternate port number for ICP messages. Useful for testing a conguration le on a nonstandard port.
-v
Prints the Squid version.
-z
Creates disk swap directories. You must use this option when installing Squid for the rst time, or when you add or modify the cache dir conguration.
-D
Do not make initial DNS tests. Normally, Squid looks up some well-known DNS hostnames to ensure that your DNS name resolution service is working properly.
-F
If the swap.state logs are clean, then the cache is rebuilt in the \foreground" before any requests are served. This will decrease the time required to rebuild the cache, but HTTP requests will not be satisied during this time.
34
-R
Do not set the SO REUSEADDR option on sockets.
-V
Enable virtual host support for the httpd-accelerator mode. This is identical to writing httpd accel host virtual in the cong le.
-X
Enable full debugging while parsing the cong le.
-Y
Return ICP OP MISS NOFETCH instead of ICP OP MISS while the swap.state le is being read. If your cache has mostly child caches which use ICP, this will allow your cache to rebuild faster.
3.9 How do I see how Squid works? Check the cache.log le in your logs directory.
normal operation.
4. Conguration issues
35
4 Conguration issues
4.1 How do I join a cache hierarchy?
To place your cache in a hierarchy, use the cache peer directive in squid.conf to specify the parent and sibling nodes. For example, the following squid.conf le on childcache.example.com congures its cache to retrieve data from one parent cache and two sibling caches:
# squid.conf - On the host: childcache.example.com # # Format is: hostname type http_port udp_port # cache_peer parentcache.example.com parent 3128 3130 cache_peer childcache2.example.com sibling 3128 3130 cache_peer childcache3.example.com sibling 3128 3130
The cache peer domain directive allows you to specify that certain caches siblings or parents for certain domains:
# # # # squid.conf - On the host: sv.cache.nlanr.net Format is: hostname type http_port udp_port
cache_peer electraglide.geog.unsw.edu.au parent 3128 cache_peer cache1.nzgate.net.nz parent 3128 cache_peer pb.cache.nlanr.net parent 3128 3130 cache_peer it.cache.nlanr.net parent 3128 3130 cache_peer sd.cache.nlanr.net parent 3128 3130 cache_peer uc.cache.nlanr.net sibling 3128 3130 cache_peer bo.cache.nlanr.net sibling 3128 3130 cache_peer_domain electraglide.geog.unsw.edu.au .au cache_peer_domain cache1.nzgate.net.nz .au .aq .fj cache_peer_domain pb.cache.nlanr.net .uk .de .fr cache_peer_domain it.cache.nlanr.net .uk .de .fr cache_peer_domain sd.cache.nlanr.net .mx .za .mu
3130 3130
The conguration above indicates that the cache will use pb.cache.nlanr.net and it.cache.nlanr.net for domains uk, de, fr, no, se and it, sd.cache.nlanr.net for domains mx, za, mu and zm, and cache1.nzgate.net.nz for domains au, aq, fj, and nz.
4. Conguration issues
36
NOTE: announcing your cache is not the same thing as joining the NLANR cache hierarchy. You can join the NLANR cache hierarchy without registering, and you can register without joining the NLANR cache hierarchy.
4.5 How do I nd other caches close to me and arrange parent/child/sibling relationships with them?
Visit the NLANR cache registration database <http://www.ircache.net/Cache/Tracker/> to discover other caches near you. Keep in mind that just because a cache is registered in the database does not mean they are willing to be your parent/sibling/child. But it can't hurt to ask...
4.6 My cache registration is not appearing in the Tracker database. Your site will not be listed if your cache IP address does not have a DNS PTR record. If we can't map
the IP address back to a domain name, it will be listed as \Unknown."
The registration messages are sent with UDP. We may not be receiving your announcement message
due to rewalls which block UDP, or dropped packets due to congestion.
4. Conguration issues
You could also specify internal servers by IP address
acl INSIDE_IP dst 1.2.3.0/24 always_direct allow INSIDE_IP never_direct allow all
37
Note, however that when you use IP addresses, Squid must perform a DNS lookup to convert URL hostnames to an address. Your internal DNS servers may not be able to lookup external domains. If you use never direct and you have multiple parent caches, then you probably will want to mark one of them as a default choice in case Squid can't decide which one to use. That is done with the default keyword on a cache peer line. For example:
cache_peer xyz.mydomain.com parent 3128 0 default
Note, with this conguration, if the parent cache fails or becomes unreachable, then every request will result in an error message. In case you want to be able to use direct connections when all the parents go down you should use a dierent approach:
cache_peer parentcache.foo.com parent 3128 0 no-query prefer_direct off
The default behaviour of Squid in the absence of positive ICP, HTCP, etc replies is to connect to the origin server instead of using parents. The prefer direct o directive tells Squid to try parents rst.
4.10 I have dnsserver processes that aren't being used, should I lower the number in squid.conf ?
The dnsserver processes are used by squid because the gethostbyname(3) library routines used to convert web sites names to their internet addresses blocks until the function returns (i.e., the process that calls it has to wait for a reply). Since there is only one squid process, everyone who uses the cache would have to wait each time the routine was called. This is why the dnsserver is a separate process, so that these processes can block, without causing blocking in squid . It's very important that there are enough dnsserver processes to cope with every access you will need, otherwise squid will stop occasionally. A good rule of thumb is to make sure you have at least the maximum number of dnsservers squid has ever needed on your system, and probably add two to be on the safe side. In other words, if you have only ever seen at most three dnsserver processes in use, make at least ve. Remember that a dnsserver is small and, if unused, will be swapped out.
4. Conguration issues
38
4.11 My dnsserver average/median service time seems high, how can I reduce it?
First, nd out if you have enough dnsserver processes running by looking at the Cachemanager dns output. Ideally, you should see that the rst dnsserver handles a lot of requests, the second one less than the rst, etc. The last dnsserver should have serviced relatively few requests. If there is not an obvious decreasing trend, then you need to increase the number of dns children in the conguration le. If the last dnsserver has zero requests, then you denately have enough. Another factor which aects the dnsserver service time is the proximity of your DNS resolver. Normally we do not recommend running Squid and named on the same host. Instead you should try use a DNS resolver (named ) on a dierent host, but on the same LAN. If your DNS trac must pass through one or more routers, this could be causing unnecessary delays.
Its better to start out conservative. After the cache becomes full, look at the disk usage. If you think there is plenty of unused space, then increase the cache dir setting a little. If you're getting \disk full" write errors, then you denately need to decrease your cache size.
4. Conguration issues
39
I increased the watermark to 100 because a lot of people run into problems with the default value. Make sure you have at least the following line in /usr/local/etc/netperm-table :
http-gw: hosts 127.0.0.1
You could add the IP-address of your own workstation to this rule and make sure the http-gw by itself works, like:
http-gw: hosts 127.0.0.1 10.0.0.1
cache_peer localhost.home.nl parent 8080 0 default acl HOME dstdomain .home.nl alwayws_direct allow HOME never_direct allow all
This tells Squid to use the parent for all domains other than home.nl . Below, access.log entries show what happens if you do a reload on the Squid-homepage:
4. Conguration issues
872739961.631 872739962.976 872739963.007 872739963.061 1566 1266 1299 1354 10.0.0.21 10.0.0.21 10.0.0.21 10.0.0.21
40
ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/ - DEFAULT_PARENT/ TCP_CLIENT_REFRESH/304 88 GET http://www.nlanr.net/Images/cache_now.gif ERR_CLIENT_ABORT/304 83 GET http://www.squid-cache.org/Icons/squidnow.gif TCP_CLIENT_REFRESH/304 83 GET http://www.squid-cache.org/Icons/Squidlogo2
permit host=localhost/127.0.0.1 use of gateway (V2.0beta) log host=localhost/127.0.0.1 protocol=HTTP cmd=dir dest=www.squid-ca exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration permit host=localhost/127.0.0.1 use of gateway (V2.0beta) log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-ca permit host=localhost/127.0.0.1 use of gateway (V2.0beta) log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.squid-ca permit host=localhost/127.0.0.1 use of gateway (V2.0beta) log host=localhost/127.0.0.1 protocol=HTTP cmd=get dest=www.nlanr.ne exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration exit host=localhost/127.0.0.1 cmds=1 in=0 out=0 user=unauth duration
To summarize: Advantages:
http-gw allows you to selectively block ActiveX and Java, and it's primary design goal is security. The rewall doesn't need to run large applications like Squid. The internal Squid-server still gives you the benet of caching.
Disadvantages:
The internal Squid proxyserver can't (and shouldn't) work with other parent or neighbor caches. Initial requests are slower because these go through http-gw, http-gw also does reverse lookups. Run
a nameserver on the rewall or use an internal nameserver. {Rodney van den Oever <mailto:[email protected]>
4.17 What is \HTTP X FORWARDED FOR"? Why does squid provide it to WWW servers, and how can I stop it?
When a proxy-cache is used, a server does not see the connection coming from the originating client. Many people like to implement access controls based on the client address. To accommodate these people, Squid adds its own request header called "X-Forwarded-For" which looks like this:
X-Forwarded-For: 128.138.243.150, unknown, 192.52.106.30
Entries are always IP addresses, or the word unknown if the address could not be determined or if it has been disabled with the forwarded for conguration option. We must note that access controls based on this header are extremely weak and simple to fake. Anyone may hand-enter a request with any IP address whatsoever. This is perhaps the reason why client IP addresses have been omitted from the HTTP/1.1 specication.
4. Conguration issues
41
Because of the weakness of this header support for access controls based on X-Forwarder-For is not yet available in any ocially released version of squid. However, unocial patches are available from the follow x <http://devel.squid-cache.org/follow xff/index.html> Squid development project and may be integrated into later versions of Squid once a suitable trust model have been developed.
With Squid-2.4 and later you can use the \null" storage module to avoid having a cache directory:
cache_dir null /tmp
Note: a null cache dir does not disable caching, but it does save you from creating a cache structure if you have disabled caching with no cache.
42
Note: the directory (e.g., /tmp ) must exist so that squid can chdir to it, unless you also use the coredump dir option. To congure Squid for the \null" storage module, specify it on the congure command line:
./configure --enable-storeio=ufs,null ...
43
44
For Lynx you can also edit the lynx.cfg le to congure proxy usage. This has the added benet of causing all Lynx users on a system to access the proxy without making environment variable changes for each user. For example:
http_proxy:http://mycache.example.com:3128/ ftp_proxy:http://mycache.example.com:3128/ gopher_proxy:http://mycache.example.com:3128/
The clients just refer to 'http://proxy/proxy.pac'. This script looks like this:
45
// First try proxy1 then proxy2. One server mostly caches '.com' // to make sure both servers are not // caching the same data in the normal situation. The other // server caches the other domains normally. // If one of 'm is down the client will try the other server. else if (shExpMatch(host, "*.com")) return "PROXY proxy1.domain.com:8080; PROXY proxy2.domain.com:8081; DIRECT"; return "PROXY proxy2.domain.com:8081; PROXY proxy1.domain.com:8080; DIRECT"; }
I made sure every client domain has the appropriate 'proxy' entry. The clients are automatically congured with two nameservers using DHCP. {Rodney van den Oever <mailto:[email protected]>
46
Opera 2.12 doesn't support gopher on its own, but requires a proxy; therefore Squid's gopher proxying
can extend the utility of your Opera immensely.
Unfortunately, Opera 2.12 chokes on some HTTP requests, for example abuse.net
<http://spam.abuse.net/spam/>.
At the moment I think it has something to do with cookies. If you have trouble with a site, try disabling the HTTP proxying by unchecking that protocol in the Preferences jProxy Servers... dialogue. Opera will remember the address, so reenabling is easy. {Hume Smith <mailto:[email protected]>
5.9 How do I tell Squid to use a specic username for FTP urls?
Insert your username in the host part of the URL, for example:
ftp://[email protected]/
Squid should then prompt you for your account password. Alternatively, you can specify both your username and password in the URL itself:
ftp://joecool:[email protected]/
However, we certainly do not recommend this, as it could be very easy for someone to see or grab your password.
You may like to start by reading the Expired Internet-Draft <http://www.web-cache.com/Writings/Internet-Drafts/draf that describes WPAD.
47
After reading the 8 steps below, if you don't understand any of the terms or methods mentioned, you probably shouldn't be doing this. Implementing wpad requires you to fully understand: 1. web server installations and modications. 2. squid proxy server (or others) installation etc. 3. Domain Name System maintenance etc. Please don't bombard the squid list with web server or dns questions. See your system administrator, or do some more research on those topics. This is not a recommendation for any product or version. As far as I know IE5 is the only browser out now implementing wpad. I think wpad is an excellent feature that will return several hours of life per month. Hopefully, all browser clients will implement it as well. But it will take years for all the older browsers to fade away though. I have only focused on the domain name method, to the exclusion of the DHCP method. I think the dns method might be easier for most people. I don't currently, and may never, fully understand wpad and IE5, but this method worked for me. It may work for you. But if you'd rather just have a go ... 1. Create a standard 5.2. The sample provided there is more than adequate to get you going. No doubt all the other load balancing and backup scripts will be ne also. 2. Store the resultant le in the document root directory of a handy web server as wpad.dat (Not proxy.pac as you may have previously done.) Andrei Ivanov <mailto:ira at racoon.riga.lv> notes that you should be able to use an HTTP redirect if you want to store the wpad.dat le somewhere else. You can probably even redirect wpad.dat to proxy.pac :
Redirect /wpad.dat http://racoon.riga.lv/proxy.pac
3. If you do nothing more, a url like http://www.your.domain.name/wpad.dat should bring up the script text in your browser window. 4. Insert the following entry into your web server mime.types le. Maybe in addition to your pac le type, if you've done this before.
application/x-ns-proxy-autoconfig dat
And then restart your web server, for new mime type to work. 5. Assuming Internet Explorer 5, under Tools , Internet Options , Connections , Settings or Lan Settings , set ONLY Use Automatic Conguration Script to be the URL for where your new wpad.dat le can be found. i.e. http://www.your.domain.name/wpad.dat Test that that all works as per your script and network. There's no point continuing until this works ... 6. Create/install/implement a DNS record so that wpad.your.domain.name resolves to the host above where you have a functioning auto cong script running. You should now be able to use http://wpad.your.domain.name/wpad.dat as the Auto Cong Script location in step 5 above. 7. And nally, go back to the setup screen detailed in 5 above, and choose nothing but the Automatically Detect Settings option, turning everything else o. Best to restart IE5, as you normally do with any Microsoft product... And it should all work. Did for me anyway.
48
8. One nal question might be 'Which domain name does the client (IE5) use for the wpad... lookup?' It uses the hostname from the control panel setting. It starts the search by adding the hostname "WPAD" to current fully-qualied domain name. For instance, a client in a.b.Microsoft.com would search for a WPAD server at wpad.a.b.microsoft.com. If it could not locate one, it would remove the bottom-most domain and try again; for instance, it would try wpad.b.microsoft.com next. IE 5 would stop searching when it found a WPAD server or reached the third-level domain, wpad.microsoft.com. Anybody using these steps to install and test, please feel free to make notes, corrections or additions for improvements, and post back to the squid list... There are probably many more tricks and tips which hopefully will be detailed here in the future. Things like wpad.dat les being served from the proxy server themselves, maybe with a round robin dns setup for the WPAD host.
Replace the hostname with the name or address of your own server. Ilja Pavkovic notes that the DHCP mode does not work reliably with every version of Internet Explorer. The DNS name method to nd wpad.dat is more reliable. Another user adds that IE 6.01 seems to strip the last character from the URL. By adding a trailing newline, he is able to make it work with both IE 5.0 and 6.0:<
option wpad "http://www.example.com/proxy.pac\n";
49
There is a knowledgebase article (KB 331906 <http://support.microsoft.com/default.aspx?id=kb;en-us;331906>) regarding this issue, which contains a link to a downloadable "hot x." They do warn that this code is not "regression tested" but so far there have not been any reports of this breaking anything else. The problematic le is wininet.dll. Please note that this hotx is included in the latest security update. article, KB 312176 He says that you must not have the registry entry that KB 312176 encourages users to add to their registry. Lloyd
<http://support.microsoft.com/default.aspx?scid=kb;EN-US;312176>.
Parkes
notes
that
the
article
references
another
According to Joao Coutinho, this simple solution also corrects the problem:
Go to Tools/Internet Go to Options/Advanced UNSELECT "Show friendly HTTP error messages" under Browsing.
Another possible workaround to these problems is to make the ERR CACHE ACCESS DENIED larger than 1460 bytes. This should trigger IE to handle the authentication in a slightly dierent manner.
6.1 squid.out
If you run your Squid from the RunCache script, a le squid.out contains the Squid startup times, and also all fatal errors, e.g. as produced by an assert() failure. If you are not using RunCache , you will not see such a le.
6.2 cache.log
The cache.log le contains the debug and error messages that Squid generates. If you start your Squid using the default RunCache script, or start it with the -s command line option, a copy of certain messages will go into your syslog facilities. It is a matter of personal preferences to use a separate le for the squid log data. From the area of automatic log le analysis, the cache.log le does not have much to oer. You will usually look into this le for automated error reports, when programming Squid, testing new features, or searching for reasons of a perceived misbehaviour, etc.
6.3 useragent.log
The user agent log le is only maintained, if
50
6.4 store.log
The store.log le covers the objects currently kept on disk or removed ones. As a kind of transaction log it is ususally used for debugging purposes. A denitive statement, whether an object resides on your disks is only possible after analysing the complete log le. The release (deletion) of an object may be logged at a later time than the swap out (save to disk). The store.log le may be of interest to log le analysis which looks into the objects on your disks and the time they spend there, or how many times a hot object was accessed. The latter may be covered by another log le, too. With knowledge of the cache dir conguration option, this log le allows for a URL to lename mapping without recursing your cache disks. However, the Squid developers recommend to treat store.log primarily as a debug le, and so should you, unless you know what you are doing. The print format for a store log entry (one line) consists of eleven space-separated columns, compare with the storeLog() function in le src/store log.c :
"%9d.%03d %-7s %08X %4d %9d %9d %9d %s %d/%d %s %s\n"
time
The timestamp when the line was logged in UTC with a millisecond fraction.
action
The action the object was sumitted to, compare with src/store log.c :
RELEASE The object was removed from the cache (see also 6.4). SWAPOUT The object was saved to disk.
le number
The le number for the object storage le. Please note that the path to this le is calculated according to your cache dir conguration. A le number of FFFFFFFF denominates "memory only" objects. Any action code for such a le number refers to an object which existed only in memory, not on disk. For instance, if a RELEASE code was logged with le number FFFFFFFF , the object existed only in memory, and was released from memory.
SWAPIN The object existed on disk and was read into memory.
status
The HTTP reply status code.
datehdr
The value of the HTTP "Date: " reply header.
lastmod
The value of the HTTP "Last-Modied: " reply header.
51
type
The HTTP "Content-Type" major value, or "unknown" if it cannot be determined.
sizes
This column consists of two slash separated elds: 1. The advertised content length from the HTTP "Content-Length: " reply header. 2. The size actually read. If the advertised (or expected) length is missing, it will be set to zero. If the advertised length is not zero, but not equal to the real length, the object will be realeased from the cache.
method
The request method for the object, e.g. GET .
key
The key to the object, usually the URL. The timestamp format for the columns 6.4 to 6.4 are all expressed in UTC seconds. The actual values are parsed from the HTTP reply headers. An unparsable header is represented by a value of -1, and a missing header is represented by a value of -2. The column 6.4 usually contains just the URL of the object. Some objects though will never become public. Thus the key is said to include a unique integer number and the request method in addition to the URL.
6.5 hierarchy.log
This logle exists for Squid-1.0 only. The format is
[date] URL peerstatus peerhost
6.6 access.log
Most log le analysis program are based on the entries in access.log . Currently, there are two le formats possible for the log le, depending on your conguration for the emulate httpd log option. By default, Squid will log in its native log le format. If the above option is enabled, Squid will log in the common log le format as dened by the CERN web daemon. The common log le format contains other information than the native log le, and less. The native format contains more information for the admin interested in cache evaluation.
It is parsable by a variety of tools. The common format contains dierent information than the native log le format. The HTTP version is logged, which is not logged in native log le format.
52
For Squid-1.1, the information from the hierarchy.log was moved into access.log . The format is:
time elapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost type
For Squid-2 the columns stay the same, though the content within may change a little. The native log le format logs more and dierent information than the common log le format: the request duration, some timeout information, the next upstream server address, and the content type. There exist tools, which convert one le format into the other. Please mind that even though the log formats share most information, both formats contain information which is not part of the other format, and thus this part of the information is lost when converting. Especially converting back and forth is not possible without loss.
squid2common.pl is a conversion utility, which converts any of the squid log le formats into the old CERN proxy style output. There exist tools to analyse, evaluate and graph results from that format.
Therefore, an access.log entry usually consists of (at least) 10 columns separated by one ore more spaces:
time
A Unix timestamp as UTC seconds with a millisecond resolution. You can convert Unix timestamps into something more human readable using this short perl script:
#! /usr/bin/perl -p s/^\d+\.\d+/localtime $&/e;
duration
The elapsed time considers how many milliseconds the transaction busied the cache. It diers in interpretation between TCP and UDP:
For HTTP/1.0, this is basically the time between accept() and close(). For persistent connections, this ought to be the time between scheduling the reply and nishing
sending it. For ICP, this is the time between scheduling a reply and actually sending it. Please note that the entries are logged after the reply nished being sent, not during the lifetime of the transaction.
53
The IP address of the requesting instance, the client IP address. The client netmask conguration option can distort the clients for data protection reasons, but it makes analysis more dicult. Often it is better to use one of the log le anonymizers. Also, the log fqdn conguration option may log the fully qualied domain name of the client instead of the dotted quad. The use of that option is discouraged due to its performance impact.
result codes
This column is made up of two entries separated by a slash. This column encodes the transaction result: 1. The cache result of the request contains information on the kind of request, how it was satised, or in what way it failed. Please refer to section 6.7 for valid symbolic result codes. Several codes from older versions are no longer available, were renamed, or split. Especially the ERR codes do not seem to appear in the log le any more. Also refer to section 6.7 for details on the codes no longer available in Squid-2. The NOVM versions and Squid-2 also rely on the Unix buer cache, thus you will see less TCP MEM HIT s than with a Squid-1. Basically, the NOVM feature relies on read() to obtain an object, but due to the kernel buer cache, no disk activity is needed. Only small objects (below 8KByte) are kept in Squid's part of main memory. 2. The status part contains the HTTP result codes with some Squid specic extensions. Squid uses a subset of the RFC dened error codes for HTTP. Refer to section 6.8 for details of the status codes recognized by a Squid-2.
bytes
The size is the amount of data delivered to the client. Mind that this does not constitute the net object size, as headers are also counted. Also, failed requests may deliver an error page, the size of which is also logged here.
request method
The request method to obtain an object. Please refer to section 6.9 for available methods. If you turned o log icp queries in your conguration, you will not see (and thus unable to analyse) ICP exchanges. The PURGE method is only available, if you have an ACL for \method purge" enabled in your conguration le.
URL
This column contains the URL requested. Please note that the log le may contain whitespaces for the URI. The default conguration for uri whitespace denies whitespaces, though.
rfc931
The eigth column may contain the ident lookups for the requesting client. Since ident lookups have performance impact, the default conguration turns ident loookups o. If turned o, or no ident information is available, a \-" will be logged.
hierarchy code
The hierarchy information consists of three items: 1. Any hierarchy tag may be prexed with TIMEOUT , if the timeout occurs waiting for all ICP replies to return from the neighbours. The timeout is either dynamic, if the icp query timeout was not set, or the time congured there has run up.
54
2. A code that explains how the request was handled, e.g. by forwarding it to a peer, or going straight to the source. Refer to section 6.10 for details on hierarchy codes and removed hierarchy codes. 3. The IP address or hostname where the request (if a miss) was forwarded. For requests sent to origin servers, this is the origin server's IP address. For requests sent to a neighbor cache, this is the neighbor's hostname. NOTE: older versions of Squid would put the origin server hostname here.
type
The content type of the object as seen in the HTTP reply header. Please note that ICP exchanges usually don't have any content type, and thus are logged \-". Also, some weird replies have content types \:" or even empty ones. There may be two more columns in the access.log , if the (debug) option log mime headers is enabled In this case, the HTTP request headers are logged between a \[" and a \]", and the HTTP reply headers are also logged between \[" and \]". All control characters like CR and LF are URL-escaped, but spaces are not escaped! Parsers should watch out for this.
TCP HIT
A valid copy of the requested object was in the cache.
TCP MISS
The requested object was not in the cache.
55
Request for a negatively cached object, e.g. "404 not found", for which the cache believes to know that it is inaccessible. Also refer to the explainations for negative ttl in your squid.conf le.
TCP DENIED
Access was denied for this request.
UDP HIT
A valid copy of the requested object was in the cache.
UDP MISS
The requested object is not in this cache.
UDP DENIED
Access was denied for this request.
UDP INVALID
An invalid request was received.
NONE
Seen with errors and cachemgr requests. The following codes are no longer available in Squid-2:
ERR *
Errors are now contained in the status code.
TCP SWAPFAIL
See: 6.7.
UDP RELOADING
See: 6.7.
56
57
recognizes
several
request
methods
as
dened
in
RFC
2616
meaning ------------------------------------------object retrieval and simple searches. metadata retrieval. submit data (to a program). upload data (e.g. to a file). remove resource (e.g. file). appl. layer trace of request route. request available comm. options. tunnel SSL connection. used for ICP based exchanges. remove object from cache. retrieve properties of an object. change properties of an object. create a new collection. create a duplicate of src in dst. atomically move src to dst. lock an object against modifications. unlock an object.
ICP_QUERY Squid PURGE Squid PROPFIND PROPATCH MKCOL COPY MOVE LOCK UNLOCK rfc2518 rfc2518 rfc2518 rfc2518 rfc2518 rfc2518 rfc2518
NONE
For TCP HIT, TCP failures, cachemgr requests and all UDP requests, there is no hierarchy information.
DIRECT
58
SIBLING HIT
The object was fetched from a sibling cache which replied with UDP HIT.
PARENT HIT
The object was requested from a parent cache which replied with UDP HIT.
DEFAULT PARENT
No ICP queries were sent. This parent was chosen because it was marked \default" in the cong le.
SINGLE PARENT
The object was requested from the only parent appropriate for the given URL.
FIRST UP PARENT
The object was fetched from the rst parent in the list of parents.
NO PARENT DIRECT
The object was fetched from the origin server, because no parents existed for the given URL.
CLOSEST PARENT
The parent selection was based on our own RTT measurements.
CLOSEST DIRECT
Our own RTT measurements returned a shorter time than any parent.
NO DIRECT FAIL
The object could not be requested because of a rewall conguration, see also material, and no parents were available.
SOURCE FASTEST
The origin site was chosen, because the source ping arrived fastest.
ROUNDROBIN PARENT
No ICP replies were received from any parent. The parent was chosen, because it was marked for round robin in the cong le and had the lowest usage count.
CD PARENT HIT
The parent was chosen, because the cache digest predicted a hit.
CD SIBLING HIT
The sibling was chosen, because the cache digest predicted a hit.
59
CARP
The peer was selected by CARP.
ANY PARENT
part of src/peer select.c:hier strings[] .
INVALID CODE
part of src/peer select.c:hier strings[] . Almost any of these may be preceded by 'TIMEOUT ' if the two-second (default) timeout occurs waiting for all ICP replies to arrive from neighbors, see also the icp query timeout conguration option. The following hierarchy codes were removed from Squid-2:
code -------------------PARENT_UDP_HIT_OBJ SIBLING_UDP_HIT_OBJ SSL_PARENT_MISS FIREWALL_IP_DIRECT LOCAL_IP_DIRECT meaning ------------------------------------------------hit objects are not longer available. hit objects are not longer available. SSL can now be handled by squid. No special logging for hosts inside the firewall. No special logging for local networks.
This will disrupt service, but at least you will have your swap log back. Alternatively, you can tell squid to rotate its log les. This also causes a clean swap log to be written.
% squid -k rotate
For Squid-1.1, there are six elds: 1. leno : The swap le number holding the object data. This is mapped to a pathname on your lesystem. 2. timestamp: This is the time when the object was last veried to be current. The time is a hexadecimal representation of Unix time. 3. expires: This is the value of the Expires header in the HTTP reply. If an Expires header was not present, this will be -2 or fe. If the Expires header was present, but invalid (unparsable), this will be -1 or . 4. lastmod: Value of the HTTP reply Last-Modied header. If missing it will be -2, if invalid it will be -1. 5. size: Size of the object, including headers. 6. url: The URL naming this object.
60
Alternatively, you can tell Squid to shutdown and it will rewrite this le before it exits. If you remove the swap.state while Squid is not running, you will not lose your entire cache. In this case, Squid will scan all of the cache directories and read each swap le to rebuild the cache. This can take a very long time, so you'll have to be patient. By default the swap.state le is stored in the top-level of each cache dir . You can move the logs to a dierent location with the cache swap log option.
For example, use this cron entry to rotate the logs at midnight:
0 0 * * * /usr/local/squid/bin/squid -k rotate
To disable store.log :
cache_store_log none
61
To disable store.log :
cache_store_log none
To disable cache.log :
cache_log /dev/null
Note : It is a bad idea to disable the cache.log because this le contains many important status and
debugging messages. However, if you really want to, you can. risk Squid rotating away /dev/null making it a plain log le.
Tip : Instead of disabling the log les, it is advisable to use a smaller value for logle rotate and properly rotating Squid's log les in your cron. That way, your log les are more controllable and self-maintained by your system.
62
you might have around 1 GB of uncompressed log information per day and busy cache. Look into you cache manager info page to make an educated guess on the size of your log les. The
<http://www.desire.org/> developed <http://www.uninett.no/prosjekt/desire/arneberg/statistics.html>
EU
project
DESIRE
Respect the privacy of your clients when publishing results. Keep logs unavailable unless anonymized. Most countries have laws on privacy protection, and some
even on how long you are legally allowed to keep certain kinds of information.
Rotate and process log les at least once a day. Even if you don't process the log les, they will grow
quite large, see section 6.15. If you rely on processing the log les, reserve a large enough partition solely for log les.
Keep the size in mind when processing. It might take longer to process log les than to generate them! Limit yourself to the numbers you are interested in. There is data beyond your dreams available in
your log le, some quite obvious, others by combination of dierent views. Here are some examples for gures to watch:
{ The hosts using your cache. { The elapsed time for HTTP requests - this is the latency the user sees. Usually, you will want
to make a distinction for HITs and MISSes and overall times. Also, medians are preferred over averages. { The requests handled per interval (e.g. second, minute or hour).
It is larger than maximum object size It is being fetched from a neighbor which has the proxy-only option set.
le number .
7. Operational issues
Then, nd the le usage is
63
leno-to-pathname.pl from the \scripts" directory of the Squid source distribution. The
7 Operational issues
7.1 How do I see system level Squid statistics?
The Squid distribution includes a CGI utility called cachemgr.cgi which can be used to view squid statistics with a web browser. This document has a section devoted to cachemgr.cgi usage which you should consult for more information.
The fastest way to restart with an entirely clean cache is to over write the swap.state les for each cache dir in your cong le. Note, you can not just remove the swap.state le, or truncate it to zero size. Instead, you should put just one byte of garbage there. For example:
% echo "" > /cache1/swap.state
Repeat that for every cache dir , then restart Squid. Be sure to leave the swap.state le with the same owner and permissions that it had before! Another way, which takes longer, is to have squid recreate all the cache dir directories. But rst you must move the existing directories out of the way. For example, you can try this:
7. Operational issues
% % % % cd /cache1 mkdir JUNK mv ?? swap.state* JUNK rm -rf JUNK &
64
Repeat this for your other cache dir 's, then tell Squid to create new directories:
% squid -z
Rodney
van
den
Oever
<mailto:[email protected]>,
and
James
Grinter
Point the RealPlayer at your Squid server's HTTP port (e.g. 3128). Using the Preferences->Transport tab, select Use specied transports and with the Specied Transports
button, select use HTTP Only . The RealPlayer (and RealPlayer Plus) manual states:
Use HTTP Only Select this option if you are behind a firewall and cannot receive data through TCP. All data will be streamed through HTTP. Note: You may not be able to receive some content if you select this option.
Note that the rst request is a POST, and the second has a '?' in the URL, so standard Squid congurations would treat it as non-cachable. It also looks rather \magic."
7. Operational issues
65
HTTP is an alternative delivery mechanism introduced with version 3 players, and it allows a reasonable approximation to \streaming" data - that is playing it as you receive it. It isn't available in the general case: only if someone has made the realaudio le available via an HTTP server, or they're using a version 4 server, they've switched it on, and you're using a version 4 client. If someone has made the le available via their HTTP server, then it'll be cachable. Otherwise, it won't be (as far as we can tell.) The more common RealAudio link connects via their own pnm: method and is transferred using their proprietary protocol (via TCP or UDP) and not using HTTP. It can't be cached nor proxied by Squid, and requires something such as the simple proxy that Progressive Networks themselves have made available, if you're in a rewall/no direct route situation. Their product does not cache (and I don't know of any software available that does.) Some confusion arises because there is also a conguration option to use an HTTP proxy (such as Squid) with the Realaudio/RealVideo players. This is because the players can fetch the \.ram" le that contains the pnm: reference for the audio/video stream. They fetch that .ram le from an HTTP server, using HTTP.
The above only allows purge requests which come from the local host and denies all other purge requests. To purge an object, you can use the squidclient program:
squidclient -m PURGE http://www.miscreant.com/
If the purge was successful, you will see a \200 OK" response:
HTTP/1.0 200 OK Date: Thu, 17 Jul 1997 16:03:32 GMT Server: Squid/1.1.14
If the object was not found in the cache, you will see a \404 Not Found" response:
HTTP/1.0 404 Not Found Date: Thu, 17 Jul 1997 16:03:22 GMT Server: Squid/1.1.14
66
It is more important that your parent caches enable the ICMP features. If you are acting as a parent, then you may want to enable ICMP on your cache. Also, if your cache makes RTT measurements, it will fetch objects directly if your cache is closer than any of the parents. If you want your Squid cache to measure RTT's to origin servers, Squid must be compiled with the USE ICMP option. This is easily accomplished by uncommenting "-DUSE ICMP=1" in src/Makele and/or src/Makele.in . An external program called pinger is responsible for sending and receiving ICMP packets. It must run with root privileges. After Squid has been compiled, the pinger program must be installed separately. A special Makele target will install pinger with appropriate permissions.
% make install % su # make install-pinger
There are three conguration le options for tuning the measurement database on your cache. netdb low and netdb high specify high and low water marks for keeping the database to a certain size (e.g. just like with the IP cache). The netdb ttl option species the minimum rate for pinging a site. If netdb ttl is set to 300 seconds (5 minutes) then an ICMP packet will not be sent to the same site more than once every ve minutes. Note that a site is only pinged when an HTTP request for the site is received. Another option, minimum direct hops can be used to try nding servers which are close to your cache. If the measured hop count to the origin server is less than or equal to minimum direct hops , the request will be forwarded directly to the origin server.
This causes a ag to be set in your outgoing ICP queries. If your parent caches return ICMP RTT measurements then the eighth column of your access.log will have lines similar to:
CLOSEST_PARENT_MISS/it.cache.nlanr.net
In this case, it means that it.cache.nlanr.net returned the lowest RTT to the origin server. If your cache measured a lower RTT than any of the parents, the request will be logged with
CLOSEST_DIRECT/www.sample.com
7. Operational issues
Network recv/sent 192.41.10.0 20/ 21 bo.cache.nlanr.net uc.cache.nlanr.net pb.cache.nlanr.net it.cache.nlanr.net RTT Hops Hostnames 82.3 6.0 www.jisedu.org www.dozo.com 42.0 7.0 48.0 10.0 55.0 10.0 185.0 13.0
67
This means we have sent 21 pings to both www.jisedu.org and www.dozo.com. The average RTT is 82.3 milliseconds. The next four lines show the measured values from our parent caches. Since bo.cache.nlanr.net has the lowest RTT, it would be selected as the location to forward a request for a www.jisedu.org or www.dozo.com URL.
7.8 How can I make Squid NOT cache some servers or URLs?
In Squid-2, you use the no cache option to specify uncachable requests. For example, this makes all responses from origin servers in the 10.0.1.0/24 network uncachable:
acl Local dst 10.0.1.0/24 no_cache deny Local
In Squid-1.1, whether or not an object gets cached is controlled by the cache stoplist , and cache stoplist pattern options. So, you may add:
cache_stoplist my.domain.com
7. Operational issues
68
Specifying uncachable objects by IP address is harder. The 1.1 patch page <../1.1/patches.html> includes a patch called no-cache-local.patch which changes the behaviour of the local ip and local domain so that matching requests are NOT CACHED, in addition to being fetched directly.
If you add a new cache dir you have to run squid -z to initialize that directory. 3. Remeber that you can not delete a cache directory from a running Squid process; you can not simply recongure squid. You must shutdown Squid:
squid -k shutdown
4. Once Squid exits, you may immediately start it up again. Since you deleted the old cache dir from squid.conf, Squid won't try to access that directory. If you use the RunCache script, Squid should start up again automatically. 5. Now Squid is no longer using the cache directory that you removed from the cong le. You can verify this by checking "Store Directory" information with the cache manager. From the command line, type:
squidclient mgr:storedir
6. Now that Squid is not using the cache directory, you can rm -rf it, format the disk, build a new lesystem, or whatever. The procedure is similar to recreate the directory. 1. Edit squid.conf and add a new cache dir line. 2. Initialize the new directory by running
% squid -z
NOTE: it is safe to run this even if Squid is already running. squid -z will harmlessly try to create all of the subdirectories that already exist. 3. Recongure Squid
squid -k reconfigure
Unlike deleting, you can add new cache directories while Squid is already running.
8. Memory
69
7.11 Can you tell me a good way to upgrade Squid with minimal downtime?
Here is a technique that was described by Radu Greab <mailto:[email protected]>. Start a second Squid server on an unused HTTP port (say 4128). This instance of Squid probably doesn't need a large disk cache. When this second server has nished reloading the disk store, swap the http port values in the two squid.conf les. Set the original Squid to use port 5128, and the second one to use 3128. Next, run \squid -k recongure" for both Squids. New requests will go to the second Squid, now on port 3128 and the rst Squid will nish handling its current requests. After a few minutes, it should be safe to fully shut down the rst Squid and upgrade it. Later you can simply repeat this process in reverse.
7.13 Can I make origin servers see the client's IP address when going through Squid?
Normally you cannot. Most TCP/IP stacks do not allow applications to create sockets with the local endpoint assigned to a foreign IP address. However, some folks have some patches to Linux <http://www.balabit.hu/en/downloads/tproxy/> that allow exactly that. In this situation, you must ensure that all HTTP packets destined for the client IP addresses are routed to the Squid box. If the packets take another path, the real clients will send TCP resets to the origin servers, thereby breaking the connections.
8 Memory
8.1 Why does Squid use so much memory!?
Squid uses a lot of memory for performance reasons. It takes much, much longer to read something from disk than it does to read directly from memory. A small amount of metadata for each cached object is kept in memory. This is the StoreEntry data structure. For Squid-2 this is 56-bytes on "small" pointer architectures (Intel, Sparc, MIPS, etc) and 88-bytes on "large" pointer architectures (Alpha). In addition, There is a 16-byte cache key (MD5 checksum) associated with
8. Memory
70
each StoreEntry . This means there are 72 or 104 bytes of metadata in memory for every object in your cache. A cache with 1,000,000 objects therefore requires 72 MB of memory for metadata only . In practice it requires much more than that. Squid-1.1 also uses a lot of memory to store in-transit objects. This version stores incoming objects only in memory, until the transfer is complete. At that point it decides whether or not to store the object on disk. This means that when users download large les, your memory usage will increase signicantly. The squid.conf parameter maximum object size determines how much memory an in-transit object can consume before we mark it as uncachable. When an object is marked uncachable, there is no need to keep all of the object in memory, so the memory is freed for the part of the object which has already been written to the client. In other words, lowering maximum object size also lowers Squid-1.1 memory usage. Other uses of memory by Squid include:
Disk buers for reading and writing Network I/O buers IP Cache contents FQDN Cache contents Netdb ICMP measurement database Per-request state information, including full request and reply headers Miscellaneous statistics collection. \Hot objects" which are kept entirely in memory.
8.2 How can I tell how much memory my Squid process is using?
One way is to simply look at ps output on your system. For BSD-ish systems, you probably want to use the -u option and look at the VSZ and RSS elds:
wessels 236% ps -axuhm USER PID %CPU %MEM VSZ RSS TT STAT STARTED squid 9631 4.6 26.4 141204 137852 ?? S 10:13PM TIME COMMAND 78:22.80 squid -NCYs
For SYSV-ish, you probably want to use the -l option. When interpreting the ps output, be sure to check your ps manual page. It may not be obvious if the reported numbers are kbytes, or pages (usually 4 kb). A nicer way to check the memory usage is with a program called top :
last pid: 20128; load averages: 0.06, 0.12, 0.11 14:10:58 46 processes: 1 running, 45 sleeping CPU states: % user, % nice, % system, % interrupt, % idle Mem: 187M Active, 1884K Inact, 45M Wired, 268M Cache, 8351K Buf, 1296K Free Swap: 1024M Total, 256K Used, 1024M Free PID USERNAME PRI NICE SIZE 9631 squid 2 0 138M RES STATE 135M select TIME WCPU CPU COMMAND 78:45 3.93% 3.93% squid
Finally, you can ask the Squid process to report its own memory usage. This is available on the Cache Manager info page. Your output may vary depending upon your operating system and Squid version, but it looks similar to this:
8. Memory
Resource usage for squid: Maximum Resident Size: 137892 KB Memory usage for squid via mstats(): Total space in arena: 140144 KB Total free: 8153 KB 6%
71
If your RSS (Resident Set Size) value is much lower than your process size, then your cache performance is most likely suering due to 9.24.
8.4 I set cache mem to XX, but the process grows beyond that!
The cache mem parameter does NOT specify the maximum size of the process. It only species how much memory to use for caching \hot" (very popular) replies. Squid's actual memory usage is depends very strongly on your cache size (disk space) and your incoming request load. Reducing cache mem will usually also reduce the process size, but not necessarily, and there are other ways to reduce Squid's memory usage (see below). See also 8.11.
8.5 How do I analyze memory usage from the cache manger output?
Note: This information is specic to Squid-1.1 versions
Look at your cachemgr.cgi Cache Information page. For example:
Memory usage for squid via mallinfo(): Total space in arena: 94687 KB Ordinary blocks: 32019 KB Small blocks: 44364 KB Holding blocks: 0 KB Free Small blocks: 6650 KB Free Ordinary blocks: 11652 KB Total in use: 76384 KB Total free: 18302 KB Meta Data: StoreEntry IPCacheEntry Hash link
81% 19%
15377 KB 83 KB 0 KB
8. Memory
URL strings Pool MemObject structures Pool for Request structur Pool for in-memory object Pool for disk I/O Miscellaneous total Accounted = 514 x 144 bytes = 516 x 4380 bytes = 6200 x 4096 bytes = 242 x 8192 bytes = = = 11422 KB 72 KB ( 70 free) 2207 KB ( 2121 free) 24800 KB ( 22888 free) 1936 KB ( 1888 free) 2600 KB 58499 KB
72
First note that mallinfo() reports 94M in \arena." This is pretty close to what top says (97M). Of that 94M, 81% (76M) is actually being used at the moment. The rest has been freed, or pre-allocated by malloc(3) and not yet used. Of the 76M in use, we can account for 58.5M (76%). There are some calls to malloc(3) for which we can't account. The Meta Data list gives the breakdown of where the accounted memory has gone. 45% has gone to StoreEntry and URL strings. Another 42% has gone to buering hold objects in VM while they are fetched and relayed to the clients (Pool for in-memory object). The pool sizes are specied by squid.conf parameters. In version 1.0, these pools are somewhat broken: we keep a stack of unused pages instead of freeing the block. In the Pool for in-memory object, the unused stack size is 1/2 of cache mem. The Pool for disk I/O is hardcoded at 200. For MemObject and Request it's 1/8 of your system's FD SETSIZE value. If you need to lower your process size, we recommend lowering the max object sizes in the 'http', 'ftp' and 'gopher' cong lines. You may also want to lower cache mem to suit your needs. But if you make cache mem too low, then some objects may not get saved to disk during high-load periods. Newer Squid versions allow you to set memory pools off to disable the free memory pools.
8.6 The \Total memory accounted" value is less than the size of my Squid process.
We are not able to account for all memory that Squid uses. This would require excessive amounts of code to keep track of every last byte. We do our best to account for the major uses of memory. Also, note that the malloc and free functions have their own overhead. Some additional memory is required to keep track of which chunks are in use, and which are free. Additionally, most operating systems do not allow processes to shrink in size. When a process gives up memory by calling free , the total process size does not shrink. So the process size really represents the maximum size your Squid process has reached.
8. Memory
73
To tell if it is the second case, rst rule out the rst case and then monitor the size of the Squid process. If it dies at a certain size with plenty of swap left then the max data segment size is reached without no doubts. The data segment size can be limited by two factors: 1. Kernel imposed maximum, which no user can go above 2. The size set with ulimit, which the user can control. When squid starts it sets data and le ulimit's to the hard level. If you manually tune ulimit before starting Squid make sure that you set the hard limit and not only the soft limit (the default operation of ulimit is to only change the soft limit). root is allowed to raise the soft limit above the hard limit. This command prints the hard limits:
ulimit -aH
8.7.1 BSD/OS
by Arjan de Vet <mailto:[email protected]> The default kernel limit on BSD/OS for datasize is 64MB (at least on 3.0 which I'm using). Recompile a kernel with larger datasize settings:
maxusers 128 # Support for large inpcb hash tables, e.g. busy WEB servers. options INET_SERVER # support for large routing tables, e.g. gated with full Internet routing: options "KMEMSIZE=\(16*1024*1024\)" options "DFLDSIZ=\(128*1024*1024\)" options "DFLSSIZ=\(8*1024*1024\)" options "SOMAXCONN=128" options "MAXDSIZ=\(256*1024*1024\)"
8. Memory
# # Settings used by /etc/rc and root # This must be set properly for daemons started as root by inetd as well. # Be sure reset these values back to system defaults in the default class! # daemon:\ :path=/bin /usr/bin /sbin /usr/sbin:\ :widepasswords:\ :tc=default: # :datasize-cur=128M:\ # :openfiles-cur=256:\ # :maxproc-cur=256:\
74
Increase the maximum and default data segment size in your kernel cong le, e.g.
options options "MAXDSIZ=(512*1024*1024)" "DFLDSIZ=(128*1024*1024)"
/sys/conf/i386/CONFIG :
And, if you have more than 256 MB of physical memory, you probably have to disable BOUNCE BUFFERS (whatever that is), so comment out this line:
#options BOUNCE_BUFFERS #include support for DMA bounce buffers
8. Memory
:openfiles-cur@:\ :stacksize=64M:\ :tc=default:
75
And don't forget to run \cap mkdb /etc/login.conf" after editing that le.
Try a 8.10. Reduce the cache mem parameter in the cong le. This controls how many \hot" objects are kept in
memory. Reducing this parameter will not signicantly aect performance, but you may recieve some warnings in cache.log if your cache is busy.
Turn the memory pools o in the cong le. This causes Squid to give up unused memory by calling
free() instead of holding on to the chunk for potential, future use.
Reduce the cache swap parameter in your cong le. This will reduce the number of objects Squid
keeps. Your overall hit ratio may go down a little, but your cache will perform signicantly better.
8. Memory
76
Reduce the maximum object size parameter (Squid-1.1 only). You won't be able to cache the larger
objects, and your byte volume hit ratio may go down, but Squid will perform better overall.
3. Copy libmalloc.a to your system's library directory and be sure to name it libgnumalloc.a .
% su # cp malloc.a /usr/lib/libgnumalloc.a
4. (Optional) Copy the GNU malloc.h to your system's include directory and be sure to name it gnumalloc.h . This step is not required, but if you do this, then Squid will be able to use the mstat() function to report memory usage statistics on the cachemgr info page.
# cp malloc.h /usr/include/gnumalloc.h
Note, In later distributions, 'realclean' has been changed to 'distclean'. As the congure script runs, watch its output. You should nd that it locates libgnumalloc.a and optionally gnumalloc.h.
8.10.2 dlmalloc
dlmalloc
<http://g.oswego.edu/dl/html/malloc.html> <mailto:[email protected]>. According to Doug:
has
been
written
by
Doug
Lea
This is not the fastest, most space-conserving, most portable, or most tunable malloc ever written. However it is among the fastest while also being among the most space-conserving, portable and tunable.
77
dlmalloc is included with the Squid-2 source distribution. To use this library, you simply give an option to the congure script:
% ./configure --enable-dlmalloc ...
EDITOR"S NOTE: readers are encouraged to submit instructions for conguration of cachemgr.cgi on other web server platforms, such as Netscape.
After you edit the server conguration les, you will probably need to either restart your web server or or send it a SIGHUP signal to tell it to re-read its conguration les. When you're done conguring your web server, you'll connect to the cache manager with a web browser, using a URL such as:
http://www.example.com/Squid/cgi-bin/cachemgr.cgi/
78
Wildcards are acceptable, IP addresses are acceptable, and others can be added with a comma-separated list of IP addresses. There are many more ways of protection. Your server documentation has details. You also need to add:
Protect Exec /Squid/* MGR-PROT /Squid/cgi-bin/*.cgi /usr/local/squid/bin/*.cgi
It's probably a bad idea to ScriptAlias the entire usr/local/squid/bin/ directory where all the Squid executables live. Next, you should ensure that only specied workstations can access the cache manager. That is done in your Apache httpd.conf , not in squid.conf . At the bottom of httpd.conf le, insert:
<Location /Squid/cgi-bin/cachemgr.cgi> order allow,deny allow from workstation.example.com </Location>
You can have more than one allow line, and you can allow domains or networks. Alternately, cachemgr.cgi can be password-protected. You'd add the following to httpd.conf :
<Location /Squid/cgi-bin/cachemgr.cgi> AuthUserFile /path/to/password/file AuthGroupFile /dev/null AuthName User/Password Required AuthType Basic require user cachemanager </Location>
Consult the Apache documentation for information on using htpasswd to set a password for this \user."
79
CGI-bin path: set to /Squid/cgi-bin/ Handle *.cgi: set to no Run user scripts as owner: set to no Search path: set to the directory containing the cachemgr.cgi le
In section Security , set Patterns to:
allow ip=1.2.3.4
where 1.2.3.4 is the IP address for workstation.example.com Save the conguration, and you're done.
The rst ACL is the most important as the cache manager program interrogates squid using a special cache object protocol. Try it yourself by doing:
telnet mycache.example.com 3128 GET cache_object://mycache.example.com/info HTTP/1.0
80
The default ACLs say that if the request is for a cache object, and it isn't the local host, then deny access; otherwise allow access. In fact, only allowing localhost access means that on the initial cachemgr.cgi form you can only specify the cache host as localhost. We recommend the following:
acl acl acl acl manager proto cache_object localhost src 127.0.0.1/255.255.255.255 example src 123.123.123.123/255.255.255.255 all src 0.0.0.0/0.0.0.0
Where 123.123.123.123 is the IP address of your web server. Then modify the rules like this:
http_access http_access http_access http_access allow manager localhost allow manager example deny manager allow all
If you're using miss access , then don't forget to also add a miss access rule for the cache manager:
miss_access allow manager
The default ACLs assume that your web server is on the same machine as squid . Remember that the connection from the cache manager program to squid originates at the web server, not the browser. So if your web server lives somewhere else, you should make sure that IP address of the web server that has cachemgr.cgi installed on it is in the example ACL above. Always be sure to send a SIGHUP signal to squid any time you change the squid.conf le.
Note, if you do this after you already installed Squid before, you need to make sure cachemgr.cgi gets recompiled. For example:
% cd src % rm cachemgr.o cachemgr.cgi % make cachemgr.cgi
81
9.10 What's the dierence between Squid TCP connections and Squid UDP connections?
Browsers and caches use TCP connections to retrieve web objects from web servers or caches. UDP connections are used when another cache using you as a sibling or parent wants to nd out if you have an object in your cache that it's looking for. The UDP connections are ICP queries.
IPCacheEntry
An entry in the DNS cache.
Hash link
Link in the cache hash table structure.
URL strings
The strings of the URLs themselves that map to an object number in the cache, allowing access to the StoreEntry. Basically just like the log le in your cache directory: 1. PoolMemObject structures 2. Info about objects currently in memory, (eg, in the process of being transferred). 3. Pool for Request structures 4. Information about each request as it happens. 5. Pool for in-memory object 6. Space for object data as it is retrieved. If squid is much smaller than this eld, run for cover! Something is very wrong, and you should probably restart squid .
KB/sec
This column contains gross estimations of data transfer rates averaged over the entire time the cache has been running. These numbers are unreliable and mostly useless.
82
Count?
9.16 In the utilization section, what is the Max/Current/Min 9.17 What is the I/O section about?
KB?
These refer to the size all the objects of this type have grown to/currently are/shrunk to.
These are histograms on the number of bytes read from the network per read(2) call. Somewhat useful for determining maximum buer sizes.
Objects
section for?
VM Objects are the objects which are in Virtual Memory. These are objects which are currently being
retrieved and those which were kept in memory for fast access (accelerator mode).
RTT
mean?
Average Round Trip Time. This is how long on average after an ICP ping is sent that a reply is received.
9.21 In the IP cache section, what's the dierence between a hit, a negative hit and a miss?
A HIT means that the document was found in the cache. A MISS, that it wasn't found in the cache. A negative hit means that it was found in the cache, but it doesn't exist.
C Means positively cached. N Means negatively cached. P Means the request is pending being dispatched. D Means the request has been dispatched and we're waiting for an answer. L Means it is a locked entry because it represents a parent or sibling.
83
The TTL column represents \Time To Live" (i.e., how long the cache entry is valid). (May be negative if the document has expired.) The N column is the number of IP addresses from which the cache has documents. The rest of the line lists all the IP addresses that have been associated with that IP cache entry.
9.23 What is the fqdncache and how is it dierent from the ipcache?
IPCache contains data for the Hostname to IP-Number mapping, and FQDNCache does it the other way round. For example:
IP Cache Contents:
Hostname gorn.cc.fh-lippe.de lagrange.uni-paderborn.de www.altavista.digital.com 2/ftp.symantec.com Flags: Flags lstref TTL N [IP-Number] C 0 21581 1 193.16.112.73 C 6 21594 1 131.234.128.245 C 10 21299 4 204.123.2.75 ... DL 1583 -772855 0
C --> Cached D --> Dispatched N --> Negative Cached L --> Locked lstref: Time since last use TTL: Time-To-Live until information expires N: Count of addresses
TTL: N:
C --> Cached D --> Dispatched N --> Negative Cached L --> Locked Time-To-Live until information expires Count of names
9.24 What does \Page faults with physical i/o: 4897" mean?
This question was asked on the squid-users mailing list, to which there were three excellent replies. by Jonathan Larmour <mailto:[email protected]> You get a \page fault" when your OS tries to access something in memory which is actually swapped to disk. The term \page fault" while correct at the kernel and CPU level, is a bit deceptive to a user, as there's no actual error - this is a normal feature of operation. Also, this doesn't necessarily mean your squid is swapping by that much. Most operating systems also implement paging for executables, so that only sections of the executable which are actually used are read
84
from disk into memory. Also, whenever squid needs more memory, the fact that the memory was allocated will show up in the page faults. However, if the number of faults is unusually high, and getting bigger, this could mean that squid is swapping. Another way to verify this is using a program called \vmstat" which is found on most UNIX platforms. If you run this as \vmstat 5" this will update a display every 5 seconds. This can tell you if the system as a whole is swapping a lot (see your local man page for vmstat for more information). It is very bad for squid to swap, as every single request will be blocked until the requested data is swapped in. It is better to tweak the cache mem and/or memory pools setting in squid.conf, or switch to the NOVM versions of squid, than allow this to happen. by Peter Wemm <mailto:[email protected]> There's two dierent operations at work, Paging and swapping. Paging is when individual pages are shued (either discarded or swapped to/from disk), while \swapping" generally means the entire process got sent to/from disk. Needless to say, swapping a process is a pretty drastic event, and usually only reserved for when there's a memory crunch and paging out cannot free enough memory quickly enough. Also, there's some variation on how swapping is implemented in OS's. Some don't do it at all or do a hybrid of paging and swapping instead. As you say, paging out doesn't necessarily involve disk IO, eg: text (code) pages are read-only and can simply be discarded if they are not used (and reloaded if/when needed). Data pages are also discarded if unmodied, and paged out if there's been any changes. Allocated memory (malloc) is always saved to disk since there's no executable le to recover the data from. mmap() memory is variable.. If it's backed from a le, it uses the same rules as the data segment of a le - ie: either discarded if unmodied or paged out. There's also \demand zeroing" of pages as well that cause faults.. If you malloc memory and it calls brk()/sbrk() to allocate new pages, the chances are that you are allocated demand zero pages. Ie: the pages are not \really" attached to your process yet, but when you access them for the rst time, the page fault causes the page to be connected to the process address space and zeroed - this saves unnecessary zeroing of pages that are allocated but never used. The \page faults with physical IO" comes from the OS via getrusage(). It's highly OS dependent on what it means. Generally, it means that the process accessed a page that was not present in memory (for whatever reason) and there was disk access to fetch it. Many OS's load executables by demand paging as well, so the act of starting squid implicitly causes page faults with disk IO - however, many (but not all) OS's use \read ahead" and \prefault" heuristics to streamline the loading. Some OS's maintain \intent queues" so that pages can be selected as pageout candidates ahead of time. When (say) squid touches a freshly allocated demand zero page and one is needed, the OS can page out one of the candidates on the spot, causing a 'fault with physical IO' with demand zeroing of allocated memory which doesn't happen on many other OS's. (The other OS's generally put the process to sleep while the pageout daemon nds a page for it). The meaning of \swapping" varies. On FreeBSD for example, swapping out is implemented as unlocking upages, kernel stack, PTD etc for aggressive pageout with the process. The only thing left of the process in memory is the 'struct proc'. The FreeBSD paging system is highly adaptive and can resort to paging in a way that is equivalent to the traditional swapping style operation (ie: entire process). FreeBSD also tries stealing pages from active processes in order to make space for disk cache. I suspect this is why setting 'memory pools o ' on the non-NOVM squids on FreeBSD is reported to work better - the VM/buer system could be competing with squid to cache the same pages. It's a pity that squid cannot use mmap() to do le IO on the 4K chunks in it's memory pool (I can see that this is not a simple thing to do though, but that won't stop me wishing. :-). by John Line <mailto:[email protected]>
85
The comments so far have been about what paging/swapping gures mean in a \traditional" context, but it's worth bearing in mind that on some systems (Sun's Solaris 2, at least), the virtual memory and lesystem handling are unied and what a user process sees as reading or writing a le, the system simply sees as paging something in from disk or a page being updated so it needs to be paged out. (I suppose you could view it as similar to the operating system memory-mapping the les behind-the-scenes.) The eect of this is that on Solaris 2, paging gures will also include le I/O. Or rather, the gures from vmstat certainly appear to include le I/O, and I presume (but can't quickly test) that gures such as those quoted by Squid will also include le I/O. To conrm the above (which represents an impression from what I've read and observed, rather than 100% certain facts...), using an otherwise idle Sun Ultra 1 system system I just tried using cat (small, shouldn't need to page) to copy (a) one le to another, (b) a le to /dev/null, (c) /dev/zero to a le, and (d) /dev/zero to /dev/null (interrupting the last two with control-C after a while!), while watching with vmstat. 300-600 page-ins or page-outs per second when reading or writing a le (rather than a device), essentially zero in other cases (and when not cat-ing). So ... beware assuming that all systems are similar and that paging gures represent *only* program code and data being shued to/from disk - they may also include the work in reading/writing all those les you were accessing...
9.25 What does the IGNORED eld mean in the 'cache server list'?
This refers to ICP replies which Squid ignored, for one of these reasons:
The URL in the reply could not be found in the cache at all. The URL in the reply was already being fetched. Probably this ICP reply arrived too late. The URL in the reply did not have a MemObject associated with it. Either the request is already
nished, or the user aborted before the ICP arrived. to forward this request to that neighbor.
The reply came from a multicast-responder, but the cache peer access conguration does not allow us Source-Echo replies from known neighbors are ignored. ICP OP DENIED replies are ignored after the rst 100.
10 Access Controls
10.1 Introduction
Squid's access control scheme is relatively comprehensive and dicult for some people to understand. There are two dierent components: ACL elements , and access lists . An access list consists of an allow or deny action followed by a number of ACL elements.
86
src: source (client) IP addresses dst: destination (server) IP addresses myip: the local IP address of a client's connection srcdomain: source (client) domain name dstdomain: destination (server) domain name srcdom regex: source (client) regular expression pattern matching dstdom regex: destination (server) regular expression pattern matching time: time of day, and day of week url regex: URL regular expression pattern matching urlpath regex: URL-path regular expression pattern matching, leaves out the protocol and hostname port: destination (server) port number myport: local port number that client connected to proto: transfer protocol (http, ftp, etc) method: HTTP request method (get, post, etc) browser: regular expression pattern matching on the request's user-agent header ident: string matching on the user's name ident regex: regular expression pattern matching on the user's name src as: source (client) Autonomous System number dst as: destination (server) Autonomous System number proxy auth: user authentication via external processes proxy auth regex: user authentication via external processes snmp community: SNMP community string matching maxconn: a limit on the maximum number of connections from a single client IP address req mime type: regular expression pattern matching on the request content-type header arp: Ethernet (MAC) address matching rep mime type: regular expression pattern matching on the reply (downloaded content) content-type
header. This is only usable in the http reply access directive, not http access.
external: lookup via external acl helper dened by external acl type
87
Not all of the ACL elements can be used with all types of access lists (described below). For example, snmp community is only meaningful when used with snmp access . The src as and dst as types are only used in cache peer access access lists. The arp ACL requires the special congure option {enable-arp-acl. Furthermore, the ARP ACL code is not portable to all operating systems. It works on Linux, Solaris, and some *BSD variants. The SNMP ACL element and access list require the {enable-snmp congure option. Some ACL elements can cause processing delays. For example, use of src domain and srcdom regex require a reverse DNS lookup on the client's IP address. This lookup adds some delay to the request. Each ACL element is assigned a unique name . A named ACL element consists of a list of values . When checking for a match, the multiple values use OR logic. In other words, an ACL element is matched when any one of its values is a match. You can't give the same name to two dierent types of ACL elements. It will generate a syntax error. You can put dierent values for the same ACL name on dierent lines. Squid combines them into one list.
http access: Allows HTTP clients (browsers) to access the HTTP port. This is the primary access
control list.
http reply access: Allows HTTP clients (browsers) to receive the reply to their request. This further
restricts permissions given by http access, and is primarily intended to be used together with the rep mime type acl type for blocking dierent content types.
icp access: Allows neighbor caches to query your cache with ICP. miss access: Allows certain clients to forward cache misses through your cache. This further restricts no cache: Denes responses that should not be cached. redirector access: Controls which requests are sent through the redirector pool. ident lookup access: Controls which requests need an Ident lookup. always direct: Controls which requests should always be forwarded directly to origin servers. never direct: Controls which requests should never be forwarded directly to origin servers. snmp access: Controls SNMP client access to the cache. broken posts: Denes requests for which squid appends an extra CRLF after POST message bodies
as required by some broken origin servers. permissions given by http access, and is primarily intended to be used for enforcing sibling relations by denying siblings from forwarding cache misses through your cache.
cache peer access: Controls which requests can be forwarded to a given neighbor (peer).
Notes: An access list rule consists of an allow or deny keyword, followed by a list of ACL element names. An access list consists of one or more access list rules.
88
Access list rules are checked in the order they are written. List searching terminates as soon as one of the rules is a match. If a rule has multiple ACL elements, it uses AND logic. In other words, all ACL elements of the rule must be a match in order for the rule to be a match. This means that it is possible to write a rule that can never be matched. For example, a port number can never be equal to both 80 AND 8000 at the same time. To summarise the acl logics can be described as:
http_access allow|deny acl AND acl AND ... OR http_access allow|deny acl AND acl AND ... OR ...
If none of the rules are matched, then the default action is the opposite of the last rule in the list. Its a good idea to be explicit with the default action. The best way is to thse the all ACL. For example:
acl all src 0/0 http_access deny all
The url regex means to search the entire URL for the regular expression you specify. Note that these regular expressions are case-sensitive, so a url containing \Cooking" would not be denied. Another way is to deny access to specic servers which are known to hold recipes. For example:
89
The dstdomain means to search the hostname in the URL for the string \www.gourmet-chef.com." Note that when IP addresses are used in URLs (instead of domain names), Squid-1.1 implements relaxed access controls. If the a domain name for the IP address has been saved in Squid's \FQDN cache," then Squid can compare the destination domain against the access controls. However, if the domain is not immediately available, Squid allows the request and makes a lookup for the IP address so that it may be available for future reqeusts.
10.6 Do you have a CGI program which lets users change their own proxy passwords?
Pedro L Orso <mailto:[email protected]> has adapted the Apache's htpasswd into a CGI program called chpasswd.cgi </htpasswd/chpasswd-cgi.tar.gz>.
10.7 Is there a way to do ident lookups only for a certain host and compare the result with a userlist in squid.conf?
<ftp://ftp.isi.edu/in-notes/rfc931.txt> requests.
You can use the ident access directive to control for which hosts Squid will issue ident lookup
Additionally, if you use a ident ACL in squid conf, then Squid will make sure an ident lookup is performed while evaluating the acl even if iden access does not indicate ident lookups should be performed. However, Squid does not wait for the lookup to complete unless the ACL rules require it. Consider this conguration:
90
Requests coming from 10.0.0.1 will be allowed immediately because there are no user requirements for that host. However, requests from 10.0.0.2 will be allowed only after the ident lookup completes, and if the username is in the set kim, lisa, frank, or joe.
All elements of an acl entry are OR'ed together. All elements of an access entry are AND'ed together. e.g. http access and icp access .
For example, the following access control conguration will never work:
acl ME src 10.0.0.1 acl YOU src 10.0.0.2 http_access allow ME YOU
In order for the request to be allowed, it must match the \ME" acl AND the \YOU" acl. This is impossible because any IP address could only match one or the other. This should instead be rewritten as:
acl ME src 10.0.0.1 acl YOU src 10.0.0.2 http_access allow ME http_access allow YOU
91
The intent here is to allow cache manager requests from the localhost and server addresses, and deny all others. This policy has been expressed here:
http_access deny manager !localhost !server
The problem here is that for allowable requests, this access rule is not matched. For example, if the source IP address is localhost , then \!localhost" is false and the access rule is not matched, so Squid continues checking the other rules. Cache manager requests from the server address work because server is a subset of ourhosts and the second access rule will match and allow the request. Also note that this means any cache manager request from ourhosts would be allowed. To implement the desired policy correctly, the access rules should be rewritten as
http_access http_access http_access http_access http_access allow manager localhost allow manager server deny manager allow ourhosts deny all
If you're using miss access , then don't forget to also add a miss access rule for the cache manager:
miss_access allow manager
You may be concerned that the having ve access rules instead of three may have an impact on the cache performance. In our experience this is not the case. Squid is able to handle a moderate amount of access control checking without degrading overall performance. You may like to verify that for yourself, however.
92
From now on, your cache.log should contain a line for every request that explains if it was allowed, or denied, and which ACL was the last one that it matched. If this does not give you sucient information to nail down the problem you can also enable detailed debug information on ACL processing
debug_options ALL,1 33,2 28,9
Then restart or recongure squid as above. From now on, your cache.log should contain detailed traces of all access list processing. Be warned that this can be quite some lines per request. See also 11.20
Proxy A sends and ICP query to Proxy B about an object, Proxy B replies with an ICP HIT. Proxy A forwards the HTTP request to Proxy B, but does not pass on the authentication details, therefore the HTTP GET from Proxy A fails.
Only ONE proxy cache in a chain is allowed to \use" the Proxy-Authentication request header. Once the header is used, it must not be passed on to other proxies. Therefore, you must allow the neighbor caches to request from each other without proxy authentication. This is simply accomplished by listing the neighbor ACL's rst in the list of http access lines. For example:
acl proxy-A src 10.0.0.1 acl proxy-B src 10.0.0.2 acl user_passwords proxy_auth /tmp/user_passwds http_access http_access http_access http_access allow proxy-A allow proxy-B allow user_passwords deny all
10.11 Is there an easy way of banning all Destination addresses except one?
acl GOOD dst 10.0.0.1 acl BAD dst 0.0.0.0/0.0.0.0 http_access allow GOOD http_access deny BAD
93
10.12 Does anyone have a ban list of porn sites and such? Jasons Staudenmayer <http://members.lycos.co.uk/njadmin> Pedro Lineu Orso's List <http://web.onda.com.br/orso/> Linux Center Hong Kong's List <http://www.hklc.com/squidblock/> Snerpa, an ISP in Iceland operates a DNS-database of IP-addresses of blacklisted sites containing porn,
violence, etc. which is utilized using a small perl-script redirector. Information on this on the INlter <http://www.snerpa.is/notendur/infilter/infilter-en.phtml> webpage.
The SquidGuard <http://www.squidguard.org/blacklist/> redirector folks provide a blacklist. Bill Stearns maintains the sa-blacklist <http://www.stearns.org/sa-blacklist/> of known spammers. By blocking the spammer web sites in squid, users can no longer use up bandwidth downloading spam images and html. Even more importantly, they can no longer send out requests for things like scripts and gifs that have a unique identifer attached, showing that they opened the email and making their addresses more valuable to the spammer.
In the rst place, the above list is simply wrong because the rst two ( boulder.co.us and vail.co.us ) are unnecessary. Any domain name that matches one of the rst two will also match the last one ( co.us ). Ok, but why does this happen? The problem stems from the data structure used to index domain names in an access control list. Squid uses Splay trees for lists of domain names. As other tree-based data structures, the searching algorithm requires a comparison function that returns -1, 0, or +1 for any pair of keys (domain names). This is similar to the way that strcmp() works. The problem is that it is wrong to say that co.us is greater-than, equal-to, or less-than boulder.co.us . For example, if you said that co.us is LESS than f.co.us , then the Splay tree searching algorithm might never discover co.us as a match for kkk.co.us . similarly, if you said that co.us is GREATER than f.co.us , then the Splay tree searching algorithm might never discover co.us as a match for bbb.co.us .
94
The bottom line is that you can't have one entry that is a subdomain of another. Squid-2.2 will warn you if it detects this condition.
The above conguration denies requests when the URL port number is not in the list. The list allows connections to the standard ports for HTTP, FTP, Gopher, SSL, WAIS, and all non-priveleged ports. Another approach is to deny dangerous ports. The dangerous port list should look something like:
acl Dangerous_ports 7 9 19 22 23 25 53 109 110 119 http_access deny Dangerous_ports
...and probably many others. Please consult the /etc/services le on your system for a list of known ports and protocols.
10.15 Does Squid support the use of a database such as mySQL for storing the ACL list?
Note: The information here is current for version 2.2.
No, it does not.
10.17 How can I allow some clients to use the cache at specic times?
Let's say you have two workstations that should only be allowed access to the Internet during working hours (8:30 - 17:30). You can use something like this:
95
10.18 How can I allow some users to use the cache at specic times?
acl USER1 proxy_auth Dick acl USER2 proxy_auth Jane acl DAY time 06:00-18:00 http_access allow USER1 DAY http_access deny USER1 http_access allow USER2 !DAY http_access deny USER2
The reason is that IP access lists are stored in \splay" tree data structures. These trees require the keys to be sortable. When you use a complicated, or non-standard, netmask (255.0.0.128), it confuses the function that compares two address/mask pairs. The best way to x this problem is to use separate ACL names for each ACL value. For example, change the above to:
acl restricted1 src 10.0.0.128/255.0.0.128 acl restricted2 src 10.85.0.0/16
Then, of course, you'll have to rewrite your http access lines as well.
10.20 Can I set up ACL's based on MAC address rather than IP?
Yes, for some operating systes. Squid calls these \ARP ACLs" and they are supported on Linux, Solaris, and probably BSD variants. NOTE: Squid can only determine the MAC address for clients that are on the same subnet. If the client is on a dierent subnet, then Squid can not nd out its MAC address. To use ARP (MAC) access controls, you rst need to compile in the optional code. Do this with the {enable-arp-acl congure option:
% ./configure --enable-arp-acl ... % make clean % make
If src/acl.c doesn't compile, then ARP ACLs are probably not supported on your system. If everything compiles, then you can add some ARP ACL lines to your squid.conf :
96
Given the above conguration, when a client whose source IP address is in the 1.2.3.0/24 subnet tries to establish 6 or more connections at once, Squid returns an error page. Unless you use the deny info feature, the error message will just say \access denied." The maxconn ACL requires the client db feature. If you've disabled client db (for example with client db o ) then maxconn ALCs will not work. Note, the maxconn ACL type is kind of tricky because it uses less-than comparison. The ACL is a match when the number of established connections is greater than the value you specify. Because of that, you don't want to use the maxconn ACL with http access allow . Also note that you could use maxconn in conjunction with a user type (ident, proxy auth), rather than an IP address type.
11. Troubleshooting
feel you've received this message in error, please contact the support staff ([email protected], 555-1234).
97
11 Troubleshooting
11.1 Why am I getting \Proxy Access Denied?"
You may need to set up the http access option to allow requests from your IP addresses. Please see 10 for information about that. If squid is in httpd-accelerator mode, it will accept normal HTTP requests and forward them to a HTTP server, but it will not honor proxy requests. If you want your cache to also accept proxy-HTTP requests then you must enable this feature:
httpd_accel_with_proxy on
Alternately, you may have miscongured one of your ACLs. Check the access.log and squid.conf les for clues.
11.2 I can't get local domain to work; Squid is caching the objects from my local servers.
The local domain directive does not prevent local objects from being cached. It prevents the use of sibling caches when fetching local objects. If you want to prevent objects from being cached, use the cache stoplist or http stop conguration options (depending on your version).
11.3 I get Connection Refused when the cache tries to retrieve an object located on a sibling, even though the sibling thinks it delivered the object to my cache.
If the HTTP port number is wrong but the ICP port is correct you will send ICP queries correctly and the ICP replies will fool your cache into thinking the conguration is correct but large objects will fail since you don't have the correct HTTP port for the sibling in your squid.conf le. If your sibling changed their http port, you could have this problem for some time before noticing.
11. Troubleshooting
98
11.4.1 Linux
Henrik has a How to get many ledescriptors on Linux 2.2.X <http://squid.sourceforge.net/hno/linux-lfd.html> page. You also might want to have a look at lehandle patch <http://www.linux.org.za/oskar/patches/kernel/filehandle/> by Michael O'Reilly <mailto:[email protected]> If your kernel version is 2.2.x or greater, you can read and write the maximum number of le handles and/or inodes simply by accessing the special les:
/proc/sys/fs/file-max /proc/sys/fs/inode-max
If your kernel version is between 2.0.35 and 2.1.x (?), you can read and write the maximum number of le handles and/or inodes simply by accessing the special les:
/proc/sys/kernel/file-max /proc/sys/kernel/inode-max
While this does increase the current number of le descriptors, Squid's congure script probably won't gure out the new value unless you also update the include les, specically the value of OPEN MAX in /usr/include/linux/limits.h .
11.4.2 Solaris
Add the following to your /etc/system le to increase your maximum le descriptors per process:
set rlim_fd_max = 4096
Next you should re-run the congure script in the top directory so that it nds the new value. If it does not nd the new limit, then you might try editing include/autoconf.h and setting #define DEFAULT FD SETSIZE by hand. Note that include/autoconf.h is created from autoconf.h.in every time you run congure. Thus, if you edit it by hand, you might lose your changes later on. If you have a very old version of Squid (1.1.X), and you want to use more than 1024 descriptors, then you must edit src/Makele and enable $(USE POLL OPT). Then recompile squid .
Jens-S. Voeckler <mailto:voeckler at rvs dot uni-hannover dot de> advises that you should NOT change the default soft limit (rlim fd cur ) to anything larger than 256. It will break other programs, such as the license manager needed for the SUN workshop compiler. Jens-S. also says that it should be safe to raise the limit for the Squid process as high as 16,384 except that there may be problems duruing recongure or logrotate if all of the lower 256 ledescriptors are in use at the time or rotate/recongure.
99
Do
3. What is the upper limit? I don't think there is a formal upper limit inside the kernel. All the data structures are dynamically allocated. In practice there might be unintended metaphenomena (kernel spending too much time searching tables, for example).
2. FreeBSD (from the 2.1.6 kernel) Very similar to SunOS, edit /usr/src/sys/conf/param.c and alter the relationship between maxusers and the maxfiles and maxfilesperproc variables:
int int maxfiles = NPROC*2; maxfilesperproc = NPROC*2;
Where NPROC is dened by: #define NPROC (20 + 16 * MAXUSERS) The per-process limit can also be adjusted directly in the kernel conguration le with the following directive: options OPEN MAX=128
11. Troubleshooting
100
3. BSD/OS (from the 2.1 kernel) Edit /usr/src/sys/conf/param.c and adjust the maxfiles math here:
int maxfiles = 3 * (NPROC + MAXUSERS) + 80;
Where NPROC is dened by: #define NPROC (20 + 16 * MAXUSERS) You should also set the OPEN MAX value in your kernel conguration le to change the per-process limit.
11.4.5 Recongure afterwards NOTE: After you rebuild/recongure your kernel with more ledescriptors, you must then recompile Squid.
Squid's congure script determines how many ledescriptors are available, so you must make sure the congure script runs again as well. For example:
cd squid-1.1.x make realclean ./configure --prefix=/usr/local/squid make
These log entries are normal, and do not indicate that squid has reached cache swap high. Consult your cache information page in cachemgr.cgi for a line like this:
Storage LRU Expiration Age: 364.01 days
Objects which have not been used for that amount of time are removed as a part of the regular maintenance. You can set an upper limit on the LRU Expiration Age value with reference age in the cong le.
11.6 Can I change a Windows NT FTP server to list directories in Unix format?
Why, yes you can! Select the following menus:
11. Troubleshooting
{Oskar Pearson <[email protected]>
101
on your squid.conf:
cache_peer proxy.parent.com parent 3128 3130
2. You can also see this warning when sending ICP queries to multicast addresses. For security reasons, Squid requires your conguration to list all other caches listening on the multicast group address. If an unknown cache listens to that address and sends replies, your cache will log the warning message. To x this situation, either tell the unknown cache to stop listening on the multicast address, or if they are legitimate, add them to your conguration le.
11.8 DNS lookups for domain names with underscores ( ) always fail.
The standards for naming hosts (RFC 952 <ftp://ftp.isi.edu/in-notes/rfc952.txt>, RFC 1101 <ftp://ftp.isi.edu/in-notes/rfc1101.txt>) do not allow underscores in domain names: A "name" (Net, Host, Gateway, or Domain name) is a text string up to 24 characters drawn from the alphabet (A-Z), digits (0-9), minus sign (-), and period (.). The resolver library that ships with recent versions of BIND enforces this restriction, returning an error for any host with underscore in the hostname. The best solution is to complain to the hostmaster of the oending site, and ask them to rename their host. See also the comp.protocols.tcp-ip.domains FAQ <http://www.intac.com/~cdp/cptd-faq/section4.html#underscore>. Some people have noticed that RFC 1033 <ftp://ftp.isi.edu/in-notes/rfc1033.txt> implies that underscores are allowed. However, this is an informational RFC with a poorly chosen example, and not a standard by any means.
11.9 Why does Squid say: \Illegal character in hostname; underscores are not allowed?'
See the above question. The underscore character is not valid for hostnames. Some DNS resolvers allow the underscore, so yes, the hostname might work ne when you don't use Squid. To make Squid allow underscores in hostnames, re-run the congure script with this option:
% ./configure --enable-underscores ...
11. Troubleshooting
and then recompile:
% make clean % make
102
11. Troubleshooting
103
2. Do not use miss access at all. Promise your sibling cache administrator that your cache is properly congured and that you will not abuse their generosity. The sibling cache administrator can check his log les to make sure you are keeping your word. If neither of these is realistic, then the sibling relationship should not exist.
That will show all sockets in the LISTEN state. You might also try
netstat -naf inet | grep 8080
If you nd that some process has bound to your port, but you're not sure which process it is, you might be able to use the excellent lsof <ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/> program. It will show you which processes own every open le descriptor on your system.
In summary, Basic authentication does not require an implicit end-to-end state, and can therefore be used through a proxy server. Windows NT Challenge/Response authentication requires implicit end-to-end state and will not work through a proxy server.
11. Troubleshooting
104
Squid transparently passes the NTLM request and response headers between clients and servers. NTLM relies on a single end-end connection (possibly with men-in-the-middle, but a single connection every step of the way. This implies that for NTLM authentication to work at all with proxy caches, the proxy would need to tightly link the client-proxy and proxy-server links, as well as understand the state of the link at any one time. NTLM through a CONNECT might work, but we as far as we know that hasn't been implemented by anyone, and it would prevent the pages being cached - removing the value of the proxy. NTLM authentication is carried entirely inside the HTTP protocol, but is not a true HTTP authentication protocol and is dierent from Basic and Digest authentication in many ways. 1. It is dependent on a stateful end-to-end connection which collides with RFC 2616 for proxy-servers to disjoin the client-proxy and proxy-server connections. 2. It is only taking place once per connection, not per request. Once the connection is authenticated then all future requests on the same connection inherities the authentication. The connection must be reestablished to set up other authentication or re-identify the user. This too collides with RFC 2616 where authentication is dened as a property of the HTTP messages, not connections. The reasons why it is not implemented in Netscape is probably:
It is very specic for the Windows platform It is not dened in any RFC or even internet draft. The protocol has several shortcomings, where the most apparent one is that it cannot be proxied. There exists an open internet standard which does mostly the same but without the shortcomings or
platform dependencies: digest authentication <ftp://ftp.isi.edu/in-notes/rfc2617.txt>.
nothing is sent to the parent; neither UDP packets, nor TCP connections.
Simply adding default to a parent does not force all requests to be sent to that parent. The term default is perhaps a poor choice of words. A default parent is only used as a last resort. If the cache is able to make direct connections, direct will be preferred over default. If you want to force all requests to your parent cache(s), use the never direct option:
acl all src 0.0.0.0/0.0.0.0 never_direct allow all
11. Troubleshooting
105
11.17 My Squid becomes very slow after it has been running for some time.
This is most likely because Squid is using more memory than it should be for your system. When the Squid process becomes large, it experiences a lot of paging. This will very rapidly degrade the performance of Squid. Memory usage is a complicated problem. There are a number of things to consider. Then, examine the Cache Manager Info ouput and look at these two lines:
Number of HTTP requests received: Page faults with physical i/o: 121104 16720
Note, if your system does not have the getrusage() function, then you will not see the page faults line. Divide the number of page faults by the number of connections. In this case 16720/121104 = 0.14. Ideally this ratio should be in the 0.0 - 0.1 range. It may be acceptable to be in the 0.1 - 0.2 range. Above that, however, and you will most likely nd that Squid's performance is unacceptably slow. If the ratio is too high, you will need to make some changes to 8.9. See also 8.11.
The Squid version Your Operating System type and version A clear description of the bug symptoms. If your Squid crashes the report must include a 11.19.1 as described below
11. Troubleshooting
106
Please note that bug reports are only processed if they can be reproduced or identied in the current STABLE or development versions of Squid. If you are running an older version of Squid the rst response will be to ask you to upgrade unless the developer who looks at your bug report immediately can identify that the bug also exists in the current versions. It should also be noted that any patches provided by the Squid developer team will be to the current STABLE version even if you run an older version.
Resource Limits. The shell has limits on the size of a coredump le. You may need to increase the
limit.
sysctl options. On FreeBSD, you won't get a coredump from programs that call setuid() and/or setgid()
(like Squid sometimes does) unless you enable this option:
# sysctl -w kern.sugid_coredump=1
No debugging symbols. The Squid binary must have debugging symbols in order to get a meaningful
coredump.
Threads and Linux. On Linux, threaded applications do not generate core dumps. When you use the
aufs cache dir type, it uses threads and you can't get a coredump.
or
limits coredump unlimited
Debugging Symbols: To see if your Squid binary has debugging symbols, use this command:
% nm /usr/local/squid/bin/squid | head
The binary has debugging symbols if you see gobbledegook like this:
0812abec 080a7540 080a73fc 080908a4 B D D r AS_tree_head AclMatchedName ActionTable B_BYTES_STR
11. Troubleshooting
080908bc 080908ac 080908b4 080a7550 08097c0c 08098f00 r r r D R r B_GBYTES_STR B_KBYTES_STR B_MBYTES_STR Biggest_FD CacheDigestHashFuncCount CcAttrs
107
Debugging symbols may have been removed by your install program. If you look at the squid binary from the source directory, then it might have the debugging symbols.
Coredump Location: The core dump le will be left in one of the following locations:
1. The coredump dir directory, if you set that option. 2. The rst cache dir directory if you have used the cache eective user option. 3. The current directory when Squid was started Recent versions of Squid report their current directory after starting, so look there rst:
2000/03/14 00:12:36| Set Current Directory to /usr/local/squid/cache
If you cannot nd a core le, then either Squid does not have permission to write in its current directory, or perhaps your shell limits are preventing the core le from being written. Often you can get a coredump if you run Squid from the command line like this (csh shells and clones):
% limit core un % /usr/local/squid/bin/squid -NCd1
Once you have located the core dump le, use a debugger such as dbx or gdb to generate a stack trace:
tirana-wessels squid/src 270% gdb squid /T2/Cache/core GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15.1 (hppa1.0-hp-hpux10.10), Copyright 1995 Free Software Foundation, Inc... Core was generated by `squid'. Program terminated with signal 6, Aborted. [...] (gdb) where #0 0xc01277a8 #1 0xc00b2944 #2 0xc007bb08 #3 0x53f5c in #4 0x29828 in
in _kill () in _raise () in abort () __eprintf (string=0x7b037048 "", expression=0x5f <Address 0x5f out of bounds>, line=8, fi fd_open (fd=10918, type=3221514150, desc=0x95e4 "HTTP Request") at fd.c:71
11. Troubleshooting
#5 #6 #7 #8 #9 0x24f40 0x23874 0x25510 0x25954 0x3b04c in in in in in comm_accept (fd=2063838200, peer=0x7b0390b0, me=0x6b) at comm.c:574 httpAccept (sock=33, notused=0xc00467a6) at client_side.c:1691 comm_select_incoming () at comm.c:784 comm_select (sec=29) at comm.c:1052 main (argc=1073745368, argv=0x40000dd8) at main.c:671
108
If possible, you might keep the coredump le around for a day or two. It is often helpful if we can ask you to send additional debugger output, such as the contents of some variables. But please note that a core le is only useful if paired with the exact same binary as generated the corele. If you recompile Squid then any coredumps from previous versions will be useless unless you have saved the corresponding Squid binaries, and any attempts to analyze such coredumps will most certainly give misleading information about the cause to the crash. If you CANNOT get Squid to leave a core le for you then one of the following approaches can be used First alternative is to start Squid under the contol of GDB
% gdb /path/to/squid handle SIGPIPE pass nostop noprint run -DNYCd3 [wait for crash] backtrace quit
The drawback from the above is that it isn't really suitable to run on a production system as Squid then won't restart automatically if it crashes. The good news is that it is fully possible to automate the process above to automatically get the stack trace and then restart Squid. Here is a short automated script that should work:
#!/bin/sh trap "rm -f $$.gdb" 0 cat <<EOF >$$.gdb handle SIGPIPE pass nostop noprint run -DNYCd3 backtrace quit EOF while sleep 2; do gdb -x $$.gdb /path/to/squid 2>&1 | tee -a squid.out done
Other options if the above cannot be done is to: a) Build Squid with the {enable-stacktraces option, if support exists for your OS (exists for Linux glibc on Intel, and Solaris with some extra libraries which seems rather impossible to nd these days..) b) Run Squid using the "catchsegv" tool. (Linux glibc Intel) but these approaches does not by far provide as much details as using gdb.
11. Troubleshooting
alternatively, you may want to copy it to an FTP or HTTP server where we can download it.
109
It is very simple to enable full debugging on a running squid process. Simply use the -k debug command line option:
% ./squid -k debug
This causes every debug() statement in the source code to write a line in the cache.log le. You also use the same command to restore Squid to normal debugging level. To enable selective debugging (e.g. for one source le only), you need to edit squid.conf and add to the debug options line. Every Squid source le is assigned a dierent debugging section . The debugging section assignments can be found by looking at the top of individual source les, or by reading the le doc/debuglevels.txt (correctly renamed to debug-sections.txt for Squid-2). You also specify the debugging level to control the amount of debugging. Higher levels result in more debugging messages. For example, to enable full debugging of Access Control functions, you would use
debug_options ALL,1 28,9
Then you have to restart or recongure Squid. Once you have the debugging captured to cache.log , take a look at it yourself and see if you can make sense of the behaviour which you see. If not, please feel free to send your debugging output to the squid-users or squid-bugs lists.
11.22 FATAL: Failed to make swap directory /var/spool/cache: (13) Permission denied
Starting with version 1.1.15, we have required that you rst run
squid -z
to create the swap directories on your lesystem. If you have set the cache eective user option, then the Squid process takes on the given userid before making the directories. If the cache dir directory (e.g. /var/spool/cache) does not exist, and the Squid userid does not have permission to create it, then you will get the \permission denied" error. This can be simply xed by manually creating the cache directory.
11. Troubleshooting
# mkdir /var/spool/cache # chown <userid> <groupid> /var/spool/cache # squid -z
110
Alternatively, if the directory already exists, then your operating system may be returning \Permission Denied" instead of \File Exists" on the mkdir() system call. This patch <store.c-mkdir.patch> by Miquel van Smoorenburg <mailto:[email protected]> should x it.
So, if you exactly specify the correct average object size, Squid should have 50% lemap bits free when the cache is full. You can see how many lemap bits are being used by looking at the 'storedir' cache manager page. It looks like this:
Store Directory #0: /usr/local/squid/cache First level subdirectories: 4 Second level subdirectories: 4 Maximum Size: 1024000 KB Current Size: 924837 KB
11. Troubleshooting
Percent Used: 90.32% Filemap bits in use: 77308 of 157538 (49%) Flags:
111
Now, if you see the \You've run out of swap le numbers" message, then it means one of two things: 1. You've found a Squid bug. 2. Your cache's average le size is much smaller than the 'store avg object size' value. To check the average le size of object currently in your cache, look at the cache manager 'info' page, and you will nd a line like:
Mean Object Size: 11.96 KB
To make the warning message go away, set 'store avg object size' to that value (or lower) and then restart Squid.
ls -l command:
% ls -l /usr/local/squid/logs/access.log
A process is normally owned by the user who starts it. However, Unix sometimes allows a process to change its owner. If you specied a value for the eective user option in squid.conf , then that will be the process owner. The les must be owned by this same userid. If all this is confusing, then you probably should not be running Squid until you learn some more about Unix. As a reference, I suggest Learning the UNIX Operating System, 4th Edition <http://www.oreilly.com/catalog/lunix4/>.
11.29 When using a username and password, I can not access some les.
If I try by way of a test, to access
ftp://username:password@ftpserver/somewhere/foo.tar.gz
I get
somewhere/foo.tar.gz: Not a directory.
11. Troubleshooting
Use this URL instead:
ftp://username:password@ftpserver/%2fsomewhere/foo.tar.gz
112
or
# chown root /usr/local/squid/bin/pinger # chmod 4755 /usr/local/squid/bin/pinger
a cache forwards requests to itself. This might happen with interception caching (or server acceleration)
congurations.
a pair or group of caches forward requests to each other. This can happen when Squid uses ICP, Cache
Digests, or the ICMP RTT database to select a next-hop cache. Forwarding loops are detected by examining the Via request header. Each cache which "touches" a request must add its hostname to the Via header. If a cache notices its own hostname in this header for an incoming request, it knows there is a forwarding loop somewhere. NOTE: Squid may report a forwarding loop if a request goes through two caches that have the same visible hostname value. If you want to have multiple machines with the same visible hostname then you must give each machine a dierent unique hostname so that forwarding loops are correctly detected. When Squid detects a forwarding loop, it is logged to the cache.log le with the recieved Via header. From this header you can determine which cache (the last in the list) forwarded the request to you. One way to reduce forwarding loops is to change a parent relationship to a sibling relationship. Another way is to use cache peer access rules. For example:
# Our parent caches cache_peer A.example.com parent 3128 3130 cache_peer B.example.com parent 3128 3130 cache_peer C.example.com parent 3128 3130 # An ACL list acl PEERS src A.example.com acl PEERS src B.example.com acl PEERS src C.example.com # Prevent forwarding loops
11. Troubleshooting
cache_peer_access A.example.com allow !PEERS cache_peer_access B.example.com allow !PEERS cache_peer_access C.example.com allow !PEERS
113
The above conguration instructs squid to NOT forward a request to parents A, B, or C when a request is received from any one of those caches.
1998/09/23 09:31:30| storeSwapInFileOpened: /var/cache/00/00/00000015: Size mismatch: 776(fstat) != 3785 1998/09/23 09:31:31| storeSwapInFileOpened: /var/cache/00/00/00000017: Size mismatch: 2571(fstat) != 415
Cannot
retrieve
These messages are caused by buggy clients, mostly Netscape Navigator. What happens is, Netscape sends an HTTPS/SSL request over a persistent HTTP connection. Normally, when Squid gets an SSL request, it looks like this:
CONNECT www.buy.com:443 HTTP/1.0
Then Squid opens a TCP connection to the destination host and port, and the real request is sent encrypted over this connection. Thats the whole point of SSL, that all of the information must be sent encrypted. With this client bug, however, Squid receives a request like this:
GET https://www.buy.com/corp/ordertracking.asp HTTP/1.0 Accept: */* User-agent: Netscape ... ...
11. Troubleshooting
114
Now, all of the headers, and the message body have been sent, unencrypted to Squid. There is no way for Squid to somehow turn this into an SSL request. The only thing we can do is return the error message. Note, this browser bug does represent a security risk because the browser is sending sensitive information unencrypted over the network.
confuse content ltering rules on proxies, and possibly some browsers' idea of whether they are trusted
sites on the local intranet;
confuse whois (?); make people think they are not IP addresses and unknown domain names, in an attempt to stop them
trying to locate and complain to the ISP. Any browser or proxy that works with them should be considered a security risk.
RFC 1738 <http://www.ietf.org/rfc/rfc1738.txt> has this to say about the hostname part of a URL:
The fully qualied domain name of a network host, or its IP address as a set of four decimal digit groups separated by ".". Fully qualied domain names take the form as described in Section 3.5 of RFC 1034 [13] and Section 2.1 of RFC 1123 [5]: a sequence of domain labels separated by ".", each domain label starting and ending with an alphanumerical character and possibly also containing "-" characters. The rightmost domain label will never start with a digit, though, which syntactically distinguishes all domain names from the IP addresses.
11.36 I get a lot of \URI has whitespace" error messages in my cache log, what should I do?
Whitespace characters (space, tab, newline, carriage return) are not allowed in URI's and URL's. Unfortunately, a number of Web services generate URL's with whitespace. Of course your favorite browser silently accomodates these bad URL's. The servers (or people) that generate these URL's are in violation of Internet standards. The whitespace characters should be encoded. If you want Squid to accept URL's with whitespace, you have to decide how to handle them. There are four choices that you can set with the uri whitespace option: 1. DENY: The request is denied with an \Invalid Request" message. This is the default. 2. ALLOW: The request is allowed and the URL remains unchanged. 3. ENCODE:
<http://www.ietf.org/rfc/rfc1738.txt>.
The
whitespace
characters
specication.
are encoded according to RFC 1738 This can be considered a violation of the HTTP
4. CHOP: The URL is chopped at the rst whitespace character and then processed normally. This also can be considered a violation of HTTP.
11. Troubleshooting
115
11.37 commBind: Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address
This likely means that your system does not have a loopback network device, or that device is not properly congured. All Unix systems should have a network device named lo0 , and it should be congured with the address 127.0.0.1. If not, you may get the above error message. To check your system, run:
% ifconfig lo0
11.41 What does sslReadClient: FD 14: read failure: (104) Connection reset by peer mean?
\Connection reset by peer" is an error code that Unix operating systems sometimes return for read , write , connect , and other system calls. Connection reset means that the other host, the peer, sent us a RESET packet on a TCP connection. A host sends a RESET when it receives an unexpected packet for a nonexistent connection. For example, if
11. Troubleshooting
116
one side sends data at the same time that the other side closes a connection, when the other side receives the data it may send a reset back. The fact that these messages appear in Squid's log might indicate a problem, such as a broken origin server or parent cache. On the other hand, they might be \normal," especially since some applications are known to force connection resets rather than a proper close. You probably don't need to worry about them, unless you receive a lot of user complaints relating to SSL sites.
Rick Jones <mailto:raj at cup dot hp dot com> notes that if the server is running a Microsoft TCP stack, clients receive RST segments whenever the listen queue overows. In other words, if the server is really busy, new connections receive the reset message. This is contrary to rational behaviour, but is unlikely to change.
It happens because there is no server listening for connections on port 12345. When you see this in response to a URL request, it probably means the origin server web site is temporarily down. It may also mean that your parent cache is down, if you have one.
You want the second process id, 83619 in this case. Create the PID le and put the process id number there. For example:
echo 83619 > /usr/local/squid/logs/squid.pid
2. Use the above technique to nd the Squid process id. Send the process a HUP signal, which is the same as squid -kreconfigure:
kill -HUP 83619
11. Troubleshooting
117
11. Troubleshooting
% % % % cd squid-x.y make distclean setenv CFLAGS='-g -Wall' ./configure ...
118
To check that you did it right, you can search for AC CFLAGS in src/Makele :
% grep AC_CFLAGS src/Makefile AC_CFLAGS = -g -Wall
NOTE: some people worry that disabling compiler optimization will negatively impact Squid's performance. The impact should be negligible, unless your cache is really busy and already runs at a high CPU usage. For most people, the compiler optimization makes little or no dierence at all.
11. Troubleshooting
119
and
3. Disable any advanced TCP features on the Squid system. Disable ECN on Linux with echo 0 > /proc/sys/net/ipv4/tcp ecn/. 4. Upgrade to Squid-2.5.STABLE4 or later to work around a Host header related bug in Cisco PIX HTTP inspection. The Cisco PIX rewall wrongly assumes the Host header can be found in the rst packet of the request. If this error causes serious problems for you and the above does not help, Squid developers would be happy to help you uncover the problem. However, we will require high-quality debugging information from you, such as tcpdump output, server IP addresses, operating system versions, and access.log entries with full HTTP headers. If you want to make Squid give the Zero Sized error on demand, you can use the short C program below. Simply compile and start the program on a system that doesn't already have a server running on port 80. Then try to connect to this fake server through Squid:
#include #include #include #include #include #include #include #include <stdio.h> <stdlib.h> <unistd.h> <sys/types.h> <sys/socket.h> <netinet/in.h> <arpa/inet.h> <assert.h>
120
121
12.7 What is the Squid cache resolution algorithm? Send ICP queries to all appropriate siblings Wait for all replies to arrive with a congurable timeout (the default is two seconds). Begin fetching the object upon receipt of the rst HIT reply, or Fetch the object from the rst parent which replied with MISS (subject to weighting values), or Fetch the object from the source
The algorithm is somewhat more complicated when rewalls are involved. The single parent bypass directive can be used to skip the ICP queries if the only appropriate sibling is a parent cache (i.e., if there's only one place you'd fetch the object from, why bother querying?)
122
12.10 What are the tradeos of caching with the NLANR cache system?
The NLANR root caches are at the NSF supercomputer centers (SCCs), which are interconnected via NSF's high speed backbone service (vBNS). So inter-cache communication between the NLANR root caches does not cross the Internet. The benets of hierarchical caching (namely, reduced network bandwidth consumption, reduced access latency, and improved resiliency) come at a price. Caches higher in the hierarchy must eld the misses of their descendents. If the equilibrium hit rate of a leaf cache is 50%, half of all leaf references have to be resolved through a second level cache rather than directly from the object's source. If this second level cache has most of the documents, it is usually still a win, but if higher level caches often don't have the document, or become overloaded, then they could actually increase access latency, rather than reduce it.
The LRU expiration age is a dynamically-calculated value. Any objects which have not been accessed for this amount of time will be removed from the cache to make room for new, incoming objects. Another way of looking at this is that it would take your cache approximately this many days to go from empty to full at your current trac levels. As your cache becomes more busy, the LRU age becomes lower so that more objects will be removed to make room for the new ones. Ideally, your cache will have an LRU age value in the range of at least 3 days. If the LRU age is lower than 3 days, then your cache is probably not big enough to handle the volume of requests it receives. By adding more disk space you could increase your cache hit ratio. The conguration parameter reference age places an upper limit on your cache's LRU expiration age.
123
12.13 What is \Failure Ratio at 1.01; Going into hit-only-mode for 5 minutes"?
Consider a pair of caches named A and B. It may be the case that A can reach B, and vice-versa, but B has poor reachability to the rest of the Internet. In this case, we would like B to recognize that it has poor reachability and somehow convey this fact to its neighbor caches. Squid will track the ratio of failed-to-successful requests over short time periods. A failed request is one which is logged as ERR DNS FAIL, ERR CONNECT FAIL, or ERR READ ERROR. When the failed-tosuccessful ratio exceeds 1.0, then Squid will return ICP MISS NOFETCH instead of ICP MISS to neighbors. Note, Squid will still return ICP HIT for cache hits.
124
In Squid 1.0 and 1.1, we used internal browser icons with names like gopher-internal-image . Unfortunately, these were not very portable. Not all browsers had internal icons, or even used the same names. Perhaps only Netscape and Mosaic used these names. For Squid 2 we include a set of icons in the source distribution. These icon les are loaded by Squid as cached objects at runtime. Thus, every Squid cache now has its own icons to use in Gopher and FTP listings. Just like other objects available on the web, we refer to the icons with Uniform Resource Locators <ftp://ftp.isi.edu/in-notes/rfc1738.txt>, or URLs .
This number is NOT how much time it takes to handle ledescriptor I/O. We simply count the number of times select was called, and divide the total process running time by the number of select calls. This means, on average it takes your cache .714 seconds to check all the open le descriptors once. But this also includes time select() spends in a wait state when there is no I/O on any le descriptors. My relatively idle workstation cache has similar numbers:
Select loop called: 336782 times, 715.938 ms avg
proper
way
to
deal
with
Set-Cookie
reply
headers,
according
to
RFC
2109
header lines.
With Squid-1.1, we can not lter out specic HTTP headers, so Squid-1.1 does not cache any response which contains a Set-Cookie header. With Squid-2, however, we can lter out specic HTTP headers. But instead of ltering them on the receiving-side, we lter them on the sending-side. Thus, Squid-2 does cache replies with Set-Cookie headers, but it lters out the Set-Cookie header itself for cache hits.
125
OBJ DATE is the time when the object was given out by the origin server. This is taken from the
HTTP Date reply header.
OBJ LASTMOD is the time when the object was last modied, given by the HTTP Last-Modied
reply header.
OBJ AGE is how much the object has aged since it was retrieved:
OBJ_AGE = NOW - OBJ_DATE
CLIENT MAX AGE is the (optional) maximum object age the client will accept as taken from the
HTTP/1.1 Cache-Control request header.
EXPIRES is the (optional) expiry time from the server reply headers.
These values are compared with the parameters of the refresh pattern rules. The refresh parameters are:
URL regular expression CONF MIN : The time (in minutes) an object without an explicit expiry time should be considered CONF PERCENT : A percentage of the objects age (time since last modication age) an object without
explicit exipry time will be considered fresh. fresh. fresh. The recommended value is 0, any higher values may cause dynamic applications to be erronously cached unless the application designer has taken the appropriate actions.
CONF MAX : An upper limit on how long objects without an explicit expiry time will be considered
The URL regular expressions are checked in the order listed until a match is found. Then the algorithms below are applied for determining if an object is fresh or stale.
126
Kolics Bertold <mailto:[email protected]> has made an excellent ow chart diagram <http://www.squid-cache.org/Doc/FAQ/refresh-flowchart.gif> showing this process.
127
I can't account for the exact behavior you're seeing, but I can oer this advice; whenever you start measuring raw Ethernet or IP trac on interfaces, you can forget about getting all the numbers to exactly match what Squid reports as the amount of trac it has sent/received. Why? Squid is an application - it counts whatever data is sent to, or received from, the lower-level networking functions; at each successively lower layer, additional trac is involved (such as header overhead, retransmits and fragmentation, unrelated broadcasts/trac, etc.). The additional trac is never seen by Squid and thus isn't counted - but if you run MRTG (or any SNMP/RMON measurement tool) against a specic interface, all this additional trac will "magically appear". Also remember that an interface has no concept of upper-layer networking (so an Ethernet interface doesn't distinguish between IP trac that's entirely internal to your organization, and trac that's to/from the Internet); this means that when you start measuring an interface, you have to be aware of *what* you are measuring before you can start comparing numbers elsewhere. It is possible (though by no means guaranteed) that you are seeing roughly equivalent input/output because you're measuring an interface that both retrieves data from the outside world (Internet), *and* serves it to end users (internal clients). That wouldn't be the whole answer, but hopefully it gives you a few ideas to start applying to your own circumstance. To interpret any statistic, you have to rst know what you are measuring; for example, an interface counts inbound and outbound bytes - that's it. The interface doesn't distinguish between inbound bytes from external Internet sites or from internal (to the organization) clients (making requests). If you want that, try looking at RMON2. Also, if you're talking about a 40% hit rate in terms of object requests/counts then there's absolutely no reason why you should expect a 40% reduction in trac; after all, not every request/object is going to be the same size so you may be saving a lot in terms of requests but very little in terms of actual trac.
200 OK
203 Non-Authoritative Information 300 Multiple Choices 410 Gone 301 Moved Permanently
128
However, if Squid receives one of these responses from a neighbor cache, it will NOT be cached if ALL of the Date , Last-Modied , and Expires reply headers are missing. This prevents such objects from bouncing back-and-forth between siblings forever. 7. A 302 Moved Temporarily response is cachable ONLY if the response also includes an Expires header. 8. The following HTTP status codes are \negatively cached" for a short amount of time (congurable):
204 No Content 305 Use Proxy 403 Forbidden 400 Bad Request 404 Not Found 405 Method Not Allowed 501 Not Implemented 502 Bad Gateway
414 Request-URI Too Large 500 Internal Server Error 503 Service Unavailable 504 Gateway Time-out 206 Partial Content 303 See Other 304 Not Modied 401 Unauthorized
9. All other HTTP status codes are NOT cachable, including:
129
Squid keeps the cache disk usage between the low and high water marks. By default the low mark is 90%, and the high mark is 95% of the total congured cache size. When the disk usage is close to the low mark, the replacement is less aggressive (fewer objects removed). When the usage is close to the high mark, the replacement is more aggressive (more objects removed). When selecting objects for removal, Squid examines some number of objects and determines which can be removed and which cannot. A number of factors determine whether or not any given object can be removed. If the object is currently being requested, or retrieved from an upstream site, it will not be removed. If the object is \negatively-cached" it will be removed. If the object has a private cache key, it will be removed (there would be no reason to keep it { because the key is private, it can never be \found" by subsequent requests). Finally, if the time since last access is greater than the LRU threshold, the object is removed. The LRU threshold value is dynamically calculated based on the current cache size and the low and high marks. The LRU threshold scaled exponentially between the high and low water marks. When the store swap size is near the low water mark, the LRU threshold is large. When the store swap size is near the high water mark, the LRU threshold is small. The threshold automatically adjusts to the rate of incoming requests. In fact, when your cache size has stabilized, the LRU threshold represents how long it takes to ll (or fully replace) your cache at the current request rate. Typical values for the LRU threshold are 1 to 10 days. Back to selecting objects for removal. Obviously it is not possible to check every object in the cache every time we need to remove some of them. We can only check a small subset each time. The way in which this is implemented is very dierent between Squid-1.1 and Squid-2.
12.25.2 Squid 2
For Squid-2 we eliminated the need to use qsort() by indexing cached objects into an automatically sorted linked list. Every time an object is accessed, it gets moved to the top of the list. Over time, the least used objects migrate to the bottom of the list. When looking for objects to remove, we only need to check the last 100 or so objects in the list. Unfortunately this approach increases our memory usage because of the need to store three additional pointers per cache object. But for Squid-2 we're still ahead of the game because we also replaced plain-text cache keys with MD5 hashes.
130
public object may be sent to multiple clients at the same time. In other words, public objects can be located by any cache client. Private keys can only be located by a single client{the one who requested it. Objects are changed from private to public after all of the HTTP reply headers have been received and parsed. In some cases, the reply headers will indicate the object should not be made public. For example, if the no-cache Cache-Control directive is used.
12.29 What does \WARNING: Reply from unknown nameserver [a.b.c.d]" mean?
It means Squid sent a DNS query to one IP address, but the response came back from a dierent IP address. By default Squid checks that the addresses match. If not, Squid ignores the response. There are a number of reasons why this would happen: 1. Your DNS name server just works this way, either becuase its been congured to, or because its stupid and doesn't know any better. 2. You have a weird broadcast address, like 0.0.0.0, in your /etc/resolv.conf le. 3. Somebody is trying to send spoofed DNS responses to your cache. If you recognize the IP address in the warning as one of your name server hosts, then its probably numbers (1) or (2). You can make these warnings stop, and allow responses from \unknown" name servers by setting this conguration option:
ignore_unknown_nameservers off
131
12.30 How does Squid distribute cache les among the available directories?
Note: The information here is current for version 2.2.
See storeDirMapAllocate() in the source code. When Squid wants to create a new disk le for storing an object, it rst selects which cache dir the object will go into. This is done with the storeDirSelectSwapDir() function. If you have N cache directories, the function identies the 3N/4 (75%) of them with the most available space. These directories are then used, in order of having the most available space. When Squid has stored one URL to each of the 3N/4 cache dir 's, the process repeats and storeDirSelectSwapDir() nds a new set of 3N/4 cache directories with the most available space. Once the cache dir has been selected, the next step is to nd an available swap le number . This is accomplished by checking the le map , with the le map allocate() function. Essentially the swap le numbers are allocated sequentially. For example, if the last number allocated happens to be 1000, then the next one will be the rst number after 1000 that is not already being used.
If server bytes is greater than client bytes, you end up with a negative value. The server bytes may be greater than client bytes for a number of reasons, including:
Cache Digests and other internally generated requests. Cache Digest messages are quite large. They are User-aborted requests. If your quick abort setting allows it, Squid sometimes continues to fetch aborted
requests from the server-side, without sending any data to the client-side.
counted in the server bytes, but since they are consumed internally, they do not count in client bytes.
Some range requests, in combination with Squid bugs, can consume more bandwidth on the serverside than on the client-side. In a range request, the client is asking for only some part of the object. Squid may decide to retrieve the whole object anyway, so that it can be used later on. This means downloading more from the server than sending to the client. You can aect this behavior with the range oset limit option.
132
Not having private cache keys has some important privacy implications. Two users could receive one response that was meant for only one of the users. This response could contain personal, condential information. You will need to disable the \zero reqnum" neighbor if you want Squid to use private cache keys.
Then, Squid will always close its side of the connection instead of marking it as half-closed.
Then, in squid.conf , you can select dierent policies with the cache replacement policy option. See the squid.conf comments for details. The LFUDA and GDS replacement code was contributed by John Dilley and others from Hewlett-Packard. Their work is described in these papers: 1. Enhancement 2. Enhancement
<http://www.hpl.hp.com/techreports/1999/HPL-1999-69.html> (HP Tech Report).
and
Validation
of
Squid's
Cache
Replacement
Policy Policy
and
Validation
of
the
Squid
12.35 Why is actual lesystem space used greater than what Squid thinks?
If you compare df output and cachemgr storedir output, you will notice that actual disk usage is greater than what Squid reports. This may be due to a number of reasons:
Squid doesn't keep track of the size of the swap.state le, which normally resides on each cache dir . Directory entries and take up lesystem space. Other applications might be using the same disk partition.
133
Your lesystem block size might be larger than what Squid thinks. When calculating total disk usage,
Squid rounds le sizes up to a whole number of 1024 byte blocks. If your lesystem uses larger blocks, then some "wasted" space is not accounted.
12.36 How do positive dns ttl and negative dns ttl work?
positive dns ttl is how long Squid caches a successful DNS lookup. Similarly, negative dns ttl is how long Squid caches a failed DNS lookup. positive dns ttl is not always used. It is NOT used in the following cases:
Squid-2.3 and later versions with internal DNS lookups. Internal lookups are the default for Squid-2.3
and later.
If you applied the \DNS TTL" 2.9 for BIND. If you are using FreeBSD, then it already has the DNS TTL patch built in.
Let's say you have the following settings:
positive_dns_ttl 1 hours negative_dns_ttl 1 minutes
When Squid looks up a name like www.squid-cache.org , it gets back an IP address like 204.144.128.89. The address is cached for the next hour. That means, when Squid needs to know the address for www.squidcache.org again, it uses the cached answer for the next hour. After one hour, the cached information expires, and Squid makes a new query for the address of www.squid-cache.org . If you have the DNS TTL patch, or are using internal lookups, then each hostname has its own TTL value, which was set by the domain name administrator. You can see these values in the 'ipcache' cache manager page. For example:
Hostname www.squid-cache.org www.ircache.net polygraph.ircache.net Flags lstref TTL N C 73043 12784 1( 0) 204.144.128.89-OK C 73812 10891 1( 0) 192.52.106.12-OK C 241768 -181261 1( 0) 192.52.106.12-OK
The TTL eld shows how how many seconds until the entry expires. Negative values mean the entry is already expired, and will be refreshed upon next use. The negative dns ttl species how long to cache failed DNS lookups. When Squid fails to resolve a hostname, you can be pretty sure that it is a real failure, and you are not likely to get a successful answer within a short time period. Squid retries its lookups many times before declaring a lookup has failed. If you like, you can set negative dns ttl to zero.
134
So why bind in that way? If you know you are interception proxying, then why not bind the local endpoint to the host's (intranet) IP address? Why make the masses suer needlessly?
Because thats just how ident works. Please read RFC 931 <ftp://ftp.isi.edu/in-notes/rfc931.txt>, in particular the RESTRICTIONS section.
13. Multicast
135
In passive mode, when you request a data transfer, the server tells the client \I am listening on <ip address> <port>." Your client then connects to the server on that IP and port and data ows.
13 Multicast
13.1 What is Multicast?
Multicast is essentially the ability to send one IP packet to multiple receivers. Multicast is often used for audio and video conferencing systems.
13. Multicast
136
3. Multicast does not reduce the number of ICP replies being sent around. It does reduce the number of ICP queries sent, but not the number of replies. 4. Multicast exposes your cache to some privacy issues. There are no special emissions required to join a multicast group. Anyone may join your group and eavesdrop on ICP query messages. However, the scope of your multicast trac can be controlled such that it does not exceed certain boundaries. We only recommend people to use Multicast ICP over network infrastructure which they have close control over. In other words, only use Multicast over your local area network, or maybe your wide area network if you are an ISP. We think it is probably a bad idea to use Multicast ICP over congested links or commodity backbones.
224.9.9.9 is a sample multicast group address. multicast indicates that this is a special type of neighbour. The HTTP-port argument (3128) is ignored for multicast peers, but the ICP-port (3130) is very important. The nal argument, ttl=64 species the multicast TTL value for queries sent to this address. It is probably a good idea to increment the minimum TTL by a few to provide a margin for error and changing conditions. You must also specify which of your neighbours will respond to your multicast queries, since it would be a bad idea to implicitly trust any ICP reply from an unknown address. Note that ICP replies are sent back to unicast addresses; they are NOT multicast, so Squid has no indication whether a reply is from a regular query or a multicast query. To congure your multicast group neighbours, use the cache peer directive and the multicast-responder option:
cache_peer cache1 sibling 3128 3130 multicast-responder cache_peer cache2 sibling 3128 3130 multicast-responder
Here all elds are relevant. The ICP port number (3130) must be the same as in the cache peer line dening the multicast peer above. The third eld must either be parent or sibling to indicate how Squid should treat replies. With the multicast-responder ag set for a peer, Squid will NOT send ICP queries to it directly (i.e. unicast).
A good way to determine the TTL you need is to run mtrace as shown above and look at the last line. It will show you the minimum TTL required to reach the other host.
137
If you set you TTL too high, then your ICP messages may travel \too far" and will be subject to eavesdropping by others. If you're only using multicast on your LAN, as we suggest, then your TTL will be quite small, for example ttl=4 .
Of course, all members of your Multicast ICP group will need to use the exact same multicast group address.
NOTE: Choose a multicast group address with care! If two organizations happen to choose the same
multicast address, then they may nd that their groups \overlap" at some point. This will be especially true if one of the querying caches uses a large TTL value. There are two ways to reduce the risk of group overlap: 1. Use a unique group address 2. Limit the scope of multicast messages with TTLs or administrative scoping. Using a unique address is a good idea, but not without some potential problems. If you choose an address randomly, how do you know that someone else will not also randomly choose the same address? NLANR has been assigned a block of multicast addresses by the IANA for use in situations such as this. If you would like to be assigned one of these addresses, please write to us <mailto:[email protected]>. However, note that NLANR or IANA have no authority to prevent anyone from using an address assigned to you. Limiting the scope of your multicast messages is probably a better solution. They can be limited with the TTL value discussed above, or with some newer techniques known as administratively scoped addresses. Here you can congure well-dened boundaries for the trac to a specic address. The Administratively Scoped IP Multicast RFC <ftp://ftp.isi.edu/in-notes/rfc2365.txt> describes this.
14 System-Dependent Weirdnesses
14.1 Solaris
14.1.1 TCP incompatibility?
J.D. Bronson (jb at ktxg dot com) reported that his Solaris box could not talk to certain origin servers, such as moneycentral.msn.com <http://moneycentral.msn.com/> and www.mbnanetaccess.com <http://www.mbnanetaccess.com>. J.D. xed his problem by setting:
tcp_xmit_hiwat 49152 tcp_xmit_lowat 4096 tcp_recv_hiwat 49152
14.1.2 select()
select(3c) won't handle more than 1024 le descriptors. The congure script should enable poll() by default for Solaris. poll() allows you to use many more ledescriptors, probably 8192 or more.
For older Squid versions you can enable poll() manually by changing HAVE POLL in include/autoconf.h , or by adding -DUSE POLL=1 to the DEFINES in src/Makele.
138
Apparently nscd serializes DNS queries thus slowing everything down when an application (such as Squid) hits the resolver hard. You may notice something similar if you run a log processor executing many DNS resolver queries - the resolver starts to slow.. right.. down.. . . . According to Andres Kroonmaa <mailto:andre at online dot ee>, users of Solaris starting from version 2.6 and up should NOT completely disable nscd daemon. nscd should be running and caching passwd and group les, although it is suggested to disable hosts caching as it may interfere with DNS lookups. Several library calls rely on available free FILE descriptors FD < 256. Systems running without nscd may fail on such calls if rst 256 les are all in use. Since solaris 2.6 Sun has changed the way some system calls work and is using nscd daemon as a implementor of them. To communicate to nscd Solaris is using undocumented door calls. Basically nscd is used to reduce memory usage of user-space system libraries that use passwd and group les. Before 2.6 Solaris cached full passwd le in library memory on the rst use but as this was considered to use up too much ram on large multiuser systems Sun has decided to move implementation of these calls out of libraries and to a single dedicated daemon.
139
being started with. The 2.6 ypstart script checks to see if there is a resolv.conf le present when it starts ypserv. If there is, then it starts it with the -d option. This has the same eect as putting the YP INTERDOMAIN key in the hosts table { namely, that failed NIS host lookups are tried against the DNS by the NIS server. This is a bad thing(tm)! If NIS itself tries to resolve names using the DNS, then the requests are serialised through the NIS server, creating a bottleneck (This is the same basic problem that is seen with nscd ). Thus, one failing or slow lookup can, if you have NIS before DNS in the service switch le (which is the most common setup), hold up every other lookup taking place. If you're running in this kind of setup, then you will want to make sure that 1. ypserv doesn't start with the -d ag. 2. you don't have the YP INTERDOMAIN key in the hosts table (nd the B=-b line in the yp Makele and change it to B= ) We changed these here, and saw our average lookup times drop by up to an order of magnitude (~150msec for name-ip queries and ~1.5sec for ip-name queries, the latter still so high, I suspect, because more of these fail and timeout since they are not made so often and the entries are frequently non-existent anyway).
14.1.7 Tuning
Solaris 2.x - tuning your TCP/IP stack and more <http://www.rvs.uni-hannover.de/people/voeckler/tune/EN/tune.ht by Jens-S. Vckler <http://www.rvs.uni-hannover.de/people/voeckler/>
In a nutshell, the UFS lesystem used by Solaris can't cope with the workload squid presents to it very well. The lesystem will end up becoming highly fragmented, until it reaches a point where there are insucient free blocks left to create les with, and only fragments available. At this point, you'll get this error and squid will revise its idea of how much space is actually available to it. You can do a "fsck -n raw device" (no need to unmount, this checks in read only mode) to look at the fragmentation level of the lesystem. It will probably be quite high (>15%). Sun suggest two solutions to this problem. One costs money, the other is free but may result in a loss of performance (although Sun do claim it shouldn't, given the already highly random nature of squid disk access). The rst is to buy a copy of VxFS, the Veritas Filesystem. This is an extent-based lesystem and it's capable of having online defragmentation performed on mounted lesystems. This costs money, however (VxFS is not very cheap!) The second is to change certain parameters of the UFS lesystem. Unmount your cache lesystems and use tunefs to change optimization to "space" and to reduce the "minfree" value to 3-5% (under Solaris 2.6 and higher, very large lesystems will almost certainly have a minfree of 2% already and you shouldn't increase this). You should be able to get fragmentation down to around 3% by doing this, with an accompanied increase in the amount of space available.
140
or even higher. The kernel variable ufs inode - which is the size of the inode cache itself - scales with ncsize in Solaris 2.5.1 and later. Previous versions of Solaris required both to be adjusted independently, but now, it is not recommended to adjust ufs inode directly on 2.5.1 and later. You can set ncsize quite high, but at some point - dependent on the application - a too-large ncsize will increase the latency of lookups. Defaults are:
Solaris 2.5.1 : (max_nprocs + 16 + maxusers) + 64 Solaris 2.6/Solaris 7 : 4 * (max_nprocs + maxusers) + 320
141
data last, and lesystem pages rst, if you turn it on (set priority paging = 1 in /etc/system ). As you may know, the Solaris buer cache grows to ll available pages, and under the old VM system, applications could get paged out to make way for the buer cache, which can lead to swap thrashing and degraded application performance. The new priority paging helps keep application and shared library pages in memory, preventing the buer cache from paging them out, until memory gets REALLY short. Solaris 2.5.1 requires patch 103640-25 or higher and Solaris 2.6 requires 105181-10 or higher to get priority paging. Solaris 7 needs no patch, but all versions have it turned o by default.
14.2 FreeBSD
14.2.1 T/TCP bugs
We have found that with FreeBSD-2.2.2-RELEASE, there some bugs with T/TCP. FreeBSD will try to use T/TCP if you've enabled the \TCP Extensions." To disable T/TCP, use sysinstall to disable TCP Extensions, or edit /etc/rc.conf and set
tcp_extensions="NO" # Allow RFC1323 & RFC1544 extensions (or NO).
Interestingly, it is very common to get only 100 bytes on the rst read. When two read() calls are required, this adds additional latency to the overall request. On our caches running Digital Unix, the median dnsserver response time was measured at 0.01 seconds. On our FreeBSD cache, however, the median latency was 0.10 seconds. Here is a simple patch to x the bug:
142
1.41
Another technique which may help, but does not x the bug, is to increase the kernel's mbuf size. The default is 128 bytes. The MSIZE symbol is dened in /usr/include/machine/param.h . However, to change it we added this line to our kernel conguration le:
options MSIZE="256"
You will want to comment out the B=-b line so that ypserv does not do DNS lookups.
14.2.4 FreeBSD 3.3: The lo0 (loop-back) device is not congured on startup
Squid requires a the loopback interface to be up and congured. If it is not, you will get errors such as 11.37. From FreeBSD 3.3 Errata Notes <http://www.freebsd.org/releases/3.3R/errata.html>: Fix: Assuming that you experience this problem at all, edit /etc/rc.conf and search for where the network interfaces variable is set. In its value, change the word auto to lo0 since the
143
The lesystem in question MUST NOT be mounted at this time. After that, softupdates are permanently enabled and the lesystem can be mounted normally. To verify that the softupdates code is running, simply issue a mount command and an output similar to the following will appear:
$ mount /dev/da2a on /usr/local/squid/cache (ufs, local, noatime, soft-updates, writes: sync 70 async 22
You can eliminate the problem by putting the jail's network interface address in the 'udp outgoing addr' conguration option in squid.conf .
14.3 OSF1/3.2
If you compile both libgnumalloc.a and Squid with cc , the mstats() function returns bogus values. However, if you compile libgnumalloc.a with gcc , and Squid with cc , the values are correct.
14.4 BSD/OS
14.4.1 gcc/yacc
Some people report 2.10.
144
I've noticed that my Squid process seems to stick at a nice value of four, and clicks back to that even after I renice it to a higher priority. However, looking through the Squid source, I can't nd any instance of a setpriority() call, or anything else that would seem to indicate Squid's adjusting its own priority.
by Bill Bogstad <mailto:[email protected]> BSD Unices traditionally have auto-niced non-root processes to 4 after they used alot (4 minutes???) of CPU time. My guess is that it's the BSD/OS not Squid that is doing this. I don't know ohand if there is a way to disable this on BSD/OS. by Arjan de Vet <mailto:[email protected]> You can get around this by starting Squid with nice-level -4 (or another negative value). by Bert Driehuis <mailto:bert driehuis at nl dot compuware dot com> The autonice behavior is a leftover from the history of BSD as a university OS. It penalises CPU bound jobs by nicing them after using 600 CPU seconds. Adding
sysctl -w kern.autonicetime=0
14.5 Linux
14.5.1 Cannot bind socket FD 5 to 127.0.0.1:0: (49) Can't assign requested address
Try a dierent version of Linux. We have received many reports of this \bug" from people running Linux 2.0.30. The bind(2) system call should NEVER give this error when binding to port 0.
14.5.2 FATAL: Don't run Squid as root, set 'cache eective user'!
Some users have reported that setting cache effective user to nobody under Linux does not work. However, it appears that using any cache effective user other than nobody will succeed. One solution is to create a user account for Squid and set cache effective user to that. Alternately you can change the UID for the nobody account from 65535 to 65534. Another problem is that RedHat 5.0 Linux seems to have a broken setresuid() function. There are two ways to x this. Before running congure:
% % % % setenv ac_cv_func_setresuid no ./configure ... make clean make install
Or after running congure, manually edit include/autoconf.h and change the HAVE SETRESUID line to:
#define HAVE_SETRESUID 0
Also, some users report this error is due to a NIS conguration problem. By adding compat to the passwd and group lines of /etc/nsswitch.conf , the problem goes away. (Ambrose Li <mailto:[email protected]>).
Russ Mellon <mailto:[email protected]> notes that these problems with cache eective user are xed in version 2.2.x of the Linux kernel.
14. System-Dependent Weirdnesses 14.5.3 Large ACL lists make Squid slow
145
The regular expression library which comes with Linux is known to be very slow. Some people report it entirely fails to work after long periods of time. To x, use the GNUregex library included with the Squid source code. With Squid-2, use the {enablegnuregex congure option.
14.5.5 assertion failed: StatHist.c:91: `statHistBin(H, max) == H->capacity - 1' on Alpha system.
by Jamie Raymond <mailto:[email protected]> Some early versions of Linux have a kernel bug that causes this. All that is needed is a recent kernel that doesn't have the mentioned bug.
146
Found this on the FreeBSD mailing list: From: Robert Watson As Bill Fumerola has indicated, and I thought I'd follow up in with a bit more detail, the behavior you're seeing is the result of a bug in the FreeBSD IPFW code. FreeBSD did a direct comparison of the TCP header ag eld with an internal eld in the IPFW rule description structure. Unfortunately, at some point, someone decided to overload the IPFW rule description structure eld to add a ag representing "ESTABLISHED". They used a ag value that was previously unused by the TCP protocol (which doesn't make it safer, just less noticeable). Later, when that ag was allocated for ECN (Endpoint Congestion Notication) in TCP, and Linux began using ECN by default, the packets began to match ESTABLISHED rules regardless of the other TCP header ags. This bug was corrected on the RELENG 4 branch, and security advisory for the bug was released. This was, needless to say, a pretty serious bug, and good example of why you should be very careful to compare only the bits you really mean to, and should seperate packet state from protocol state in management structures, as well as make use of extensive testing to make sure rules actually have the eect you describe.
See also the thread on the NANOG mailing list <http://answerpointe.cctec.com/maillists/nanog/historical/0104/msg RFC3168 "The Addition of Explicit Congestion Notication (ECN) to IP, PROPOSED STANDARD " <ftp://ftp.isi.edu/in-notes/rfc3168.txt> , Sally Floyd's page on ECN and problems related to it <http://www.aciri.org/floyd/ecn.html> or ECN Hall of Shame <http://urchin.earth.li/ecn/> for more information.
14.6 HP-UX
14.6.1 StatHist.c:74: failed assertion `statHistBin(H, min) == 0'
This was a very mysterious and unexplainable bug with GCC on HP-UX. Certain functions, when specied as static , would cause math bugs. The compiler also failed to handle implied int-double conversions properly. These bugs should all be handled correctly in Squid version 2.2.
14.7 IRIX
14.7.1 dnsserver always returns 255.255.255.255
There is a problem with GCC (2.8.1 at least) on Irix 6 which causes it to always return the string 255.255.255.255 for ANY address when calling inet ntoa(). If this happens to you, compile Squid with the native C compiler instead of GCC.
14.8 SCO-UNIX
by F.J. Bosscha <mailto:[email protected]> To make squid run comfortable on SCO-unix you need to do the following: Increase the NOFILES paramater and the NUMSP parameter and compile squid with I had, although squid told in the cache.log le he had 3000 ledescriptors, problems with the messages that there were no ledescriptors more available. After I increase also the NUMSP value the problems were gone.
15. Redirectors
147
One thing left is the number of tcp-connections the system can handle. Default is 256, but I increase that as well because of the number of clients we have.
14.9 AIX
14.9.1 "shmat failed" errors with diskd
32-bit processes on AIX and later are restricted by default to a maximum of 11 shared memory segments. This restriction can be removed on AIX 4.2.1 and later by setting the environment variable EXTSHM=ON in the script or shell which starts squid.
set the LDR CNTRL environment variable, eg LDR CNTRL="MAXDATA=0x80000000"; or link with -bmaxdata:0x80000000; or patch the squid binary
See IBM's documentation <http://publibn.boulder.ibm.com/doc link/en US/a doc lib/aixprggd/genprogc/lrg prg s on large program support for more information, including how to patch an already-compiled program.
15 Redirectors
15.1 What is a redirector?
Squid has the ability to rewrite requested URLs. Implemented as an external process (similar to a dnsserver), Squid can be congured to pass every incoming URL through a redirector process that returns either a new URL, or a blank line to indicate no change. The redirector program is NOT a standard part of the Squid package. However, some examples are provided below, and in the "contrib/" directory of the source distribution. Since everyone has dierent needs, it is up to the individual administrators to write their own implementation.
15. Redirectors
148
Please see sections 10.3.2 and 10.3.3 of RFC 2068 <ftp://ftp.isi.edu/in-notes/rfc2068.txt> for an explanation of the 301 and 302 HTTP reply codes.
149
and many of the redirector requests don't have a username in the ident eld.
Squid does not delay a request to wait for an ident lookup, unless you use the ident ACLs. Thus, it is very likely that the ident was not available at the time of calling the redirector, but became available by the time the request is complete and logged to access.log. If you want to block requests waiting for ident lookup, try something like this:
acl foo ident REQUIRED http_access allow foo
16 Cache Digests
Cache Digest FAQs compiled by Niall Doherty <mailto:[email protected]>.
Latency is eliminated and client response time should be improved. Network utilisation may be improved.
Note that the use of Cache Digests (for querying the cache contents of peers) and the generation of a Cache Digest (for retrieval by peers) are independent. So, it is possible for a cache to make a digest available for peers, and not use the functionality itself and vice versa.
150
A vector (1-dimensional array) of m bits is allocated, with all bits initially set to 0. A number, k, of independent hash functions are chosen, h1, h2, ..., hk, with range f 1, ..., m g (i.e. a
key hashed with any of these functions gives a value between 1 and m inclusive).
The set of n keys to be operated on are denoted by: A = f a1, a2, a3, ..., an g.
16.3.1 Adding a Key
To add a key the value of each hash function for that key is calculated. So, if the key was denoted by a , then h1(a), h2(a), ..., hk(a) are calculated. The value of each hash function for that key represents an index into the array and the corresponding bits are set to 1. So, a digest with 6 hash functions would have 6 bits to be set to 1 for each key added. Note that the addition of a number of dierent keys could cause one particular bit to be set to 1 multiple times.
If any of the corresponding bits in the array are 0 then the key is not present. If all of the corresponding bits in the array are 1 then the key is likely to be present.
Note the term likely . It is possible that a collision in the digest can occur, whereby the digest incorrectly indicates a key is present. This is the price paid for the compact representation. While the probability of a collision can never be reduced to zero it can be controlled. Larger values for the ratio of the digest size to the number of entries added lower the probability. The number of hash functions chosen also inuence the probability.
When adding a key, set appropriate bits to 1 and increment the corresponding counters. When deleting a key, decrement the appropriate counters (while > 0), and if a counter reaches 0 then
the corresponding bit is set to 0.
151
When a digest rebuild occurs, the change in the cache size (capacity) is measured. If the capacity has changed by a large enough amount (10%) then the digest array is freed and reallocated memory, otherwise the same digest is re-used.
16.5 What hash functions (and how many of them) does Squid use?
The protocol design allows for a variable number of hash functions (k). However, Squid employs a very ecient method using a xed number - four. Rather than computing a number of independent hash functions over a URL Squid uses a 128-bit MD5 hash of the key (actually a combination of the URL and the HTTP retrieval method) and then splits this into four equal chunks. Each chunk, modulo the digest size (m), is used as the value for one of the hash functions - i.e. an index into the bit array. Note: As Squid retrieves objects and stores them in its cache on disk, it adds them to the in-RAM index using a lookup key which is an MD5 hash - the very one discussed above. This means that the values for the Cache Digest hash functions are already available and consequently the operations are extremely ecient! Obviously, modifying the code to support a variable number of hash functions would prove a little more dicult and would most likely reduce eciency.
16.7 Does Squid support deletions in Cache Digests? What are dis/deltas?
Squid does not support deletions from the digest. Because of this the digest must, periodically, be rebuilt from scratch to erase stale bits and prevent digest pollution. A more sophisticated option is to use dis or deltas . These would be created by building a new digest and comparing with the current/old one. They would essentially consist of aggregated deletions and additions since the previous digest. Since less bandwidth should be required using these it would be possible to have more frequent updates (and hence, more accurate information).
152
RAM - extra RAM needed to hold two digests while comparisons takes place. CPU - probably a negligible amount.
when store rebuild completes after startup (the cache contents have been indexed in RAM), and periodically thereafter. Currently, it is rebuilt every hour (more data and experience is required before
other periods, whether xed or dynamically varying, can "intelligently" be chosen). The good thing is that the local cache decides on the expiry time and peers must obey (see later). While the [new] digest is being built in RAM the old version (stored on disk) is still valid, and will be returned to any peer requesting it. When the digest has completed building it is then swapped out to disk, overwriting the old version. The rebuild is CPU intensive, but not overly so. Since Squid is programmed using an event-handling model, the approach taken is to split the digest building task into chunks (i.e. chunks of entries to add) and to register each chunk as an event. If CPU load is overly high, it is possible to extend the build period - as long as it is nished before the next rebuild is due! It may prove more ecient to implement the digest building as a separate process/thread in the future...
Approximately the same amount of memory will be (re-)allocated on every rebuild of the digest, the memory requirements are probably quite small (when compared to other requirements of the cache
server),
153
if ongoing updates of the digest are to be supported (e.g. additions/deletions) it will be necessary to
perform these operations on a digest in RAM, and the comparisons.
if dis/deltas are to be supported the "old" digest would have to be swapped into RAM anyway for
When the digest is built in RAM, it is then swapped out to disk, where it is stored as a "normal" cache item - which is how peers request it.
Recovery - If stopped and restarted, peer digests can be reused from the local on-disk copy (they will
soon be validated using an HTTP IMS request to the appropriate peers as discussed earlier), and neighbour caches.
Sharing - peer digests are stored as normal objects in the cache. This allows them to be given to
16.11 How are the Cache Digest statistics in the Cache Manager to be interpreted?
Cache Digest statistics can be seen from the Cache Manager or through the squidclient utility. The following examples show how to use the squidclient utility to request the list of possible operations from the localhost, local digest statistics from the localhost, refresh statistics from the localhost and local digest statistics from another cache, respectively.
squidclient mgr:menu squidclient mgr:store_digest squidclient mgr:refresh squidclient -h peer mgr:store_digest
The available statistics provide a lot of useful debugging information. The refresh statistics include a section for Cache Digests which explains why items were added (or not) to the digest. The following example shows local digest statistics for a 16GB cache in a corporate intranet environment (may be a useful reference for the discussion below).
store digest: size: 768000 bytes entries: count: 588327 capacity: 1228800 util: 48% deletion attempts: 0 bits: per entry: 5 on: 1953311 capacity: 6144000 util: 32% bit-seq: count: 2664350 avg.len: 2.31 added: 588327 rejected: 528703 ( 47.33 %) del-ed: 0 collisions: on add: 0.23 % on rej: 0.23 %
154
entries:capacity is a measure of how many items "are likely" to be added to the digest. It represents the number of items that were in the local cache at the start of digest creation - however, upper and lower limits currently apply. This value is multiplied by bits: per entry (an arbitrary constant) to give bits:capacity , which is the size of the cache digest in bits. Dividing this by 8 will give store digest: size which is the size in bytes.
The number of items represented in the digest is given by entries:count . This should be equal to added minus deletion attempts . Since (currently) no modications are made to the digest after the initial build (no additions are made and deletions are not supported) deletion attempts will always be 0 and entries:count should simply be equal to added .
entries:util is not really a signicant statistic. At most it gives a measure of how many of the items in the store were deemed suitable for entry into the cache compared to how many were "prepared" for. rej shows how many objects were rejected. Objects will not be added for a number of reasons, the most common being refresh pattern settings. Remember that (currently) the default refresh pattern will be used for checking for entry here and also note that changing this pattern can signicantly aect the number of items added to the digest! Too relaxed and False Hits increase, too strict and False Misses increase. Remember also that at time of validation (on the peer) the "real" refresh pattern will be used - so it is wise to keep the default refresh pattern conservative. bits: on indicates the number of bits in the digest that are set to 1. bits: util gives this gure as a percentage of the total number of bits in the digest. As we saw earlier, a gure of 50% represents the optimal trade-o. Values too high (say > 75%) would cause a larger number of collisions, and hence False Hits, while lower values mean the digest is under-utilised (using unnecessary RAM). Note that low values are normal for caches that are starting to ll up.
A bit sequence is an uninterrupted sequence of bits with the same value. bit-seq: avg.len gives some insight into the quality of the hash functions. Long values indicate problem, even if bits:util is 50% (> 3 = suspicious, > 10 = very suspicious).
16.12 What are False Hits and how should they be handled?
A False Hit occurs when a cache believes a peer has an object and asks the peer for it but the peer is not able to satisfy the request. Expiring or stale objects on the peer are frequent causes of False Hits. At the time of the query actual refresh patterns are used on the peer and stale entries are marked for revalidation. However, revalidation is prohibited unless the peer is behaving as a parent, or miss access is enabled. Thus, clients can receive error messages instead of revalidated objects! The frequency of False Hits can be reduced but never eliminated completely, therefore there must be a robust way of handling them when they occur. The philosophy behind the design of Squid is to use lightweight techniques and optimise for the common case and robustly handle the unusual case (False Hits). Squid will soon support the HTTP only-if-cached header. Requests for objects made to a peer will use this header and if the objects are not available, the peer can reply appropriately allowing Squid to recognise the situation. The following describes what Squid is aiming towards:
Cache Digests used to obtain good estimates of where a requested object is located in a Cache Hierarchy. Persistent HTTP Connections between peers. There will be no TCP startup overhead and both latency
and network load will be similar for ICP (i.e. fast).
155
HTTP False Hit Recognition using the only-if-cached HTTP header - allowing fall back to another
peer or, if no other peers are available with the object, then going direct (or through a parent if behind a rewall).
The HTTP headers of the request are available. Two header types are of particular interest:
X-Cache - this shows whether an object is available or not. X-Cache-Lookup - this keeps the result of a store table lookup before refresh causing rules are checked
(i.e. it indicates if the object is available before any validation would be attempted).
156
If A requests the object from B (which it will if the digest lookup indicates B has it - assuming B is closest peer of course :-) then there will be another set of these headers from B. If the X-Cache header from B shows a MISS a False Hit has occurred. This means that A thought B had an object but B tells A it does not have it available for retrieval. The reason why it is not available for retrieval is indicated by the X-Cache-Lookup header. If:
X-Cache-Lookup = MISS then either A's (version of B's) digest is out-of-date or corrupt OR a collision
occurred in the digest (very small probability) OR B recently purged the object. A from getting a HIT (validation failed).
X-Cache-Lookup = HIT then B had the object, but refresh rules (or A's max-age requirements) prevent
16.13.5 Use The Source
If there is something else you need to check you can always look at the source code. The main Cache Digest functionality is organised as follows:
CacheDigest.c (debug section 70) Generic Cache Digest routines store digest.c (debug section 71) Local Cache Digest routines peer digest.c (debug section 72) Peer Cache Digest routines
Note that in the source the term Store Digest refers to the digest created locally. The Cache Digest code is fairly self-explanatory (once you understand how Cache Digests work):
You'll notice the format is similar to an Internet Draft. We decided not to submit this document as a draft because Cache Digests will likely undergo some important changes before we want to try to make it a standard.
16.16 Would it be possible to stagger the timings when cache digests are retrieved from peers?
Note: The information here is current for version 2.2.
Squid already has code to spread the digest updates. The algorithm is currently controlled by a few hardcoded constants in peer digest.c . For example, GlobDigestReqMinGap variable determines the minimum interval between two requests for a digest. You may want to try to increase the value of GlobDigestReqMinGap
157
from 60 seconds to whatever you feel comfortable with (but it should be smaller than hour/number of peers, of course). Note that whatever you do, you still need to give Squid enough time and bandwidth to fetch all the digests. Depending on your environment, that bandwidth may be more or less than an ICP would require. Upcoming digest deltas (x10 smaller than the digests themselves) may be the only way to solve the \big scale" problem.
17 Interception Caching/Proxying
How can I make my users' browsers use my cache without conguring the browsers for proxying?
First, it is critical to read the full comments in the squid.conf le! That is the only authoritative source for conguration information. However, the following instructions are correct as of this writing (July 1999.) Getting interception caching to work requires four distinct steps: 1. Compile and run a version of Squid which accepts connections for other addresses. For some operating systems, you need to have congured and built a version of Squid which can recognize the hijacked connections and discern the destination addresses. For Linux this seems to work automatically. For *BSD-based systems, you probably have to congure squid with the {enable-ipf-transparent option. (Do a make clean if you previously congured without that option, or the correct settings may not be present.) 2. Congure Squid to accept and process the connections . You have to change the Squid conguration settings to recognize the hijacked connections and discern the destination addresses. Here are the important settings in squid.conf :
http_port 8080 httpd_accel_host virtual httpd_accel_port 80 httpd_accel_with_proxy on httpd_accel_uses_host_header on
3. Get your cache server to accept the packets. You have to congure your cache host to accept the redirected packets - any IP address, on port 80 - and deliver them to your cache application. This is typically done with IP ltering/forwarding features built into the kernel. On linux they call this iptables (kernel 2.4.x), ipchains (2.2.x) or ipfwadm (2.0.x). On FreeBSD its called ipfw . Other BSD systems may use ip lter or ipnat . On most systems, it may require rebuilding the kernel or adding a new loadable kernel module. 4. Get the packets to your cache server. There are several ways to do this. First, if your proxy machine is already in the path of the packets (i.e. it is routing between your proxy users and the Internet) then you don't have to worry about this step. This would be true if you install Squid on a rewall machine, or on a UNIX-based router. If the cache is not in the natural path of the connections, then you have to divert the packets from the normal path to your cache host using a router or switch. You may be able to do this with a Cisco router using their "route maps" feature, depending on your IOS version. You might also use a so-called layer-4 switch, such as the Alteon ACE-director or the Foundry Networks ServerIron. Finally, you might be able to use a stand-alone router/load-balancer type product, or routing capabilities of an access server.
Notes:
158
The http port 8080 in this example assumes you will redirect incoming port 80 packets to port 8080
on your cache machine. If you are running Squid on port 3128 (for example) you can leave it there via http port 3128 , and redirect to that port via your IP ltering or forwarding commands.
In the httpd accel host option, virtual is the magic word! The httpd accel with proxy on is required to enable interception proxy mode; essentially in interception
proxy mode Squid thinks it is acting both as an accelerator (hence accepting packets for other IPs on port 80) and a caching proxy (hence serving les out of cache.)
You must use httpd accel uses host header on to get the cache to work properly in interception mode.
This enables the cache to index its stored objects under the true hostname, as is done in a normal proxy, rather than under the IP address. This is especially important if you want to use a parent cache hierarchy, or to share cache data between interception proxy users and non-interception proxy users, which you can do with Squid in this conguration.
Modify your startup scripts to enable ipnat. For example, on FreeBSD it looks something like this:
/sbin/modload /lkm/if_ipl.o /sbin/ipnat -f /etc/ipnat.rules chgrp nobody /dev/ipnat chmod 644 /dev/ipnat
17.1.3 Congure Squid Squid-2 Squid-2 (after version beta25) has IP lter support built in. Simple enable it when you run
congure :
./configure --enable-ipf-transparent
159
Note, you don't have to use port 8080, but it must match whatever you used in the /etc/ipnat.rules le.
Squid-1.1 Patches
<http://www.fan.net.au/~q/squid/>. Add these lines to squid.conf : http_port 8080 httpd_accel virtual 80 httpd_accel_with_proxy on httpd_accel_uses_host_header on
for
Squid-1.X
are
available
from
Quinton
Dolan's
Squid
page
Note: Interception proxying does NOT work with Linux 2.0.30! Linux 2.0.29 is known to work well. If
you're using a more recent kernel, like 2.2.X, then you should probably use an ipchains conguration, 17.3.
160
You may also need to enable IP Forwarding. One way to do it is to add this line to your startup scripts:
echo 1 > /proc/sys/net/ipv4/ip_forward
Go to the Linux IP Firewall and Accounting <http://www.xos.nl/linux/ipfwadm/> page, obtain the source distribution to ipfwadm and install it. Older versions of ipfwadm may not work. You might need at least version 2.3.0. You'll use ipfwadm to setup the redirection rules. I added this rule to the script that runs from /etc/rc.d/rc.inet1 (Slackware) which sets up the interfaces at boot-time. The redirection should be done before any other Input-accept rule. To really make sure it worked I disabled the forwarding (masquerading) I normally do.
/etc/rc.d/rc.rewall :
#!/bin/sh # rc.firewall Linux kernel firewalling rules FW=/sbin/ipfwadm # Flush rules, for testing purposes for i in I O F # A # If we enabled accounting too do ${FW} -$i -f done # Default policies: ${FW} -I -p rej ${FW} -O -p acc ${FW} -F -p den # Input Rules: # Loopback-interface (local access, eg, to local nameserver): ${FW} -I -a acc -S localhost/32 -D localhost/32 # Local Ethernet-interface: # Redirect to Squid proxy server: ${FW} -I -a acc -P tcp -D default/0 80 -r 8080 # Accept packets from local network: ${FW} -I -a acc -P all -S localnet/8 -D default/0 -W eth0 # Only required for other types of traffic (FTP, Telnet): # Forward localnet with masquerading (udp and tcp, no icmp!): ${FW} -F -a m -P tcp -S localnet/8 -D default/0 ${FW} -F -a m -P udp -S localnet/8 -D default/0
# Incoming policy: reject (quick error) # Output policy: accept # Forwarding policy: deny
161
Here all trac from the local LAN with any destination gets redirected to the local port 8080. Rules can be viewed like this:
IP firewall input rules, default policy: reject type prot source destination acc all 127.0.0.1 127.0.0.1 acc/r tcp 10.0.0.0/8 0.0.0.0/0 acc all 10.0.0.0/8 0.0.0.0/0 acc tcp 0.0.0.0/0 0.0.0.0/0 ports n/a * -> 80 => 8080 n/a * -> *
I did some testing on Windows 95 with both Microsoft Internet Explorer 3.01 and Netscape Communicator pre-release and it worked with both browsers with the proxy-settings disabled. At one time squid seemed to get in a loop when I pointed the browser to the local port 80. But this could be avoided by adding a reject rule for client to this address:
${FW} -I -a rej -P tcp -S localnet/8 -D hostname/32 80 IP firewall input rules, default policy: reject type prot source destination acc all 127.0.0.1 127.0.0.1 rej tcp 10.0.0.0/8 10.0.0.1 acc/r tcp 10.0.0.0/8 0.0.0.0/0 acc all 10.0.0.0/8 0.0.0.0/0 acc tcp 0.0.0.0/0 0.0.0.0/0
NOTE on resolving names : Instead of just passing the URLs to the proxy server, the browser itself has to resolve the URLs. Make sure the workstations are setup to query a local nameserver, to minimize outgoing trac.
If you're already running a nameserver at the rewall or proxy server (which is a good idea anyway IMHO) let the workstations use this nameserver. Additional notes from Richard Ayres <mailto:[email protected]> I'm using such a setup. The only issues so far have been that: 1. It's fairly useless to use my service providers parent caches (cache-?.www.demon.net) because by proxying squid only sees IP addresses, not host names and demon aren't generally asked for IP addresses by other users; 2. Linux kernel 2.0.30 is a no-no as interception proxying is broken (I use 2.0.29); 3. Client browsers must do host name lookups themselves, as they don't know they're using a proxy; 4. The Microsoft Network won't authorize its users through a proxy, so I have to specically *not* redirect those packets (my company is a MSN content provider). Aside from this, I get a 30-40% hit rate on a 50MB cache for 30-40 users and am quite pleased with the results.
162
You must include the IP: always defragment , otherwise it prevents you from using the REDIRECT chain. You can use this script as a template for your own rc.rewall to congure ipchains:
#!/bin/sh # rc.firewall Linux kernel firewalling rules # Leon Brooks (leon at brooks dot fdns dot net) FW=/sbin/ipchains ADD="$FW -A" # Flush rules, for testing purposes for i in I O F # A # If we enabled accounting too do ${FW} -F $i done # Default policies: ${FW} -P input REJECT ${FW} -P output ACCEPT ${FW} -P forward DENY # Input Rules: # Loopback-interface (local access, eg, to local nameserver): ${ADD} input -j ACCEPT -s localhost/32 -d localhost/32 # Local Ethernet-interface:
# Incoming policy: reject (quick error) # Output policy: accept # Forwarding policy: deny
163
Also, Andrew Shipton <mailto:[email protected]> notes that with 2.0.x kernels you don't need to enable packet forwarding, but with the 2.1.x and 2.2.x kernels using ipchains you do. Packet forwarding is enabled with the following command:
echo 1 > /proc/sys/net/ipv4/ip_forward
Networking support Sysctl support Network packet ltering TCP/IP networking Connection tracking (Under \IP: Netlter Conguration" in menucong) IP tables support Full NAT REDIRECT target support /proc lesystem support
You must say NO to \Fast switching" After building the kernel, install it and reboot. You may need to enable packet forwarding (e.g. in your startup scripts):
echo 1 > /proc/sys/net/ipv4/ip_forward
Use the iptables command to make your kernel intercept HTTP connections and send them to Squid:
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j REDIRECT --to-port 3128
164
Dene an access list to trap HTTP requests. The second line allows the Squid host direct access so an routing loop is not formed. By carefully writing your access list as show below, common cases are found quickly and this can greatly reduce the load on your router's processor.
! access-list 110 deny tcp any any neq www access-list 110 deny tcp host 203.24.133.2 any access-list 110 permit tcp any any !
John <mailto:[email protected]> notes that you may be able to get around this bug by carefully writing your access lists. If the last/default rule is to permit then this bug would be a problem, but if the last/default rule was to deny then it won't be a problem. I guess fragments, other than the rst, don't have the information available to properly policy route them. Normally TCP packets should not be fragmented, at least my network runs an MTU of 1500 everywhere to avoid fragmentation. So this would aect UDP and ICMP trac only.
165
Basically, you will have to pick between living with the bug or better performance. This set has better performance, but suers from the bug:
access-list 110 deny tcp any any neq www access-list 110 deny tcp host 10.1.2.3 any access-list 110 permit tcp any any
Conversely, this set has worse performance, but works for all protocols:
access-list 110 deny tcp host 10.1.2.3 any access-list 110 permit tcp any any eq www access-list 110 deny tcp any any
17.6 Interception caching with LINUX 2.0.29 and CISCO IOS 11.1
Just for kicks, here's an email message posted to squid-users on how to make interception proxying work with a Cisco router and Squid running on Linux. by Brian Feeny <mailto:[email protected]> Here is how I have Interception proxying working for me, in an environment where my router is a Cisco 2501 running IOS 11.1, and Squid machine is running Linux 2.0.33. Many thanks to the following individuals and the squid-users list for helping me get redirection and interception proxying working on my Cisco/Linux box.
166
So basically from above you can see I added the "route-map" declaration, and an access-list, and then turned the route-map on under int e0 "ip policy route-map proxy-redir" ok, so the Cisco is taken care of at this point. The host above: 208.206.76.44, is the ip number of my squid host. My squid box runs Linux, so I had to do the following on it: my kernel (2.0.33) cong looks like this:
# # Networking options # CONFIG_FIREWALL=y # CONFIG_NET_ALIAS is not set CONFIG_INET=y CONFIG_IP_FORWARD=y CONFIG_IP_MULTICAST=y CONFIG_SYN_COOKIES=y # CONFIG_RST_COOKIES is not set CONFIG_IP_FIREWALL=y # CONFIG_IP_FIREWALL_VERBOSE is not set CONFIG_IP_MASQUERADE=y # CONFIG_IP_MASQUERADE_IPAUTOFW is not set CONFIG_IP_MASQUERADE_ICMP=y CONFIG_IP_TRANSPARENT_PROXY=y CONFIG_IP_ALWAYS_DEFRAG=y # CONFIG_IP_ACCT is not set CONFIG_IP_ROUTER=y
You will need Firewalling and Transparent Proxy turned on at a minimum. Then some ipfwadm stu:
# Accept all on loopback ipfwadm -I -a accept -W lo # Accept my own IP, to prevent ipfwadm -I -a accept -P tcp -D # Send all traffic destined to ipfwadm -I -a accept -P tcp -D
loops (repeat for each interface/alias) 208.206.76.44 80 port 80 to Squid on port 3128 0/0 80 -r 3128
it accepts packets on port 80 (redirected from the Cisco), and redirects them to 3128 which is the port my squid process is sitting on. I put all this in /etc/rc.d/rc.local am using v1.1.20 of Squid </Versions/1.1/1.1.20/> with Henrik's patch <http://devel.squid-cache.org/hno/patches/squid-1.1.20.host and virtual.patch> installed. I You will want to install this patch if using a setup similar to mine.
167
Deny Squid from fetching objects from itself (using ACL lists). Apply a small patch that prevents Squid from looping innitely (available from Henrik's Squid Patches
<http://devel.squid-cache.org/hno/>)
Don't run Squid on port 80, and redirect port 80 not destined for the local machine to Squid (redirection
== iplter/ipfw/ipfadm). This avoids the most common loops.
If you are using iplter then you should also use transproxyd in front of Squid. Squid does not yet
know how to interface to iplter (patches are welcome: [email protected]).
Here, 10.0.3.22 is the IP address of the FreeBSD cache machine. Once I have packets going to the FreeBSD box, I need to get the kernel to deliver them to Squid. I started on FreeBSD-2.2.7, and then downloaded IPFilter <ftp://coombs.anu.edu.au/pub/net/ip-filter/>. This was a dead end for me. The IPFilter distribution includes patches to the FreeBSD kernel sources, but many of these had conicts. Then I noticed that the IPFilter page says \It comes as a part of [FreeBSD-2.2 and later]." Fair enough. Unfortunately, you can't hijack connections with the FreeBSD-2.2.X IPFIREWALL code (ipfw ), and you can't (or at least I couldn't) do it with natd either. FreeBSD-3.0 has much better support for connection hijacking, so I suggest you start with that. You need to build a kernel with the following options:
options options IPFIREWALL IPFIREWALL_FORWARD
Next, its time to congure the IP rewall rules with ipfw . By default, there are no "allow" rules and all packets are denied. I added these commands to /etc/rc.local just to be able to use the machine on my network:
ipfw add 60000 allow all from any to any
But we're still not hijacking connections. To accomplish that, add these rules:
ipfw add 49 ipfw add 50 allow tcp from 10.0.3.22 to any fwd 127.0.0.1 tcp from any to any 80
168
The second line (rule 50) is the one which hijacks the connection. The rst line makes sure we never hit rule 50 for trac originated by the local machine. This prevents forwarding loops. Note that I am not changing the port number here. That is, port 80 packets are simply diverted to Squid on port 80. My Squid conguration is:
http_port 80 httpd_accel_host virtual httpd_accel_port 80 httpd_accel_with_proxy on httpd_accel_uses_host_header on
If you don't want Squid to listen on port 80 (because that requires root privileges) then you can use another port. In that case your ipfw redirect rule looks like:
ipfw add 50 fwd 127.0.0.1,3128 tcp from any to any 80
Step 3 is to set the "APPLICATION ID" on port 80 trac to 80. This causes all packets matching this lter to have ID 80 instead of the default ID of 0.
SET PROFILE IP FILTER APPLICATION_ID http 80
Step 4 is to create a special route that is used for packets with "APPLICATION ID" set to 80. The routing engine uses the ID to select which routes to use.
ADD IP ROUTE ENTRY 0.0.0.0 0.0.0.0 PROXY-IP 1 SET IP ROUTE APPLICATION_ID 0.0.0.0 0.0.0.0 PROXY-IP 80
169
With this in place use your RADIUS server to send back the \Framed-Filter-Id = transproxy" key/value pair to the NAS. You can check if the lter is being assigned to logins with the following command:
display profile port table
IOS Version 11.x It is possible that later versions of IOS 11.x will support V2.0 of the protocol. If that is
the case follow the 12.x instructions. Several people have reported that the squid implimentation of WCCP does not work with their 11.x routers. If you experience this please mail the debug output from your router to squid-bugs .
170
IOS Version 12.x Some of the early versions of 12.x do not have the 'ip wccp version' command. You
will need to upgrade your IOS version to use V1.0. You will need to be running at least IOS Software Release 12.0(5)T if you're running the 12.0 T-train. IOS Software Releases 12.0(3)T and 12.0(4)T do not have WCCPv1, but 12.0(5)T does.
conf t ip wccp version 1 ip wccp web-cache redirect-list 150 ! interface [Interface carrying Outgoing/Incoming Traffic]x/x ip wccp web-cache redirect out|in ! CTRL Z write mem
Replace 150 with an access list number (either standard or extended) which lists IP addresses which you do not wish to be transparently redirected to your cache. Otherwise simply user the word 'redirect' on it's own to redirect trac from all sources to all destinations.
171
2. Download gre.c for FreeBSD-3.x <../../WCCP-support/FreeBSD-3.x/gre.c>. Save this le as /usr/src/sys/netinet/gre.c . 3. Add "options GRE" to your kernel cong le and rebuild your kernel. Note, the when you run cong . Once your kernel is installed you will need to 17.8. a little dierent.
FreeBSD-4.0 through 4.7 The procedure is nearly identical to the above for 3.x, but the source les are
1. Apply the most appropriate patch le from the list of <../../WCCP-support/FreeBSD-4.x>.
2. Download gre.c for FreeBSD-3.x <../../WCCP-support/FreeBSD-3.x/gre.c>. Save this le as /usr/src/sys/netinet/gre.c . 3. Add "options GRE" to your kernel cong le and rebuild your kernel. Note, the when you run cong . Once your kernel is installed you will need to 17.8. to make a kernel with the GRE code enabled:
pseudo-device gre
FreeBSD-4.8 and later The operating system now comes standard with some GRE support. You need
And then congure the tunnel so that the router's GRE packets are accepted:
# # # # ifconfig gre0 create ifconfig gre0 $squid_ip $router_ip netmask 255.255.255.255 up ifconfig gre0 tunnel $squid_ip $router_ip route delete $router_ip
Standard Linux GRE Tunnel Linux 2.2 kernels already support GRE, as long as the GRE module is
compiled into the kernel. Ensure that the GRE code is either built as static or as a module by chosing the appropriate option in your kernel cong. Then rebuild your kernel. If it is a module you will need to:
172
The next step is to tell Linux to establish an IP tunnel between the router and your host. Daniele Orlandi reports that you have to give the gre1 interface an address, but any old address seems to work.
iptunnel add gre1 mode gre remote <Router-IP> local <Host-IP> dev <interface> ifconfig gre1 127.0.0.2 up
<Router-IP> is the IP address of your router that is intercepting the HTTP packets. <Host-IP> is the IP address of your cache, and <interface> is the network interface that receives those packets (probably eth0).
Joe Cooper's Patch Joe Cooper has a patch for Linux 2.2.18 kernel on his Squid page
<http://www.swelltech.com/pengies/joe/patches/>.
This module is not part of the standard Linux distributon. It needs to be compiled as a module and loaded on your system to function. Do not attempt to build this in as a static part of your kernel. Download the Linux WCCP module <../../WCCP-support/Linux/ip wccp.c> and compile it as you would any Linux network module. Copy the module to /lib/modules/kernel-version/ipv4/ip wccp.o . version/modules.dep and add:
/lib/modules/kernel-version/ipv4/ip_wccp.o:
Edit /lib/modules/kernel-
and requeuing them. The system will also need to be congured for interception proxying, either with 17.2 or with 17.3.
Common Steps The machine should now be striping the GRE encapsulation from any packets recieved
17.12 Can someone tell me what version of cisco IOS WCCP is added in?
IOS releases:
173
Cisco has published WCCPv2 as an Internet Draft <http://www.web-cache.com/Writings/Internet-Drafts/draft-wilso (expired Jan 2001). There is a ongoing project at the Squid development projects <http://devel.squid-cache.org/> website aiming to add support for WCCPv2 and at the time of writing this patch provides at least the same functionality as WCCPv1.
We will assume you have various workstations, customers, etc, plugged into the switch for which you want them to be intercepted and sent to Squid. The squid caches themselves should be plugged into the switch as well. Only the interface that the router is connected to is important. Where you put the squid caches or other connections does not matter. This example assumes your router is plugged into interface 17 of the switch. If not, adjust the following commands accordingly. 1. Enter conguration mode:
telnet@ServerIron#conf t
18. SNMP
telnet@ServerIron(config)#int e 17 telnet@ServerIron(config-if-17)# ip-policy 1
174
Since all outbound trac to the Internet goes out interface 17 (the router), and interface 17 has the caching policy applied to it, HTTP trac is going to be intercepted and redirected to the caches you have congured. The default port to redirect to can be changed. The load balancing algorithm used can be changed (Least Used, Round Robin, etc). Ports can be exempted from caching if needed. Access Lists can be applied so that only certain source IP Addresses are redirected, etc. This information was left out of this document since this was just a quick howto that would apply for most people, not meant to be a comprehensive manual of how to congure a Foundry switch. I can however revise this with any information necessary if people feel it should be included.
18 SNMP
Contributors: Glenn Chisholm <mailto:[email protected]>.
Once the compile is completed and the new binary is installed the squid.conf le needs to be congured to allow access; the default is to deny all requests. The instructions on how to do this have been broken into two parts, the rst for all versions of Squid from 2.2 onwards and the second for 2.1 and below.
18. SNMP
acl aclname snmp_community string
175
For example:
acl snmppublic snmp_community public acl snmpjoebloggs snmp_community joebloggs
This creates two acl's, with two dierent communities, public and joebloggs. You can name the acl's and the community strings anything that you like. To specify the port that the agent will listen on modify the "snmp port" parameter, it is defaulted to 3401. The port that the agent will forward requests that can not be furlled by this agent to is set by "forward snmpd port" it is defaulted to o. It must be congured for this to work. Remember that as the requests will be originating from this agent you will need to make sure that you congure your access accordingly. To allow access to Squid's SNMP agent, dene an snmp access ACL with the community strings that you previously dened. For example:
snmp_access allow snmppublic localhost snmp_access deny all
The above will allow anyone on the localhost who uses the community public to access the agent. It will deny all others access. If you do not dene any snmp access ACL's, then SNMP access is denied by default. Finally squid allows to you to congure the address that the agent will bind to for incomming and outgoing trac. These are defaulted to 0.0.0.0, changing these will cause the agent to bind to a specic address on the host, rather than the default which is all.
snmp_incoming_address 0.0.0.0 snmp_outgoing_address 0.0.0.0
Note that for security you are advised to restrict SNMP access to your caches. You can do this easily as follows:
18. SNMP
acl snmpmanagementhosts 1.2.3.4/255.255.255.255 1.2.3.0/255.255.255.0 snmp_acl public deny all !snmpmanagementhosts snmp_acl readwrite deny all
176
You must follow these instructions for 2.1 and below exactly or you are likely to have problems. The parser has some issues which have been corrected in 2.2.
then it is working ok, and you should be able to make nice statistics out of it. For an explanation of what every string (OID) does, you should refer to the Squid SNMP web pages </SNMP/>.
18.8 Where can I get more information/discussion about Squid and SNMP?
General Discussion: [email protected] <mailto:[email protected]> These messages are archived <http://www.squid-cache.org/mail-archive/cache-snmp/>. Subscriptions should be sent to: [email protected] <mailto:[email protected]>.
177
1. Cache Monitoring - How to set up your own monitoring <http://www.cache.dfn.de/DFN-Cache/Development/Monito by DFN-Cache 2. Using MRTG to monitor Squid <http://www.serassio.it/SquidNT/mrtg.htm> by Guido Serassio 3. Squid Conguration Manual - Monitoring Squid <http://squid.visolve.com/related/snmp/monitoringsquid.htm> by Visolve 4. Using MRTG for Squid monitoring <http://www.arnes.si/~matija/utrecht/lecture.html> Desire II caching workshop session by Matija Grabnar 5. How do I monitor my Squid 2 cache using MRT <http://hermes.wwwcache.ja.net/FAQ/FAQ-2.html#mrtg> by The National Janet Web Cache Service Further examples of Squid MRTG congurations can be found here: 1. MRTG HOWTO Collection / Squid <http://howto.aphroland.de/HOWTO/MRTG/SquidMonitoringWithMRTG> from MRTG 2. using mrtg to monitor Squid <http://people.ee.ethz.ch/~oetiker/webtools/mrtg/squid.html> from MRTG 3. Chris' MRTG Resources <http://www.psychofx.com/chris/unix/mrtg/> 4. MRTG & Squid <http://thproxy.jinr.ru/file-archive/doc/squid/cache-snmp/mrtg-demo/> by Glenn Chisholm 5. Braindump <http://www.braindump.dk/en/wiki/?catid=7&wikipage=ConfigFiles> by Joakim Recht
19 Squid version 2
19.1 What are the new features? persistent connections. Lower VM usage; in-transit objects are not held fully in memory. Totally independent swap directories. Customizable error texts. FTP supported internally; no more ftpget. Asynchronous disk operations (optional, requires pthreads library). Internal icons for FTP and gopher directories. snprintf() used everywhere instead of sprintf(). SNMP.
178
URN support </urn-support.html> Routing requests based on AS numbers. Cache Digests <FAQ-16.html> ...and many more!
With this in place, Squid should pick one of your parents to use for SSL requests. If you want it to pick a particular parent, you must use the cache peer access conguration:
cache_peer parent1 parent 3128 3130 cache_peer parent2 parent 3128 3130 cache_peer_access parent2 allow !SSL
The above lines tell Squid to NOT use parent2 for SSL, so it should always use parent1 .
179
2. You will need to compile and install an external authenticator program. Most people will want to use ncsa auth . The source for this program is included in the source distribution, in the auth modules/NCSA directory.
% cd auth_modules/NCSA % make % make install
You should now have an ncsa auth program in the same directory where your squid binary lives. 3. You may need to create a password le. If you have been using proxy authentication before, you probably already have such a le. You can get apache's htpasswd program <../../htpasswd/> from our server. Pick a pathname for your password le. We will assume you will want to put it in the same directory as your squid.conf. 4. Congure the external authenticator in squid.conf . For ncsa auth you need to give the pathname to the executable and the password le as an argument. For example:
auth_param basic /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd
After all that, you should be able to start up Squid. If we left something out, or haven't been clear enough, please let us know ([email protected]).
19.7 Why does proxy-auth reject all users after upgrading from Squid-2.1 or earlier?
The ACL for proxy-authentication has changed from:
acl foo proxy_auth timeout
to:
acl foo proxy_auth username
Please update your ACL appropriately - a username of REQUIRED will permit all valid usernames. The timeout is now specied with the conguration option:
auth_param basic credentialsttl timeout
180
The information here is current for version 2.2. It is strongly recommended that you use at least Squid 2.2 if you wish to use delay pools.
Delay pools provide a way to limit the bandwidth of certain requests based on any list of criteria. The idea came from a Western Australian university who wanted to restrict student trac costs (without aecting sta trac, and still getting cache and local peering hits at full speed). There was some early Squid 1.0 code by Central Network Services at Murdoch University, which I then developed (at the University of Western Australia) into a much more complex patch for Squid 1.0 called \DELAY HACK." I then tried to code it in a much cleaner style and with slightly more generic options than I personally needed, and called this \delay pools" in Squid 2. I almost completely recoded this in Squid 2.2 to provide the greater exibility requested by people using the feature. To enable delay pools features in Squid 2.2, you must use the {enable-delay-pools congure option before compilation. Terminology for this FAQ entry:
pool
a collection of bucket groups as appropriate to a given class
bucket group
a group of buckets within a pool, such as the per-host bucket group, the per-network bucket group or the aggregate bucket group (the aggregate bucket group is actually a single bucket)
bucket
an individual delay bucket represents a trac allocation which is replenished at a given rate (up to a given limit) and causes trac to be delayed when empty
class
the class of a delay pool determines how the delay is applied, ie, whether the dierent client IPs are treated seperately or as a group (or both)
class 1
a class 1 delay pool contains a single unied bucket which is used for all requests from hosts subject to the pool
class 2
a class 2 delay pool contains one unied bucket and 255 buckets, one for each host on an 8-bit network (IPv4 class C)
class 3
contains 255 buckets for the subnets in a 16-bit network, and individual buckets for every host on these networks (IPv4 class B) Delay pools allows you to limit trac for clients or client groups, with various features:
can specify peer hosts which aren't aected by delay pools, ie, local peering or other 'free' trac (with
the no-delay peer option).
delay behavior is selected by ACLs (low and high priority trac, sta vs students or student vs
authenticated student or so on).
181
each group of users has a number of buckets, a bucket has an amount coming into it in a second and
a maximum amount it can grow to; when it reaches zero, objects reads are deferred until one of the object's clients has some trac allowance. disabled, for example you might only want to use the aggregate and per-host bucket groups of class 3, not the per-network one.
any number of pools can be congured with a given class and any set of limits within the pools can be
This allows options such as creating a number of class 1 delay pools and allowing a certain amount of bandwidth to given object types (by using URL regular expressions or similar), and many other uses I'm sure I haven't even though of beyond the original fair balancing of a relatively small trac allocation across a large number of users. There are some limitations of delay pools:
delay pools are incompatible with slow aborts; quick abort should be set fairly low to prevent objects
being retrived at full speed once there are no clients requesting them (as the trac allocation is based on the current clients, and when there are no clients attached to the object there is no way to determine the trac allocation). overheads, ICP, DNS, icmp pings, etc.
delay pools only limits the actual data transferred and is not inclusive of overheads such as TCP it is possible for one connection or a small number of connections to take all the bandwidth from a
given bucket and the other connections to be starved completely, which can be a major problem if there are a number of large objects being transferred and the parameters are set in a way that a few large objects will cause all clients to be starved (potentially xed by a currently experimental patch).
19.8.1 How can I limit Squid's total bandwidth to, say, 512 Kbps?
acl all src 0.0.0.0/0.0.0.0 delay_pools 1 delay_class 1 1 delay_access 1 allow all delay_parameters 1 64000/64000 # might already be defined
182
183
# Security access checks http_access [...] # These people get in for slow cache access http_access allow virtual_slowcache slownets http_access deny virtual_slowcache # Access checks for main cache http_access [...] # Delay definitions (read config file for clarification) delay_pools 2 delay_initial_bucket_level 50 delay_class 1 3 delay_access 1 allow virtual_slowcache !LOCAL-NET !LOCAL-IP !fast_slow delay_access 1 deny all delay_parameters 1 8192/131072 1024/65536 256/32768 delay_class 2 2 delay_access 2 allow virtual_slowcache !LOCAL-NET !LOCAL-IP fast_slow delay_access 2 deny all delay_parameters 2 2048/65536 512/32768
The same code is also used by a some of departments using class 2 delay pools to give them more exibility in giving dierent performance to dierent labs or students.
184
185
186
Furthermore, you can rewrite the error message template les if you like. This list describes the tags which Squid will insert into the messages:
%B
URL with FTP %2f hack
%c
Squid error code
%d
seconds elapsed since request received (not yet implemented)
%e
errno
%E
strerror()
187
%F
FTP reply line
%g
FTP server message
%h
cache hostname
%H
server host name
%i
client IP address
%I
server IP address
%L
contents of err html text cong option
%M
Request Method
%m
Error message returned by external auth helper
%p %P
%R
Full HTTP Request
%S
squid default signature
%s
caching proxy software with version
%t
local time
%T
UTC
%U
URL without password
188
%w
cachemgr email address
%z
dns server error message The Squid default signature is added automatically unless %s is used in the error page. To change the signature you must manually append the signature to each error page. The default signature reads like:
<BR clear="all"> <HR noshade size="1px"> <ADDRESS> Generated %T by %h (%s) </ADDRESS> </BODY></HTML>
cache host
This is now called cache peer . The old term does not really describe what you are conguring, but the new name tells you that you are conguring a peer for your cache.
cache stoplist
This directive also has been reimplemented with access control lists. You will use the no cache option. For example:
acl Uncachable url_regex cgi ? no_cache deny Uncachable
cache swap
This option used to specify the cache disk size. Now you specify the disk size on each cache dir line.
This option has been renamed to cache peer access and the syntax has changed. Now this option is a true access control list, and you must include an allow or deny keyword. For example:
189
This example sends requests to your peer thatcache.thatdomain.net only for origin servers in Autonomous System Number 1241.
units
In Squid-1.1 many of the conguration options had implied units associated with them. For example, the connect timeout value may have been in seconds, but the read timeout value had to be given in minutes. With Squid-2, these directives take units after the numbers, and you will get a warning if you leave o the units. For example, you should now write:
connect_timeout 120 seconds read_timeout 15 minutes
20 httpd-accelerator mode
20.1 What is the httpd-accelerator mode?
Occasionally people have trouble understanding accelerators and proxy caches, usually resulting from mixed up interpretations of "incoming" and \outgoing" data. I think in terms of requests (i.e., an outgoing request is from the local site out to the big bad Internet). The data received in reply is incoming, of course. Others think in the opposite sense of \a request for incoming data". An accelerator caches incoming requests for outgoing data (i.e., that which you publish to the world). It takes load away from your HTTP server and internal network. You move the server away from port 80 (or whatever your published port is), and substitute the accelerator, which then pulls the HTTP data from the \real" HTTP server (only the accelerator needs to know where the real server is). The outside world sees no dierence (apart from an increase in speed, with luck). Quite apart from taking the load of a site's normal web server, accelerators can also sit outside rewalls or other network bottlenecks and talk to HTTP servers inside, reducing trac across the bottleneck and simplifying the conguration. Two or more accelerators communicating via ICP can increase the speed and resilience of a web service to any single failure. The Squid redirector can make one accelerator act as a single front-end for multiple servers. If you need to move parts of your lesystem from one server to another, or if separately administered HTTP servers should logically appear under a single URL hierarchy, the accelerator makes the right thing happen. If you wish only to cache the \rest of the world" to improve local users browsing performance, then accelerator mode is irrelevant. Sites which own and publish a URL hierarchy use an accelerator to improve other sites' access to it. Sites wishing to improve their local users' access to other sites' URLs use proxy caches. Many sites, like us, do both and hence run both. Measurement of the Squid cache and its Harvest counterpart suggest an order of magnitude performance improvement over CERN or other widely available caching software. This order of magnitude performance improvement on hits suggests that the cache can serve as an httpd accelerator, a cache congured to act as a site's primary httpd server (on port 80), forwarding references that miss to the site's real httpd (on port 81).
190
In such a conguration, the web administrator renames all non-cachable URLs to the httpd's port (81). The cache serves references to cachable objects, such as HTML pages and GIFs, and the true httpd (on port 81) serves references to non-cachable objects, such as queries and cgi-bin programs. If a site's usage characteristics tend toward cachable objects, this conguration can dramatically reduce the site's web workload. Note that it is best not to run a single squid process as both an httpd-accelerator and a proxy cache, since these two modes will have dierent working sets. You will get better performance by running two separate caches on separate machines. However, for compatability with how administrators are accustomed to running other servers that provide both proxy and Web serving capability (eg, CERN), the Squid supports operation as both a proxy and an accelerator if you set the httpd accel with proxy variable to on inside your squid.conf conguration le.
Next, you need to move your normal HTTP server to another port and/or another machine. If you want to run your HTTP server on the same machine, then it can not also use port 80 (except see the next FAQ entry below). A common choice is port 81. Congure squid as follows:
httpd_accel_host localhost httpd_accel_port 81
Alternatively, you could move the HTTP server to another machine and leave it on port 80:
httpd_accel_host otherhost.foo.com httpd_accel_port 80
You should now be able to start Squid and it will serve requests as a HTTP server. If you are using Squid has an accelerator for a virtual host system, then you need to specify
httpd_accel_host virtual
Finally, if you want Squid to also accept proxy requests (like it used to before you turned it into an accelerator), then you need to enable this option:
httpd_accel_with_proxy on
20.3 When using an httpd-accelerator, the port number for redirects is wrong
Yes, this is because you probably moved your real httpd to port 81. When your httpd issues a redirect message (e.g. 302 Moved Temporarily), it knows it is not running on the standard port (80), so it inserts :81 in the redirected URL. Then, when the client requests the redirected URL, it bypasses the accelerator. How can you x this? One way is to leave your httpd running on port 80, but bind the httpd socket to a specic interface, namely the loopback interface. With Apache <http://www.apache.org/> you can do it like this in httpd.conf :
Port 80 BindAddress 127.0.0.1
191
Note, you probably also need to add an /etc/hosts entry of 127.0.0.1 for your server hostname. Otherwise, Squid may get stuck in a forwarding loop.
21 Related Software
21.1 Clients
21.1.1 Wget
Wget <ftp://gnjilux.cc.fer.hr/pub/unix/util/wget/> is a command-line Web client. It supports HTTP and FTP URLs, recursive retrievals, and HTTP proxies.
21.1.2 echoping
If you want to test your Squid cache in batch (from a cron command, for instance), you can use the echoping <ftp://ftp.internatif.org/pub/unix/echoping/> program, which will tell you (in plain text or via an exit code) if the cache is up or not, and will indicate the response times.
192
Junkbusters <http://internet.junkbuster.com> Corp has a copyleft privacy-enhancing, ad-blocking proxy server which you can use in conjunction with Squid.
21.4.4 Squirm
Squirm <http://squirm.foote.com.au/> is a congurable, ecient redirector for Squid by <mailto:[email protected]>. Features: Chris Foote
Very fast Virtually no memory usage It can re-read it's cong les while running by sending it a HUP signal Interactive test mode for checking new congs Full regular expression matching and replacement Cong les for patterns and IP addresses. If you mess up the cong le, Squirm runs in Dodo Mode so your squid keeps working :-)
21.4.5 chpasswd.cgi
Pedro L Orso <mailto:[email protected]> has adapated the Apache's htpasswd <../../htpasswd/> into a CGI program called chpasswd.cgi <http://web.onda.com.br/orso/chpasswd.html>.
21.4.6 jesred
jesred
<http://ivs.cs.uni-magdeburg.de/~elkner/webtools/jesred/> <mailto:[email protected]>.
by
Jens
Elkner
21.4.7 squidGuard
squidGuard <http://www.squidguard.org/> is a free (GPL), exible and ecient lter and redirector program for squid. It lets you dene multiple access rules with dierent restrictions for dierent user groups on a squid cache. squidGuard uses squid standard redirector interface.
193
The Cerberian content lter <http://marasystems.com/?section=cerberian> is a very exible URL rating system with full Squid integration provided by MARA Systems AB <http://marasystems.com/download/cerberian>. The service requires a license (priced by the number of seats) but evaluation licenses are available.
22 DISKD
22.1 What is DISKD?
DISKD refers to some features in Squid-2.4 and later to improve Disk I/O performance. The basic idea is that each cache dir has its own diskd child process. The diskd process performs all disk I/O operations (open, close, read, write, unlink) for the cache dir. Message queues are used to send requests and responses between the Squid and diskd processes. Shared memory is used for chunks of data to be read and written.
We
benchmarked
Squid-2.4
with
DISKD
at
the
Second
IRCache
Bake-O
22. DISKD
194
MSGMNB
Maximum number of bytes per message queue.
MSGMNI
Maximum number of message queue identiers (system wide).
MSGSEG
Maximum number of message segments per queue.
MSGSSZ
Size of a message segment.
MSGTQL
Maximum number of messages (system wide).
MSGMAX
Maximum size of a whole message. On some systems you may need to increase this limit. On other systems, you may not be able to change it. The messages between Squid and diskd are 32 bytes for 32-bit CPUs and 40 bytes for 64-bit CPUs. Thus, MSGSSZ should be 32 or greater. You may want to set it to a larger value, just to be safe. We'll have two queues for each cache dir { one in each direction. So, MSGMNI needs to be at least two times the number of cache dir 's. I've found that 75 messages per queue is about the limit of decent performance. If each diskd message consists of just one segment (depending on your value of MSGSSZ), then MSGSEG should be greater than 75. MSGMNB and MSGTQL aect how many messages can be in the queues at one time. Diskd messages shouldn't be more than 40 bytes, but let's use 64 bytes to be safe. MSGMNB should be at least 64*75. I recommend rounding up to the nearest power of two, or 8192. MSGTQL should be at least 75 times the number of cache dir 's that you'll have.
22.6.1 FreeBSD
Your kernel must have
options SYSVMSG
You can set the parameters in the kernel as follows. This is just an example. Make sure the values are appropriate for your system:
22. DISKD
options options options options options MSGMNB=8192 MSGMNI=40 MSGSEG=512 MSGSSZ=64 MSGTQL=2048 # # # # # max # of bytes in a queue number of message queue identifiers number of message segments per queue size of a message segment max messages in system
195
22.6.2 OpenBSD
You can set the parameters in the kernel as follows. This is just an example. Make sure the values are appropriate for your system:
option option option option option MSGMNB=16384 MSGMNI=40 MSGSEG=2048 MSGSSZ=64 MSGTQL=1024 # # # # # max characters per message queue max number of message queue identifiers max number of message segments in the system size of a message segment (Must be 2^N) max amount of messages in the system
by Brenden Phillips <mailto:B.C.Phillips at massey dot ac dot nz> If you have a newer version (DU64), then you can probably use syscong instead. To see what the current IPC settings are run
# sysconfig -q ipc
then run
# sysconfigdb -a -f ipc.stanza
22.6.4 Linux
Stefan K opsell reports that if you compile sysctl support into your kernel, then you can change the following values:
22. DISKD
196
Refer to Demangling Message Queues <http://www.sunworld.com/sunworldonline/swol-11-1997/swol-11-insidesolari in Sunworld Magazine. I don't think the above article really tells you how to set the parameters. You do it in /etc/system with lines like this:
set set set set set msgsys:msginfo_msgmax=2048 msgsys:msginfo_msgmnb=8192 msgsys:msginfo_msgmni=40 msgsys:msginfo_msgssz=64 msgsys:msginfo_msgtql=2048
Of course, you must reboot whenever you modify /etc/system before changes take eect.
SHMSEG
Maximum number of shared memory segments per process.
SHMMNI
Maximum number of shared memory segments for the whole system.
SHMMAX
Largest shared memory segment size allowed.
SHMALL
Total amount of shared memory that can be used. For Squid and DISKD, SHMMNI and SHMMNI must be greater than or equal to the number of cache dir 's that you have. SHMMAX must be at least 800 kilobytes. SHMALL must be at least SHMMAX 800 kilobytes multiplied by the number of cache dir 's.
197
You can set the parameters in the kernel as follows. This is just an example. Make sure the values are appropriate for your system:
options options options options SHMSEG=16 # max shared mem id's per process SHMMNI=32 # max shared mem id's per system SHMMAX=2097152 # max shared memory segment size (bytes) SHMALL=4096 # max amount of shared memory (pages)
22.7.2 OpenBSD
OpenBSD is similar to FreeBSD, except you must use option instead of options , and SHMMAX is in pages instead of bytes:
option option option option SHMSEG=16 SHMMNI=32 SHMMAX=2048 SHMALL=4096 # # # # max max max max shared shared shared amount mem id's per process mem id's per system memory segment size (pages) of shared memory (pages)
by Brenden Phillips <mailto:B.C.Phillips at massey dot ac dot nz> If you have a newer version (DU64), then you can probably use syscong instead. To see what the current IPC settings are run
# sysconfig -q ipc
then run
# sysconfigdb -a -f ipc.stanza
198
Winfried Truemper reports: The default values should be large enough for most common cases. You can modify the shared memory conguration by writing to these les:
22.8 Sometimes shared memory and message queues aren't released when Squid exits.
Yes, this is a little problem sometimes. Seems like the operating system gets confused and doesn't always release shared memory and message queue resources when processes exit, especially if they exit abnormally. To x it you can \manually" clear the resources with the ipcs command. Add this command into your RunCache or squid start script:
ipcs | grep '^[mq]' | awk '{printf "ipcrm -%s %s\n", $1, $2}' | /bin/sh
If there are more than Q1 messages outstanding, then Squid will intentionally fail to open disk les for reading and writing. This is a load-shedding mechanism. If your cache gets really really busy and the disks can not keep up, Squid bypasses the disks until the load goes down again.
23. Authentication
199
If there are more than Q2 messages outstanding, then the main Squid process \blocks" for a little bit until the diskd process services some of the messages and sends back some replies. Q1 should be larger than Q2. You want Squid to get to the \blocking" condition before it gets to the \refuse to open les" condition. Reasonable values for Q1 and Q2 are 72 and 64, respectively.
23 Authentication
23.1 How does Proxy Authentication work in Squid?
Note: The information here is current for version 2.5.
Users will be authenticated if squid is congured to use proxy auth ACLs (see next question). Browsers send the user's authentication credentials in the Authorization request header. If Squid gets a request and the http access rule list gets to a proxy auth ACL, Squid looks for the Authorization header. If the header is present, Squid decodes it and extracts a username and password. If the header is missing, Squid returns an HTTP reply with status 407 (Proxy Authentication Required). The user agent (browser) receives the 407 reply and then prompts the user to enter a name and password. The name and password are encoded, and sent in the Authorization header for subsequent requests to the proxy.
NOTE : The name and password are encoded using \base64" (See section 11.1 of RFC 2616 <ftp://ftp.isi.edu/in-notes/rfc2616.txt>). However, base64 is a binary-to-text encoding only, it does NOT encrypt the information it encodes. This means that the username and password are essentially \cleartext" between the browser and the proxy. Therefore, you probably should not use the same username and password that you would use for your account login.
Authentication is actually performed outside of main Squid process. When Squid starts, it spawns a number of authentication subprocesses. These processes read usernames and passwords on stdin, and reply with "OK" or "ERR" on stdout. This technique allows you to use a number of dierent authentication schemes, although currently you can only use one scheme at a time. The Squid source code comes with a few authentcation processes for Basic authentication. These include:
LDAP: Uses the Lightweight Directory Access Protocol NCSA: Uses an NCSA-style username and password le. MSNT: Uses a Windows NT authentication domain. PAM: Uses the Linux Pluggable Authentication Modules scheme. SMB: Uses a SMB server like Windows NT or Samba. getpwam: Uses the old-fashioned Unix password le. sasl: Uses SALS libraries. winbind: Uses Samba authenticate in a Windows NT domain
In addition Squid also supports the NTLM and Digest authentication schemes which both provide more secure authentication methods where the password is not exchanged in plain text. Each scheme have their
23. Authentication
200
own set of helpers and auth param settings. You can not mix helpers between the dierent authentication schemes. For information on how to set up NTLM authentication see 23.5. In order to authenticate users, you need to compile and install one of the supplied authentication modules found in the helpers/basic auth/ directory, one of the others <http://www.squid-cache.org/related-software.html#auth>, or supply your own. You tell Squid which authentication program to use with the auth param option in squid.conf. You specify the name of the program, plus any command line options if necessary. For example:
auth_param basic program /usr/local/squid/bin/ncsa_auth /usr/local/squid/etc/passwd
The REQURIED term means that any authenticated user will match the ACL named foo . Squid allows you to provide ne-grained controls by specifying individual user names. For example:
acl foo proxy_auth REQUIRED acl bar proxy_auth lisa sarah frank joe acl daytime time 08:00-17:00 acl all src 0/0 http_access allow bar http_access allow foo daytime http_access deny all
In this example, users named lisa, sarah, joe, and frank are allowed to use the proxy at all times. Other users are allowed only during daytime hours.
23. Authentication
201
Squid writes cleartext usernames and passwords when talking to the external authentication processes. Note, however, that this interprocess communication occors over TCP connections bound to the loopback interface or private UNIX pipes. Thus, its not possile for processes on other comuters or local users without root privileges to "snoop" on the authentication trac. Each authentication program must select its own scheme for persistent storage of passwords and usernames.
Optionally, if building Samba 2.2.5, apply the smbpasswd.di <http://www.squid-cache.org/mail-archive/squid-dev/200 patch. See 23.5.2 below to determine if the patch is worthwhile.
23. Authentication
workgroup = mydomain password server = myPDC security = domain winbind uid = 10000-20000 winbind gid = 10000-20000 winbind use default domain = yes
202
2. Join the NT domain as outlined in the winbindd man page for your version of samba. 3. Test winbindd functionality.
Start nmbd (required to insure proper operation). Start winbindd. Test basic winbindd functionality "wbinfo -t": Test winbindd user authentication:
# wbinfo -a mydomain\\myuser%mypasswd plaintext password authentication succeeded error code was NT_STATUS_OK (0x0) challenge/response password authentication succeeded error code was NT_STATUS_OK (0x0) # wbinfo -t Secret is good
NOTE : both plaintext and challenge/response should return "succeeded." If there is no "challenge/response" status returned then Samba was not built with "{with-winbind-auth-challenge" and cannot support ntlm authentication.
UglySolution.pl <http://www.squid-cache.org/mail-archive/squid-dev/200207/att-0076/02-UglySolution.pl> is a sample perl script to load smbd, connect to a Samba share using smbclient, and generate enough dummy activity to trigger smbd's machine trust account password change code. smbpasswd.di <http://www.squid-cache.org/mail-archive/squid-dev/200207/att-0117/01-smbpasswd.diff> is a patch to Samba 2.2.5's smbpasswd utility to allow changing the machine account password at will. It is a minimal patch simply exposing a command line interface to an existing Samba function.
23. Authentication
smbpasswd -t DOMAIN -r PDC
203
Samba 3.x
The Samba team has incorporated functionality to change the machine trust account password in the new "net" command. A simple daily cron job scheduling "net rpc changetrustpw" is all that is needed.
Samba-3.X
As Samba-3.x has it's own authentication helper there is no need to build any of the Squid authentication helpers for use with Samba-3.x. You do however need to enable support for the ntlm scheme if you plan on using this. Also you may want to use the wbinfo group helper for group lookups
--enable-auth="ntlm,basic" --enable-external-acl-helpers="wbinfo_group"
Edit squid.conf
1. Setup the authenticators. Add the following to enable both the winbind basic and ntlm authenticators. IE will use ntlm and everything else basic:
23. Authentication
auth_param auth_param auth_param auth_param auth_param auth_param auth_param auth_param ntlm ntlm ntlm ntlm basic basic basic basic program /usr/local/squid/libexec/wb_ntlmauth children 5 max_challenge_reuses 0 max_challenge_lifetime 2 minutes program /usr/local/squid/libexec/wb_auth children 5 realm Squid proxy-caching web server credentialsttl 2 hours
204
Note: For Samba-3.X the Samba ntlm auth helper is used instead of the wb ntlmauth and wb auth helpers above. This helper supports all Squid versions and both ntlm and basic schemes via the {helper-protocol= option. See the Samba documentation for details. 2. Add acl entries to require authentication:
acl AuthorizedUsers proxy_auth REQUIRED .. http_access allow all AuthorizedUsers
References
Samba Winbind Overview <http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection.html#WINBIND> Joining a Domain in Samba 2.2.x <http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection.html#AEN1134> winbindd man page <http://www.samba.org/samba/docs/man/winbindd.8.html> wbinfo man page <http://www.samba.org/samba/docs/man/wbinfo.1.html> nmbd man page <http://www.samba.org/samba/docs/man/nmbd.8.html> smbd man page <http://www.samba.org/samba/docs/man/smbd.8.html> smb.conf man page <http://www.samba.org/samba/docs/man/smb.conf.5.html> smbclient man page <http://www.samba.org/samba/docs/man/smbclient.1.html> ntlm auth man page <http://www.samba.org/samba/docs/man/ntlm auth.1.html>
205
25 Security Concerns
25.1 Open-access proxies
Squid's default conguration le denies all client requests. It is the administrator's responsibility to congure Squid to allow access only to trusted hosts and/or users. If your proxy allows access from untrusted hosts or users, you can be sure that people will nd and abuse your service. Some people will use your proxy to make their browsing anonymous. Others will intentionally use your proxy for transactions that may be illegal (such as credit card fraud). A number of web sites exist simply to provide the world with a list of open-access HTTP proxies. You don't want to end up on this list. Be sure to carefully design your access control scheme. You should also check it from time to time to make sure that it works as you expect.
Do NOT add port 25 to Safe ports (unless your goal is to end up in the RBL <http://mail-abuse.org/rbl/>). You may want to make a cron job that regularly veries that your proxy blocks access to port 25.
$Id: FAQ.sgml,v 1.215 2004/02/05 17:06:11 wessels Exp $