Это объекты Internet такие как файл, документ или ответ на запрос к сервису Internet типа FTP, HTTP или gopher. Клиент запрашивает объект Internet у кеширующего прокси. Если объект еще не закеширован, прокси-сервер получает объект (либо с узла указанного в URL, либо с родительского или братского кеша) и доставляет его клиенту.
ICP - протокол, используемый для взаимодействия между кешами squid. ICP протокол описман в двух Internet RFC. RFC 2186 описывает собственно протокол, в т время как RFC 2187 описывает приложения ICP для иерархического кеширования Web.
ICP в соновмно используется в иерерхии кешей, когда идет поиск определенного объекта на братских кешах. Если squid не имеет запрошенного документа, то он посылает ICP-запрос своему братскому кешу и братский кеш отправляет ICP-ответ, сигнализирующий ``HIT'' или ``MISS.'' Кеш. использует эти ответы для выбора от какого кеща ему разрешать собственные MISS.
ICP также поддерживает множественную предеачу нескольких объектных потоков поверх одного TCP-соединения. ICP сейчас работает поверх протокола UDP. Текущие версии Squid также поддерживают ICP посредством множественных запросов.
dnsserver - это процесс, порождаемый squid, чтобы
преобразовывать IP-адреса в доменные имена. Это необходимо, т.к.
функция gethostbyname(3) блокирует вызов процесса пока DNS-запрос не будет отработан.
Squid должен использовать неблокриуемый ВВОД/ВЫВОД постоянно, посему DNS-преобразования выполняются внешним, отдельным от главного процессом. Процесс dnsserver не кеширует DNS-запросы, это делается внутри процесса squid.
Программа ftpget существует только для версий Squid 1.1 и Squid 1.0.
ftpget - это FTP-клиент, используемый для доставки файлов с FTP-сервров. Т.к. FTP - сложный протокол, то проще всего обрабатывать отдельно от главного процесса squid code.
FTP PUT должен работать со Squid-2.0 и более поздними версиями. Если вы используете Squid-1.1, то вам необходимо обновить версию для того, чтобы PUT заработал.
A cache hierarchy is a collection of caching proxy servers organized in a logical parent/child and sibling arrangement so that caches closest to Internet gateways (closest to the backbone transit entry-points) act as parents to caches at locations farther from the backbone. The parent caches resolve ``misses'' for their children. In other words, when a cache requests an object from its parent, and the parent does not have the object in its cache, the parent fetches the object, caches it, and delivers it to the child. This ensures that the hierarchy achieves the maximum reduction in bandwidth utilization on the backbone transit links, helps reduce load on Internet information servers outside the network served by the hierarchy, and builds a rich cache on the parents so that the other child caches in the hierarchy will obtain better ``hit'' rates against their parents.
In addition to the parent-child relationships, squid supports the notion of siblings: caches at the same level in the hierarchy, provided to distribute cache server load. Each cache in the hierarchy independently decides whether to fetch the reference from the object's home site or from parent or sibling caches, using a a simple resolution protocol. Siblings will not fetch an object for another sibling to resolve a cache ``miss.''
The algorithm is somewhat more complicated when firewalls are involved.
The single_parent_bypass directive can be used to skip
the ICP queries if the only appropriate sibling is a parent cache
(i.e., if there's only one place you'd fetch the object from, why
bother querying?)
There are several open issues for the caching project namely more automatic load balancing and (both configured and dynamic) selection of parents, routing, multicast cache-to-cache communication, and better recognition of URLs that are not worth caching.
For our other to-do list items, please see our ``TODO'' file in the recent source distributions.
Prospective developers should review the resources available at the Squid developers corner
Workload can be characterized as the burden a client or group of clients imposes on a system. Understanding the nature of workloads is important to the managing system capacity.
If you are interested in Internet traffic workloads then NLANR's Network Analysis activities is a good place to start.
The NLANR root caches are at the NSF supercomputer centers (SCCs), which are interconnected via NSF's high speed backbone service (vBNS). So inter-cache communication between the NLANR root caches does not cross the Internet.
The benefits of hierarchical caching (namely, reduced network bandwidth consumption, reduced access latency, and improved resiliency) come at a price. Caches higher in the hierarchy must field the misses of their descendents. If the equilibrium hit rate of a leaf cache is 50%, half of all leaf references have to be resolved through a second level cache rather than directly from the object's source. If this second level cache has most of the documents, it is usually still a win, but if higher level caches often don't have the document, or become overloaded, then they could actually increase access latency, rather than reduce it.
Обратитесь к Firewalls mailing list and FAQ.
For example:
Storage LRU Expiration Age: 4.31 days
The LRU expiration age is a dynamically-calculated value. Any objects which have not been accessed for this amount of time will be removed from the cache to make room for new, incoming objects. Another way of looking at this is that it would take your cache approximately this many days to go from empty to full at your current traffic levels.
As your cache becomes more busy, the LRU age becomes lower so that more objects will be removed to make room for the new ones. Ideally, your cache will have an LRU age value in the range of at least 3 days. If the LRU age is lower than 3 days, then your cache is probably not big enough to handle the volume of requests it receives. By adding more disk space you could increase your cache hit ratio.
The configuration parameter reference_age places an upper limit on your cache's LRU expiration age.
Consider a pair of caches named A and B. It may be the case that A can reach B, and vice-versa, but B has poor reachability to the rest of the Internet. In this case, we would like B to recognize that it has poor reachability and somehow convey this fact to its neighbor caches.
Squid will track the ratio of failed-to-successful requests over short time periods. A failed request is one which is logged as ERR_DNS_FAIL, ERR_CONNECT_FAIL, or ERR_READ_ERROR. When the failed-to-successful ratio exceeds 1.0, then Squid will return ICP_MISS_NOFETCH instead of ICP_MISS to neighbors. Note, Squid will still return ICP_HIT for cache hits.
Нет, вы должны послать сигнал HUP, чтобы заставить Squid перечитать свой файл конфигурации, включая и списки контроля доступа. Простейший способ сделать это - использования ключа -k командной строки:
squid -k reconfigure
unlinkd is an external process used for unlinking unused cache files. Performing the unlink operation in an external process opens up some race-condition problems for Squid. If we are not careful, the following sequence of events could occur:
So, the problem is, how can we guarantee that unlinkd will not remove a cache file that Squid has recently allocated to a new object? The approach we have taken is to have Squid keep a stack of unused (but not deleted!) swap file numbers. The stack size is hard-coded at 128 entries. We only give unlink requests to unlinkd when the unused file number stack is full. Thus, if we ever have to start unlinking files, we have a pool of 128 file numbers to choose from which we know will not be removed by unlinkd.
In terms of implementation, the only way to send unlink requests to the unlinkd process is via the storePutUnusedFileno function.
Unfortunately there are times when Squid can not use the unlinkd process but must call unlink(2) directly. One of these times is when the cache swap size is over the high water mark. If we push the released file numbers onto the unused file number stack, and the stack is not full, then no files will be deleted, and the actual disk usage will remain unchanged. So, when we exceed the high water mark, we must call unlink(2) directly.
One of the most unpleasant things Squid must do is generate HTML pages of Gopher and FTP directory listings. For some strange reason, people like to have little icons next to each listing entry, denoting the type of object to which the link refers (image, text file, etc.).
In Squid 1.0 and 1.1, we used internal browser icons with names like gopher-internal-image. Unfortunately, these were not very portable. Not all browsers had internal icons, or even used the same names. Perhaps only Netscape and Mosaic used these names.
For Squid 2 we include a set of icons in the source distribution. These icon files are loaded by Squid as cached objects at runtime. Thus, every Squid cache now has its own icons to use in Gopher and FTP listings. Just like other objects available on the web, we refer to the icons with Uniform Resource Locators, or URLs.
Это невозможно. Squid понимает только HTTP-запросы. Он взаимодейстует по FTP на строне сервера, а не на стороне клиента.
Замечательная программа wget загружает файлы по FTP URL-лам через Squid (и возможно другие кеширующие прокси).
Is there any way to speed up the time spent dealing with select? Cachemgr shows:
Select loop called: 885025 times, 714.176 ms avg
This number is NOT how much time it takes to handle filedescriptor I/O. We simply count the number of times select was called, and divide the total process running time by the number of select calls.
This means, on average it takes your cache .714 seconds to check all the open file descriptors once. But this also includes time select() spends in a wait state when there is no I/O on any file descriptors. My relatively idle workstation cache has similar numbers:
Select loop called: 336782 times, 715.938 ms avg
But my busy caches have much lower times:
Select loop called: 16940436 times, 10.427 ms avg
Select loop called: 80524058 times, 10.030 ms avg
Select loop called: 10590369 times, 8.675 ms avg
Select loop called: 84319441 times, 9.578 ms avg
The presence of Cookies headers in requests does not affect whether or not an HTTP reply can be cached. Similarly, the presense of Set-Cookie headers in replies does not affect whether the reply can be cached.
The proper way to deal with Set-Cookie reply headers, according to RFC 2109 is to cache the whole object, EXCEPT the Set-Cookie header lines.
With Squid-1.1, we can not filter out specific HTTP headers, so Squid-1.1 does not cache any response which contains a Set-Cookie header.
With Squid-2, however, we can filter out specific HTTP headers. But instead of filtering them on the receiving-side, we filter them on the sending-side. Thus, Squid-2 does cache replies with Set-Cookie headers, but it filters out the Set-Cookie header itself for cache hits.
When checking the object freshness, we calculate these values:
OBJ_AGE = NOW - OBJ_DATE
LM_AGE = OBJ_DATE - OBJ_LASTMOD
LM_FACTOR = OBJ_AGE / LM_AGE
These values are compared with the parameters of the refresh_pattern rules. The refresh parameters are:
The URL regular expressions are checked in the order listed until a match is found. Then the algorithms below are applied for determining if an object is fresh or stale.
if (CLIENT_MAX_AGE)
if (OBJ_AGE > CLIENT_MAX_AGE)
return STALE
if (OBJ_AGE <= CONF_MIN)
return FRESH
if (EXPIRES) {
if (EXPIRES <= NOW)
return STALE
else
return FRESH
}
if (OBJ_AGE > CONF_MAX)
return STALE
if (LM_FACTOR < CONF_PERCENT)
return FRESH
return STALE
Kolics Bertold has made an excellent flow chart diagram showing this process.
For Squid-2 the refresh algorithm has been slightly modified to give the EXPIRES value a higher precedence, and the CONF_MIN value lower precedence:
if (EXPIRES) {
if (EXPIRES <= NOW)
return STALE
else
return FRESH
}
if (CLIENT_MAX_AGE)
if (OBJ_AGE > CLIENT_MAX_AGE)
return STALE
if (OBJ_AGE > CONF_MAX)
return STALE
if (OBJ_DATE > OBJ_LASTMOD) {
if (LM_FACTOR < CONF_PERCENT)
return FRESH
else
return STALE
}
if (OBJ_AGE <= CONF_MIN)
return FRESH
return STALE
The cachemanager I/O page lists deferred reads for various server-side protocols.
Sometimes reading on the server-side gets ahead of writing to the client-side. Especially if your cache is on a fast network and your clients are connected at modem speeds. Squid-1.1 will read up to 256k (per request) ahead before it starts to defer the server-side reads.
I've been monitoring the traffic on my cache's ethernet adapter an found a behavior I can't explain: the inbound traffic is equal to the outbound traffic. The differences are negligible. The hit ratio reports 40%. Shouldn't the outbound be at least 40% greater than the inbound?
I can't account for the exact behavior you're seeing, but I can offer this advice; whenever you start measuring raw Ethernet or IP traffic on interfaces, you can forget about getting all the numbers to exactly match what Squid reports as the amount of traffic it has sent/received.
Why?
Squid is an application - it counts whatever data is sent to, or received from, the lower-level networking functions; at each successively lower layer, additional traffic is involved (such as header overhead, retransmits and fragmentation, unrelated broadcasts/traffic, etc.). The additional traffic is never seen by Squid and thus isn't counted - but if you run MRTG (or any SNMP/RMON measurement tool) against a specific interface, all this additional traffic will "magically appear".
Also remember that an interface has no concept of upper-layer networking (so an Ethernet interface doesn't distinguish between IP traffic that's entirely internal to your organization, and traffic that's to/from the Internet); this means that when you start measuring an interface, you have to be aware of *what* you are measuring before you can start comparing numbers elsewhere.
It is possible (though by no means guaranteed) that you are seeing roughly equivalent input/output because you're measuring an interface that both retrieves data from the outside world (Internet), *and* serves it to end users (internal clients). That wouldn't be the whole answer, but hopefully it gives you a few ideas to start applying to your own circumstance.
To interpret any statistic, you have to first know what you are measuring; for example, an interface counts inbound and outbound bytes - that's it. The interface doesn't distinguish between inbound bytes from external Internet sites or from internal (to the organization) clients (making requests). If you want that, try looking at RMON2.
Also, if you're talking about a 40% hit rate in terms of object requests/counts then there's absolutely no reason why you should expect a 40% reduction in traffic; after all, not every request/object is going to be the same size so you may be saving a lot in terms of requests but very little in terms of actual traffic.
To determine whether a given object may be cached, Squid takes many things into consideration. The current algorithm (for Squid-2) goes something like this:
The keep-alive ratio shows up in the server_list cache manager page for Squid 2.
This is a mechanism to try detecting neighbor caches which might not be able to deal with HTTP/1.1 persistent connections. Every time we send a proxy-connection: keep-alive request header to a neighbor, we count how many times the neighbor sent us a proxy-connection: keep-alive reply header. Thus, the keep-alive ratio is the ratio of these two counters.
If the ratio stays above 0.5, then we continue to assume the neighbor properly implements persistent connections. Otherwise, we will stop sending the keep-alive request header to that neighbor.
Squid uses an LRU (least recently used) algorithm to replace old cache objects. This means objects which have not been accessed for the longest time are removed first. In the source code, the StoreEntry->lastref value is updated every time an object is accessed.
Objects are not necessarily removed ``on-demand.'' Instead, a regularly scheduled event runs to periodically remove objects. Normally this event runs every second.
Squid keeps the cache disk usage between the low and high water marks. By default the low mark is 90%, and the high mark is 95% of the total configured cache size. When the disk usage is close to the low mark, the replacement is less aggressive (fewer objects removed). When the usage is close to the high mark, the replacement is more aggressive (more objects removed).
When selecting objects for removal, Squid examines some number of objects and determines which can be removed and which cannot. A number of factors determine whether or not any given object can be removed. If the object is currently being requested, or retrieved from an upstream site, it will not be removed. If the object is ``negatively-cached'' it will be removed. If the object has a private cache key, it will be removed (there would be no reason to keep it -- because the key is private, it can never be ``found'' by subsequent requests). Finally, if the time since last access is greater than the LRU threshold, the object is removed.
The LRU threshold value is dynamically calculated based on the current cache size and the low and high marks. The LRU threshold scaled exponentially between the high and low water marks. When the store swap size is near the low water mark, the LRU threshold is large. When the store swap size is near the high water mark, the LRU threshold is small. The threshold automatically adjusts to the rate of incoming requests. In fact, when your cache size has stabilized, the LRU threshold represents how long it takes to fill (or fully replace) your cache at the current request rate. Typical values for the LRU threshold are 1 to 10 days.
Back to selecting objects for removal. Obviously it is not possible to check every object in the cache every time we need to remove some of them. We can only check a small subset each time. The way in which this is implemented is very different between Squid-1.1 and Squid-2.
The Squid cache storage is implemented as a hash table with some number of "hash buckets." Squid-1.1 scans one bucket at a time and sorts all the objects in the bucket by their LRU age. Objects with an LRU age over the threshold are removed. The scan rate is adjusted so that it takes approximately 24 hours to scan the entire cache. The store buckets are randomized so that we don't always scan the same buckets at the same time of the day.
This algorithm has some flaws. Because we only scan one bucket, there are going to be better candidates for removal in some of the other 16,000 or so buckets. Also, the qsort() function might take a non-trivial amount of CPU time, depending on how many entries are in each bucket.
For Squid-2 we eliminated the need to use qsort() by indexing cached objects into an automatically sorted linked list. Every time an object is accessed, it gets moved to the top of the list. Over time, the least used objects migrate to the bottom of the list. When looking for objects to remove, we only need to check the last 100 or so objects in the list. Unfortunately this approach increases our memory usage because of the need to store three additional pointers per cache object. But for Squid-2 we're still ahead of the game because we also replaced plain-text cache keys with MD5 hashes.
keys refers to the database keys which Squid uses to index cache objects. Every object in the cache--whether saved on disk or currently being downloaded--has a cache key. For Squid-1.0 and Squid-1.1 the cache key was basically the URL. Squid-2 uses MD5 checksums for cache keys.
The Squid cache uses the notions of private and public cache keys. An object can start out as being private, but may later be changed to public status. Private objects are associated with only a single client whereas a public object may be sent to multiple clients at the same time. In other words, public objects can be located by any cache client. Private keys can only be located by a single client--the one who requested it.
Objects are changed from private to public after all of the HTTP reply headers have been received and parsed. In some cases, the reply headers will indicate the object should not be made public. For example, if the no-cache Cache-Control directive is used.
Мы используем ее для сбора данных для Plankton.
Да, может послыать. Это старая особенность, доставшаяся от кеша Harvest. Кеш может посылать сообщения ICP ``SECHO'' на echo-порт сервера назначения. Если сообщение SECHOвернется раньше других ICP-ответов, то это может означать, что сервер назначения был ближе, чем любой из соседских кешей. В этом случае Harvest/Squid посылал запрос напрямую к срверу назаначения.
Из-за соображений безопасности многие системные администарторы отлифьтровуют UDP-пакеты для потра 7. Computer Emergency Response Team (CERT) однажды выпустило консультативную заметку ( CA-96.01: UDP Port Denial-of-Service Attack), в которой говорилось, что UDP echo и сервисы chargen могут использоваться для DOS-аттаки. Это заставило администраторов сильно беспокоиться по поводу любого пакета попадающего на потр 7 их системы и породило жалобы.
source_ping был отключен в Squid-2. Если вы видите пакет пришедший на порт 7 от кеша Squid (удаленный порт 3130), то скорее всего вы имеете дело с очень старой версией Squid.
Это значит, что Squid послал DNS-запрос на один IP-адрес, но ответ пришел от другого IP-адреса. По умолчанию Squid проверяет эти адреса на соответствие. Если адреса не совпадают, Squid игорирует подобные ответы.
Есть несколько причин, по которым такое может происходить:
Если вы увидели с этих сообщениях IP-адрес одного из ваших серверов имен, то скорее всего имеет мето вариант (1) или (2).
Вы можете остановить появление подобных сообщений, разрешив получать ответы от ``неизвестных'' серверов имен такой опцией конфигурации:
ignore_unknown_nameservers off
Note: The information here is current for version 2.2.
See storeDirMapAllocate() in the source code.
When Squid wants to create a new disk file for storing an object, it first selects which cache_dir the object will go into. This is done with the storeDirSelectSwapDir() function. If you have N cache directories, the function identifies the 3N/4 (75%) of them with the most available space. These directories are then used, in order of having the most available space. When Squid has stored one URL to each of the 3N/4 cache_dir's, the process repeats and storeDirSelectSwapDir() finds a new set of 3N/4 cache directories with the most available space.
Once the cache_dir has been selected, the next step is to find an available swap file number. This is accomplished by checking the file map, with the file_map_allocate() function. Essentially the swap file numbers are allocated sequentially. For example, if the last number allocated happens to be 1000, then the next one will be the first number after 1000 that is not already being used.
Byte hit ratio is calculated a bit differently than Request hit ratio. Squid counts the number of bytes read from the network on the server-side, and the number of bytes written to the client-side. The byte hit ratio is calculated as
(client_bytes - server_bytes) / client_bytes
If server_bytes is greater than client_bytes, you end up
with a negative value.
The server_bytes may be greater than client_bytes for a number of reasons, including:
First you need to understand the difference between public and private keys.
When Squid sends ICP queries, it uses the ICP reqnum field to hold the private key data. In other words, when Squid gets an ICP reply, it uses the reqnum value to build the private cache key for the pending object.
Some ICP implementations always set the reqnum field to zero when they send a reply. Squid can not use private cache keys with such neighbor caches because Squid will not be able to locate cache keys for those ICP replies. Thus, if Squid detects a neighbor cache that sends zero reqnum's, it disables the use of private cache keys.
Not having private cache keys has some important privacy implications. Two users could receive one response that was meant for only one of the users. This response could contain personal, confidential information. You will need to disable the ``zero reqnum'' neighbor if you want Squid to use private cache keys.
TCP разрешает использование наполовину закрытых (``half-closed'') соединений . Это обеспечивается при помощи стистемного вызова shutdown(2). В Squid это значит, что клиент закрылд своб строну соединения для записи, но оставил открытой для чтения. Наполовину закрытые соединения достаточно хитрая штука, т.к. Squid не может отличить наполовину закрытого соединения от полностью закрытого.
Если Squid пытается прочитать данные из соединения и вызов read() возвращает 0, то Squid считает, что клиент еще не получилответных данных полностью и помечает файловый дескриптор как наполовину закрытый. Скорее всего клиент обрвал запрос и соединение действительно закрыто. Однако есть небольшая вероятность, что клиент использовал вызов shutdown() и все еще может читать возвращаемые данные.
ЧТобы выключить поддержку наполовину закрытых соединений просто укажите в файле squid.conf следующее:
half_closed_clients off
В этом случае Squid будет всегда закрывать свою сторону соединения взамен того, чтобы
помечать их как наполовину зкарытые.
Squid традиционно использует алгоритм замещения LRU. Начиная с версии 2.3, вы можете использовать другие алгоритмы замещения, указав их при помощи опции конфигурации --enable-heap-replacement. В настоящее время код замещения поддерживает два дополнительных алгоритма: LFUDA и GDS.
В Squid версии 2.4 и более поздних вы должны вы дожны использовать опции конфигурации:
./configure --enable-removal-policies=heap
Далее, в squid.conf вы можете выбрать другую политику, указав ее при помощи директивы cache_replacement_policy. См. комментарии в squid.conf для более детальных пояснений.
Код замещения LFUDA и был разарботан John Dilley и прочими из Hewlett-Packard. Их работа описана в таких документах:
Если вы сравниете показания команды df и показания storedir кеш-менеджера, то вы заметите, что используемое дисковое пространство больше того, о чем сообщает Squid. Это может происходить по следующим причинам:
Директива positive_dns_ttl указывает Squid как долго хранить в кеше успешные DNS-запросы. Аналогично этому negative_dns_ttl указывает как долго Squid-у хранить в кеше неудачные DNS-запросы.
Директива positive_dns_ttl используется не всегда . Она НЕ используется в таких случаях:
Допустим у вас такие установки:
positive_dns_ttl 1 hours negative_dns_ttl 1 minutesКогда Squid делает запрос для имени типа www.squid-cache.org, он получает в ответ IP-адрес типа 204.144.128.89. Дарес кешируется на час. Это значит, что когда снова Squid запросит адрес для www.squid-cache.org, то будет исползоваться закешированный на час ответ. По прошествии часа, закешировання информация устаревает и Squid делает новый запрос для адреса www.squid-cache.org.
Если вы испоьзуете патч DNS TTL или используете внутренний обработчик, то каждое имя хоста имеет свое собственное значение TTL, которое было установлено администратором домена. Вы можете просмотреть эти значения на странице 'ipcache' кеш-менеджера. К примеру:
Hostname Flags lstref TTL N www.squid-cache.org C 73043 12784 1( 0) 204.144.128.89-OK www.ircache.net C 73812 10891 1( 0) 192.52.106.12-OK polygraph.ircache.net C 241768 -181261 1( 0) 192.52.106.12-OKПоле TTL показывает сколько секунд должно пройти перед тем , как записить устареет. Отрицательное значение значит, что запись уже устарела и будет обнвлена при следующем запросе.
Директива negative_dns_ttl указывает как долго хранить в кеше неудачные DNS-запросы. Когда Squid-у не удается отрезолвить имя хоста, выможете быть уверены, что это действительно неудачаня попытка и врядли вы получите верный ответ пока не пройдет короткий промежуток времени. Squid повторяет запросы много раз перед тем, как объявить запрос неудачным. Если хотите, можите установить negative_dns_ttl в ноль.
Это значит, что Squid открыл файл для осблуживания хита кеша, но обнаружил, что сохраненный объект не совпадает с тем, что запросил пользователь. Squid размещает подпись MD5 URL-ла в начале каждого файла. Каогда файл открывается Squid проверяет соответствует ли MD5 файла на диске с MD5 запрошенного пользователем URL-ла. Если они не совпадают, то выдается предупреждение и Squid перенаправляет запрос к оригинальному серверу.
Вам не следует волноваться по поводу таких предупреждений. Это означает, что Squid исправляет поврежденные данные директории кеша.
Каждый из дисковых файлов кеша Squid-а содержит секцию индекса в своем начале. Это заголовк, в кторм размещается URL MD5, некоторые данные StoreEntry и прочее. Когда Squid открывает файл на диске для чтения, он ищет заголовк индекса и распаковывает его.
Предепреждние означает, что Squid не смого распаковать индекс. Это не фатальная ошибка, которую Squid может исправить. Возможно индекса просто не было или файл был поврежден.
Вам не нужно волноваться по поводу этих предупреждений. Это значит, что Squid делает двойную проверку того, что файл на диске совпадает с тем, что сквид должен был найти и эта проверка закончилась неудачно. Squid справиться и сгенирирует miss для кеша в этом случае.
Its a side-effect of the way interception proxying works.
When Squid is configured for interception proxying, the operating system pretends that it is the origin server. That means that the "local" socket address for intercepted TCP connections is really the origin server's IP address. If you run netstat -n on your interception proxy, you'll see a lot of foreign IP addresses in the Local Address column.
When Squid wants to make an ident query, it creates a new TCP socket and binds the local endpoint to the same IP address as the local end of the client's TCP connection. Since the local address isn't really local (its some far away origin server's IP address), the bind() system call fails. Squid handles this as a failed ident lookup.
So why bind in that way? If you know you are transparent proxying, then why not bind the local endpoint to the host's (intranet) IP address? Why make the masses suffer needlessly?
Because thats just how ident works. Please read RFC 931, in particular the RESTRICTIONS section.
Это значит, что вы пользуетесь внешним процессом dnsserver для обработки запросов и все процессы заняты, а очередь ожидания Squid-а переполнена. Каждый из процессов dnsserver может обслуживатьтолько один запрос за раз. Когда все процессы dnsserver заняты, Squid ставит запросы в очередь, но только до определенного момента.
Чтобы избежать этого, необходимо либо увеличить кол-во процессов dnsserver, изменив значение директивы dns_children в вашем файле конфигурации, либо использовать внутренний обработчик DNS.
Заметьте, что в некторых версиях Squid есть ограничение на dns_children равное 32. Чтобы установить большее значение, вам потребуется править исходный код.
by Colin Campbell
Ftp uses two data streams, one for passing commands around, the other for moving data. The command channel is handled by the ftpd listening on port 21.
The data channel varies depending on whether you ask for passive ftp or not. When you request data in a non-passive environment, you client tells the server ``I am listening on <ip-address> <port>.'' The server then connects FROM port 20 to the ip address and port specified by your client. This requires your "security device" to permit any host outside from port 20 to any host inside on any port > 1023. Somewhat of a hole.
In passive mode, when you request a data transfer, the server tells the client ``I am listening on <ip address> <port>.'' Your client then connects to the server on that IP and port and data flows.