[squid-users] intercepting roku traffic
Brendan Kearney
bpk678 at gmail.com
Thu Mar 10 02:57:42 UTC 2016
On 03/09/2016 06:18 AM, Amos Jeffries wrote:
> On 9/03/2016 4:59 a.m., Brendan Kearney wrote:
>> i have a roku4 device and it constantly has issues causing it to
>> buffer. i want to try intercepting the traffic to see if i can smooth
>> out the rough spots.
> Squid is unlikely to help with this issue.
>
> "Buffering ..." issues are usually caused by:
>
> - broken algorithms on the device consuming data faster than it lets the
> remote endpoint be aware it can process, and/or
> - network level congestion, and/or
> - latency increase from excessive buffer sizes (on device, or network).
>
>
>> i can install squid on the router device i have
>> and intercept the port 80/443 traffic, but i want to push the traffic to
>> my load balanced VIP so the "real" proxies can do the fulfillment work.
> Each level of software you have processing this traffic increases the
> latency delays packets have. Setups like this also add extra bottlenecks
> which can get congested.
>
> Notice how both of those things are items on the problem list. So adding
> a proxy is one of the worst things you can do in this situation.
>
> On the other hand, it *might* help if the problem is lack of a cache
> near the client(s). You need to know that a cache will help though
> before starting.
>
>
> My advice is to read up on "buffer bloat". What the term means and how
> to remove it from your network. Check that you have ICMP and ICMPv6
> working on your network to handle device level issues and congestion
> handling activities.
>
> Then if the problem remains, check your traffic to see how much is
> cacheable. Squid intercepts can usually cache 5%-20% of any network
> traffic if there is no other caching already being done on that traffic
> (excluding browser caches). With attention and tuning it can reach
> soewhere around 50% under certain conditions.
>
> Amos
>
> _______________________________________________
> squid-users mailing list
> squid-users at lists.squid-cache.org
> http://lists.squid-cache.org/listinfo/squid-users
a bit about my router and network:
router - hp n36l microserver, 1.3 GHz Athlon II Neo CPU, 4 GB RAM, on
board Gb NIC for WAN, HP nc364t 4x1Gb NIC using e1000e driver. the 4
ports on the nc364t card are bonded with 802.3ad and LACP and 9 VLANs
are trunked across.
switch 1 - cisco sg500-52
switch 2 - cisco sg300-28
router is connected to switch 1 with a 4 port bond and switch 1 is
connected to switch 2 with a 4 port bond. all network drops throughout
the house are terminated to a patch panel and patched into the sg500.
all servers are connected to the sg300, and have a 4 port bond for
in-band connections and an IPMI card for out of band mgmt.
the router does firewall, internet gateway/NAT, load balancing, and
routing (locally connected only, no dynamic routing such as ospf via
quagga).
now, what i have done so far:
when if first got the roku4, i found issues with the sling tv app. hulu
worked without issue, and continues without issue even now. i have
looked into QoS, firewall performance tweaks, ring buffer increases, and
kernel tuning for things like packets per second capacity. i also have
roku SE devices, that have no issues in hulu or sling tv at all. having
put up a vm for munin monitoring, i am able to see some details about
the network.
QoS will not be of any value because none of the links i control are
saturated or congested. everything is gig, except for the roku
devices. the 4 is 100 Mb and the SE's are wifi. the only way for me to
have QoS kick in is to artificially congest my links, say with very few
ring buffers. i dont see this as a reasonable option at this point.
i have tuned my firewall policy in several ways. first, i stopped
logging the roku HTTP/HTTPS traffic. very chatty sessions lead to lots
of logs. each log event calls the "logger" binary, and i was paying
penalties for starting a new process thousands of times to log the
access events. i also reject all other traffic from the roku's instead
of dropping the traffic. this helps with the google dns lookups the
devices try, and i no longer pay the dns timeout penalties for that. i
have also stopped the systemd logging and i am not paying the i/o
penalty for writing those logs to disk. since i use rsyslog with RELP
(Reliable Event Log Processing), all logging still goes on, i just have
reliable syslog over tcp with receipt acknowledgment, and cascading FIFO
queue to memory and then to disk if need be. i believe this has helped
reclaim i/o, interrupts and contexts, leading to some (minor)
performance gains.
the hp nc364t quad gig nic's are bonded, and i see RX and TX errors on
the bond interface (not the VLAN sub interfaces and not the physical
p1pX interfaces). i increased the ring buffers on all 4 interfaces from
the default of 256 to 512 and then to 1024 and then to 2048, testing
each change along the way. 1024 seems to the best so far and i dont
think there are any issues with buffer bloat. i have used
http://www.dslreports.com/speedtest to run speed tests since it has a
buffer bloat calculation built in. i have tested using my load balanced
squid instances as well as direct outbound with no proxy. it does not
detect any buffer bloat at all. with the ring buffers set to 1024,
there is a small but perceptible performance gain and anecdotally, i
notice page loads are quicker. from what i have read about buffer bloat,
it is only related to interface buffer queues and not memory/disk caches
such as squid. it seems that any delay seen is at the beginning of a
stream and not during an in-progress conversation.
i am currently endeavoring to tune the kernel on the router box, and
have found some info that seems to have helped in small ways. having
talked to a coworker who is well versed in all things related to the
network stack, he tells me the info found in this blog,
http://www.nateware.com/linux-network-tuning-for-2013.html, is good for
a server that is handling the connections as opposed to a router that is
routing the traffic. i have added the suggested tweaks to my box, as it
is a load balancer (TCP Proxy) and does handle connections in addition
to routing. there is also an article on ars technicha about building
your own linux router. the teaser article was a decent read, but the
next installment should have more meat-and-potatoes in it and i am
waiting on that, to see what gaps i have in my setup. the second
article should be dropping soon.
i put together a monitoring vm and have munin pulling snmp stats from
all my infrastructure. i can see the bandwidth usage and found some
patterns. the roku SE's only use about 3 Mb when playing back. the
roku4 uses between 5 and 6 Mb. hulu and sling tv both work fine on the
SE's, and hulu works fine on the roku4. sling tv on the roku4 degrades
in video (pixelates), sound quality fails and goes from stereo to mono,
and ultimately the round ring of buffering destroys the end user
experience. of course i see in the munin graphs when the buffering
occurs, but i have not been able to correlate the buffering event to
anything specific.
i setup a span port on the sg500 switch and recorded a 23 minute session
of me watching tv in a packet capture. the resulting 567 MB pcap has
some interesting data. first, there is no ICMP traffic in it. so the
PMTUD/ICMP Type 3, Code 4/Packet Fragmentation suggestion likely wont
pan out. i still added the rule to my firewall, though as it is sound
logic to allow it outbound. second, the stream is in the clear and is
downloaded in partial content (HTTP/206) chunks. below are the response
headers for one of the streams in the capture:
HTTP/1.1 206 Partial Content
ETag: "482433a6a462d26fc994b06a1856547f"
Last-Modified: Thu, 25 Feb 2016 01:00:22 GMT
Server: Dynapack/0.1."152-4" Go/go1.5
X-Backend: pcg15dynpak12
XID-Fetch: 150850146
Grace: none
Linear-Cache-Host: cg5-lcache001
Linear-Cache: MISS
XID-Deliver: 150850145
Accept-Ranges: bytes
Cache-Control: public, max-age=604744
Expires: Thu, 03 Mar 2016 00:59:40 GMT
Date: Thu, 25 Feb 2016 01:00:36 GMT
Content-Range: bytes 143805-287607/318480
Content-Length: 143803
Connection: keep-alive
Content-Type: video/MP2T
access-control-max-age: 86400
access-control-allow-credentials: true
access-control-expose-headers: Server,range,hdntl,hdnts
access-control-allow-headers: origin,range,hdntl,hdnts
access-control-allow-methods: GET,POST,OPTIONS
access-control-allow-origin: *
the ETag, Last-Modified, Cache-Control and Expires headers indicate to
me that i would be able to cache this content, so i believe there would
be a benefit to getting squid into the mix here.
looking at the IO Graph in wireshark, i can see latency spikes during
the buffering events, but modulates/undulates throughout the captured
session. i am not sure i know enough about what i am looking at to make
sense of it.
with all of this info, i do believe proxying this traffic will improve
the situation. just how much improvement is yet to be seen. with what
i think i need, in terms of intercepting the traffic, are there any
glaring holes or pitfalls?
thanks for the assistance,
brendan
More information about the squid-users
mailing list