End-to-end congestion control cannot avoid latency spikes (2022)

Comments

By flapjack 2024-07-0914:283 reply

One of the solutions they mention is underutilizing links. This is probably a good time to mention my thesis work, where we showed that streaming video traffic (which is the majority of the traffic on the internet) can pretty readily underutilize links on the internet today, without a downside to video QoE! https://sammy.brucespang.com

By aidenn0 2024-07-0922:222 reply

Packet switching won over circuit switching because the cost-per-capacity was so much lower; if you have to end up over-provision/under-utilize links anyways, why not use circuit switching?

By crest 2024-07-0923:431 reply

Because the cost is a lot lower. Paying a few percent of overcapacity doesn’t change that.

By aidenn0 2024-07-0923:481 reply

TFA suggest a 900% overcapacity, not a few percent. I just skimmed GP's article but it seems to suggest a ~100% overcapacity for streaming-video specifically.

By nine_k 2024-07-103:111 reply

"100% overcapacity" is also known as "50% load factor".

It's common knowledge to run your API nodes at 20%-60% CPU load, no more, exactly to curb tail latency.

By sambazi 2024-07-109:101 reply

just think about your competitive edge if you could manage to cramp more load w/o affecting latency

By jdougan 2024-07-1121:05

Depends on what you have to pay (and not just in money) to get there.

By nine_k 2024-07-103:071 reply

A physical circuit costs a lot, so much more that it's not even funny.

You can deploy a 24-fiber optical cable and allow many thousand virtual circuits to run on it in parallel using packet switching. Usually orders of magnitude more when they share bandwidth opportunistically, because the streams of packets are not constant intensity.

Running thousands of separate fibers / wires would be much more expensive, and having thousands of narrow-band splitters / transcievers, also massively expensive. Phone networks have tried that all, and gladly jumped off the physical circuits ship as soon as they could.

By wmf 2024-07-104:352 reply

These days I think circuit switching means virtual circuits with guaranteed bandwidth at each hop. Even the phone network used FDM/TDM for circuits, not separate wires. Anyway, circuits are more expensive than just running a packet-switched network lightly loaded. See http://www.stuartcheshire.org/rants/Networkdynamics.html and https://www-users.cse.umn.edu/~odlyzko/doc/network.utilizati...

By aidenn0 2024-07-1016:20

> Anyway, circuits are more expensive than just running a packet-switched network lightly loaded.

This was undoubtedly true (and not even close) 20 years ago. As technology changes, it can be worth revisiting some of these axioms to see if they still hold. Since virtual circuits require smart switches for the entire shared path, there are literal network effects making it hard to adopt.

By ahartmetz 2024-07-107:13

The old and new standard ways to do virtual circuit switching are ATM (heavily optimized for low latency voice - 53 byte packets!) and MPLS (seems to be a sort of flow labeling extension to "host" protocols such as IP - clever!).

Both are technologies that one rarely has any contact with as an end user.

Sources: Things I've read a long time ago + Wikipedia for ATM, Wikipedia for MPLS.

By sambazi 2024-07-109:072 reply

> my thesis work, where we showed that streaming video traffic [...] can pretty readily underutilize links on the internet today, without a downside to video QoE!

was slightly at a loss in what exactly needed to be shown here until i clicked the link and came to the conclusion that you re-invented(?) pacing.

https://man7.org/linux/man-pages/man8/tc-fq.8.html

By flapjack 2024-07-1117:361 reply

I would definitely not say that we re-invented pacing! One version of the question we looked at was: how low a pace rate can you pick for an ABR algorithm, without reducing video QoE? The part which takes work is this "without reducing video QoE" requirement. If you're interested, check out the paper!

By sambazi 2024-07-129:20

> One version of the question we looked at was: how low a pace rate can you pick for an ABR algorithm, without reducing video QoE?

that is certainly an interesting economical optimization problem to reason about, though imho somewhat beyond technical merit, as simply letting the client choose quality and sending the data full speed works well enough.

addition:

i totally agree that things have to look economical in order to work and that there are technical edge-cases that need to be handled for good ux, but i dont't quite see how client-side buffer occupancy in the seconds range is in the users interest.

By nsteel 2024-07-1013:321 reply

Did you read the linked paper which talks a lot about TCP pacing?

https://sammy.brucespang.com/sammy.pdf

By sambazi 2024-07-1111:411 reply

i did not read that paper with a focus on adaptive bitrate selection for video streaming services that came out 8 years after the pacing implementation hit the kernel. thx thou

By nsteel 2024-07-2314:09

It uses that implementation, so yes, it came out afterwards. Section 2 and 3 might help.

By clbrmbr 2024-07-0915:512 reply

Can you comment on latency-sensitive video (Meet, Zoom) versus latency-insensitive video (YouTube, Netflix)? Is only the latter “streaming video traffic”?

By flapjack 2024-07-0916:251 reply

We looked at latency-insensitive like YouTube and Netflix (which were a bit more than 50% of internet traffic last year [1]).

I'd bet you could do something similar with Meet and Zoom–my understanding is video bitrates for those services are lower than for e.g. Netflix which we showed are much lower than network capacities. But it might be tricky because of the latency-sensitivity angle, and we did not look into it in our paper.

[1] https://www.sandvine.com/hubfs/Sandvine_Redesign_2019/Downlo...

By mikepavone 2024-07-101:35

> Meet and Zoom–my understanding is video bitrates for those services are lower than for e.g. Netflix

For a given quality, bitrate will generally be higher in RTC apps (though quality may be lower in general depending on the context and network conditions obviously) because of tradeoffs between encoding latency and efficiency. However, RTC apps generally already try to underutilize links because queuing is bad for latency and latency matters a lot for the RTC case.

By sulandor 2024-07-107:06

the term “streaming video" usually refers to the fact that the data is sent slower than the link capacity (but intermittently faster than the content bitrate)

op used the term presumably to describe "live content" eg. the source material is not available as a whole (because the recording is not finished); which can be considered a subset of "streaming video"

the sensitivity in regard to transport characteristics stems from the fact that "live content" places an upper bound for the time required for processing and transferring the content-bits to the clients (for it to be considered "live").

By westurner 2024-07-0912:47

- "MOOC: Reducing Internet Latency: Why and How" (2023) https://news.ycombinator.com/item?id=37285586#37285733 ; sqm-autorate, netperf, iperf, flent, dslreports_8dn

Bufferbloat > Solutions and mitigations: https://en.wikipedia.org/wiki/Bufferbloat#Solutions_and_miti...

By comex 2024-07-0917:25

Sounds like a good argument for using a CDN. Or to phrase it more generally, an intermediary that is as close as possible to the host experiencing the fluctuating bandwidth bottleneck, while still being on the other side of the bottleneck. That way it can detect bandwidth drops quickly and handle them more intelligently than by dropping packets - for instance, by switching to a lower quality video stream (or even re-encoding on the fly).

Hacker News