And yet, every single programming language/platform build their own HTTP-handling library, usually several, of very varying quality and feature support. Again, it would not be as bad if HTTP was a robust format where you could skip recognizing and correctly dealing with half of the features you don't intend to support but it is not: even if you don't want to accept e.g. trailers, you still have to be aware of those. We have OpenSSL, why not also have OpenHTTP (in sans-io style)?
I only skimmed very quickly to look for which server setups they found new vulnerabilities for, and it looked like they tested a 2D matrix of popular webservers/caches/reverse-proxies with each other? Which is neat for automation, but in the real world I'm not usually going to be running haproxy behind nginx or vice versa. I'd be much more interested in findings for popular webserver->appserver setups, e.g., nginx in front of gunicorn/django.
The dominate method for request smuggling as of the last few years has been with `Content-Length` and `Transfer-Encoding`. What I found most interesting and the biggest take-away as someone who has worked doing web-app assessments is more just the attacks that they found to work and cause problems.
I mean the details about particular server pairs having issues is great information, as is the fuzzing setup (great use of differential fuzzing) but I think more important is being able to take these potential attack avenues that they had success with and running them against your own deployments. Given how many applications internally are running their own stacks there is still a lot of room for potential issues. I can imagine people running with some of these for bounties in the near future.
A brief summary of the manipulations they had some success with are on pages 7-8. Though if you don't feel like reading, the "headline" version of the list gives you a pretty decent idea of what was having an impact:
Request Line Mutations
- Mangled Method - Distorted Protocol - Invalid Version - Manipulated Termination - Embedded Request Lines
Request Headers Mutations
- Distorted Header Value - Manipulated Termination - Expect Header - Identity Encoding - V1.0 Chunked Encoding - Double Transfer-Encoding
Request Body Mutations
- Chunk-Size Chunk-Data Mismatch - Manipulated Chunk-Size Termination - Manipulated Chunk-Extension Termination - Manipulated Chunk-Data Termination - Mangled Last-Chunk
And a bit of self-promotion but we talked about this paper on Monday on the DAY podcast I cohost (24m53s - 35m30s) https://www.youtube.com/watch?v=GmOuX8nHZuc&t=1497s
The problem is that people keep all of their authn/authz at the boundary and then, once you're past that, it's a free for all.
Every service needs to validate authorization of the request.