Tide under the microscope, the secret life of requests
Hi there! Welcome back, It has been some time. This will be a different kind of post since we will looking under the hood
to learn and understand how tide works, for this purpose we will examine
the life cycle of a request.
Tide is a modular web framework, that means is built by the composition of different modules (crates
to be correct) that cooperate to give the users the features that expect in a web framework ( e.g. listeners, routing, extraction and more ).
Setup
Minimal example
So, let's start digging Tide's design by following the request
and to do that we can create a minimal application.
#[async_std::main]
async fn main() -> tide::Result<()> {
let mut app = tide::new();
app.at("/").get( |_req| async {
Ok("Hi there!")
});
app.listen("127.0.0.1:8080").await?;
Ok(())
}
And check the response
$ curl localhost:8080
Hi there!
Great! we have our minimal application working.
We have a server that is listening for connections on port 8080, accepting http requests and producing responses.
Expanding the main macro
Let's now start to examine the building blocks, first you may notice the #[async_std::main]
macro, that allow us to write our main
function as async
. If we expand the macro we check how the code looks after expansion
#![feature(prelude_import)]
#[prelude_import]
use std::prelude::v1::*;
#[macro_use]
extern crate std;
fn main() -> tide::Result<()> {
async fn main() -> tide::Result<()> {
{
let mut app = tide::new();
app.at("/").get(|_req| async { Ok("Hi there!") });
app.listen("127.0.0.1:8080").await?;
Ok(())
}
}
async_std::task::block_on(async { main().await })
}
We can see that our
main function is wrapped
inside another not async main function that run our
code inside a an async task blocking the current thread.
Creating the app
Back to our code, inside of our main
fn we are creating a new
tide application.
let mut app = tide::new();
We call app
but the actual type is Server
since new
return a Server
// lib.rs
#[must_use]
pub fn new() -> server::Server<()> {
Server::new()
}
And Servers
are built up as a combination of state, endpoints and middleware
// server.rs
pub struct Server<State> {
router: Arc<Router<State>>,
state: State,
(...)
#[allow(clippy::rc_buffer)]
middleware: Arc<Vec<Arc<dyn Middleware<State>>>>,
}
Where:
-
State
is defined by users andtide
make it available as shared reference in each request. -
router
the server's routing table, used behind andArc
. -
middleware
, allow users to extend the default behavior in both input (request
) and output (response
) direction. This field in particular holds a vector behind anArc
.
We will talk about the State
and Middlewares
in the next post, but we will focus later on how the routing decision is made based on the routing table.
Adding routes
Our next line make a couple of things changing the at
and get
methods.
app.at("/").get(|_req| async { Ok("Hi there!") });
The at
function allow users to add
a new route (at a given path
)to the router and return the created Route
allowing the chaining
.
( You can read the official path
and segment
definition in the tide server module documentation)
The path
(e.g /hello/:name
) is composed by zero or many segments
, each segment
represent a non empty string separated by /
in the path. There are two kind of segments, concreate
and wilfcard
- Contreate: match exactly with the part of the path ( e.g
/hello
) - Wildcard: extracts and parses the respective part of the path of the incoming request to pass it along to the endpoint as an argument. Wildcards segments have also different alternnatives:
- named (e.g
/:name
) that create an endpoint parameter calledname
. - optional (
/*:name
) will match to the end of given path, no matter how many segments are left, even nothing. - unnamed (e.g
/:
) name of the parameter can be omitted to define a path that matches the required structure, but where the parameters are not required:
will match a segment, and*
will match an entire path.
- named (e.g
As we say before, the at
method return a new Route
and if we look the definition of Route
// route.rs
pub struct Route<'a, State> {
router: &'a mut Router<State>,
path: String,
middleware: Vec<Arc<dyn Middleware<State>>>,
prefix: bool,
}
The route
holds a ref
of the router, have a path
and a vector
of middlewares to apply. Also, there is a prefix
flag used to decide if strip_prefix should be applyed or not.
But, in our example we use the get
method to set
the endpoint
(or in our case the closure
to execute when the request arrive). Let's check that method.
/// Add an endpoint for `GET` requests
pub fn get(&mut self, ep: impl Endpoint<State>) -> &mut Self {
self.method(http_types::Method::Get, ep);
self
Awesome, tide
provides methods for each http verb ( e.g get
, post
, put
, etc) that internally call the method
method with the correct http method type
as argument.
Until now we were always looking the code in the tide
source code, but now this methods are using the http-types
dependency. This crate
provides shared types for common HTTP operations.
Let's also looks how the method
function looks like
pub fn method(&mut self, method: http_types::Method, ep: impl Endpoint<State>) -> &mut Self {
if self.prefix {
let ep = StripPrefixEndpoint::new(ep);
self.router.add(
&self.path,
method,
MiddlewareEndpoint::wrap_with_middleware(ep.clone(), &self.middleware),
);
let wildcard = self.at("*--tide-path-rest");
wildcard.router.add(
&wildcard.path,
method,
MiddlewareEndpoint::wrap_with_middleware(ep, &wildcard.middleware),
);
} else {
self.router.add(
&self.path,
method,
MiddlewareEndpoint::wrap_with_middleware(ep, &self.middleware),
);
}
self
For now let's focus on the else
part, since we don't need to strip any prefix. This function is adding the route definition
(a path
, http verb
and endpoint) to the router, but is wrapping the endpoint with the middlewares
that should be executed. Also, notice that is returning Self
(a Route
allowing to chaining with other methods).
Great! We already setup our server
(a.k.a app
). At the moment we define a route that:
- should match at
/hello/:name
path and the httpget
verb. - should run the defined endpoint, a clousure in our case.
But we are not listening any connection yet, let take a look how tide
allow us to listen.
Listening
Next line in our example app is
app.listen("127.0.0.1:8080").await?;
This line set the listener
and start listen to incomming connections by awaiting
(remember that futures
are lazy in rust). Let's take a look of the listen method.
pub async fn listen<L: ToListener<State>>(self, listener: L) -> io::Result<()> {
let mut listener = listener.to_listener()?;
listener.bind(self).await?;
for info in listener.info().iter() {
log::info!("Server listening on {}", info);
}
listener.accept().await?;
Ok(())
Tide have the concept of listener
that is implemented as an async trait
that represent an http transport
, an build using the to_listener
implementation. Out of the box tide
provide a tcp listener
and a unix socket listener
, but you can create your owns.
#[async_trait]
pub trait Listener<State>: Debug + Display + Send + Sync + 'static
where
State: Send + Sync + 'static,
{
async fn bind(&mut self, app: Server<State>) -> io::Result<()>;
async fn accept(&mut self) -> io::Result<()>;
fn info(&self) -> Vec<ListenInfo>;
}
The listen
fn then call the bind
method of the listener
that start the listening
process by opening the neccessary networks ports. At this points the ports
are open but not accepting connection yet, for that the listen
method call the accept
method of the listener.
Awesome! now we are running our app
and listening for networks connections, we can easy check that using netstat
command.
$ netstat -nal| grep 8080
tcp4 0 0 127.0.0.1.8080 *.* LISTEN
Examine
Follow the trace
Now that we have the setup in place and our application running we can start review the life of a request. Let's start with a simple test
curl -v localhost:8080/
* Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 8080 failed: Connection refused
* Trying fe80::1...
* TCP_NODELAY set
* Connection failed
* connect to fe80::1 port 8080 failed: Connection refused
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-length: 9
< content-type: text/plain;charset=utf-8
< date: Sun, 07 Mar 2021 15:50:13 GMT
<
* Connection #0 to host localhost left intact
Hi there!
Lot's of things happens before get the Hi there!
response, let's dive in...
First, we want to add the logger and set the debug level to Trace
tide::log::with_level(tide::log::LevelFilter::Trace);
Let's run the app again and make the test request to see the log (leaving the async_io and polling outside).
tide::log::middleware <-- Request received
method GET
path /
tide::log::middleware --> Response sent
method GET
path /
status 200 - OK
duration 76.45µs
async_h1::server wrote 124 response bytes
async_h1::server discarded 0 unread request body bytes
So, we can see the logs from the middleware
and also from async_h1
, and this is another dep
crate used to parse HTTP 1.1
. And this is something to note now, tide currently support only HTTP 1.1
at this moment.
// HTTP 1
curl -v -0 localhost:8080/
* Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 8080 failed: Connection refused
* Trying fe80::1...
* TCP_NODELAY set
* Connection failed
* connect to fe80::1 port 8080 failed: Connection refused
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.0
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
Going deeper
Now is time to examine how the connection is stablished and follow the path from the listener
to the endpoint
.
First, going back to our listener
(a tcp
listener in our case), remember that we need to call accept
in order to start accepting connections, so let's take a look there to see the behavior.
// tcp_listener.rs
async fn accept(&mut self) -> io::Result<()> {
let server = self
.server
.take()
.expect("`Listener::bind` must be called before `Listener::accept`");
let listener = self
.listener
.take()
.expect("`Listener::bind` must be called before `Listener::accept`");
let mut incoming = listener.incoming();
while let Some(stream) = incoming.next().await {
match stream {
Err(ref e) if is_transient_error(e) => continue,
Err(error) => {
let delay = std::time::Duration::from_millis(500);
crate::log::error!("Error: {}. Pausing for {:?}.", error, delay);
task::sleep(delay).await;
continue;
}
Ok(stream) => {
handle_tcp(server.clone(), stream);
}
};
}
Ok(())
}
listener.incomming
return us an stream that we can then loop calling next
to handle each connection calling handle_tcp
with the server and the stream.
// tcp_listener.rs
fn handle_tcp<State: Clone + Send + Sync + 'static>(app: Server<State>, stream: TcpStream) {
task::spawn(async move {
let local_addr = stream.local_addr().ok();
let peer_addr = stream.peer_addr().ok();
let fut = async_h1::accept(stream, |mut req| async {
req.set_local_addr(local_addr);
req.set_peer_addr(peer_addr);
app.respond(req).await
});
if let Err(error) = fut.await {
log::error!("async-h1 error", { error: error.to_string() });
}
});
}
This spawn a new async task
and inside that task call async_h1.accept
(the http parser) with the stream to parse and a clousure to execute. So, let's follow this request to see how the parser handle it
// async_h1
pub async fn accept<RW, F, Fut>(io: RW, endpoint: F) -> http_types::Result<()>
where
RW: Read + Write + Clone + Send + Sync + Unpin + 'static,
F: Fn(Request) -> Fut,
Fut: Future<Output = http_types::Result<Response>>,
{
Server::new(io, endpoint).accept().await
}
Internally, async_h1
create a new instance of Server
with the io
stream and the endpoint. Then is calling the accept
method of the server
and return the future
.
The accept
method just loop while the connetion keep alive
calling accept_one
.
pub async fn accept(&mut self) -> http_types::Result<()> {
while ConnectionStatus::KeepAlive == self.accept_one().await? {}
Ok(())
}
And accept_one
method is the one that decode
the incomming request, read the body and parse the headers
.
Pass the request to the endpoint
and encode
and write
the response.
(...)
let mut res = (self.endpoint)(req).await?;
let bytes_written = io::copy(&mut encoder, &mut self.io).await?;
log::trace!("wrote {} response bytes", bytes_written);
(...)
Nice! we follow all the path from accepting the connection, decoding, calling endpoint, encoding and writing the response.
We can now go depper and follow the clousure...
One level further
After decoding and parsing the headers, the clousure
passed to async_h1
is executed
// tcp_listener.rs
(...)
let fut = async_h1::accept(stream, |mut req| async {
req.set_local_addr(local_addr);
req.set_peer_addr(peer_addr);
app.respond(req).await
})
Now is time to go deeper into the respond
method and see how this request is proccessed inside tide.
pub async fn respond<Req, Res>(&self, req: Req) -> http_types::Result<Res>
where
Req: Into<http_types::Request>,
Res: From<http_types::Response>,
{
let req = req.into();
let Self {
router,
state,
middleware,
} = self.clone();
let method = req.method().to_owned();
let Selection { endpoint, params } = router.route(&req.url().path(), method);
let route_params = vec![params];
let req = Request::new(state, req, route_params);
let next = Next {
endpoint,
next_middleware: &middleware,
};
let res = next.run(req).await;
let res: http_types::Response = res.into();
Ok(res.into())
}
respond
recive a request
, first need to figure out which endpoint need to be called based on the path
and method
of the request.
The router route
method tries different strategies to select the endpoint
that should be used and if no one matching the request a 404
endpoint is called to return NOT FOUND
to the client.
pub(crate) fn route(&self, path: &str, method: http_types::Method) -> Selection<'_, State> {
if let Some(Match { handler, params }) = self
.method_map
.get(&method)
.and_then(|r| r.recognize(path).ok())
{
Selection {
endpoint: &**handler,
params,
}
} else if let Ok(Match { handler, params }) = self.all_method_router.recognize(path) {
Selection {
endpoint: &**handler,
params,
}
} else if method == http_types::Method::Head {
// If it is a HTTP HEAD request then check if there is a callback in the endpoints map
// if not then fallback to the behavior of HTTP GET else proceed as usual
self.route(path, http_types::Method::Get)
} else if self
.method_map
.iter()
.filter(|(k, _)| **k != method)
.any(|(_, r)| r.recognize(path).is_ok())
{
// If this `path` can be handled by a callback registered with a different HTTP method
// should return 405 Method Not Allowed
Selection {
endpoint: &method_not_allowed,
params: Params::new(),
}
} else {
Selection {
endpoint: ¬_found_endpoint,
params: Params::new(),
}
}
}
}
Once we have the best match
endpoint, the middleware use the Next
struct to drive the execution, including the actual endpoint, and call run
to start proccessing.
// middleware.rs
impl<State: Clone + Send + Sync + 'static> Next<'_, State> {
/// Asynchronously execute the remaining middleware chain.
pub async fn run(mut self, req: Request<State>) -> Response {
if let Some((current, next)) = self.next_middleware.split_first() {
self.next_middleware = next;
match current.handle(req, self).await {
Ok(request) => request,
Err(err) => err.into(),
}
} else {
match self.endpoint.call(req).await {
Ok(request) => request,
Err(err) => err.into(),
}
}
}
}
Notice that we are using handle
to execute the middlewares
and call
to run the endpoint, that is because the middleware receive also the next
as argument, so can continue calling the next middleware
or break the chain with a response.
#[async_trait]
pub trait Middleware<State>: Send + Sync + 'static {
/// Asynchronously handle the request, and return a response.
async fn handle(&self, request: Request<State>, next: Next<'_, State>) -> crate::Result;
/// Set the middleware's name. By default it uses the type signature.
fn name(&self) -> &str {
std::any::type_name::<Self>()
}
}
Awesome! we follow the request until we call
the endpoint.
// endpoint.rs
#[async_trait]
pub trait Endpoint<State: Clone + Send + Sync + 'static>: Send + Sync + 'static {
/// Invoke the endpoint within the given context
async fn call(&self, req: Request<State>) -> crate::Result;
}
Now the response is send to the client!
That's all for today, we follow the code ( an crates ) that tide
us to accept connections, decode and parse the request, decide the best endpoint (route) to use, execute the middleware chain and call the endpoint. There are still lots of topics to cover like body parsing, parameter extraction and middleware execution in both directions ( input/output). In the next notes we will start covering some of those topics.
As always, I write this as a learning journal and there could be errors or misunderstandings and any feedback is welcome.
Thanks!