This is a personal note on how to inspect async streams in order to hash the
contents without storing and reading the stream once more.
This might become the
start of a a series of little code snippets that might be interesting to others
as well.
Code
As an example fetch a URL and compute the SHA256 hash on it:
use anyhow::anyhow;
use futures::StreamExt;
use sha2::Digest;
use std::path::PathBuf;
#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), anyhow::Error> {
let arg = std::env::args()
.skip(1)
.next()
.ok_or(anyhow!("no URL given"))?;
let url = url::Url::parse(&arg)?;
let path = PathBuf::try_from(url.path())?;
let filename = path
.file_name()
.ok_or(anyhow!("URL does not contain a filename"))?;
let mut hasher = sha2::Sha256::new();
let stream = reqwest::get(url)
.await?
.bytes_stream()
.inspect(|bytes| {
if let Ok(bytes) = bytes {
hasher.update(bytes);
}
})
.map(|chunk| chunk.map_err(|err| std::io::Error::new(std::io::ErrorKind::Other, err)));
let mut reader = tokio_util::io::StreamReader::new(stream);
let mut file = tokio::io::BufWriter::new(tokio::fs::File::create(filename).await?);
tokio::io::copy(&mut reader, &mut file).await?;
let sum = hasher.finalize();
println!("{} {:?}", hex::encode(&sum), filename);
Ok(())
}
Cargo.toml dependencies
[dependencies]
anyhow = "1.0.80"
futures = "0.3.30"
hex = "0.4.3"
reqwest = { version = "0.11.24", features = ["stream"] }
sha2 = "0.10.8"
tokio = { version = "1.36.0", features = ["macros"] }
tokio-util = { version = "0.7.10", features = ["io"] }
url = "2.5.0"
OpenCL has been with me for more than a decade, back when we decided to use it
in our research project to make it the foundation for accelerating synchrotron
imaging. Now, as history has shown, OpenCL never really took off, partially
because Apple (the initial sponsor) dropped it but more importantly NVIDIA being
very successful in locking in people with their proprietary CUDA solution.
Nevertheless, support by all major GPU vendors is there to some degree, so
software can be accelerated in a somewhat portable way. The degree AMD has
taken is somewhat questionable though: they do support OpenCL either via their
open ROCm stack but just for select GPUs and short support windows or via their
proprietary amdgpu-pro packages. The latter is what I use today to enable OpenCL
in Darktable but it is a hack because it involves downloading Debian packages
from their website and extracting them correctly.
Fast forward to 2022, Rust is on its way to become the premier systems language
and heroes like Karol Herbst start writing OpenCL mesa
drivers completely alleviating the need for
the crap AMD is offering (well almost). Because building and using it is not
very straightforward at the moment, here are some hints how to do that. I am
assuming an older Ubuntu 20.04 box, so some things could be in the 22.04 repos
already.
Add the LLVM apt repos
deb http://apt.llvm.org/focal/ llvm-toolchain-focal-15 main
deb-src http://apt.llvm.org/focal/ llvm-toolchain-focal-15 main
to /etc/apt/sources.list
and run apt update
. Install
$ apt install clang-15 libclang-15-dev llvm-15 llvm-15-dev llvm-15-tools
Ubuntu 20.04 comes with a pretty old version of meson, so lets create a
virtualenv and install it along with mako which is used by mesa itself:
$ python3 -mvenv .venv
$ source .venv/bin/activate
$ pip3 install meson mako
We also need bindgen to bind to C functions but luckily the bindgen
program is
sufficient and can be installed easily with
$ cargo install bindgen-cli
Build rusticl
At the moment the radeonsi changes are not yet merged into the main branch,
hence
$ git remote add superhero https://gitlab.freedesktop.org/karolherbst/mesa.git
$ git fetch superhero
$ git checkout -t rusticl/si
For some reason, rusticl won’t build with the LLVM 15 libraries as is and we
have to add clangSupport
to src/gallium/targets/opencl/meson.build
as yet another clang library to link against in order to find some RISCV
symbols. It’s time to configure the build with meson
$ meson .. -Dgallium-rusticl=true -Dllvm=enabled -Drust_std=2021 -Dvalgrind=disabled
Note that meson does not check for existence for Valgrind on Ubuntu and enables
it by default causing build errors when the development libraries are not
installed. Time to build and install using ninja
$ ninja build && ninja install
Running OpenCL programs
I tend to install mesa into a custom prefix and pre-load it with my old shell
script. In
order to have the system-wide ICD loader find the ICD that points to rusticl, we
have to set the OPENCL_VENDOR_PATH
environment variable to the directory
containing the .icd, i.e. <some-prefix>/etc/OpenCL/vendors
. Also we have to
set the RUSTICL_ENABLE
environment variable to radeonsi
because it is not
enabled by default yet. With that set clinfo
should show a platform with the
name rusticl
.
Setting up rust-analyzer
If you intend to dig into rusticl itself you will notice that this is not your
bog standard Cargo project but intertwined with meson which takes care of
building the majority of the C and C++ sources. Because of this rust-analyzer is
not able to figure out the structure of the rusticl project. Luckily, meson 0.64
produces a rust-project.json
file that describes the structure but
unfortunately the paths in there seem to be a bit messed up. After symlinking
from the root of the Git repo (so rust-analyzer can find it) and changing the
paths to point to existing directories, rust-analyzer was able to make sense of
the project.
It has been almost a month already since I released the first major breaking
release of my minimalist pastebin. The main
reason to bump the major version was due to streamlining routes especially
dropping the /api
ones and adding query parameters where it made sense. In
between my last post and version two, there have been many other non-breaking
changes like correct caching (of course …), more keybinds, better looking user
interface, minor fixes and a demo site hosted here.
Currently, I am preparing everything to make the move to the upcoming breaking
0.6 release of axum. But more importantly, I am investigating ideas how to get
rid of syntect, the syntax highlighting
library. My main issue with that library is that themes have to be in Sublime
Text theme format which leaves a lot of nice light/dark themes on the table. My
current approach is a tree-sitter based
library that bundles a bunch of
tree-sitter grammars and uses helix
themes to highlight the parsed names. While it works alright, distributing it as
a crate is a pain in the ass because only a fraction keeps publishing updated
grammars on crates.io. So, next idea is perhaps bundling it via Git submodules.
Let’s see.
Pastebins are the next step in the
evolution of a software developer, right after finishing hello worlds and static
site generators. They are limited terms of features (or not … ahem) but require
some form of dynamisms in order to receive and store user input and make it
available upon request. Of course, everyone has different ideas what a pastebin
should do and in what language it should be written. And because I am in no way
different, I had to write my own: the
wastebin pastebin that ticks the following
boxes:
- Written in Rust for ease of deployment.
- SQLite instead of a full-fledged database server or flat files.
- Paste expiration.
- Minimalist appearance.
- Syntax highlighting.
- Line numbers.
bin – from which wastebin takes huge
inspiration in terms of UI – was almost there but the lack of expiration and
flat-file storage was a no-go. Moreover, I sincerely think
axum has a more solid foundation than
Rocket. Enough reasons to do it myself.
One of Rust’s nice properties is producing statically linked binaries making
deployment simple and straightforward. In some cases this is not enough and
additional data is required for proper function, for example static data for web
servers. With dependencies such as include_dir
and mime_guess
this is a
piece of cake to integrate into axum
though.
Using include_dir
we first declare variable that represents the data currently
located in the static
directory:
use include_dir::{include_dir, Dir};
static STATIC_DIR: Dir<'_> = include_dir!("$CARGO_MANIFEST_DIR/static");
Now we define the static data route, passing *path
to denote we want to match
the entire remaining path.
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let app = axum::Router::new()
.route("/static/*path", get(static_path));
let addr = std::net::SocketAddr::from(([0, 0, 0, 0], 3000));
axum::Server::bind(&addr)
.serve(app.into_make_service())
.await?;
Ok(())
}
Note that we cannot use the newly added typed path functionality in
axum-extra
. Now onto the actual route handler:
async fn static_path(Path(path): Path<String>) -> impl IntoResponse {
let path = path.trim_start_matches('/');
let mime_type = mime_guess::from_path(path).first_or_text_plain();
match STATIC_DIR.get_file(path) {
None => Response::builder()
.status(StatusCode::NOT_FOUND)
.body(body::boxed(Empty::new()))
.unwrap(),
Some(file) => Response::builder()
.status(StatusCode::OK)
.header(
header::CONTENT_TYPE,
HeaderValue::from_str(mime_type.as_ref()).unwrap(),
)
.body(body::boxed(Full::from(file.contents())))
.unwrap(),
}
}
As you can see we first strip the initial slash and then use the mime_guess
crate to guess a MIME type from it. If we are not able to do so, just assume
text/plain
however wrong that is. Then we try to locate the file path and
either return a 404 or a 200 with the actual file contents. Easy as pie.