Hello Hooman 👋 !

I'm pretty much sure you got here by mistake, this is a personal site full of notes that I use mainly as a brain dump and the result of a huge number of sleepless night navigating the internet.

As such, it may change a lot, so you shoud assume its content is completely volatile (including the github repo you can find in the upper right corner of the page since I have a thing for rebases).

In the future I may consider some parts of it as worth publishing and I will probably dedicate a subdomain of toniogela.dev to them.

If you're in the mood to chat, or suggest me to persist specific parts of this mess, ping me on twitter.

Learning Resources

Books

Blogs

Video & Talks

Scala Macros Resources

Mostly Scala 2 material, as it's difficult to find

Videos

Documentation

Code

Presentations

Blog

Papers

Scala 3

Reading List

Generic

Cats Effects

Table of Contents generated with DocToc

Getting Started with Rust

Why the developers who use Rust love it so much - from StackOverflow survey, really good quotes

If you want a Rust REPL, check out evcxr.

I highly recommend rust-analyzer to support fast compile checks, references, refactorings, etc. in your editor. VSCode works pretty well - install rust-analyzer and the "Even Better TOML" extension and you should be set.

Easy short intros:

Online resources and help:

Speed without wizardry - how using Rust is safer and better than using hacks in Javascript

Dealing with strings are confusing in rust, because there are two types: a heap- allocated String and a pointer to a slice of String bytes: &str. Knowing what to use, and defining structures on them, immediately exposes the steep learning curve of ownership.

See the Guide to Strings for some help.

Specific topics:

Borrowing and Lifetime Tricks

If you need to borrow multiple items mutably from a Vec/array/SmallVec/etc.:

If you have a Trait with an associated type that must deal with lifetimes: https://stackoverflow.com/questions/33734640/how-do-i-specify-lifetime-parameters-in-an-associated-type

Macros

I started writing Rust macros and it is not only lots of fun, but pretty essential for writing concise, performant code IMO. Writing Rust has lots of boilerplate sometimes, especially owing to not having real inheritance. I recommend starting with macro_rules! which are fancy templates and really easy. Here are some links to help:

Some crates that may help write macros:

  • spez - match and specialize on the type of an expression. "A trick to do specialization in non-generic contexts on stable Rust"
  • concat-ident - macro to concat multiple identifiers etc. and use the result, perhaps as a struct or method name. Very useful in macros

Cool Rust Projects

NOTE: there's a separate section for Data-related projects.

CLI tools:

  • XSV - a fast CSV parsing and analysis tool
  • zoxide - a supercharged, AI-based replacement for cd with rank-based search of your most frequently used dirs
    • mcfly - Upgraded, smarter Ctrl-R for bash etc. (note: fish users already have this built in, basically)
  • Ripgrep - insanely fast grep utility, great for code searches. Shows off power of Rust regex library
  • Bat - A super cat with syntax highlighting, git integration, other features
  • Bottom - Cross-platform fancy top in Rust - process/sys mon with graphs, very useful!
  • gitui - awesome, fast Git terminal UI. It will change your life!
  • skim - sk is a general purpose fuzzy-finder; it can work with ripgrep and other utils too
  • zellij - terminal mux/session detach like tmux/screen, but with a pretty UI and plugins
  • pueue - instead of using tmux, queue and manage your background tasks
  • xh - HTTPie clone / much better curl alternative
  • Dust - Rust graphical-text faster and friendlier version of du
  • Diskonaut - another Text-UI folder/file space usage and browsing tool
  • fd - Rust CLI, friendlier and faster replacement for find
  • rustscan - Really fast port scanner, this should easily replace lsof / netstat
  • sd - Easier to use sed. You can search and replace in like all files under subdir with sd old_str new_str **.
  • Nushell - Rust shell that turns all output into tabular data. Pretty cool!
  • delta - git-delta: colorful git diff viewer
  • ruplacer - Source code search and replace tool
  • imagecli - CLI for image batch processing
  • Hyperfine - Rust performance benchmarking CLI
  • Alacritty - GPU accelerated terminal emulator
  • jql - Rust version of popular jq JSON CLI processor, though not as powerful
  • rq - a Record Query/Transform tool, translate CSV, Avro, CBOR, Json etc etc to and from each other
  • htmlq - like jq but for HTML
  • Starship - "The minimal, blazing-fast, and infinitely customizable prompt for any shell!"
  • Kubesql - SQL queries for Kube metadata!
  • grex - CLI tool to create regexes given a set of strings to match!
  • Scaphandre - Metrics agent for collecting power consumption metrics!
  • kdash - Text UI Kubernetes dashboard
  • Josh - Cool git proxy allows you to treat part of a large monorepo like its own smaller git repo!

Wasm:

  • Wasmer - general purpose WASM runtime
  • Krustlet - WebAssembly (instead of containers) runtime on Kubernetes!! Use Rust + wasm + WASI for a truly portable k8s-based deploy!
  • Extism - a universal WASM-based plugin system, multi language, but written in Rust
  • lunatic - Erlang-like server side WASM runtime with supervision and channel-based message passing, plus hot reloading!
  • CosmWasm - Rust/WASM for programming smart contract on Cosmos ecosystem

Others:

  • TabNine - an ML-based autocompleter, written in Rust
  • kiro - a CLI text editor with syntax highlighting, like a friendlier vim
  • ox - another CLI/Text UI lightweight text editor
  • async-std - the standard library with async APIs
  • Convey - Layer 4 load balancer
  • Ockam - End to end secure messaging lib/platform between cloud and IoT devices
  • Parsec - abstraction layer for hardware security and cryptography
  • Gazebo - useful utilties for all apps, by the Facebook Rust team. They also have blog posts such as on Dupe

Do Rust in Turkish, Spanish and other languages! :)

Languages etc.

Rust Error Handling

Error handling survey - really good summary of the Rust error library landscape as of late 2019.

  • Anyhow - streamlined error handling with context....
  • Snafu - adding context to errors
  • Error-stack - I really like the philosophy behind this crate. It makes it easy to "stack" errors - you get not a backtrace, but stacked detailed errors, with the inner error showing through.

Rust Concurrency

Shared Data Across Multiple Threads

Sometimes one needs to share a large data structure across threads and several of them must access it.

The most general way to share a data structure is to use Arc<RwLock<...>> or Arc<Mutex<...>>. The Arc keeps track of lifetimes and lets different threads exist for different lengths of time, and is inexpensive since it is usually only accessed once at thread spawn. The Mutex or RwLock lets different threads mutate it safely, assuming the data structure is not thread-safe.

A thread-safe data structure could be used in place of the RwLock or Mutex.

Scoped threads could be used if only one owner will mutate the data structure, and one wants to share immutable refs with other threads for reading. However, the special threads in Crossbeam crate are still needed as Rustc by itself has no way of proving the lifetime of a thread or when it will be joined, thus any immutable refs created from the owner thread still cannot compile or be shared due to rustc lifetime checks. Scoped threads are a way around that as it gives rustc a guarantee that the threads will be joined before the owner goes away.

Arc-swap is an alternative to Arc that is designed for occasional updates - enables atomic swapping of the object underneath the Arc, and allows one to read without contention (unlike Mutex/RwLock).

Also see beef - a leaner version of Cow.

Data Processing and Data Structures

  • Are we learning yet? - list of ML Rust crates

    • Linfa - Rust ML framework
  • Timely Dataflow - distributed data-parallel compute engine in Rust!!!

  • Hydroflow - a brand new Rust based optimized streaming dataflow engine, relational data, based on very advanced UCBerkeley research on optimization.

  • DataFusion - a Rust query engine which is part of Apache Arrow!

    • NOTE: there is now a Ballista project that is basically like Spark - distributed Data Fusion.
  • Amadeus - distributed streams / Parquet / big data processing

  • Fluvio - distributed, persistent queuing / stream processing framework using WASM for programmability, written in Rust!

  • Arroyo - another stream processing framework, streaming SQL and Rust pipelines!

  • Weld - Stanford's high-performance runtime for data analytics

  • Cleora - Super fast Rust tool for billion-scale hypergraph vector embedding ML

  • Node crunch - simple lightweight distributed compute framework

  • Project Midas - distributed compute framework and terminal UI using Lua as scripting language

  • Cube Store - Rust and Arrow/DataFusion-based rollup/aggregation/cache layer for SQL datastores, too bad it's mostly for JS

  • Noria - "data-flow for high-performance web apps" - basically a materialized view cache that updates in real time as database data updates

  • polars - super fast and high level DataFrame implementation for both Rust and Python, much faster and higher level than using Arrow itself

  • Bagua - distributed learning/training framework, the very fast communication core is written in Rust

  • Similari - similarity search/computation engine for ML in Rust

  • Toshi - ElasticSearch written in Rust using Tantivy as the engine

  • MeiliDB - fast full-text search engine

  • Quickwit - Log search DB, like Elastic but built on top of Tantivy

  • Datafuse - distributed "Real-Time Data Processing & Analytics DBMS", similar to Clickhouse "but faster"

  • sonic - Fast, very lightweight and schemaless search/text index. NOT a document store, but an index store.

  • Sanakirja - a transactional KV DB engine/local store, claims to be fastest around

  • Sled - an embedded database engine using latch-free Bw-tree on latch-free page cache techniques for speed

  • Skytable - Rust "realtime NoSQL" key-value database

  • IOx - New in-memory columnar InfluxDB engine using Arrow, DataFusion, rust! Persists using parquet. Super awesome stuff.

  • IndraDB - Graph database/library written in Rust! and inspired by Facebook's TAO.

  • TerminusDB-store - a Rust RDF triple data store

  • BonsaiDB - NoSQL document store written in Rust with Rust schemas

  • Vector - high performance observability data pipeline, for transforming, aggregating, routing logs, metrics, traces, etc.

  • Tremor - a simple event processing / log and metric processing and forwarding system, with scripting and streaming query support. Much more capable than Telegraf.

  • MinSQL - interesting POC on lightweight SQL based log search, w automatic field parsing etc.

  • pq - Parse and Query log files as time series, extracting structured records out of common log files

  • plotters - Rust data visualization / graphing library

  • Stateright - distributed protocol/model checker with UI, linearizability checker!

  • Clepsydra - Graydon Hoare working on distributed database protocol - in Rust!

  • crepe - Datalog, declarative logic programs as macros in Rust

JSON Processing

For JSON DOM (IR) processing, using the mimalloc allocator provided me a 2x speedup with serde-json. Then, switching to json-rust provided another 1.8x speedup. The speedup is completely unreal, much faster than JVM. The main reason I guess is that json-rust has a Short DOM class for short strings, which requires no heap allocation.

  • simdjson-rs - SIMD-enabled JSON parser. NOTE: no writing of JSON.
  • pjson - JSON streaming parser
  • streamson - efficient JSON processing for large documents

Cool Data Structures

  • leapfrog - fast, concurrent HashMap, lock-free if types support atomic ops.

    • What's neat about its API is that instead of locking at bucket level, and blocking inserts if a reader is taking too long, it never returns references to data and relies on an atomic API
  • concread - Concurrently Readable (Copy on Write, MVCC) datastructures - "allow multiple readers with transactions to proceed while single writers can operate" - guaranteeing readers same version. There is a hashmap and ARCache.

  • flurry - Rust impl of Java's ConcurrentHashMap. Uses seize for ref-count-based GC.

  • im - Immutable data structures for Rust

  • rust-phf - generate efficient lookup tables at compile time using perfect hash functions!

  • odht - "hash table that can be mapped from disk into memory without need for up-front decoding" - deterministic binary representation, and platform and endianness independent. Sounds sweet!

  • radix-trie

  • Patricia Tree - Radix-tree based map for more compact storage

  • probabilistic-collections - Bloom/Cuckoo/Quotient filters, CountMinSketch, HyperLogLog, streaming approx set membership, etc.

  • priq - "blazing fast" priority queue built using arrays

  • Using Finite State Automata and Rust to quickly index and find data amongst HUGE amount of strings

  • ahash - this seems to be the fastest hash algo for hash keys

  • Metrohash - a really fast hash algorithm

  • IndexMap - O(1) obtain by index, iteration by index order

  • FM-Index, a neat structure that allows for fast exact string indexing and counting while compressing original string data at the same time. There is a Rust crate

  • Heapless - static data structures with fixed size; Vec, heap, map, set, queues

  • dashmap - "Blazing fast concurrent HashMap for Rust". NOTE: I don't recommend this project, I used it in my Ying profiler but it can deadlock in unpredictable ways

  • Easy Persistent Data Structures in Rust - replacing Box with Rc

  • VecMap - map for small integer keys, may use less space

Geospatial and Graph

  • The base Geometry processing crate is geo.

    • Geo does not (as of 0.18) handle intersections, difference, XOR etc. Try geo-booleanop for a Rust-only implementation using Martinez-Rueda algorithm
    • Or use geos based on the C library
  • spatial-join - Spatial joins and proximity maps!

  • Rstar - n-dimensional R*-Tree for geospatial indexing and nearest-neighbor

  • spade - R-trees and Delaunay triangulations

  • Hora Search - Nearest-Neighbor (NN) / geo search library that includes multiple algorithms including HNSW, SSG, PQIVF, etc.

  • Petgraph - Graph data structure for Rust, considered perhaps most mature right now

String Performance

Rust has native UTF8 string processing, which is AWESOME for performance. However, there are two concerns usually:

  1. Small string memory efficiency. The native String type uses at least two words just for pointer and length/cap, which might be longer than the string itself;
  2. Minimizing number of heap allocations

Here are some solutions:

  • String - string type with configurable byte storage, including stack byte arrays!
  • Inlinable String - stores strings up to 30 chars inline, automatic promotion to heap string if needed.
  • Also see smallstr
  • flexstr - Enum String type to unify literals, inlined, and heap strings
  • kstring - intended for map keys: immutable, inlined for small keys, and have Ref/Cow types to allow efficient sharing. :)
  • nested - reduce Vec type structures to just two allocations, probably more memory efficient too.
  • tinyset - space efficient sets and maps, can be combined with nested perhaps
  • bumpalo can do really cheap group allocations in a Bump and has custom String and Vec versions. At least lowers allocation overhead.

Rust and Scala/Java

  • Rust for Java Developers

  • 5 Rust Reflections from Java

  • The presence of true unsigned types is really nice for low-level work. I hit a bug in Scala where I used >> instead of >>>. In Rust you declare a type as unsigned and don't have to worry about this.

  • Immutable byte slices and reference types again are awesome for low-level work.

  • Trait monomorphisation is awesome for ensuring trait methods can be inlined. JVM cannot do this when there is more than one implementation of a trait.

  • Being able to examine assembly directly from compiler output is super nice for low level perf work (compared to examining bytecode and not knowing the final output until runtime)

  • OTOH, rustc is definitely much much stricter (IMO) compared to scalac. Much of this is for good reason though, for example lack of integer/primitive coercion, ownership, etc. gives safety guarantees.

Rust and Python

  • PyO3 seems to be a gold standard of Rust-based Python module development.
  • PyOxidizer - a Rust tool to package Python apps, interpreter, and all dependencies as a single binary, by wrapping app in a Rust program with a custom Rust Py module importer. Also helps embed Python code in Rust apps.
  • Oh no, my data science is getting Rusty! - neat post from CrowdStrike on integrating Rust with Python for improved performance AND safety

Rust-OtherLanguage Integration / Rust FFI

  • Calling Rust from Java - especially see the hint for using jnr-ffi
  • There is also j4rs for calling Java from Rust
  • SaferFFI - a neat library to make exposing C-like APIs much safer esp dealing with pointers, nulls, borrowing etc.
  • Exposing a Rust library to C - has some great tips on creating .so's and working with strings
  • cc-rs - C/C++ build integration with Cargo
  • It seems to me Circle CI's support for multiple docker images and explicit manifest style makes it very easy to set up multiple language and dependency support
  • Supporting multiple languages in Travis CI
  • Running LLVM on GraalVM - using GraalVM to embed and run LLVM bitcode! Too bad GraalVM is commercial/Oracle only

CLI and Misc

  • Structopt - define CLI options using a struct!

  • tui-rs - Rust terminal UI for CLI apps. Check out list of projects it refers to also. Lots of options!

  • Hot Reloading in Rust - great article on how to hot-reload dynamic linked libraries in Rust, and on the potential pitfalls, with plenty of links.

IDE/Editor/Tooling

  • EVCXR - a Rust REPL!!! With deps, and tab-completion for methods!!
  • comby-rust - rewrite Rust code using comby
  • rustviz - Visualize borrowing and ownership!
  • no-panics-whatsoever - crate to detect and ensure at compile time there aren't panics in your code
  • cargo-bloat - what's taking up space in my Rust binary
  • cargo-limit - clean up, sort and limit error/warning output. Great for those of us running cargo in shells!
  • mutagen - mutation testing tool for Rust programs. Generates "mutations" in your code to try to break test coverage!
  • cargo-rr - time travel/recording/reverse debugger framework for Rust using rr
  • cargo_hakari - A crate to speed up builds of workspace-hack packages ... for when you have multiple crates or complex builds, and you have duplicate dependencies
  • inkwell - LLVM API, including LLVM IR generation and running LLVM JIT to run snippets in your code

Dependency conflicts? Use cargo tree -i to lookup reverse dependencies for specific packages (which crates are using which deps). For example, cargo tree -i arrow:5.0.0-SNAPSHOT.

  • RustAnalyzer - LSP-based plugin/server for IDE functionality in Sublime/VSCode/EMacs/etc
  • Configuring Rustfmt
  • Godbolt - A "compiler explorer", not Rust specific but neat to play with compiler settings and diff targets.
  • Cargo-play - run Rust scripts without needing to set up a project
    • Also see cargo-eval and runner for diff ways of easily running scripts without projects

BTW for Rust 1.51+ you can speed up MacOS builds with this in your Cargo.toml (see the release notes):

[profile.dev]
split-debuginfo = "unpacked"

Testing and CI/CD

The two standard property testing crates are Quickcheck and proptest. Personally I prefer proptest due to much better control over input generation (without having to define your own type class).

Cross-compilation

A common concern - how do I build different versions of my Rust lib/app for say OSX and also Linux?

  • Easiest way now seems to be to use cross - I tried it and literally as easy as cargo install cross and cross build --target ... as long as you have Docker.
    • NOTE: crates with non-Rust code (eg jemalloc, mimalloc) often have trouble
  • Also see rust-musl-builder, another Docker-based solution
  • musl is the best target for Linux as it removes need for G/LIBC dependencies and versioning. Musl creates a single static binary for super easy deploys.
  • For automation, maybe better to create a single Docker image which combines crossbuild (which has a recipe for OSXCross + other targets) with a rustup container like abronan/rust-circleci which allows building both nightly and stable. Use Docker multi-stage builds to make combining multiple images easier

Finally, the Taking Rust everywhere with Rustup blog has good guide on how to use rustup to install cross toolchains, but the above steps to install OS specific linkers are still important.

Performance and Low-Level Stuff

A big part of the appeal of Rust for me is super fast, SAFE, built in UTF8 string processing, access to detailed memory layout, things like SIMD. Basically, to be able to idiomatically, safely, and beautifully (functionally?) do super fast and efficient data processing.

If small binary size is what you're after, check out Min-sized-Rust.

Rust nightly now has a super slick asm! inline assembly feature. The way that it integrates Rust variables/expressions with auto register assignment is super awesome.

NOTE: simplest way to increase perf may be to enable certain CPU instructions: set -x RUSTFLAGS "-C target-feature=+sse3,+sse4.2,+lzcnt,+avx,+avx2"

NOTE2: lazy_static accesses are not cheap. Don't use it in hot code paths.

Perf profiling:

Note: this section is mostly about profiling tools -- detailed breakdowns of bottlenecks, as opposed to benchmarking (which is repeatable, systematic measurement). The two benchmarking tools I recommend are criterion and Iai for benchmarking.

NEW: I've created a Docker image for Linux perf profiling, super easy to use. The best combo is cargo flamegraph followed by perf and asm analysis.

  • cargo-flamegraph -- this is now the easiest way to get a FlameGraph on OSX and profile your Rust binaries. To make it work with bench and Criterion:

    • First run cargo bench to build your bench executable
    • If you haven't already, cargo install flamegraph (recommend at least v0.1.13)
    • sudo flamegraph target/release/bench-aba573ea464f3f67 --profile-time 180 <filter> --bench (replace bench-aba* with the name of your bench executable)
      • The --profile-time is needed for flamegraph to collect enough stats
    • open -a Safari flamegraph.svg
    • NOTE: you need to turn on debug = true in release profile for symbols
    • This method works better for apps than small benchmarks btw, as inlined methods won't show up in the graph.
  • Rust Profiling with Instruments on OSX - but apparently cannot export CSV to FlameGraph :(

    • Note that you can now just install cargo instruments
    • Also useful for heap/memory analysis, including tracking retained vs transient allocations
  • Rust Performance: Perf and Flamegraph - including finding hot assembly instructions

  • samply - used to be called perfrecord, Rust CPU CLI command profiler using Firefox as UI. WIP.

  • Iai - a one-shot Rust profiler that uses Valgrind underneath

  • Top-down Microarchitecture Analysis Method - TMAM is a formal microprocessor perf analysis method from Intel, works with perf to find out what CPU-level bottlenecks are (mem IO? branch predictions? etc.)

  • Rust Profiling with DTrace and FlameGraphs on OSX - probably the best bet (besides Instruments), can handle any native executable too

    • From @blaagh: though the predicate should be "/pid == $target/" rather than using execname.
    • DTrace Guide is probably pretty useful here
  • Hyperfine - Rust performnace benchmarking CLI

  • Tools for Profiling Rust - cpuprofiler might possibly work on OSX. It does compile. The cpuprofiler crate requires surrounding blocks of your code though.

  • Rust Performance Profiling on Travis CI

  • Rust Profiling talk - discusses both OSX and Linux, as well as Instruments and Intel VTune

  • 2017 RustConf - Improving Rust Performance through Profiling

  • Flamer - an alternative to generating FlameGraphs if one is willing to instrument code. Warning: might require nightly Rust features.

  • cargo-profiler - only works in Linux :(

  • coz and its Cargo plugin, coz-rs -- "a new kind of profiler that unlocks optimization opportunities missed by traditional profilers. Coz employs a novel technique we call causal profiling that measures optimization potential"

  • Rust Perf Book Profiling Page - lots of good links

cargo-asm can dump out assembly or LLVM/IR output from a particular method. I have found this useful for really low level perf analysis. NOTE: if the method is generic, you need to give a "monomorphised" or filled out method. Also, methods declared inline won't show up.

  • What I like to do with asm output: check if rustc has inlined certain methods. Also you can clearly see where dynamic dispatch happens and how complicated generated code seems. More complicated code usually == slower.
  • llvm-mca - really detailed static analysis and runtime prediction at the machine instruction level

What works on a Mac (but see cargo flamegraph above for easier way):

sudo dtrace -c './target/release/bench-2022f41cf9c87baf --profile-time 120' -o out.stacks -n 'profile-997 /pid == $target/ { @[ustack(100)] = count(); }'
~/src/github/FlameGraph/stackcollapse.pl out.stacks | ~/src/github/FlameGraph/flamegraph.pl >rust-bench.svg
open -a Safari rust-bench.svg

where -c bench.... is the executable output of cargo bench.

I was hoping cargo-with would allow us to run above dtrace command with the name of the bench output, but alas it doesn't seem to work with bench. (NOTE: they are working on a PR to fix this! :)

NOTE: The built in cargo bench requires nightly Rust, it doesn't work on stable! I highly recommend for benchmarking to use criterion, which works on stable and has extra features such as gnuplot, parameterized benchmarking and run-to-run comparisons, as well as being able to run for longer time to work with profiling such as dtrace.

Memory/Heap Profiling

The options I've tried out:

  • Bytehound - really slick, but only works on Linux (using perf).
    • No need to modify apps, uses LD_PRELOAD
    • extracts full stack traces plus every alloc/dealloc, but claims it uses custom unwinding code that's much much faster
    • tracks memory usage over time, as well as leaks explicitly, and memory fragmentation
    • can give you flamegraphs of memory allocations or just leaks!
    • Has a really nice UI/webapp that's bundled together
    • Has many options to write out profiling data to different locations or over network
    • Problems:
      • Creates giant profiling data files. There are options to slim it down though, such as keeping only allocations that live longer than a particular threshold
      • Bundled viewer does not seem to be able to load debug symbols when profiling data does not include them :(
      • It seems the only way to really include full symbols in the profiling info is to run profiling with a debug build. However this blows up the size of the data file even more... hundreds of MBs from just a few minutes of run time!
  • jeprof: If you use jemallocator and install jemalloc as your global allocator, you can get some profiling for free.
    • Jemalloc Heap Profiling
    • How to parse jeprof text output
    • Pros: Jemalloc profiling is sampling based and very lightweight. It can be used in production with minimal perf impact.
    • The profile files are also very small
    • Cons: it's, like, really hard to use. For example, enabling it via environment variable - the instructions are not very clear, and there is no way to write the files to anything other than the current directory
      • Runtime config: set both environment variables MALLOC_CONF and _RJEM_MALLOC_CONF (which one works depends on environment)
      • Compile time config, for jemallocator users: JEMALLOC_SYS_WITH_MALLOC_CONF
    • Con: The stats collected are about total memory allocated, with no differentiation for short/temporal vs long-lived allocations
    • Con: It's not built for Rust and difficult to infer stacktraces. Many symbols are mangled.
    • It is possible to do differential analysis: use one profile as a "base" and then diff vs other profiles. However, the profile files use sequence numbers, so it's hard to tell which profile to use for what time.
    • Also there is no way to sort the output and the options for simplifying the output don't work very well
  • dhat - Swap out the global allocator, will profile your allocations & max heap usage
    • One advantage DHAT has over jeprof/jemalloc is lifetime / allocation length information. This can be used to figure out long-held things
    • DHAT also tracks the entire call graph so it can produce a useful tree
    • It's online viewer is also much easier to use than jeprof
    • Unfortunately DHAT tracks every allocation so it's not good for production use
    • DHAT also crashes on some workloads. This is really annoying.
  • Heaptrack and working with Rust works for Rust, but only on Linux.

After the above frustrations and investigations, I decided to write my own custom memory profiler - Ying - a sampling profiler, built for rich Rust stack traces including inlined methods, which tracks retained memory and lifetimes. Definitely experimental right now.

Fast String Parsing

  • nom - a direct parser using macros, commonly accepted as fastest generic parser

  • pest is a PEG parser using an external, easy to understand syntax file. Not quite as fast but might be easier to understand and debug. There is also a book.

  • combine is a parser combinator library, supposedly just as fast as nom, syntax seems slightly

  • simdutf8 - SIMD lightning fast UTF-8 validation

Bitpacking, Binary Structures, Serialization

  • bitpacking - insanely fast integer bitpacking library
  • packed_struct - bitfield packing/unpacking; can also pack arrays of bitfields; mixed endianness, etc.
  • rkyv - Zero-copy deserialization, for generic Rust structs, even trait objects. Uses relative pointers.
  • binary-layout - "type-safe, inplace, zero-copy access to structured binary data" including open-ended byte arrays at the end
  • zerovec - Clients upgrading to zerovec benefit from zero heap allocations when deserializing read-only data.
  • Speeding up incoming message parsing using nom - a detailed guide to using nom for deserialization, much faster than Serde

The ideal performance-wise is to not need serialization at all; ie be able to read directly from portions of a binary byte slice. There are some libraries for doing this, such as flatbuffers, or flatdata for which there is a Rust crate; or Cap'n Proto. However, there may be times when you want more control or things like Cap'n Proto are not good enough.

How do we perform low-level byte/bit twiddling and precise memory access? Unfortunately, all structs in Rust basically need to have known sizes. There's something called dynamically sized types basically like slices where you can have the last element of a struct be an array of unknown size; however, they are virtually impossible to create and work with, and this only covers some cases anyhow. So we will unfortunately need a combination of techniques. In order of preference:

  • Overall scroll is the best general-purpose struct serialization crate; it helps with reading integers and other fields too, and takes care of endianness. It generates pretty efficient code. It is a bit of a pain working with numeric enums however.
    • num_enum - a way to derive TryFrom for numeric enums helps a little bit.
  • I have found plain works really well. Mark your structs with #[repr(C)]. It only helps with size and alignment, not endianness - so maybe more for in-memory structures or when you are sure you don't need code to work across endianness platforms. If your structures are not aligned then use #[repr(C, packed)] or #[align(1)].
  • Use a crate such as bytes or scroll to help extract and write structs and primitives to/from buffers. Might need extra copying though. Also see iobuf
  • rel-ptr - small library for relative pointers/offsets, should be super useful for custom file formats and binary/persistent data structures
  • arrayref might help extract fixed size arrays from longer ones.
  • bytemuck for casts
  • Also zerocopy with FromBytes and AsBytes traits for easy transmuting
  • bitmatch could be great for bitfield parsing
  • Allocate a Vec::<u8> and transmute specific portions to/from structs of known size, or convert pointers within regions back to references:
#![allow(unused)]
fn main() {
    let foobar: *mut Foobar = mybytes[..].as_ptr() as *mut Foobar;
    let &mut Foobar = (unsafe { foobar.as_ref() }).expect("Cannot convert foobar to ref");
}
  • Or structview which offers types for unaligned integers etc.
  • There are some DST crates worth checking out: slice-dst, thin-dst
  • As a last resort, work with raw pointer math using the add/sub/offset methods, but this is REALLY UNSAFE.
#![allow(unused)]
fn main() {
    let foobar: *mut Foobar = mybytes[..].as_ptr() as *mut Foobar;
    unsafe {
      (*foobar).foo = 17;
      (*foobar).bar = -1;
    }
}

Want to zero memory quickly? Use slice_fill for memset optimization, since there is no memory filling for slices in Rust yet.

Also check out the crazy number of crates available under compression - including various interesting radix and trie data structures, and more compression algorithms that one has never heard of.

Enums, Thin Pointers, Type Wrapping

A frequent problem, esp when working with data, is to have a "union" of different types. Perhaps Option will suffice, but sometimes we need to wrap Vec<A> and Vec<B> together in the same type. We don't want to just use Box<dyn MyTrait> as that allocates and results in dynamic dispatch. Here are some crates and patterns that may help in working with enums, or alternatives:

  • enum_dispatch - macro to implement the dyn MyTrait trait object pattern for enums, so we get fast static dispatch. Basically implements traits for underlying types in enums
  • enum_delegate is an alternative that works with associated types in traits - but not generics
  • strum - derive strings and discriminant enums using macros
  • You can use std::mem::discriminant, a built-in function, to find the numeric discriminant for an enum
  • Also enum discriminants can be explicitly specified using #[repr(..)], see here - you can then transmute the enum into something explicit

Some non-enum crates that can also help:

  • ptr_union - "Pointer union types the since of a pointer by storing the tag in the alignment bits" :)
  • erasable - "Type-erased thin pointers" - need to see how this is different from std::any::Any

SIMD

There is this great article on Towards fearless SIMD, about why SIMD is hard, and how to make it easier. Along with pointers to many interesting crates doing SIMD. (There is a built in crate, std::simd but it is really lacking) (However, packed_simd will soon be merged into it)

Another great article: learning simd with rust by finding planets is great too. simd is really about parallelism. it is better to do multiple operations in a parallel (vertical) fashion, vector on vector, than to do horizontal operations where the different components of a wide register depend on one another.

  • ssimd - an effort to bring std::simd/packed_simd to Rust stable, with auto vectorization (meaning auto detect and implement code paths and fallbacks for when SIMD not available!)

  • faster - "SIMD for Humans" -- probably my favorite one, very high level translation of numeric map loops into SIMD

  • fearless_simd, the blog post author's crate. Runtime CPU detection and use of the most optimal code, no need for unsafe, but only focused on f32.

  • SIMDeez - abstracts intrinsic SIMD instructions over different instruction sets & vector widths, runtime detection

  • simd_aligned and simd_aligned_rust - work with SIMD and packed_simd using vectors which have guaranteed alignment

  • aligned - newtype with byte alignment, for stack or heap!

  • https://www.rustsim.org/blog/2020/03/23/simd-aosoa-in-nalgebra/

NOTE: shuffle in packed_simd is not very fast. Replace with native instructions if possible.

🦀 Learning Resources

The first part of this list is a refined and personalized version of this one

Books

Interactive

Videos

Blogs

Learning Rust

Getting Started + Installation | Cheat Sheet

Table of Contents

  1. How Do I Start Learning Rust?!
  2. Ok, I Think I Know The Basics But How Do I Get Better?!
  3. Rustlang Tooling
  4. Helpful References Throughout Your Journey
  5. The Rust Community
  6. Recommended (but not free) Books & Courses

How Do I Start Learning Rust?!

by Reading

  • The Book: start reading now, read it whenever you can, and don't worry if it takes a long time to get through. You will reference it for most of your time using Rust. or try the Rust Book with Quizzes

    Affectionately nicknamed “the book,” The Rust Programming Language will give you an overview of the language from first principles. You’ll build a few projects along the way, and by the end, you’ll have a solid grasp of the language.

  • Rust By Example: like 'The Book', with less docs and more sample code

    If reading multiple hundreds of pages about a language isn’t your style, then Rust By Example has you covered. While the book talks about code with a lot of words, RBE shows off a bunch of code, and keeps the talking to a minimum. It also includes exercises!"

  • (Blog) Mental Models for Learning Rust. (kerkour)
  • (Blog) A Half Hour to Learn Rust. (fasterthanlime)
  • (Blog Series) Learn Rust the Dangerous Way (cliffle) "Rust features in context for low-level C programmers"

    Existing Rust tutorials are great, but they focus on safe features. This companion tutorial takes an unsafe-first approach that may be more appealing for low-level systems programmers like me.

by Watching

by Doing

  • Tour of Rust: Step-by-step interactive walkthrough of Rust, all in your browser.
  • Rustlings: Rust by Example -style exercises you complete via your own local environment

    Alternatively, Rustlings guides you through downloading and setting up the Rust toolchain, and teaches you the basics of reading and writing Rust syntax, on the command line. It's an alternative to Rust by Example that works with your own environment.

  • Exercism.org: Work through examples in order from "hello world!" to advanced concepts like Doubly-linked lists. Do work in the browser or via your local environment using the exercism CLI, with progress reflected in the web app. Get mentorship and guidance from real people.

    We’re building a place where anyone can learn and master programming for free, without ever feeling lost or stupid. We're here to help everyone get really good at programming, regardless of their background. We want to share our love of programming, and help people upskill as part of their upward social mobility.

  • Comprehensive Rust: From the Google Android team

    This is a four day Rust course developed by the Android team. The course covers the full spectrum of Rust, from basic syntax to advanced topics like generics and error handling. It also includes Android-specific content on the last day. The goal of the course is to teach you Rust. We assume you don’t know anything about Rust and hope to: Give you a comprehensive understanding of the Rust syntax and language. Enable you to modify existing programs and write new programs in Rust. Show you common Rust idioms.

Ok, I Think I Know The Basics But How Do I Get Better?!

by Reading

by Watching

by Doing

  • whorl

    whorl was created to teach you how async executors work in Rust. It is not the fastest executor nor is it's API perfect, but it will teach you about them and how they work and where to get started if you wanted to make your own.

  • TP-201: Practical Networked Applications in Rust

    A series of projects that incrementally develop a single Rust project from the ground up into a high-performance,

  • https://github.com/dpc/sniper

    Educational Rust implemenation of Auction Sniper from Growing Object-Oriented Software, Guided By Tests networked, parallel and asynchronous key/value store. Along the way various real-world Rust development subject matter are explored and discussed.

  • Learn Video Codecs by Implementing one in 100 lines of Rust

Rustlang Tooling

Helpful References Throughout Your Journey

The Rust Community

Beginner

Intermediate

🦀 Libraries

GUI

Check Are We Gui Yet?

Serialization

Read here for production ready stuff

🦀 Command Line Tools

Read here for a bootstrap or Rusty Terminal

🦀 Various

Blogs

Other

Command line tools, vim and various

Speed up zsh

  • https://registerspill.thorstenball.com/p/how-fast-is-your-shell
  • https://htr3n.github.io/2018/07/faster-zsh/#macos-optimisations
  • https://gist.github.com/ctechols/ca1035271ad134841284
  • https://blog.jonlu.ca/posts/speeding-up-zsh
  • https://github.com/htr3n/zsh-config/blob/master/zlogin
  • https://askubuntu.com/questions/438150/why-are-scripts-in-etc-profile-d-being-ignored-system-wide-bash-aliases/438170#438170
  • https://askubuntu.com/questions/879364/differentiate-interactive-login-and-non-interactive-non-login-shell/879400#879400
  • https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html
  • https://linux.die.net/man/1/zshcontrib

Split it in to categories!

Keybindings for terminal

Open the iTerm preferences ⌘+, and navigate to the Profiles tab (the Keys tab can be used, but adding keybinding to your profile allows you to save your profile and sync it to multiple computers) and keys sub-tab and enter the following:

Delete all characters left of the cursor

⌘+←Delete Send Hex Codes:

  • 0x18 0x7f – Less compatible, doesn't work in node and won't work in zsh by default, see below to fix zsh (bash/irb/pry should be fine), performs desired functionality when it does work.
  • 0x15 – More compatible, but typical functionality is to delete the entire line rather than just the characters to the left of the cursor.

Delete all characters right of the cursor

⌘+fn+←Delete or ⌘+Delete→ Send Hex Codes:

  • 0x0b

Delete one word to left of cursor

⌥+←Delete Send Hex Codes:

  • 0x01b 0x08

Delete one word to right of cursor

⌥+fn←Delete or ⌥+Delete→ Send Hex Codes:

  • 0x01b 0x64

Move cursor to the front of line

⌘+← Send Hex Codes:

  • 0x01

Move cursor to the end of line

⌘+→ Send Hex Codes:

  • 0x05

Move cursor one word left

⌥+← Send Hex Codes:

  • 0x1b 0x62

Move cursor one word right

⌥+→ Send Hex Codes:

  • 0x1b 0x66

Undo

⌘+z Send Hex Codes:

  • 0x1f

Redo

Typically not bound in bash, zsh or readline, so we can set it to a unused hexcode which we can then fix in zsh.

⇧+⌘+Z or ⌘+y Send Hex Codes:

  • 0x18 0x1f

Stolen from: http://stackoverflow.com/questions/6205157/iterm2-how-to-get-jump-to-beginning-end-of-line-in-bash-shell#answer-29403520

LFCS

Various

Docker

Linux, Kernels, Low-Level

Security

TimeMachine on CasaOs

Taken from:

  • https://github.com/IceWhaleTech/CasaOS/issues/1030
  • https://mxnr.net/time-machine-on-zimaboard/amp/
  • https://wiki.samba.org/index.php/Configure_Samba_to_Work_Better_with_Mac_OS_X
  1. ssh casaos@casaos.local
  2. sudo useradd toniogela
  3. sudo smbpasswd -a toniogela
  4. cd /mnt/diskname && sudo mkdir timemachine
  5. sudo chown toniogela:toniogela timemachine
  6. sudo apt install samba-vfs-modules
  7. sudo vi /etc/samba/smb.conf

Add under [global]

   min protocol = SMB2
   ea support = yes
   vfs objects = fruit streams_xattr
   fruit:metadata = stream
   fruit:model = TimeCapsule
   fruit:posix_rename = yes
   fruit:veto_appledouble = no
   fruit:nfs_aces = no
   fruit:wipe_intentionally_left_blank_rfork = yes
   fruit:delete_empty_adfiles = yes

Create this (but maybe it's useless):

[share]
   spotlight backend = elasticsearch

Then foreach user (here's toniogela) create at bottom:

[Time Machine name you will see under MacOs]
   comment = TTime Machine name you will see under MacOs
   path = /mnt/diskname/timemachine
   browseable = yes
   writeable = yes
   guest ok = no
   read only = no
   fruit:time machine = yes
   valid users = toniogela
   durable handles = yes
   kernel oplocks = no
   kernel share modes = no
   posix locking = no
   ea support = yes
   inherit acls = yes

reboot and you should be done

Low level stuff + Assembly

https://dev.to/frosnerd/writing-my-own-boot-loader-3mld

https://dev.to/visheshpatel/how-c-program-stored-in-ram-memory-3773

https://dev.to/quantumsheep/basics-of-multithreading-in-c-4pam

https://nandgame.com/

https://www.nand2tetris.org/

Turing Complete game on Steam

https://www.turingtumble.com/

https://cs.lmu.edu/~ray/notes/nasmtutorial/

https://dev.to/frosnerd/writing-my-own-boot-loader-3mld

https://www.robertwinkler.com/projects/mips_book/mips_book.html#_chapter_0_hello_world

https://asmtutor.com/#lesson1

https://tmewett.com/c-tips/

Software Development

MacosX

Python

Frontend or Presentations


Gameboy Development

  • https://laroldsjubilantjunkyard.com/
  • https://www.youtube.com/watch?v=rCN-jwYn7Qw
  • https://www.youtube.com/watch?v=eYT9s9bvKYU&list=PLrW43fNmjaQVmjvIj3Ho3rzW46GEw14F9&index=21&t=1814s

Personal And Language Learning

Organizing Tools and Techniques

Gaming and Various

Ricette

Cinnamon Rolls

Ingredients for the dough:

  • 177ml whole (or 2%) milk at 40ºC
  • 24 gr quick rise or active yeast
  • 50 gr granulated sugar
  • 420 gr BREAD flour, not for cakes
  • 1 egg plus 1 egg yolk at room temperature
  • 60 gr butter melted but not hot
  • 3/4 teaspoon salt

For the filling:

  • 400 gr dark brown sugar
  • 1,5 tablespoons ground cinnamon
  • 50 gr butter, softened but not melted

For the cream cheese frosting:

  • Powdered Sugar
  • Milk

Instructions:

  • Warm the milk to ~40ºC and sprinkle the dry yeast on top of it. Wait ~5/10 minutes until it melts completely. While you wait, stir flour and salt toghether.

  • Mix the milk with the sugar, the eggs and the melted butter. Once done mix solid and liquid ingredients toghether.

  • Work the dough for 8/10 minutes, then let it rise covered in a oiled bowl for at least 2 hours. Keep the butter outside the fridge to have it ready for filling.

  • Roll out the dough the size of a tagliere, and spread the softened butter over it leaving a margin, and then rub the cinnamon/brown sugar mixture over it.

  • Roll up tightly the dough along the long side and cut 2 cm sections with a sharp knife. Place the rolls on a greased (or parchment-papered) pan and let them rise of 45/60 minutes.

  • Preheat the oven to 180ºC and cook them for 20/25 minutes. Let them cool for 5/10 minutes before frosting. For the frosting mix powdered sugar and milk. Spread over cinnamon rolls and serve immediately.

Torta Cocco e Cioccolato

Ingredienti:

  • 100 gr di zucchero
  • 125 gr di burro
  • 200 gr di cioccolato fondente
  • 80 gr di farina di cocco
  • 4 uova
  • 2 cucchiai di farina

Preparazione: Fondere burro e cioccolato in un tegame di media grandezza a bagnomaria, togliere dal fuoco ed unire lo zucchero. Aggiungere un uovo alla volta mescolando bene, quindi i cucchiai di farina e il cocco. Imburrare e infarinare bene una teglia di 24-26 cm di diametro, versare tutto nella teglia e mettere nel forno già caldo a 180 gradi per 20-25 minuti. A fine cottura capovolgere la torta per far evaporare l'umidità dalla parte inferiore e spolverare di zucchero a velo.

This is a mess, you should really organize it

Horror Movies To See

See here

  • Leon
  • Pulp fiction
  • Rouge
  • Forrest Gump
  • Chungking Express
  • Shawshank Redemption
  • Once we're warriors
  • Clerks
  • Blanc
  • Natural Born Killers
  • Tatjana
  • Death and the maiden
  • Before the rain
  • The hudsucker proxy
  • Four Weddings and a Funeral
  • Satantango
  • Priscilla
  • In the mouth of madness
  • True Lies
  • Ed Wood
  • Speed
  • Stargate
  • Ace Venture / The Mask / Dumb and Dumber
  • Crooklyn - A Spike Lee Joint
  • Bullets over Broadway