Use case: standards compliance.
The best standards describe reality rather than attempting to impose a
new one. I.E. "A good standard should document, not legislate."
Standards which document existing reality tend to be approved by
more than one standards body, such as ANSI and ISO both approving C99. That's why IEEE 1003.1-2008,
the Single Unix Specification version 4, and the Open Group Base Specification
edition 7 are all the same standard from three sources, which most people just
call "posix" (short for "portable operating system that works like unix").
It's available online in full, and may be downloaded as a tarball.
Previous versions (SUSv3 and
SUSv2)
are also available.
The original Posix was a collection of different standards (POSIX.1
from 1988, POSIX.1b from 1993, and POSIX.1c from 1995). The unified
SUSv2 came out in 1997 and SUSv3 came out in 2001.
Posix
2008 was then reissued in 2013 and 2018, the first was minor wordsmithing
with no behavioral changes, the second was to renew a ten year timeout
to still be considered a "current standard" by some government regulations,
but isn't officially a new standard. It's still posix-2008/SUSv4/Issue 7.
The endless committee process to produce
"Issue 8" has been ongoing for over 15 years now, with conference
calls on mondays and thursdays, mostly to discuss recent bug tracker
entries then publish the minutes of the meeting on the mailing list.
Prominent committee members have died during this time.
Why not just use posix for everything?
Unfortunately, Posix describes an incomplete subset of reality, because
it was designed to. It started with proprietary unix vendors collaborating to
describe the functionality their fragmented APIs could agree on, which was then
incorporated into US federal procurement standards
as a compliance requirement
for things like navy contracts, giving large corporations
like IBM and Microsoft millions of dollars of incentive
to punch holes in the standard big enough to drive
Windows NT and
OS/360 through.
When open source projects like Linux started developing on the internet
(enabled by the 1993 relaxation of the National Science Foundation's
"Acceptable Use Policy" allowing everyone to connect to the internet,
previously restricted to approved government/military/university organizations),
Posix ignored
the upstarts and Linux eventually
returned the favor,
leaving Posix behind.
The result is a "standard" that lacks any mention of commands like
"init" or "mount" required to actually boot a system.
It describes logname but not login. It provides ipcrm
and ipcs, but not ipcmk, so you can use System V IPC resources but not create
them. And widely used real-world commands such as tar and cpio (the basis
of initramfs and RPM) which were present in earlier
versions of the standard have been removed, while obsolete commands like
cksum, compress, sccs and uucp remain with no mention of modern counterparts
like crc32/sha1sum, gzip/xz, svn/git or scp/rsync. Meanwhile posix' description
of the commands
themselves are missing dozens of features, and specify silly things like ebcdic
support in dd or that wc should use %d (not %lld) for byte counts. So
we have to extensively filter posix to get a useful set of recommendations.
Analysis
Starting with the
full "utilities" list,
we first remove generally obsolete
commands (compress ed ex pr uncompress uccp uustat uux), commands for the
pre-CVS "SCCS" source control system (admin delta get prs rmdel sact sccs unget
val what), fortran support (asa fort77), and batch processing support (batch
qalter qdel qhold qmove qmsg qrerun qrls qselect qsig qstat qsub).
Some commands are for a compiler toolchain (ar c99 cflow ctags cxref gencat
iconv lex m4 make nm strings strip yacc) which is out of scope for
toybox and should be supplied externally. (Some of these might be
revisited later, but not for toybox 1.0.)
Some commands are part of a command shell, and can't be implemented as
separate executables (alias bg cd command fc fg getopts hash jobs kill read
type ulimit umask unalias wait). These may be implemented as part of the
built-in toybox shell, but are not exported into $PATH via symlinks and
thus are not part of toybox's main command list. (If you fork a
child process and have it "cd" then exit, you've accomplished nothing.)
Again, what posix lists as "commands" is incomplete: a shell also needs exit, if, while,
for, case, export, set, unset, trap, exec... (And for bash compatibility
function, source, declare...)
A few other commands are judgement calls, providing command-line
internationalization support (iconv locale localedef), System V inter-process
communication (ipcrm ipcs), and cross-tty communication from the minicomputer
days (talk mesg write). The "pax" utility failed to replace tar,
"mailx" is
a command line email client, and "lp" submits files for printing to... what
exactly? (cups?) The standard defines crontab but not crond. What is
pathchk supposed to be portable _to_? (Linux accepts 255 byte path components
with any char except NUL or / and no max length on the total path, and
EXPLICITLY
doesn't care if it's an invalid utf8 sequence.)
Removing all of that leaves the following commands, which toybox should
implement:
at awk basename bc cal cat chgrp chmod chown cksum cmp comm cp
csplit cut date dd df diff dirname du echo env expand expr false file find
fold fuser getconf grep head id join kill link ln logger logname ls man
mkdir mkfifo more mv newgrp nice nl nohup od paste patch printf ps
pwd renice rm rmdir sed sh sleep sort split stty tabs tail tee test time
touch tput tr true tsort tty uname unexpand uniq unlink uudecode uuencode vi wc
who xargs zcat
One attempt to supplement POSIX towards an actual usable system was the
Linux Standard Base. Unfortunately, the quality of this "standard" is
fairly low, largely due to the Free Standards Group that maintained it
being consumed by the Linux Foundation in 2007.
Where POSIX allowed its standards process to be compromised
by leaving things out (but what
they DID standardize tends to be respected, if sometimes obsolete),
the Linux Standard Base's failure mode was different. They responded to
pressure by including anything their members paid them enough to promote,
such as allowing Red Hat to push
RPM into the standard even though all sorts of distros (Debian, Slackware, Arch,
Gentoo, Android, Alpine...) don't use it and never will. This means anything in the LSB is
at best a suggestion: arbitrary portions of this standard are widely
ignored.
The community perception
seems to be that the Linux Standard Base is
the best standard money can buy: the Linux Foundation is supported by
financial donations from large companies and the LSB
represents the interests
of those donors regardless of technical merit. (The Linux Foundation, which
maintains the LSB, is NOT a 501c3. It's a 501c6, the
same kind of legal entity as the Tobacco Institute and
Microsoft's
old "Don't Copy That Floppy" campaign.) Debian officially
washed its hands of LSB by
refusing to adopt release 5.0 in 2015, and no longer even pretends to support
it (which affects Debian derivatives like Ubuntu and Knoppix). Toybox has
stayed on 4.1 for similar reasons.
That said, Posix by itself isn't enough, and this is the next most
comprehensive standards effort for Linux so far, so we salvage what we can.
A lot of historical effort went into producing the standard before the
Linux Foundation took over.
Analysis
LSB 4.1 specifies a list of command line
utilities:
ar at awk batch bc chfn chsh col cpio crontab df dmesg du echo egrep
fgrep file fuser gettext grep groupadd groupdel groupmod groups
gunzip gzip hostname install install_initd ipcrm ipcs killall lpr ls
lsb_release m4 md5sum mknod mktemp more mount msgfmt newgrp od passwd
patch pidof remove_initd renice sed sendmail seq sh shutdown su sync
tar umount useradd userdel usermod xargs zcat
Where posix specifies one of those commands, LSB's deltas tended to be
accomodations for broken tool versions which ween't up to date with the
standard yet. (See more and xargs
for examples.)
Since we've already committed to using our own judgement to skip bits of
POSIX, and LSB's "judgement" in this regard is purely bug workarounds to declare
various legacy tool implementations "compliant", this means we're mostly
interested in the set of LSB tools that aren't mentioned in posix.
Of these, gettext and msgfmt are internationalization, install_initd and
remove_initd weren't present even in Ubuntu 10.04, lpr is out of scope,
lsb_release just reports information in /etc/os-release, and sendmail's
turned into a pile of cryptographic verification and DNS shenanigans due
to spammers.
This leaves:
chfn chsh dmesg egrep fgrep groupadd groupdel groupmod groups
gunzip gzip hostname install killall md5sum
mknod mktemp mount passwd pidof seq shutdown
su sync tar umount useradd userdel usermod zcat
They're very nice, but there's thousands of them. The signal to noise
ratio here is terrible.
Discussion of standards wouldn't be complete without the Internet
Engineering Task Force's "Request For Comments" collection and Michael Kerrisk's
Linux man-pages project...
except these aren't standards, they're collections of documentation with
low barriers to inclusion. They're not saying "you should support
X", they're saying "if you do, here's how".
Thus neither really helps us select which commands to include.
The man pages website includes the commands in git, yum, perf, postgres,
flatpack... Great for examining the features of a command you've
already decided to include, useless for deciding _what_ to include.
The RFCs are more about protocols than commands. The noise level is
extremely high: there's thousands of RFCs, many describing a proposed idea
that never took off, and less than 1% of the resulting documents are
currently relevant to toybox. The documents are numbered based on the
order they were received, with no real attempt at coherently indexing
the result. As with man pages they can be long and complicated or
terse and impenetrable,
have developed a certain amount of bureaucracy over the years, and often the easiest way to understand what
they document is to find an earlier version to read first.
(This is an example of the greybeard community problem, where all current
documentation was written by people who don't remember NOT already knowing
this stuff and the resources they originally learned from are long gone.)
That said, RFC documents can be useful (especially for networking protocols)
and the four URL templates the recommended starting files
for new commands (hello.c and skeleton.c in the toys/example directory)
provide point to example posix, lsb, man, and rfc pages online.
Once upon a time, the following commands were enough to build the Aboriginal Linux development
environment, boot it to a shell prompt, and build Linux From Scratch 6.8 under it.
bzcat cat cp dirname echo env patch rmdir sha1sum sleep sort sync
true uname wc which yes zcat
awk basename chmod chown cmp cut date dd diff
egrep expr fdisk find grep gzip head hostname id install ln ls
mkdir mktemp mv od readlink rm sed sh tail tar touch tr uniq
wget whoami xargs chgrp comm gunzip less logname split
tee test time bunzip2 chgrp chroot comm cpio dmesg
dnsdomainname ftpget ftpput gunzip ifconfig init
logname losetup mdev mount mountpoint nc pgrep pkill
pwd route split stat switch_root tac umount vi
resize2fs tune2fs fsck.ext2 genext2fs mke2fs xzcat
This use case includes running init scripts and other shell scripts, running
configure, make, and install in each package, and providing basic command line
facilities such as a text editor. (It does not include a compiler toolchain or
C library, those are outside the scope of the toybox project, although mkroot
has a potential follow-up project.
For now we use distro toolchains,
musl-cross-make,
and the Android NDK for build testing.)
That build system also installed bash 2.05b as #!/bin/sh and its scripts
required bash extensions not present in shells such as busybox ash.
To replace that, toysh needs to supply several bash extensions _and_ work
when called under the name "bash".
The above command list was collected using a command line recording wrapper
(mkroot/record-commands and toys/example/logpath.c) which mkroot/mkroot.sh
also uses to populate root/build/log/*-commands.txt. Try
awk '{print $1}' root/build/log/*-commands.txt | sort -u | grep -v musl | xargs
after building a mkroot target to see the list of commands called out
of the $PATH during that build.
Stages and moving targets
The development environment use case has two stages, achieving:
1) a bootable system that can rebuild itself from source, and 2)
a build environment capable
of bootstrapping up to arbitrary complexity (by building
Linux From Scratch and Beyond Linux From Scratch under the resulting
system, or the Android Open Source Project). To accomplish just the first
goal (a minimal system that can rebuild _itself_ from source), the old
build still needs the following busybox commands for which toybox does
not yet supply adequate replacements:
awk diff expr fdisk gzip less route sh tr unxz vi xzcat
All of those except awk and less have partial implementations
in "pending".
In 2017 Aboriginal Linux development ended, replaced by a much simpler
project ("mkroot") designed to use an existing cross+native toolchain (such as
musl-cross-make
or the Android NDK) instead of building its own cross and native compilers
from source. In 2019 the still-incomplete
mkroot was merged into toybox as the "make root" target (which runs
mkroot/mkroot.sh). This is intended
as a simpler way of providing essentially the same build environment, and doesn't
significantly affect the rest of this analysis (although the "rebuild itself
from source" test should now include building musl-cross-make under either
mkroot or toybox's "make airlock" host environment).
Building Linux From Scratch is not the same as building the
Android Open Source Project,
but after toybox 1.0 we plan to try
modifying the AOSP build
to reduce dependencies. (It's fairly likely we'll have to add at least
a read-only git utility so repo can download the build's source code,
but that's actually not
that hard. We'll probably also need our own "make" at some point after
1.0, which is its own moving target thanks to cmake and ninja and so on.)
The ongoing Android hermetic build work is already advancing
this goal.
Android has a policy against GPL in userspace, so even though BusyBox
predates Android by many years, they couldn't use it. Instead they grabbed
an old version of ash (later replaced by
mksh)
and implemented their own command line utility set
called "toolbox" (which toybox has already mostly replaced).
Toolbox doesn't have its own repository, instead it's part of Android's
system/core
git repository. Android's Native Development Kit (their standalone
downloadable toolchain) has its own
roadmap, and each version has
release
notes.
Toolbox commands:
According to
system/core/toolbox/Android.bp the toolbox directory builds the
following commands:
getevent getprop modprobe setprop start
getprop/setprop/start were in toybox and moved back because they're so
tied to non-public system interfaces. modprobe shares the implementation
used in init. getevent is a board bringup tool built with a python script
that pulls all the constants from the latest kernel headers.
Other Android /system/bin commands
Other than the toolbox links, the currently interesting
binaries in /system/bin are:
- arping - ARP REQUEST tool (iputils)
- blkid - identify block devices (e2fsprogs)
- e2fsck - fsck for ext2/ext3/ext4 (e2fsprogs)
- fsck.f2fs - fsck for f2fs (f2fs-tools)
- fsck_msdos - fsck for FAT (BSD)
- gzip - compression/decompression tool (zlib)
- ip - network routing tool (iproute2)
- iptables/ip6tables - IPv4/IPv6 NAT admin (iptables)
- iw - wireless device config tool (iw)
- logwrapper - redirect stdio to android log (Android)
- make_ext4fs - make ext4 fs (Android)
- make_f2fs - make f2fs fs (f2fs-tools)
- ping/ping6 - ICMP ECHO_REQUEST tool (iputils)
- reboot - reboot (Android)
- resize2fs - resize ext2/ext3/ext4 fs (e2fsprogs)
- sh - mksh (BSD)
- ss - socket statistics (iproute2)
- tc - traffic control (iproute2)
- tracepath/tracepath6 - trace network path (iputils)
- traceroute/traceroute6 - trace network route (iputils)
The names in parentheses are the upstream source of the command.
Analysis
For reference, combining everything listed above that's still "fair game"
for toybox, we get:
arping blkid e2fsck dd fsck.f2fs fsck_msdos gzip ip iptables
ip6tables iw logwrapper make_ext4fs make_f2fs modpobe newfs_msdos ping ping6
reboot resize2fs sh ss tc tracepath tracepath6 traceroute traceroute6
We may eventually implement all of that, but for toybox 1.0 we need to
focus a bit. If Android has an acceptable external package, and the command
isn't needed for system bootstrapping, replacing the external package is
not a priority.
However, several commands toybox plans to implement anyway could potentially
replace existing Android versions, so we should take into account Android's use
cases when doing so. This includes:
getevent gzip modprobe newfs_msdos sh
Update:
external/toybox/Android.bp has symlinks for the following toys out
of "pending". (The toybox modprobe is also built for the device, but
it isn't actually used and is only there for sanity checking against
the libmodprobe-based implementation.) These should be a priority for
cleanup:
diff expr getopt tr brctl getfattr lsof modprobe more stty traceroute vi
Android wishlist:
mtools genvfatfs mke2fs gene2fs
The list of external tools used to build AOSP was
here,
but as they're switched over to toybox they disappear and reappear
here.
awk basename bash bc bzip2 cat chmod cmp comm cp cut date dd diff dirname dlv du
echo egrep env expr find fuser getconf getopt git grep gzip head hexdump
hostname id jar java javap ln ls lsof m4 make md5sum mkdir mktemp mv od openssl
paste patch pgrep pkill ps pstree pwd python python2.7 python3 readlink
realpath rm rmdir rsync sed setsid sh sha1sum sha256sum sha512sum
sleep sort stat tar tail tee touch tr true uname uniq unix2dos unzip
wc which whoami xargs xxd xz zip zipinfo
The following are already in the tree and will be used directly:
awk bc bzip2 jar java javap m4 make python python2.7 python3 xz
Subtracting what's already in toybox (including the following toybox toys
that are still in pending: diff expr gzip lsof tr
),
that leaves:
bash fuser git hexdump openssl pstree rsync sh unzip zip zipinfo
For AOSP, zip/zipinfo/unzip are likely to be libziparchive based.
git/openssl seem like they should just be brought in to the tree. rsync is
used to work around a Mac cp -Rf
bug with broken symbolic links.
That leaves:
bash fuser hexdump pstree
(Why are fuser and pstree used during the AOSP build? They're used for
diagnostics if something goes wrong. So it's really just bash and hexdump
that are actually used to build.)
A side effect of the Linux Foundation following the money to the
exclusion of all else is they "support" their donors' myriad often
contradictory pet projects with elaborate announcements and press releases.
Long ago when Nokia's Maemo merged
with Intel's Moblin to form MeeGo, there were believable statements
about unifying fragmented vendor efforts. Then MeeGo merged with
LiMo to
form Tizen,
which became a Samsung-only project (that still ships
inside televisions,
but was otherwise subsumed into Android GO).
Along the way, the Tizen project expressed a desire to eliminate GPLv3 software
from its core system, and in installing toybox as
part of this process.
They had a fairly long list of new commands they wanted to see in toybox:
arch base64 users unexpand shred join csplit
hostid nproc runcon sha224sum sha256sum sha384sum sha512sum sha3sum mkfs.vfat fsck.vfat
dosfslabel uname pinky diff3 sdiff zcmp zdiff zegrep zfgrep zless zmore
In addition, they wanted to use several commands then in pending:
tar diff printf wget rsync fdisk vi less tr test stty fold expr dd
Also, tizen uses a different Linux Security Module called SMACK, so
many of the SELinux options ala ls -Z needed smack alternatives in an
if/else setup. We added lib/lsm.h to abstract this, but haven't heard
from Tizen in years and have started implementing SELinux support without
Smack support in places like tar.c. At some point, lib/lsm.h may go away
due to lack of expressed interest.
Another project the Linux Foundation is paid to appreciate is Yocto,
which was designed to fix the ongoing proprietary fragmentation problem
(now in Linux build systems instead of vendor unix forks) by being the
build system equivalent of a glue trap. While proclaiming that having the
"minimum level of standardization" contributes to a "strong ecosystem",
Yocto uses a "layered"
design where everybody who touches it is encouraged to add more and more layers
of metadata on top of what came before, until they wind up using repo just to manage
the layers (let alone their contents). But -- and this is the
important bit -- all these dispirate forks are called "yocto" and built on
top of giant piles of code the Linux Foundation can take credit for
since they filed the serial numbers off OpenEmbedded. (And THEN users
are encouraged to check the result into their own repository as one
big initial commit, discarding all layers and history.)
Yocto's "core-image-minimal" target (only 3,106 build steps in the 3.3
release, which includes building host versions of gnome packages and
something called
the "uninative binary shim") builds a busybox-based system with the following commands:
addgroup adduser ascii sh awk base32 basename blkid bunzip2 bzcat bzip2 cat
chattr chgrp chmod chown chroot chvt clear cmp cp cpio crc32 cut date dc dd
deallocvt delgroup deluser depmod df diff dirname dmesg dnsdomainname du
dumpkmap dumpleases echo egrep env expr false fbset fdisk fgrep find flock
free fsck fstrim fuser getopt getty grep groups gunzip gzip head hexdump
hostname hwclock id ifconfig ifdown ifup insmod ip kill killall klogd less
ln loadfont loadkmap logger logname logread losetup ls lsmod lzcat md5sum
mesg microcom mkdir mkfifo mknod mkswap mktemp modprobe more mount mountpoint
mv nc netstat nohup nproc nslookup od openvt patch pgrep pidof pivot_root
printf ps pwd rdate readlink realpath reboot renice reset resize rev rfkill
rm rmdir rmmod route run-parts sed seq setconsole setsid sh sha1sum sha256sum
shuf sleep sort start-stop-daemon stat strings stty sulogin swapoff swapon
switch_root sync sysctl syslogd tail tar tee telnet test tftp time top touch
tr true ts tty udhcpc udhcpd umount uname uniq unlink unzip uptime users
usleep vi watch wc wget which who whoami xargs xzcat yes zcat
Nobody seems entirely sure why.
Another standard taken over by the Linux Foundation. (At least the
links to this one didn't go 404 the
instant they took it over). Of historical interest due to what it
managed to achieve before they chased away the hobbyists maintaining it.
Only one version (3.0 in 2015) has been released since the Linux Foundation
absorbed the FHS. The previous release, Version 2.3, was released in 2004.
The Linux Foundation did not retain earlier versions. The contents of
the relevant sections appear identical between the two versions, in the
11 years between releases the Linux Foundation just added section numbers.
The /bin options include csh but not bash, and ed but not vi.
The /sbin options have "update" which seems obsolete (filesystem
buffers haven't needed a userspace process to flush them for DECADES),
"fastboot" and "fasthalt" (reboot and halt have -nf), and
fsck.* and mkfs.* that don't actually specify any specific filesystems.
Removing that gives us:
klibc:
Long ago some kernel developers came up with a project called
klibc.
After a decade of development it still has no web page or HOWTO,
and nobody's quite sure if the license is BSD or GPL. It inexplicably
requires perl to build, and seems like an ideal candidate for
replacement.
In addition to a C library less general-purpose than old versions of bionic
(let alone musl), klibc builds a random assortment of executables to run init scripts
with. There's no multiplexer command, these are individual executables:
cat chroot cpio dd dmesg false fixdep fstype gunzip gzip halt ipconfig kill
kinit ln losetup ls minips mkdir mkfifo mknodes
mksyntax mount mv nfsmount nuke pivot_root poweroff readlink reboot resume
run-init sh sha1hash sleep sync true umount uname zcat
To get that list, build klibc according to the instructions (I
looked at version
2.0.2 and did cd klibc-*; ln -s /output/of/kernel/make/headers_install
linux; make) then echo $(for i in $(find . -type f); do file $i | grep -q
executable && basename $i; done | grep -v '[.]g$' | sort -u) to find
executables, then eliminate the *.so files and *.shared duplicates.
Some of those binaries are build-time tools that don't get installed,
which removes mknodes, mksyntax, sha1hash, and fixdep from the list.
(And sha1hash is just an unpolished sha1sum anyway.)
The run-init command is more commonly called switch_root, nuke is just
"rm -rf -- $@", and minips is more commonly called "ps": I'm not doing aliases
for these oddball names.
The "kinit" command is another gratuitous rename, it's init running as PID 1.
The halt, poweroff, and reboot commands work with it.
Yet more stale forks of dash and gzip got sucked in here (see "dubious
license terms" above).
In theory "blkid" or "file" handle fstype (and df for mounted filesystems),
but we could do fstype. We should also implement nfsmount, and probably smbmount
and p9mount even though this hasn't got one. (The reason these aren't
in the base "mount" command is they interactively query login credentials.)
The ipconfig command here has a built in dhcp client, so it's ifconfig
and dhcpcd and maybe some other stuff.
The resume command is... weird. It finds a swap partition and reads data
from it into a /proc file, something the kernel is capable of doing itself.
(Even though the klibc author
attempted
to remove that capability from the kernel, current kernel/power/hibernate.c
still parses "resume=" on the command line). And yet various distros seem to
make use of klibc for this.
Given the history of swsusp/hibernate (and
TuxOnIce
and kexec jump...) I've lost track
of the current state of the art here. Ah, Documentation/power/userland-swsusp.txt
has the API docs, and here's a better
tool...
This gives us a klibc command list:
cat chroot dmesg false kill ln losetup ls mkdir mkfifo readlink rm switch_root
sleep sync true uname
cpio dd ps mv pivot_root
mount nfsmount fstype umount
sh gunzip gzip zcat
kinit halt poweroff reboot
ipconfig
resume
uClinux
Long ago a hardware developer named Jeff Dionne put together a
nommu Linux distribution, which involved rewriting a lot of command line
utilities that relied on features
unavailable on nommu hardware.
In 2003 Jeff moved to Japan and handed
the project off to people who allowed it to roll to a stop. The website
turned into a mess of 404 links, the navigation indexes stopped being
updated over a decade ago, and the project's CVS repository suffered a
hard drive failure for which there were no backups. The project continued
to put out "releases" through 2014 (you have to scroll down in the "news"
section to find them, the "HTTP download" section in the nav bar on the
left hasn't been updated in over a decade), which were hand-updated tarball
snapshots mostly consisting of software from the 1990's. For example the
2014 release still contained ipfwadm, the package which predated ipchains,
which predated iptables, which is in the process of being replaced by
nftables.
Nevertheless, people still try to use this because the project was viewed
as the place to discuss, develop, and learn about nommu Linux.
The role of uclinux.org as an educational resource kept people coming
to it long after it had collapsed as a Linux distro.
Starting around 0.6.0 toybox began to address nommu support with the goal
of putting uClinux out of its misery.
An analysis of uClinux-dist-20140504 found 312 package
subdirectories under "user".
Taking out the trash
A bunch of packages (inotify-tools, input-event-demon, ipsec-tools, netifd,
keepalived, mobile-broadband-provider-info, nuttp, readline, snort,
snort-barnyard, socat, sqlite, sysklogd, sysstat, tcl, ubus, uci, udev,
unionfs, uqmi, usb_modeswitch, usbutils, util-linux)
are hard to evaluate because
uclinux has directories for them, but their source isn't actually in the
uclinux tree. In some of these the makefiles download a git repo during
the build, so I'm assuming you can build the external package if you really
care. (Even when I know what these packages do, I'm skipping them
because uclinux doesn't actually contain them, and any given snapshot
of the build system will bitrot as external web links change over time.)
Other packages are orphaned, meaning they're not mentioned from any Kconfig
or Makefiles outside of their directory, so uclinux can't actually build
them: mbus is an orphaned i2c test program expecting to run in some sort
of hardwired hardware context, mkeccbin is an orphaned "ECC annotated
binary file" generator (meaning it's half of a flash writer),
wsc_upnp is a "Ralink WPS" driver (some sort of stale wifi chip)...
The majority of the remaining packages are probably not of interest to
toybox due to being so obsolete or special purpose they may not actually be
of interest to anybody anymore. (This list also includes a lot of
special-purpose network back-end stuff that's hard for anybody but
datacenter admins to evaluate the current relevance of.)
arj asterisk boottools bpalogin br2684ctl camserv can4linux cgi_generic
cgihtml clamav clamsmtp conntrack-tools cramfs crypto-tools cxxtest
ddns3-client de2ts-cal debug demo diald discard dnsmasq dnsmasq2
ethattach expat-examples ez-ipupdate fakeidentd
fconfig ferret flatfs flthdr freeradius freeswan frob-led frox fswcert
game gettyd gnugk haserl horch
hostap hping httptunnel ifattach ipchains
ipfwadm ipmasqadm ipportfw ipredir ipset iso_client
jamvm jffs-tools jpegview jquery-ui kendin-config kismet klaxon kmod
l2tpd lcd ledcmd ledcon lha lilo lirc lissa load loattach
lpr lrpstat lrzsz mail mbus mgetty microwin ModemManager msntp musicbox
nooom null openswan openvpn palmbot pam_* pcmcia-cs playrt plugdaemon pop3proxy
potrace qspitest quagga radauth
ramimage readprofile rdate readprofile routed rrdtool rtc-ds1302
sendip ser sethdlc setmac setserial sgutool sigs siproxd slattach
smtpclient snmpd net-snmp snortrules speedtouch squashfs scep sslwrap stp
stunnel tcpblast tcpdump tcpwrappers threaddemos tinylogin tinyproxy
tpt tripwire unrar unzoo version vpnled w3cam xl2tpd zebra
This stuff is all over the place: arj, lha, rar, and zoo are DOS archivers,
ethattach describes itself as just "a network tool",
mail is a textmode smtp mailer literally described as "Some kind of mail
proggy" in uclinux's kconfig (as opposed to clamsmtp and smtpclient and
so on), this gettyd isn't a generic version but specifically a
hardwired ppp dialin utility, mgetty isn't a generic version but is combined
with "sendfax", hostap is an intersil prism driver, wlan-ng is also an
intersil prism dirver, null is a program to intentionally dereference a
null pointer (in case you needed one), iso_client is a
"Demo Application for the USB Device Driver", kendin-config is
"for configuring the Micrel Kendin KS8995M over QSPI", speedtouch configures
a specific brand of asdl modem, portmap is part of Anfs,
ferret, linux-igd, and miniupnp are all upnp packages,
lanbypass "can be used to control the LAN
bypass switches on the Advantech x86 based hardware platforms", lcd is
"test of lcddma device driver" (an out-of-tree Coldfire driver apparently
lost to history, the uclinux linux-2.4.x directory has a config symbol for
it, but nothing in the code actually _uses_ it...), qspitest is another
coldfire thing, mii-tool-fec is
"strictly for the FEC Ethernet driver as implemented (and modified) for
the uCdimm5272", rtc-ds1302 and rtc-m41t11 are usermode drivers for specific
clock chips, stunnel is basically "openssl s_client -quiet -connect",
potrace is a bitmap to vector graphic converter, radauth performs command line
authentication against a radius server,
clamav, klaxon, ferret, l7-protocols, and nessus are very old network security
software (it's got a stale snapshot of nmap too), xl2tpd is a PPP over UDP
tunnel (rfc 2661), zebra is the package quagga replaced,
lilo is the x86-only bootloader that predated grub (and recently discontinued
development), lissa is a "framebuffer graphics demo" from
1998, the squashfs package here is the out of tree patches for 2.4 kernels
and such before the filesystem was merged upstream (as opposed to the
squashfs-new package which is a snapshot of the userspace tool from 2011),
load is basically "dd file /dev/spi", version is basically "cat /proc/version",
microwin is a port of the WinCE graphics API to Linux, scep is a 2003
implementation of an IETF draft abandoned in 2010, tpt depends on
Andrew Morton's 15 year old unmerged "timepegs" kernel patch using the pentium
cycle counter, vpnled controls a light that reboots systems (what?),
w3cam is a video4linux 1.0 client (v4l2 showed up during 2.5 and support for
the old v4l1 was removed in 2.6.38 back in 2011), busybox ate tinylogin
over a decade ago, lrpstat is a java network monitor
from 2001, lrzsz is zmodem/ymodem/zmodem, msntp and stp implement rfc2030
meaning it overflows in 2036 (the package was last updated in 2000), rdate
is rfc 868 meaning it also overflows in 2036 (which is why ntp was invented
a few decades back), reiserfsprogs development stopped abruptly after
Hans Reiser was convicted of murdering his wife Nina (denying it on the
stand and then leading them to the body as part of his plea bargain during
sentencing)...
Seriously, there's a lot of crap in there. It's hard to analyze most
of it far enough to prove it _doesn't_ do anything.
Non-toybox programs
The following software may actually still do something intelligible
(although the package versions tend to be years out of date), but
it's not a direction toybox has chosen to go in.
There are several programming languages (bash, lua, jamvm, tinytcl,
perl, python) in there. Maybe someone somewhere wants a 2008 release of a
java virtual machine tested to work on nommu systems (jamvm), but it's out
of scope for toybox.
A bunch of benchmark programs: cpu, dhrystone, mathtest, nbench, netperf,
netpipe, and whetstone.
A bunch of web servers: appWeb, boa, fnord (via tcpserver), goahead, httpd,
mini_httpd, and thttpd.
A bunch of shells: msh is a clever (I.E. obfuscated) little shell,
nwsh is "new shell" (that's what it called itself in 1999 anyway),
sash is another shell with a bunch of builtins (ls, ps, df, cp, date, reboot,
and shutdown, this roadmap analyzes it elsewhere),
sh is a very old minix shell fork, and tcsh is also a shell.
Also in this category, we have:
dropbear jffs-tools jpegview kexec-tools bind ctorrent
iperf iproute2 ip-sentinel iptables kexec
nmap oggplay openssl oprofile p7zip pppd pptp play vplay
hdparm mp3play at clock
mtd-utils mysql logrotate brcfg bridge-utils flashw
ebtables etherwake ethtool expect gdb gdbserver hostapd
lm_sensors load netflash netstat-nat
radvd recover rootloader resolveip rp-pppoe
rsyslog rsyslogd samba smbmount squashfs-new squid ssh strace tip
uboot-envtools ulogd usbhubctrl vconfig vixie-cron watchdogd
wireless_tools wpa_supplicant
An awful lot of those are borderline: play and vplay are wav file
audio players, there's oprofile _and_ readprofile (which just reads kernel
profiling data from /proc/profile),
radvd is a "routr advertisement daemon" (ipv6 stateless autoconf),
ctorrent is a bittorent client,
lm_sensors is hardware (heat?) monitoring,
resolveip is dig only less so,
rp-pppoe is ppp over ethernet,
ebtables is an ethernet version of iptables (for bridging),
their dropbear is from 2012, and that ssh version is from 2011
(which means it's about nine months too _old_ to have the heartbleed bug).
There's both ulogd and ulogd2 (no idea why), and pppd is version 2.4 but
there's a ppd-2.3 directory also. We used to be interested in ftpd/proftpd
as a way of uploading files out of a vm, but support for that has waned
over the years and there are lots of alternatives.
Lots of flash stuff:
flashw is a flash writer, load is an spi flash loader, netflash writes
to flash via tftp,
recover is also a reflash daemon intended to come up when the system can't boot,
rootloader seems to be another reflash daemon but without dhcp.
Already in roadmap
The following packages contain commands already in the toybox roadmap:
agetty cal cksum cron dhcpcd dhcpcd-new dhcpd dhcp-isc dosfstools e2fsprogs
elvis-tiny levee fdisk fileutils ftp grep hd hwclock inetd init ntp
iputils login module-init-tools netcat shutils ntpdate lspci ping procps
rsync shadow shutils stty sysutils telnet telnetd tftp tftpd traceroute
unzip wget mawk net-tools
There are some duplicates in there, levee is a tiny vi implementation
like elvis-tiny, ntp and ntpdate overlap, etc.
Verdict: We don't really need to do a whole lot special for nommu
systems, just get the existing toybox roadmap working on nommu and
we're good. The uClinux project can rest in peace.
Requests:
The following additional commands have been requested (and often submitted)
by various users. I _really_ need to clean up this section.
Also:
dig freeramdisk getty halt hexdump hwclock klogd modprobe ping ping6 pivot_root
poweroff readahead rev sfdisk sudo syslogd taskset telnet telnetd tracepath
traceroute unzip usleep vconfig zip free login modinfo unshare netcat help w
iwconfig iwlist rdate
dos2unix unix2dos clear
pmap realpath setsid timeout truncate
mkswap swapon swapoff
count oneit fstype
acpi blkid eject pwdx
sulogin rfkill bootchartd
arp makedevs sysctl killall5 crond crontab deluser last mkpasswd watch
blockdev rpm2cpio arping brctl dumpleases fsck
tcpsvd tftpd
factor fallocate fsfreeze inotifyd lspci nbd-client partprobe strings
base32 base64 mix
reset hexedit nsenter shred
fsync insmod ionice lsmod lsusb rmmod vmstat xxd top iotop
lsof ionice compress dhcp dhcpd addgroup delgroup host iconv ip
ipcrm ipcs netstat openvt
deallocvt iorenice
udpsvd adduser
microcom tunctl chrt getfattr setfattr
kexec
ascii crc32 devmem fmt i2cdetect i2cdump i2cget i2cset i2ctransfer mcookie prlimit sntp ulimit uuidgen dhcp6 ipaddr iplink iproute iprule iptunnel cd exit toysh bash traceroute6
blkdiscard rtcwake
watchdog
pwgen readelf unicode
rsync
linux32 hd strace
gpiodetect gpiofind gpioget gpioinfo gpioset httpd uclampset
nbd-server
Other packages
System administrators have asked what other Linux packages toybox commands
replace, so they can annotate alternatives in their package management system.
This section uses the package definitions from Chapter 6 of
Linux From Scratch 9.0). Each package lists what we currently
replace, pending commands [in square brackets], and what we DON'T plan to
implement.
Each "see also" note means the listed package also installs the listed shared
libraries. (While toybox contains equivalent functionality to a lot of these
shared libraries in its lib/ directory, it does not currently provide a shared
library interface.)
Packages toybox plans to provide complete-ish replacements for:
- file: file (see also: libmagic)
- m4: [m4]
- bc: [bc] [dc]
- bison: [yacc] (not: bison, see also: liby)
- flex: [lex] (not: flex flex++, see also: libfl)
- make: [make]
- sed: sed
- grep: grep egrep fgrep
- bash: bash sh (not: bashbug)
- diffutils: cmp [diff] [diff3] [sdiff]
- gawk: [awk] (not: gawk gawk-5.0.1)
- findutils: find xargs (not: locate updatedb)
- less: less (not: lessecho lesskey)
- gzip: zcat [gzip] [gunzip] [zcmp] [zdiff] [zegrep] [zfgrep] [zgrep] [zless] [zmore]
(not: gzexe uncompress zforce znew)
- patch: patch
- tar: tar
- procps-ng: free pgrep pidof pkill ps sysctl top uptime vmstat w watch
[pmap] [pwdx] [slabtop]
(not: tload, see also libprocps)
- sysklogd: [klogd] [syslogd]
- sysvinit: [init] halt poweroff reboot killall5 [shutdown]
(not telinit runlevel fstab-decode bootlogd)
- man: man (but not accessdb apropos catman lexgrog mandb manpath whatis,
see also libman libmandb)
- vim: vi xxd (but not ex, rview, rvim, view, vim, vimdiff, vimtutor)
- sysvinit: [init] halt poweroff reboot killall5 [shutdown]
(not telinit runlevel fstab-decode bootlogd)
- kmod: insmod lsmod rmmod modinfo [modprobe]
(not: depmod kmod)
- attr: [getfattr] setfattr (not: attr, see also: libattr)
- shadow: [chfn] [chpasswd] [chsh] [groupadd] [groupdel] [groupmod]
[newusers] passwd [su] [useradd] [userdel] [usermod]
[lastlog] [login] [newgidmap] [newuidmap]
(not: chage expiry faillog groupmems grpck logoutd newgrp nologin pwck sg
vigr vipw, grpconv grpunconv pwconv pwunconv, chgpasswd gpasswd)
- psmisc: killall [fuser] [pstree] [peekfd] [prtstat]
(not: pslog pstree.x11)
- inetutils: dnsdomainname [ftp] hostname ifconfig ping ping6 [telnet] [tftp] [traceroute] (not: talk)
- coreutils: [ base32 base64 basename cat chgrp chmod chown chroot cksum comm cp cut date
dd df dirname du echo env expand factor false fmt fold groups head hostid id install
link ln logname ls md5sum mkdir mkfifo mknod mktemp mv nice nl nohup nproc od
paste printenv printf pwd readlink realpath rm rmdir seq sha1sum shred
sleep sort split stat sync tac tail tee test timeout touch true truncate
tty uname uniq unlink wc who whoami yes
[expr] [fold] [join] [numfmt] [runcon] [sha224sum] [sha256sum] [sha384sum]
[sha512sum] [stty] [b2sum] [tr] [unexpand]
(not: basenc chcon csplit dir dircolors pathchk
pinky pr ptx shuf stdbuf sum tsort users vdir, see also libstdbuf)
- util-linux: blkid blockdev cal chrt dmesg eject fallocate flock hwclock
ionice kill logger losetup mcookie mkswap more mount mountpoint nsenter
pivot_root prlimit rename renice rev setsid swapoff swapon switch_root taskset
umount unshare uuidgen
[addpart] [fdisk] [findfs] [findmnt] [fsck] [fsfreeze] [fstrim] [getopt]
[hexdump] [linux32] [linux64] [lsblk] [lscpu] [lsns] [setarch]
(not: agetty blkdiscard blkzone cfdisk chcpu chmem choom col
colcrt colrm column ctrlaltdel delpart fdformat fincore fsck.cramfs
fsck.minix ipcmk ipcrm ipcs isosize last lastb ldattach look lsipc
lslocks lslogins lsmem mesg mkfs mkfs.bfs mkfs.cramfs mkfs.minix namei partx
raw readprofile resizepart rfkill rtcwake script scriptreplay
setterm sfdisk sulogin swaplabel ul
uname26 utmpdump uuidd uuidparse wall wdctl whereis wipefs
i386 x86_64 zramctl)
Commentary: toybox init doesn't do runlevels, man and vim are just the
relevant commands without the piles of strange overgrowth, and if you want
to call a toybox binary by another name you can create a symlink to a
symlink. If somebody really wants to argue for "gzexe" or similar, be
my guest, but there's a lot of obsolete crap in shadow, coreutils,
util-linux...
No idea why LFS is installing inetutils instead of net-tools
(which contains arp route ifconfig mii-tool nameif netstat and rarp that
toybox does or might implement, and plipconfig slattach that it probably won't.)
Packages toybox plans to provide partial replacements for:
Toybox provides replacements for some binaries from these packages,
but there are other useful binaries which this package provides that toybox
currently considers out of scope for the project:
- binutils: strings [ar] [nm] [readelf] [size] [objcopy] [strip]
(not c++filt, dwp, elfedit, gprof. The following commands belong
in qcc: addr2line as ld objdump ranlib)
- bzip2: bunzip2 bzcat [bzcmp] [bzdiff] [bzegrep] [bzfgrep] [bzgrep] [bzless]
[bzmore] (not: bzip2, bzip2recover, see also libbz2)
- xz: [xzcat] [lzcat] [lzcmp] [lzdiff] [lzegrep] [lzfgrep] [lzgrep]
[lzless] [lzmadec, lzmainfo] [lzmore] [unlzma] [unxz] [xzcat]
[xzcmp] [xzdec] [xzdiff] [xzegrep] [xzfgrep] [xzgrep] [xzless] [xzmore]
(not: compression side, see also: liblzma)
- ncurses: clear reset (not: everything else, see also: libcurses)
- e2fsprogs: chattr lsattr [e2fsck] [mkfs.ext2] [mkfs.ext3]
[fsck.ext2] [fsck.ext3] [e2label] [resize2fs] [tune2fs]
(not badblocks compile_et debugfs dumpe2fse2freefrag e2image
e2mmpstatus e2scrub e2scrub_all e2undo e4crypt e4defrag filefrag
fsck.ext4 logsave mk_cmds mkfs.ext4 mklost+found)
Toybox provides several decompressors but compresses to a single format
(deflate, ala gzip/zlib). Our e2fsprogs doesn't currently plan to support
ext4 or defrag. The "qcc" reference is because someday an external project to glue
QEMU's Tiny Code Generator
to Fabrice Bellard's old Tiny C Compiler
making a multicall binary that does cc/ld/as for all the targets QEMU
supports (then use the
LLVM C Backend
to compile LLVM itself to C for use as a modern replacement for
cfront to bootstrap
C++ code) is under consideration
as a successor project to toybox. Until then things like objdump -d
(requiring target-specific disassembly for an unbounded number of architectures)
are out of scope for toybox. (This means drawing the line somewhere between
architecture-specific support in file and strace, and including a full
assembler for each architecture.)
Packages from LFS ch6 toybox does NOT plan to replace:
linux-api-headers man-pages glibc zlib readline gmp mpfr mpc gcc pkg-config
ncurses acl libcap psmisc iana-etc libtool gdbm gperf expat perl XML::Parser
intltool autoconf automake gettext libelf libffi openssl python ninja meson
check groff grub libpipeline texinfo
That said, we do implement our own zlib and readline replacements, and
presumably _could_ export them as library bindings. Plus we provide
our own version of a bunch of the section 1 man pages (as command help).
Possibly libcap and acl are interesting?
Misc
The kbd package has over a dozen commands, we only implement chvt. The
iproute2 package implements over a dozen commands, there's an "ip" in
pending but I'm not a fan (ifconfig and route and such should be extended
to work properly). We don't implement eudev, but toybox's maintainer
created busybox mdev way back when (which replaces it) and plans to do a
new one for toybox as soon as we work out what subset is still needed now that
devtmpfs is available.