Commit Message

Introduce in-kernel headers and other artifacts which are made available
as an archive through proc (/proc/kheaders.tar.xz file). This archive makes
it possible to build kernel modules, run eBPF programs, and other
tracing programs that need to extend the kernel for tracing purposes
without any dependency on the file system having headers and build
artifacts.
On Android and embedded systems, it is common to switch kernels but not
have kernel headers available on the file system. Raw kernel headers
also cannot be copied into the filesystem like they can be on other
distros, due to licensing and other issues. There's no linux-headers
package on Android. Further once a different kernel is booted, any
headers stored on the file system will no longer be useful. By storing
the headers as a compressed archive within the kernel, we can avoid these
issues that have been a hindrance for a long time.
The feature is also buildable as a module just in case the user desires
it not being part of the kernel image. This makes it possible to load
and unload the headers on demand. A tracing program, or a kernel module
builder can load the module, do its operations, and then unload the
module to save kernel memory. The total memory needed is 3.8MB.
The code to read the headers is based on /proc/config.gz code and uses
the same technique to embed the headers.
To build a module, the below steps have been tested on an x86 machine:
modprobe kheaders
rm -rf $HOME/headers
mkdir -p $HOME/headers
tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null
cd my-kernel-module
make -C $HOME/headers M=$(pwd) modules
rmmod kheaders
Additional notes:
(1) external modules must be built on the same arch as the host that
built vmlinux. This can be done either in a qemu emulated chroot on the
target, or natively. This is due to host arch dependency of kernel
scripts.
(2)
A limitation of module building with this is, since Module.symvers is
not available in the archive due to a cyclic dependency with building of
the archive into the kernel or module binaries, the modules built using
the archive will not contain symbol versioning (modversion). This is
usually not an issue since the idea of this patch is to build a kernel
module on the fly and load it into the same kernel. An appropriate
warning is already printed by the kernel to alert the user of modules
not having modversions when built using the archive. For building with
modversions, the user can use traditional header packages. For our
tracing usecases, we build modules on the fly with this so it is not a
concern.
(3) I have left IKHD_ST and IKHD_ED markers as is to facilitate
future patches that would extract the headers from a kernel or module
image.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
Changes since v3:
- Blank tar was being generated because of a one line I
forgot to push. It is updated now.
- Added module.lds since arm64 needs it to build modules.
Changes since v2:
(Thanks to Masahiro Yamada for several excellent suggestions)
- Added support for out of tree builds.
- Added incremental build support bringing down build time of
incremental builds from 50 seconds to 5 seconds.
- Fixed various small nits / cleanups.
- clean ups to kheaders.c pointed by Alexey Dobriyan.
- Fixed MODULE_LICENSE in test module and kheaders.c
- Dropped Module.symvers from archive due to circular dependency.
Changes since v1:
- removed IKH_EXTRA variable, not needed (Masahiro Yamada)
- small fix ups to selftest
- added target to main Makefile etc
- added MODULE_LICENSE to test module
- made selftest more quiet
Changes since RFC:
Both changes bring size down to 3.8MB:
- use xz for compression
- strip comments except SPDX lines
- Call out the module name in Kconfig
- Also added selftests in second patch to ensure headers are always
working.
Other notes:
By the way I still see this error (without the patch) when doing a clean
build: Makefile:594: include/config/auto.conf: No such file or directory
It appears to be because of commit 0a16d2e8cb7e ("kbuild: use 'include'
directive to load auto.conf from top Makefile")
Documentation/dontdiff | 1 +
init/Kconfig | 11 ++++++
kernel/.gitignore | 3 ++
kernel/Makefile | 37 +++++++++++++++++++
kernel/kheaders.c | 72 ++++++++++++++++++++++++++++++++++++
scripts/gen_ikh_data.sh | 78 +++++++++++++++++++++++++++++++++++++++
scripts/strip-comments.pl | 8 ++++
7 files changed, 210 insertions(+)
create mode 100644 kernel/kheaders.c
create mode 100755 scripts/gen_ikh_data.sh
create mode 100755 scripts/strip-comments.pl

On 3/1/19 5:08 PM, Joel Fernandes (Google) wrote:
> Introduce in-kernel headers and other artifacts which are made available> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> it possible to build kernel modules, run eBPF programs, and other> tracing programs that need to extend the kernel for tracing purposes> without any dependency on the file system having headers and build> artifacts.> > On Android and embedded systems, it is common to switch kernels but not> have kernel headers available on the file system. Raw kernel headers> also cannot be copied into the filesystem like they can be on other> distros, due to licensing and other issues. There's no linux-headers> package on Android. Further once a different kernel is booted, any> headers stored on the file system will no longer be useful. By storing> the headers as a compressed archive within the kernel, we can avoid these> issues that have been a hindrance for a long time.> > The feature is also buildable as a module just in case the user desires> it not being part of the kernel image. This makes it possible to load> and unload the headers on demand. A tracing program, or a kernel module> builder can load the module, do its operations, and then unload the> module to save kernel memory. The total memory needed is 3.8MB.> > The code to read the headers is based on /proc/config.gz code and uses> the same technique to embed the headers.> > To build a module, the below steps have been tested on an x86 machine:> modprobe kheaders> rm -rf $HOME/headers> mkdir -p $HOME/headers> tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> cd my-kernel-module> make -C $HOME/headers M=$(pwd) modules> rmmod kheaders> > Additional notes:> (1) external modules must be built on the same arch as the host that> built vmlinux. This can be done either in a qemu emulated chroot on the> target, or natively. This is due to host arch dependency of kernel> scripts.> > (2)> A limitation of module building with this is, since Module.symvers is> not available in the archive due to a cyclic dependency with building of> the archive into the kernel or module binaries, the modules built using> the archive will not contain symbol versioning (modversion). This is> usually not an issue since the idea of this patch is to build a kernel> module on the fly and load it into the same kernel. An appropriate> warning is already printed by the kernel to alert the user of modules> not having modversions when built using the archive. For building with> modversions, the user can use traditional header packages. For our> tracing usecases, we build modules on the fly with this so it is not a> concern.> > (3) I have left IKHD_ST and IKHD_ED markers as is to facilitate> future patches that would extract the headers from a kernel or module> image.> > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>> ---> > Changes since v3:> - Blank tar was being generated because of a one line I> forgot to push. It is updated now.> - Added module.lds since arm64 needs it to build modules.
Tested on x86 with eBPF scripts and exporting an alternative
BCC_KERNEL_SOURCE.
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

On Mon, Mar 04, 2019 at 11:48:52PM +0100, Dietmar Eggemann wrote:
> On 3/1/19 5:08 PM, Joel Fernandes (Google) wrote:> > Introduce in-kernel headers and other artifacts which are made available> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > it possible to build kernel modules, run eBPF programs, and other> > tracing programs that need to extend the kernel for tracing purposes> > without any dependency on the file system having headers and build> > artifacts.> > > > On Android and embedded systems, it is common to switch kernels but not> > have kernel headers available on the file system. Raw kernel headers> > also cannot be copied into the filesystem like they can be on other> > distros, due to licensing and other issues. There's no linux-headers> > package on Android. Further once a different kernel is booted, any> > headers stored on the file system will no longer be useful. By storing> > the headers as a compressed archive within the kernel, we can avoid these> > issues that have been a hindrance for a long time.> > > > The feature is also buildable as a module just in case the user desires> > it not being part of the kernel image. This makes it possible to load> > and unload the headers on demand. A tracing program, or a kernel module> > builder can load the module, do its operations, and then unload the> > module to save kernel memory. The total memory needed is 3.8MB.> > > > The code to read the headers is based on /proc/config.gz code and uses> > the same technique to embed the headers.> > > > To build a module, the below steps have been tested on an x86 machine:> > modprobe kheaders> > rm -rf $HOME/headers> > mkdir -p $HOME/headers> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > cd my-kernel-module> > make -C $HOME/headers M=$(pwd) modules> > rmmod kheaders> > > > Additional notes:> > (1) external modules must be built on the same arch as the host that> > built vmlinux. This can be done either in a qemu emulated chroot on the> > target, or natively. This is due to host arch dependency of kernel> > scripts.> > > > (2)> > A limitation of module building with this is, since Module.symvers is> > not available in the archive due to a cyclic dependency with building of> > the archive into the kernel or module binaries, the modules built using> > the archive will not contain symbol versioning (modversion). This is> > usually not an issue since the idea of this patch is to build a kernel> > module on the fly and load it into the same kernel. An appropriate> > warning is already printed by the kernel to alert the user of modules> > not having modversions when built using the archive. For building with> > modversions, the user can use traditional header packages. For our> > tracing usecases, we build modules on the fly with this so it is not a> > concern.> > > > (3) I have left IKHD_ST and IKHD_ED markers as is to facilitate> > future patches that would extract the headers from a kernel or module> > image.> > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>> > ---> > > > Changes since v3:> > - Blank tar was being generated because of a one line I> > forgot to push. It is updated now.> > - Added module.lds since arm64 needs it to build modules.> > Tested on x86 with eBPF scripts and exporting an alternative> BCC_KERNEL_SOURCE.> > Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Thank you! Will repost with the tags soon.
- Joel

On Mon, Mar 04, 2019 at 02:00:59PM +0000, Qais Yousef wrote:
> On 03/01/19 11:08, Joel Fernandes (Google) wrote:> > Introduce in-kernel headers and other artifacts which are made available> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > it possible to build kernel modules, run eBPF programs, and other> > tracing programs that need to extend the kernel for tracing purposes> > without any dependency on the file system having headers and build> > artifacts.> > > > On Android and embedded systems, it is common to switch kernels but not> > have kernel headers available on the file system. Raw kernel headers> > also cannot be copied into the filesystem like they can be on other> > distros, due to licensing and other issues. There's no linux-headers> > package on Android. Further once a different kernel is booted, any> > headers stored on the file system will no longer be useful. By storing> > the headers as a compressed archive within the kernel, we can avoid these> > issues that have been a hindrance for a long time.> > > > The feature is also buildable as a module just in case the user desires> > it not being part of the kernel image. This makes it possible to load> > and unload the headers on demand. A tracing program, or a kernel module> > builder can load the module, do its operations, and then unload the> > module to save kernel memory. The total memory needed is 3.8MB.> > > > The code to read the headers is based on /proc/config.gz code and uses> > the same technique to embed the headers.> > > > To build a module, the below steps have been tested on an x86 machine:> > modprobe kheaders> > rm -rf $HOME/headers> > mkdir -p $HOME/headers> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > cd my-kernel-module> > make -C $HOME/headers M=$(pwd) modules> > rmmod kheaders> > > > Additional notes:> > (1) external modules must be built on the same arch as the host that> > built vmlinux. This can be done either in a qemu emulated chroot on the> > target, or natively. This is due to host arch dependency of kernel> > scripts.> > > > (2)> > A limitation of module building with this is, since Module.symvers is> > not available in the archive due to a cyclic dependency with building of> > the archive into the kernel or module binaries, the modules built using> > the archive will not contain symbol versioning (modversion). This is> > usually not an issue since the idea of this patch is to build a kernel> > module on the fly and load it into the same kernel. An appropriate> > warning is already printed by the kernel to alert the user of modules> > not having modversions when built using the archive. For building with> > modversions, the user can use traditional header packages. For our> > tracing usecases, we build modules on the fly with this so it is not a> > concern.> > > > (3) I have left IKHD_ST and IKHD_ED markers as is to facilitate> > future patches that would extract the headers from a kernel or module> > image.> > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>> > I could get the headers using this patch in both built-in and modules options.> > You can add my tested-and-reviewed-by: Qais Yousef <qais.yousef@arm.com>> > I am not familiar with running kselftests so didn't get a chance to try the> next patch.
Thanks a lot for testing! Will repost with the tags soon.
- Joel

On Mon, Mar 4, 2019 at 1:15 AM Joel Fernandes <joelaf@google.com> wrote:
>> This report is for an older version of the patch so ignore it. The> issue is already resolved.>> On Sat, Mar 2, 2019 at 2:00 PM kbuild test robot <lkp@intel.com> wrote:> >> > Hi Joel,> >> > Thank you for the patch! Yet something to improve:> >> > [auto build test ERROR on linus/master]> > [also build test ERROR on v5.0-rc8]> > [cannot apply to next-20190301]> > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]> >> > url: https://github.com/0day-ci/linux/commits/Joel-Fernandes-Google/Provide-in-kernel-headers-for-making-it-easy-to-extend-the-kernel/20190303-014850> > config: sh-allmodconfig (attached as .config)> > compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0> > reproduce:> > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross> > chmod +x ~/bin/make.cross> > # save the attached .config to linux build tree> > GCC_VERSION=8.2.0 make.cross ARCH=sh> >> > All errors (new ones prefixed by >>):> >> > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > >> find: 'arch/sh/kernel/module.lds': No such file or directory> >> > ---> > 0-DAY kernel test infrastructure Open Source Technology Center> > https://lists.01.org/pipermail/kbuild-all Intel Corporation> >> > --> > You received this message because you are subscribed to the Google Groups "kernel-team" group.> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
Honestly speaking, I am worried about some flaws
in this feature, but anyway I read your code.
Here are just comments from the build system point of view
in case something might be useful.
Please feel free to adopt some of them if you think they are good.
(I do not think you need to re-post a patch just for
adding Reviewed-by/Tested-by tags, but anyway looks like you are
planning to post a new version.)
[1]
Please do not ignore kbuild test robot.
It never makes such a mistake like
replying to a new patch about the test result from old version.
The reports are really addressing this version (v4).
I see the error message even on x86.
[2]
The shell script keeps running
even when an error occurs.
If any line in a shell script fails,
probably it went already wrong.
I highly recommend to add 'set -e'
at the very beginning of your shell script.
It will propagate the error to Make.
[3]
targets += kheaders_data.tar.xz
targets += kheaders_data.h
targets += kheaders.md5
These three lines are unneeded because
'targets' is necessary just for if_changed or if_changed_rule.
Instead, please add
clean-files := kheaders_data.tar.xz kheaders_data.h kheaders.md5
[4]
It is pointless to pass 'ikh_file_list'
from Makefile to the shell script.
You can directly describe the following in gen_ikh_data.sh
You do not need to use sed.
src_file_list="
include/
arch/$SRCARCH/Makefile
arch/$SRCARCH/include/
arch/$SRCARCH/kernel/module.lds
scripts/
Makefile
"
obj_file_list="
scripts/
include/
arch/$SRCARCH/include/
"
if grep -q "^CONFIG_STACK_VALIDATION=y" $KCONFIG_CONFIG; then
obj_file_list="$obj_file_list tools/objtool/objtool"
fi
Be careful about module.lds
so that it will work for all architectures.
[5]
strip-comments.pl is short enough,
and I do not assume any other user.
IMHO, it would be cleaner to embed the one-liner perl
into the shell script, for example, like follows:
find $cpio_dir -type f -print0 |
xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'
[6]
It might be better to move
scripts/gen_ikh_data.sh to kernel/gen_ikh_data.sh
It will make this feature self-contained in kernel/.
And (more importantly to me), it would reduce my maintenance field.
[7]
I do not understand for what 'tarfile_md5' is used.
Is it necessary?
[8]
Is it more efficient to pipe the header files
to perl script like this?
cat (header) | perl 'do something' > (tmp directory)
Actually, I am not sure if it is more efficient than
stripping comments in-line. I could be wrong...
[9]
I'd prefer avoiding 'pushd && popd' if possible.
Hint: Kbuild already runs at the top directory
of objtree.
It is OK if the change would mess up the script.
[10]
You can embed a binary directly into C file
without producing a giant header file.
I refactored kernel/configs.c
https://lore.kernel.org/patchwork/patch/1042013/
Be careful; my patch has not been merged yet into the mainline.
It has been a while in linux-next,
and I have not received any problem report.
So, I am guessing it will probably be merged
in the current MW.
That's all from me.

Hi Masahiro,
Thanks for review, my replies are inline:
On Wed, Mar 06, 2019 at 09:26:14PM +0900, Masahiro Yamada wrote:
> On Mon, Mar 4, 2019 at 1:15 AM Joel Fernandes <joelaf@google.com> wrote:> >> > This report is for an older version of the patch so ignore it. The> > issue is already resolved.> >> > On Sat, Mar 2, 2019 at 2:00 PM kbuild test robot <lkp@intel.com> wrote:> > >> > > Hi Joel,> > >> > > Thank you for the patch! Yet something to improve:> > >> > > [auto build test ERROR on linus/master]> > > [also build test ERROR on v5.0-rc8]> > > [cannot apply to next-20190301]> > > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]> > >> > > url: https://github.com/0day-ci/linux/commits/Joel-Fernandes-Google/Provide-in-kernel-headers-for-making-it-easy-to-extend-the-kernel/20190303-014850> > > config: sh-allmodconfig (attached as .config)> > > compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0> > > reproduce:> > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross> > > chmod +x ~/bin/make.cross> > > # save the attached .config to linux build tree> > > GCC_VERSION=8.2.0 make.cross ARCH=sh> > >> > > All errors (new ones prefixed by >>):> > >> > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > >> > > ---> > > 0-DAY kernel test infrastructure Open Source Technology Center> > > https://lists.01.org/pipermail/kbuild-all Intel Corporation> > >> > > --> > > You received this message because you are subscribed to the Google Groups "kernel-team" group.> > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.> > > > Honestly speaking, I am worried about some flaws> in this feature, but anyway I read your code.
I am not sure what you mean by that :-(. All the previous series's comments
were addressed and it has been well tested by me and others. This looks quite
solid to me. Sorry it took so many revisions, this was not that easy to get right.
> Here are just comments from the build system point of view> in case something might be useful.> > Please feel free to adopt some of them if you think they are good.>> [1]> > Please do not ignore kbuild test robot.> > It never makes such a mistake like> replying to a new patch about the test result from old version.> > The reports are really addressing this version (v4).> > I see the error message even on x86.
I did not mean to. By the way - the error does not cause any issues. It is
just find being noisy. You are right I missed this, and I will correct this
in v5. However, it does not cause any issues to the feature/functionality at all.
> [2]> > The shell script keeps running> even when an error occurs.> > If any line in a shell script fails,> probably it went already wrong.> > I highly recommend to add 'set -e'> at the very beginning of your shell script.> > It will propagate the error to Make.
Ok I will consider making it -e. But please note that, the script itself does
not have any bugs. When you say it this way, it looks a bit bad to the
onlooker as if there is bugs in the code, but there aren't any that I know
off. All the comments in this round of review from view are just cosmetic
changes.
> [3]> > targets += kheaders_data.tar.xz> targets += kheaders_data.h> targets += kheaders.md5> > These three lines are unneeded because> 'targets' is necessary just for if_changed or if_changed_rule.> > Instead, please add> > clean-files := kheaders_data.tar.xz kheaders_data.h kheaders.md5
Ok.
> [4]> > It is pointless to pass 'ikh_file_list'> from Makefile to the shell script.> > > You can directly describe the following in gen_ikh_data.sh> > You do not need to use sed.> > > src_file_list="> include/> arch/$SRCARCH/Makefile> arch/$SRCARCH/include/> arch/$SRCARCH/kernel/module.lds> scripts/> Makefile> "> > obj_file_list="> scripts/> include/> arch/$SRCARCH/include/> "
Honestly, I prefer to keep it in Makefile.
> if grep -q "^CONFIG_STACK_VALIDATION=y" $KCONFIG_CONFIG; then> obj_file_list="$obj_file_list tools/objtool/objtool"> fi> > > > Be careful about module.lds> so that it will work for all architectures.
Yes.
> [5]> strip-comments.pl is short enough,> and I do not assume any other user.> > > IMHO, it would be cleaner to embed the one-liner perl> into the shell script, for example, like follows:> > find $cpio_dir -type f -print0 |> xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'
This does not work. I already tried it. But I guess I could generate the
script on the fly.
> [6]> It might be better to move> scripts/gen_ikh_data.sh to kernel/gen_ikh_data.sh> > It will make this feature self-contained in kernel/.> And (more importantly to me), it would reduce my maintenance field.
Sure.
> [7]> I do not understand for what 'tarfile_md5' is used.> Is it necessary?
It is just precaution, incase tar got corrupted and needs to be regenerated.
> [8]> Is it more efficient to pipe the header files> to perl script like this?> > > cat (header) | perl 'do something' > (tmp directory)
I don't think so.
> [9]> > I'd prefer avoiding 'pushd && popd' if possible.> > Hint: Kbuild already runs at the top directory> of objtree.> > > It is OK if the change would mess up the script.
This is needed to make out of tree builds work.
> [10]> > You can embed a binary directly into C file> without producing a giant header file.> > I refactored kernel/configs.c> https://lore.kernel.org/patchwork/patch/1042013/> > Be careful; my patch has not been merged yet into the mainline.> > It has been a while in linux-next,> and I have not received any problem report.> > So, I am guessing it will probably be merged> in the current MW.
Ok I will consider doing this.
Thanks for the review!
- Joel

On Thu, Mar 7, 2019 at 5:13 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>> Hi Masahiro,> Thanks for review, my replies are inline:>> On Wed, Mar 06, 2019 at 09:26:14PM +0900, Masahiro Yamada wrote:> > On Mon, Mar 4, 2019 at 1:15 AM Joel Fernandes <joelaf@google.com> wrote:> > >> > > This report is for an older version of the patch so ignore it. The> > > issue is already resolved.> > >> > > On Sat, Mar 2, 2019 at 2:00 PM kbuild test robot <lkp@intel.com> wrote:> > > >> > > > Hi Joel,> > > >> > > > Thank you for the patch! Yet something to improve:> > > >> > > > [auto build test ERROR on linus/master]> > > > [also build test ERROR on v5.0-rc8]> > > > [cannot apply to next-20190301]> > > > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]> > > >> > > > url: https://github.com/0day-ci/linux/commits/Joel-Fernandes-Google/Provide-in-kernel-headers-for-making-it-easy-to-extend-the-kernel/20190303-014850> > > > config: sh-allmodconfig (attached as .config)> > > > compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0> > > > reproduce:> > > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross> > > > chmod +x ~/bin/make.cross> > > > # save the attached .config to linux build tree> > > > GCC_VERSION=8.2.0 make.cross ARCH=sh> > > >> > > > All errors (new ones prefixed by >>):> > > >> > > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > > >> > > > ---> > > > 0-DAY kernel test infrastructure Open Source Technology Center> > > > https://lists.01.org/pipermail/kbuild-all Intel Corporation> > > >> > > > --> > > > You received this message because you are subscribed to the Google Groups "kernel-team" group.> > > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.> >> >> >> > Honestly speaking, I am worried about some flaws> > in this feature, but anyway I read your code.>> I am not sure what you mean by that :-(. All the previous series's comments> were addressed and it has been well tested by me and others. This looks quite> solid to me. Sorry it took so many revisions, this was not that easy to get right.>> > Here are just comments from the build system point of view> > in case something might be useful.> >> > Please feel free to adopt some of them if you think they are good.> >> [1]> >> > Please do not ignore kbuild test robot.> >> > It never makes such a mistake like> > replying to a new patch about the test result from old version.> >> > The reports are really addressing this version (v4).> >> > I see the error message even on x86.>> I did not mean to. By the way - the error does not cause any issues. It is> just find being noisy. You are right I missed this, and I will correct this> in v5. However, it does not cause any issues to the feature/functionality at all.>> > [2]> >> > The shell script keeps running> > even when an error occurs.> >> > If any line in a shell script fails,> > probably it went already wrong.> >> > I highly recommend to add 'set -e'> > at the very beginning of your shell script.> >> > It will propagate the error to Make.>> Ok I will consider making it -e. But please note that, the script itself does> not have any bugs.
Heh.
I do not know why you are so confident with your code, though.
OK, let's say this script has no bug.
But, the script may fail for a different reason.
For example, what if cpio is not installed on the user's machine.
In such a situation, this script will produce empty
kernel/kheaders_data.tar.xz, but the build system
will continue running, and succeed.
Don't you think it will better to let the build immediately fail
than letting the user debug a wrongly created image ?
> When you say it this way, it looks a bit bad to the> onlooker as if there is bugs in the code, but there aren't any that I know> off. All the comments in this round of review from view are just cosmetic> changes.>> > [3]> >> > targets += kheaders_data.tar.xz> > targets += kheaders_data.h> > targets += kheaders.md5> >> > These three lines are unneeded because> > 'targets' is necessary just for if_changed or if_changed_rule.> >> > Instead, please add> >> > clean-files := kheaders_data.tar.xz kheaders_data.h kheaders.md5>> Ok.>> > [4]> >> > It is pointless to pass 'ikh_file_list'> > from Makefile to the shell script.> >> >> > You can directly describe the following in gen_ikh_data.sh> >> > You do not need to use sed.> >> >> > src_file_list="> > include/> > arch/$SRCARCH/Makefile> > arch/$SRCARCH/include/> > arch/$SRCARCH/kernel/module.lds> > scripts/> > Makefile> > "> >> > obj_file_list="> > scripts/> > include/> > arch/$SRCARCH/include/> > ">> Honestly, I prefer to keep it in Makefile.>> > if grep -q "^CONFIG_STACK_VALIDATION=y" $KCONFIG_CONFIG; then> > obj_file_list="$obj_file_list tools/objtool/objtool"> > fi> >> >> >> > Be careful about module.lds> > so that it will work for all architectures.>> Yes.>> > [5]> > strip-comments.pl is short enough,> > and I do not assume any other user.> >> >> > IMHO, it would be cleaner to embed the one-liner perl> > into the shell script, for example, like follows:> >> > find $cpio_dir -type f -print0 |> > xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'>> This does not work. I already tried it. But I guess I could generate the> script on the fly.
I tested my code, and it worked for me.
perl -e <command_line>
is useful to run short code directly.
I do not know why you say it does not work.
Masahiro
> > [6]> > It might be better to move> > scripts/gen_ikh_data.sh to kernel/gen_ikh_data.sh> >> > It will make this feature self-contained in kernel/.> > And (more importantly to me), it would reduce my maintenance field.>> Sure.>> > [7]> > I do not understand for what 'tarfile_md5' is used.> > Is it necessary?>> It is just precaution, incase tar got corrupted and needs to be regenerated.>> > [8]> > Is it more efficient to pipe the header files> > to perl script like this?> >> >> > cat (header) | perl 'do something' > (tmp directory)>> I don't think so.>> > [9]> >> > I'd prefer avoiding 'pushd && popd' if possible.> >> > Hint: Kbuild already runs at the top directory> > of objtree.> >> >> > It is OK if the change would mess up the script.>> This is needed to make out of tree builds work.>> > [10]> >> > You can embed a binary directly into C file> > without producing a giant header file.> >> > I refactored kernel/configs.c> > https://lore.kernel.org/patchwork/patch/1042013/> >> > Be careful; my patch has not been merged yet into the mainline.> >> > It has been a while in linux-next,> > and I have not received any problem report.> >> > So, I am guessing it will probably be merged> > in the current MW.>> Ok I will consider doing this.>> Thanks for the review!>> - Joel>

Hi Joel,
On Fri, Mar 1, 2019 at 5:10 PM Joel Fernandes (Google)
<joel@joelfernandes.org> wrote:
> Introduce in-kernel headers and other artifacts which are made available> as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> it possible to build kernel modules, run eBPF programs, and other> tracing programs that need to extend the kernel for tracing purposes> without any dependency on the file system having headers and build> artifacts.>> On Android and embedded systems, it is common to switch kernels but not> have kernel headers available on the file system. Raw kernel headers> also cannot be copied into the filesystem like they can be on other> distros, due to licensing and other issues. There's no linux-headers> package on Android. Further once a different kernel is booted, any> headers stored on the file system will no longer be useful. By storing> the headers as a compressed archive within the kernel, we can avoid these> issues that have been a hindrance for a long time.>> The feature is also buildable as a module just in case the user desires> it not being part of the kernel image. This makes it possible to load> and unload the headers on demand. A tracing program, or a kernel module> builder can load the module, do its operations, and then unload the> module to save kernel memory. The total memory needed is 3.8MB.>> The code to read the headers is based on /proc/config.gz code and uses> the same technique to embed the headers.>> To build a module, the below steps have been tested on an x86 machine:> modprobe kheaders> rm -rf $HOME/headers> mkdir -p $HOME/headers> tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> cd my-kernel-module> make -C $HOME/headers M=$(pwd) modules> rmmod kheaders
As the usage pattern will be accessing the individual files, what about
implementing a file system that provides read-only access to the internal
kheaders archive?
mount kheaders $HOME/headers -t kheaders
Gr{oetje,eeting}s,
Geert

On Thu, Mar 07, 2019 at 09:58:24AM +0100, Geert Uytterhoeven wrote:
> Hi Joel,> > On Fri, Mar 1, 2019 at 5:10 PM Joel Fernandes (Google)> <joel@joelfernandes.org> wrote:> > Introduce in-kernel headers and other artifacts which are made available> > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > it possible to build kernel modules, run eBPF programs, and other> > tracing programs that need to extend the kernel for tracing purposes> > without any dependency on the file system having headers and build> > artifacts.> >> > On Android and embedded systems, it is common to switch kernels but not> > have kernel headers available on the file system. Raw kernel headers> > also cannot be copied into the filesystem like they can be on other> > distros, due to licensing and other issues. There's no linux-headers> > package on Android. Further once a different kernel is booted, any> > headers stored on the file system will no longer be useful. By storing> > the headers as a compressed archive within the kernel, we can avoid these> > issues that have been a hindrance for a long time.> >> > The feature is also buildable as a module just in case the user desires> > it not being part of the kernel image. This makes it possible to load> > and unload the headers on demand. A tracing program, or a kernel module> > builder can load the module, do its operations, and then unload the> > module to save kernel memory. The total memory needed is 3.8MB.> >> > The code to read the headers is based on /proc/config.gz code and uses> > the same technique to embed the headers.> >> > To build a module, the below steps have been tested on an x86 machine:> > modprobe kheaders> > rm -rf $HOME/headers> > mkdir -p $HOME/headers> > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > cd my-kernel-module> > make -C $HOME/headers M=$(pwd) modules> > rmmod kheaders> > As the usage pattern will be accessing the individual files, what about> implementing a file system that provides read-only access to the internal> kheaders archive?> > mount kheaders $HOME/headers -t kheaders
I thought about it already. This is easier said than done though. The archive
is compressed from 40MB to 3.6MB. If we leave it uncompressed in RAM, then it
will take up the entire 40MB of RAM and in Android we don't even use
disk-based swap.
So we will need some kind of intra file compressed memory representation that
a filesystem can use for the backing store. I thought of RAM-backed squashfs
but it requires squashfs-tools to be installed at build time (which my host
distro itself didn't have).
It is just so much easier to use tar + xz at build time, and leave the
decompression task to the user. After decompression, the files will live on
the disk and the page-cache mechanism will free memory when/if the files fall
off the LRUs.
WDYT?
thanks,
- Joel

On Thu, Mar 07, 2019 at 10:03:43AM -0500, Joel Fernandes wrote:
> On Thu, Mar 07, 2019 at 09:58:24AM +0100, Geert Uytterhoeven wrote:> > Hi Joel,> > > > On Fri, Mar 1, 2019 at 5:10 PM Joel Fernandes (Google)> > <joel@joelfernandes.org> wrote:> > > Introduce in-kernel headers and other artifacts which are made available> > > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > > it possible to build kernel modules, run eBPF programs, and other> > > tracing programs that need to extend the kernel for tracing purposes> > > without any dependency on the file system having headers and build> > > artifacts.> > >> > > On Android and embedded systems, it is common to switch kernels but not> > > have kernel headers available on the file system. Raw kernel headers> > > also cannot be copied into the filesystem like they can be on other> > > distros, due to licensing and other issues. There's no linux-headers> > > package on Android. Further once a different kernel is booted, any> > > headers stored on the file system will no longer be useful. By storing> > > the headers as a compressed archive within the kernel, we can avoid these> > > issues that have been a hindrance for a long time.> > >> > > The feature is also buildable as a module just in case the user desires> > > it not being part of the kernel image. This makes it possible to load> > > and unload the headers on demand. A tracing program, or a kernel module> > > builder can load the module, do its operations, and then unload the> > > module to save kernel memory. The total memory needed is 3.8MB.> > >> > > The code to read the headers is based on /proc/config.gz code and uses> > > the same technique to embed the headers.> > >> > > To build a module, the below steps have been tested on an x86 machine:> > > modprobe kheaders> > > rm -rf $HOME/headers> > > mkdir -p $HOME/headers> > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > > cd my-kernel-module> > > make -C $HOME/headers M=$(pwd) modules> > > rmmod kheaders> > > > As the usage pattern will be accessing the individual files, what about> > implementing a file system that provides read-only access to the internal> > kheaders archive?> > > > mount kheaders $HOME/headers -t kheaders> > I thought about it already. This is easier said than done though. The archive> is compressed from 40MB to 3.6MB. If we leave it uncompressed in RAM, then it> will take up the entire 40MB of RAM and in Android we don't even use> disk-based swap.> > So we will need some kind of intra file compressed memory representation that> a filesystem can use for the backing store. I thought of RAM-backed squashfs> but it requires squashfs-tools to be installed at build time (which my host> distro itself didn't have).> > It is just so much easier to use tar + xz at build time, and leave the> decompression task to the user. After decompression, the files will live on> the disk and the page-cache mechanism will free memory when/if the files fall> off the LRUs.> > WDYT?
I think the compressed tarball is much simpler/easier overall. If
someone really wants the filesystem, they just uncompress it into a
tmpfs mount. It's much less moving kernel code to worry about.
thanks,
greg k-h

On Thu, Mar 07, 2019 at 04:23:03PM +0100, Greg KH wrote:
> > > On Fri, Mar 1, 2019 at 5:10 PM Joel Fernandes (Google)> > > <joel@joelfernandes.org> wrote:> > > > Introduce in-kernel headers and other artifacts which are made available> > > > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > > > it possible to build kernel modules, run eBPF programs, and other> > > > tracing programs that need to extend the kernel for tracing purposes> > > > without any dependency on the file system having headers and build> > > > artifacts.> > > >> > > > On Android and embedded systems, it is common to switch kernels but not> > > > have kernel headers available on the file system. Raw kernel headers> > > > also cannot be copied into the filesystem like they can be on other> > > > distros, due to licensing and other issues. There's no linux-headers> > > > package on Android. Further once a different kernel is booted, any> > > > headers stored on the file system will no longer be useful. By storing> > > > the headers as a compressed archive within the kernel, we can avoid these> > > > issues that have been a hindrance for a long time.> > > >> > > > The feature is also buildable as a module just in case the user desires> > > > it not being part of the kernel image. This makes it possible to load> > > > and unload the headers on demand. A tracing program, or a kernel module> > > > builder can load the module, do its operations, and then unload the> > > > module to save kernel memory. The total memory needed is 3.8MB.> > > >> > > > The code to read the headers is based on /proc/config.gz code and uses> > > > the same technique to embed the headers.> > > >> > > > To build a module, the below steps have been tested on an x86 machine:> > > > modprobe kheaders> > > > rm -rf $HOME/headers> > > > mkdir -p $HOME/headers> > > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > > > cd my-kernel-module> > > > make -C $HOME/headers M=$(pwd) modules> > > > rmmod kheaders> > > > > > As the usage pattern will be accessing the individual files, what about> > > implementing a file system that provides read-only access to the internal> > > kheaders archive?> > > > > > mount kheaders $HOME/headers -t kheaders> > > > I thought about it already. This is easier said than done though. The archive> > is compressed from 40MB to 3.6MB. If we leave it uncompressed in RAM, then it> > will take up the entire 40MB of RAM and in Android we don't even use> > disk-based swap.> > > > So we will need some kind of intra file compressed memory representation that> > a filesystem can use for the backing store. I thought of RAM-backed squashfs> > but it requires squashfs-tools to be installed at build time (which my host> > distro itself didn't have).> > > > It is just so much easier to use tar + xz at build time, and leave the> > decompression task to the user. After decompression, the files will live on> > the disk and the page-cache mechanism will free memory when/if the files fall> > off the LRUs.> > > > WDYT?> > I think the compressed tarball is much simpler/easier overall. If> someone really wants the filesystem, they just uncompress it into a> tmpfs mount. It's much less moving kernel code to worry about.
Agreed, I also feel the same. thanks,
- Joel

Would you consider this more reliable than the make install_headers,
and am curious about DKMS applications / whether this should be
preferable to the /lib/modules/kernver/biuld symlinks
On Wed, Mar 6, 2019 at 6:48 AM Masahiro Yamada
<yamada.masahiro@socionext.com> wrote:
>> On Mon, Mar 4, 2019 at 1:15 AM Joel Fernandes <joelaf@google.com> wrote:> >> > This report is for an older version of the patch so ignore it. The> > issue is already resolved.> >> > On Sat, Mar 2, 2019 at 2:00 PM kbuild test robot <lkp@intel.com> wrote:> > >> > > Hi Joel,> > >> > > Thank you for the patch! Yet something to improve:> > >> > > [auto build test ERROR on linus/master]> > > [also build test ERROR on v5.0-rc8]> > > [cannot apply to next-20190301]> > > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]> > >> > > url: https://github.com/0day-ci/linux/commits/Joel-Fernandes-Google/Provide-in-kernel-headers-for-making-it-easy-to-extend-the-kernel/20190303-014850> > > config: sh-allmodconfig (attached as .config)> > > compiler: sh4-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0> > > reproduce:> > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross> > > chmod +x ~/bin/make.cross> > > # save the attached .config to linux build tree> > > GCC_VERSION=8.2.0 make.cross ARCH=sh> > >> > > All errors (new ones prefixed by >>):> > >> > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > > >> find: 'arch/sh/kernel/module.lds': No such file or directory> > >> > > ---> > > 0-DAY kernel test infrastructure Open Source Technology Center> > > https://lists.01.org/pipermail/kbuild-all Intel Corporation> > >> > > --> > > You received this message because you are subscribed to the Google Groups "kernel-team" group.> > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.>>>> Honestly speaking, I am worried about some flaws> in this feature, but anyway I read your code.>> Here are just comments from the build system point of view> in case something might be useful.>> Please feel free to adopt some of them if you think they are good.>> (I do not think you need to re-post a patch just for> adding Reviewed-by/Tested-by tags, but anyway looks like you are> planning to post a new version.)>>>> [1]>> Please do not ignore kbuild test robot.>> It never makes such a mistake like> replying to a new patch about the test result from old version.>> The reports are really addressing this version (v4).>> I see the error message even on x86.>>> [2]>> The shell script keeps running> even when an error occurs.>> If any line in a shell script fails,> probably it went already wrong.>> I highly recommend to add 'set -e'> at the very beginning of your shell script.>> It will propagate the error to Make.>>>> [3]>> targets += kheaders_data.tar.xz> targets += kheaders_data.h> targets += kheaders.md5>> These three lines are unneeded because> 'targets' is necessary just for if_changed or if_changed_rule.>> Instead, please add>> clean-files := kheaders_data.tar.xz kheaders_data.h kheaders.md5>>> [4]>> It is pointless to pass 'ikh_file_list'> from Makefile to the shell script.>>> You can directly describe the following in gen_ikh_data.sh>> You do not need to use sed.>>> src_file_list="> include/> arch/$SRCARCH/Makefile> arch/$SRCARCH/include/> arch/$SRCARCH/kernel/module.lds> scripts/> Makefile> ">> obj_file_list="> scripts/> include/> arch/$SRCARCH/include/> ">> if grep -q "^CONFIG_STACK_VALIDATION=y" $KCONFIG_CONFIG; then> obj_file_list="$obj_file_list tools/objtool/objtool"> fi>>>> Be careful about module.lds> so that it will work for all architectures.>>>>> [5]> strip-comments.pl is short enough,> and I do not assume any other user.>>> IMHO, it would be cleaner to embed the one-liner perl> into the shell script, for example, like follows:>> find $cpio_dir -type f -print0 |> xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'>>>> [6]> It might be better to move> scripts/gen_ikh_data.sh to kernel/gen_ikh_data.sh>> It will make this feature self-contained in kernel/.> And (more importantly to me), it would reduce my maintenance field.>>>> [7]> I do not understand for what 'tarfile_md5' is used.> Is it necessary?>>>> [8]> Is it more efficient to pipe the header files> to perl script like this?>>> cat (header) | perl 'do something' > (tmp directory)>>> Actually, I am not sure if it is more efficient than> stripping comments in-line. I could be wrong...>>> [9]>> I'd prefer avoiding 'pushd && popd' if possible.>> Hint: Kbuild already runs at the top directory> of objtree.>>> It is OK if the change would mess up the script.>>> [10]>> You can embed a binary directly into C file> without producing a giant header file.>> I refactored kernel/configs.c> https://lore.kernel.org/patchwork/patch/1042013/>> Be careful; my patch has not been merged yet into the mainline.>> It has been a while in linux-next,> and I have not received any problem report.>> So, I am guessing it will probably be merged> in the current MW.>>>>> That's all from me.>>> --> Best Regards> Masahiro Yamada

Hi Joel,
On Thu, Mar 7, 2019 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> On Thu, Mar 07, 2019 at 09:58:24AM +0100, Geert Uytterhoeven wrote:> > On Fri, Mar 1, 2019 at 5:10 PM Joel Fernandes (Google)> > <joel@joelfernandes.org> wrote:> > > Introduce in-kernel headers and other artifacts which are made available> > > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes> > > it possible to build kernel modules, run eBPF programs, and other> > > tracing programs that need to extend the kernel for tracing purposes> > > without any dependency on the file system having headers and build> > > artifacts.> > >> > > On Android and embedded systems, it is common to switch kernels but not> > > have kernel headers available on the file system. Raw kernel headers> > > also cannot be copied into the filesystem like they can be on other> > > distros, due to licensing and other issues. There's no linux-headers> > > package on Android. Further once a different kernel is booted, any> > > headers stored on the file system will no longer be useful. By storing> > > the headers as a compressed archive within the kernel, we can avoid these> > > issues that have been a hindrance for a long time.> > >> > > The feature is also buildable as a module just in case the user desires> > > it not being part of the kernel image. This makes it possible to load> > > and unload the headers on demand. A tracing program, or a kernel module> > > builder can load the module, do its operations, and then unload the> > > module to save kernel memory. The total memory needed is 3.8MB.> > >> > > The code to read the headers is based on /proc/config.gz code and uses> > > the same technique to embed the headers.> > >> > > To build a module, the below steps have been tested on an x86 machine:> > > modprobe kheaders> > > rm -rf $HOME/headers> > > mkdir -p $HOME/headers> > > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null> > > cd my-kernel-module> > > make -C $HOME/headers M=$(pwd) modules> > > rmmod kheaders> >> > As the usage pattern will be accessing the individual files, what about> > implementing a file system that provides read-only access to the internal> > kheaders archive?> >> > mount kheaders $HOME/headers -t kheaders>> I thought about it already. This is easier said than done though. The archive> is compressed from 40MB to 3.6MB. If we leave it uncompressed in RAM, then it> will take up the entire 40MB of RAM and in Android we don't even use> disk-based swap.
Sure.
> So we will need some kind of intra file compressed memory representation that> a filesystem can use for the backing store. I thought of RAM-backed squashfs
Right, I didn't think of squashfs. Having a kernel module that sets up an MTD
that can be mounted with squashfs is IMHO an even better (and more generic)
solution.
> but it requires squashfs-tools to be installed at build time (which my host> distro itself didn't have).
Squashfs not being part of your host distro is not the hardest problem
to solve...
> It is just so much easier to use tar + xz at build time, and leave the> decompression task to the user. After decompression, the files will live on> the disk and the page-cache mechanism will free memory when/if the files fall> off the LRUs.
I'm also considering how generic and extensible the solution is.
What if people need other build artifacts in the future (e.g. signing key to
load signed modules)?
Thanks!
Gr{oetje,eeting}s,
Geert

HI Geert,
On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > It is just so much easier to use tar + xz at build time, and leave the> > decompression task to the user. After decompression, the files will live on> > the disk and the page-cache mechanism will free memory when/if the files fall> > off the LRUs.>> I'm also considering how generic and extensible the solution is.> What if people need other build artifacts in the future (e.g. signing key to> load signed modules)?
That sounds like it could be useful. I don't see any reason off the
top why that would not be possible to add to the list of archived
files in the future. The patch allows populating the list of files
from Kbuild using ikh_file_list variable.
thanks,
- Joel

On 08.03.19 14:42, Joel Fernandes wrote:
Hi folks,
> That sounds like it could be useful. I don't see any reason off the> top why that would not be possible to add to the list of archived> files in the future. The patch allows populating the list of files> from Kbuild using ikh_file_list variable.
It seems the whole thing's going into a direction where a whole own
subtopic is coming up: compressed built-in filesystem.
Haven't had a deeper thought about it yet, whether or not existing
filesystems like squashfs, initramfs, etc are the right thing for that,
or something new should be invented, but a generic mechanism for
compiled-in compressed (ro) filesystems could also be interesting
for very small devices, that perhaps even dont need any persistence.
Some requirements coming up in mind:
1. it shall be possible to have any number of instances - possibly by
separate modules.
2. it shall be possible to use an bootloader/firmware provided image
(so it can serve as initrd)
2. data should stay compressed as long as possible, but uncompressed
data might be cached - do decompression on-demand
3. only needs to be ro (write access could be done via unionfs+friends)
4. it shall be swappable (if swap is enabled)
In that scenario, these in-kernel headers would just one consumer,
I can imagine lots of others.
--mtx

On Fri, Mar 08, 2019 at 05:42:32AM -0800, Joel Fernandes wrote:
> HI Geert,> > On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:> > > It is just so much easier to use tar + xz at build time, and leave the> > > decompression task to the user. After decompression, the files will live on> > > the disk and the page-cache mechanism will free memory when/if the files fall> > > off the LRUs.> >> > I'm also considering how generic and extensible the solution is.> > What if people need other build artifacts in the future (e.g. signing key to> > load signed modules)?> > That sounds like it could be useful. I don't see any reason off the> top why that would not be possible to add to the list of archived> files in the future. The patch allows populating the list of files> from Kbuild using ikh_file_list variable.
Um, no, you don't want the signing key in the kernel itself, as that
totally defeats the purpose of the signing key :)
greg k-h

On Fri, Mar 08, 2019 at 02:57:24PM +0100, Enrico Weigelt, metux IT consult wrote:
> On 08.03.19 14:42, Joel Fernandes wrote:> > Hi folks,> > > That sounds like it could be useful. I don't see any reason off the> > top why that would not be possible to add to the list of archived> > files in the future. The patch allows populating the list of files> > from Kbuild using ikh_file_list variable.> > It seems the whole thing's going into a direction where a whole own> subtopic is coming up: compressed built-in filesystem.> > Haven't had a deeper thought about it yet, whether or not existing> filesystems like squashfs, initramfs, etc are the right thing for that,> or something new should be invented, but a generic mechanism for> compiled-in compressed (ro) filesystems could also be interesting> for very small devices, that perhaps even dont need any persistence.> > Some requirements coming up in mind:> > 1. it shall be possible to have any number of instances - possibly by> separate modules.> 2. it shall be possible to use an bootloader/firmware provided image> (so it can serve as initrd)> 2. data should stay compressed as long as possible, but uncompressed> data might be cached - do decompression on-demand> 3. only needs to be ro (write access could be done via unionfs+friends)> 4. it shall be swappable (if swap is enabled)> > In that scenario, these in-kernel headers would just one consumer,> I can imagine lots of others.
I too want a pony :)
Until then, I think this feature should not be bikeshedded to death. It
fits a real need, is simple (modulo the Kbuild interactions), and is
optional for those that do not wish to waste the memory.
thanks,
greg k-h

On Fri, Mar 08, 2019 at 03:02:51PM +0100, Greg KH wrote:
> On Fri, Mar 08, 2019 at 05:42:32AM -0800, Joel Fernandes wrote:> > HI Geert,> > > > On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:> > > > It is just so much easier to use tar + xz at build time, and leave the> > > > decompression task to the user. After decompression, the files will live on> > > > the disk and the page-cache mechanism will free memory when/if the files fall> > > > off the LRUs.> > >> > > I'm also considering how generic and extensible the solution is.> > > What if people need other build artifacts in the future (e.g. signing key to> > > load signed modules)?> > > > That sounds like it could be useful. I don't see any reason off the> > top why that would not be possible to add to the list of archived> > files in the future. The patch allows populating the list of files> > from Kbuild using ikh_file_list variable.> > Um, no, you don't want the signing key in the kernel itself, as that> totally defeats the purpose of the signing key :)
*Bangs self to desk*. You are right, didn't think that one through ;)
By the way, my v5 is ready to send. But I am avoiding sending it due to the
merge window going on. I posted it yesterday in this thread as well:
https://lore.kernel.org/patchwork/comment/1241077/
This is likely to be the final version or so, unless Masahiro or you or other
reviewers have any more comments.
thanks,
- Joel

Hi Greg,
On Fri, Mar 8, 2019 at 6:05 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> On Fri, Mar 08, 2019 at 05:42:32AM -0800, Joel Fernandes wrote:> > On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:> > > > It is just so much easier to use tar + xz at build time, and leave the> > > > decompression task to the user. After decompression, the files will live on> > > > the disk and the page-cache mechanism will free memory when/if the files fall> > > > off the LRUs.> > >> > > I'm also considering how generic and extensible the solution is.> > > What if people need other build artifacts in the future (e.g. signing key to> > > load signed modules)?> >> > That sounds like it could be useful. I don't see any reason off the> > top why that would not be possible to add to the list of archived> > files in the future. The patch allows populating the list of files> > from Kbuild using ikh_file_list variable.>> Um, no, you don't want the signing key in the kernel itself, as that> totally defeats the purpose of the signing key :)
In a loadable module?
He who has the module, can build and sign more modules.
Gr{oetje,eeting}s,
Geert

On Fri, Mar 08, 2019 at 06:59:23PM +0100, Geert Uytterhoeven wrote:
> Hi Greg,> > On Fri, Mar 8, 2019 at 6:05 PM Greg KH <gregkh@linuxfoundation.org> wrote:> > On Fri, Mar 08, 2019 at 05:42:32AM -0800, Joel Fernandes wrote:> > > On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:> > > > > It is just so much easier to use tar + xz at build time, and leave the> > > > > decompression task to the user. After decompression, the files will live on> > > > > the disk and the page-cache mechanism will free memory when/if the files fall> > > > > off the LRUs.> > > >> > > > I'm also considering how generic and extensible the solution is.> > > > What if people need other build artifacts in the future (e.g. signing key to> > > > load signed modules)?> > >> > > That sounds like it could be useful. I don't see any reason off the> > > top why that would not be possible to add to the list of archived> > > files in the future. The patch allows populating the list of files> > > from Kbuild using ikh_file_list variable.> >> > Um, no, you don't want the signing key in the kernel itself, as that> > totally defeats the purpose of the signing key :)> > In a loadable module?> He who has the module, can build and sign more modules.
Again, that's pretty foolish.
Signing keys should be kept secure, or better yet, just deleted entirely
after creating and signing with them. That's what I do for my kernels
and I'm pretty sure that some distros also do this. That way there's no
chance that someone else can sign a module and have it loaded without
detection, which is what signing is supposed to prevent from happening.
thanks,
greg k-h

Hi Greg,
On Sat, Mar 9, 2019 at 8:16 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> On Fri, Mar 08, 2019 at 06:59:23PM +0100, Geert Uytterhoeven wrote:> > On Fri, Mar 8, 2019 at 6:05 PM Greg KH <gregkh@linuxfoundation.org> wrote:> > > On Fri, Mar 08, 2019 at 05:42:32AM -0800, Joel Fernandes wrote:> > > > On Fri, Mar 8, 2019, 3:53 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote:> > > > > > It is just so much easier to use tar + xz at build time, and leave the> > > > > > decompression task to the user. After decompression, the files will live on> > > > > > the disk and the page-cache mechanism will free memory when/if the files fall> > > > > > off the LRUs.> > > > >> > > > > I'm also considering how generic and extensible the solution is.> > > > > What if people need other build artifacts in the future (e.g. signing key to> > > > > load signed modules)?> > > >> > > > That sounds like it could be useful. I don't see any reason off the> > > > top why that would not be possible to add to the list of archived> > > > files in the future. The patch allows populating the list of files> > > > from Kbuild using ikh_file_list variable.> > >> > > Um, no, you don't want the signing key in the kernel itself, as that> > > totally defeats the purpose of the signing key :)> >> > In a loadable module?> > He who has the module, can build and sign more modules.>> Again, that's pretty foolish.>> Signing keys should be kept secure, or better yet, just deleted entirely> after creating and signing with them. That's what I do for my kernels> and I'm pretty sure that some distros also do this. That way there's no> chance that someone else can sign a module and have it loaded without> detection, which is what signing is supposed to prevent from happening.
If you want that kind of security, there's no point in allowing to extend the
kernel by building more kernel modules after deployment.
The more I think about this (embedding a kernel headers archive in the
kernel or a kernel module), the more I feel this is a workaround for a distro
issue.
Files are distributed with the kernel image, e.g. loadable kernel modules,
so distributing more files is not a technical issue (and if it is, working
around that might be as simple as "mv kheaders.tar.xz kheaders.ko",
and letting module install take care of it ;-)
It makes sense to provide /proc/kconfig.gz, as this is unique
configuration info. The kernel headers can easily by derived from this
config, and the kernel sources (which are GPL).
"Raw kernel headers also cannot be copied into the filesystem like they
can be on other distros, due to licensing and other issues. There's no
linux-headers package on Android."
What's the licensing issue? What's the (legal) difference between having
the headers on the file system, and having a kernel module including the
headers on the file system?
Thanks!
Gr{oetje,eeting}s,
Geert

On Sat, Mar 09, 2019 at 12:40:01PM +0100, Geert Uytterhoeven wrote:
> > Signing keys should be kept secure, or better yet, just deleted entirely> > after creating and signing with them. That's what I do for my kernels> > and I'm pretty sure that some distros also do this. That way there's no> > chance that someone else can sign a module and have it loaded without> > detection, which is what signing is supposed to prevent from happening.> > If you want that kind of security, there's no point in allowing to extend the> kernel by building more kernel modules after deployment.
That's not what these files are for (in the original user's case). They
want these for doing tracing/ebpf stuff, which require kernel headers to
build against.
> "Raw kernel headers also cannot be copied into the filesystem like they> can be on other distros, due to licensing and other issues. There's no> linux-headers package on Android."> > What's the licensing issue? What's the (legal) difference between having> the headers on the file system, and having a kernel module including the> headers on the file system?
There is no licensing issue, see my follow-up comment about that.
It's all in ease-of-use here. You want to build a trace function
against a running kernel, and now you have the header files for that
specific kernel right there in the kernel itself to build against. It
doesn't get easier than that.
thanks,
greg k-h

On 3/9/19 7:11 AM, Greg KH wrote:
> There is no licensing issue, see my follow-up comment about that.> > It's all in ease-of-use here. You want to build a trace function> against a running kernel, and now you have the header files for that> specific kernel right there in the kernel itself to build against. It> doesn't get easier than that.
Agreed.
It seems that opinions in this thread are split along two conceptions of
a "Linux system". On one side there's that of the conventional "Linux
distro" where the entity generating the distro has full control over the
entire process and the parts thereof. On the other side there's the
world that has evolved out of the multi-party ecosystem that Android
fostered. In the later, there isn't a single party that controls all
aspects of the "distro". Instead, there are a multitude of parties
contributing to creating a fully-functional Linux-based system.
In the latter ecosystem, the kernel and the filesystems (*plural*) used
with it do not necessarily come from the same place, are maintained by
the same parties or even required to be in lock-step. For all I know,
the filesystem images are coming from one party and the kernel is at one
point from one party and at another point can be substituted, possibly
for testing purposes, without userspace images ever changing. No
licensing issues at all involved, either with regards to the headers or
the distribution of the kernel itself, which, in as far as I've seen
when speaking of known industry players (those that matter here), are
always ultimately available in source from somewhere.
Instead the issue is that I want to be able to give user-space access to
the headers that were used to build the kernel that they're running
over, regardless of where this kernel came from. And since I have to
assume that those two parts (kernel vs. user-space) are coming from
separate parties and can be arbitrarily replaced, there needs to be some
form of "contract" between them. The kernel's ABI already provides quite
a bit of that and, as Joel pointed out, Google can enforce a few more
things, such as default kernel configs, to "make things work". But one
option that doesn't exist in the world Android evolves in is to somehow
ensure that the filesystem images somehow have kernel-specific
tool-required artifacts for every possible kernel they may run against.
That's just not possible.
That's especially true in the case of modern-day Android where the final
system is made of at least half a dozen filesystem images with each
image possibly coming from a different party. The /vendor partition
(with the hardware enablement), for example, could be coming from the
device vendor while the /system partition (with the Android framework),
for example, could be coming straight from Google in the form of
"Generic System Image" -- an image meant to run as-is on any
Treble-compliant system. /system might have the tools for eBPF, but
/vendor might have modules, while still the "boot" partition might have
the kernel. There's no reason I can't replace the boot partition
*without* changing /vendor. And there's no reason I can't replace
/vendor *without* replacing the boot partition. And all other
combinations with all other images.
That, in my view, is a big part of the problem Joel's patch solves: in a
system whose functionality requires multiple *independent* parties to
work together, I can still get the necessary kernel headers for
user-space tools to properly operate regardless of which part of the
system id being substituted or replaced.
Cheers,

Hi Karim,
Thanks for the explanation!
On Sat, Mar 9, 2019 at 5:52 PM Karim Yaghmour
<karim.yaghmour@opersys.com> wrote:
> On 3/9/19 7:11 AM, Greg KH wrote:> > There is no licensing issue, see my follow-up comment about that.> >> > It's all in ease-of-use here. You want to build a trace function> > against a running kernel, and now you have the header files for that> > specific kernel right there in the kernel itself to build against. It> > doesn't get easier than that.>> Agreed.>> It seems that opinions in this thread are split along two conceptions of> a "Linux system". On one side there's that of the conventional "Linux> distro" where the entity generating the distro has full control over the> entire process and the parts thereof. On the other side there's the> world that has evolved out of the multi-party ecosystem that Android> fostered. In the later, there isn't a single party that controls all> aspects of the "distro". Instead, there are a multitude of parties> contributing to creating a fully-functional Linux-based system.>> In the latter ecosystem, the kernel and the filesystems (*plural*) used> with it do not necessarily come from the same place, are maintained by> the same parties or even required to be in lock-step. For all I know,> the filesystem images are coming from one party and the kernel is at one> point from one party and at another point can be substituted, possibly> for testing purposes, without userspace images ever changing. No> licensing issues at all involved, either with regards to the headers or> the distribution of the kernel itself, which, in as far as I've seen> when speaking of known industry players (those that matter here), are> always ultimately available in source from somewhere.>> Instead the issue is that I want to be able to give user-space access to> the headers that were used to build the kernel that they're running> over, regardless of where this kernel came from. And since I have to> assume that those two parts (kernel vs. user-space) are coming from> separate parties and can be arbitrarily replaced, there needs to be some> form of "contract" between them. The kernel's ABI already provides quite> a bit of that and, as Joel pointed out, Google can enforce a few more> things, such as default kernel configs, to "make things work". But one> option that doesn't exist in the world Android evolves in is to somehow> ensure that the filesystem images somehow have kernel-specific> tool-required artifacts for every possible kernel they may run against.> That's just not possible.>> That's especially true in the case of modern-day Android where the final> system is made of at least half a dozen filesystem images with each> image possibly coming from a different party. The /vendor partition> (with the hardware enablement), for example, could be coming from the> device vendor while the /system partition (with the Android framework),> for example, could be coming straight from Google in the form of> "Generic System Image" -- an image meant to run as-is on any> Treble-compliant system. /system might have the tools for eBPF, but> /vendor might have modules, while still the "boot" partition might have> the kernel. There's no reason I can't replace the boot partition> *without* changing /vendor. And there's no reason I can't replace> /vendor *without* replacing the boot partition. And all other> combinations with all other images.
So how does this work, with kernel images and kernel modules supplied
by separate parties, not "bound" by the same kernel headers/API, as they
can be replaced separately?
> That, in my view, is a big part of the problem Joel's patch solves: in a> system whose functionality requires multiple *independent* parties to> work together, I can still get the necessary kernel headers for> user-space tools to properly operate regardless of which part of the> system id being substituted or replaced.
Isn't the need for kernel headers for user-space tools something different,
as this is limited to the uapi versions, which are less (almost not) subject
to change, compared to the kernel headers needed for compiling kernel
modules?
Thanks!
Gr{oetje,eeting}s,
Geert

Hi Geert,
On 3/9/19 2:26 PM, Geert Uytterhoeven wrote:
> Thanks for the explanation!
Happy this was useful :)
> So how does this work, with kernel images and kernel modules supplied> by separate parties, not "bound" by the same kernel headers/API, as they> can be replaced separately?
The thing with Android is that there isn't a "one size fits all". Google
provides guidance, policies and at least one example implementation
through the Pixel lines.
Each vendor however is allowed a great degree of latitude with regards
to what they decide to do. Personally, if I was advising a team working
on an Android device where Joel's patch was available as part of their
kernel I would just recommend that they build it in -- i.e. not as a
module. Hence, there would be no module chasing game.
With regards to Google's guidelines for manufacturers, though, Google
states that CONFIG_MODVERSIONS needs to be enabled, see here:
https://source.android.com/devices/architecture/kernel/modular-kernels
FWIW, that doesn't mean that modules are actually used. Devices don't
necessarily have to be using modules.
> Isn't the need for kernel headers for user-space tools something different,> as this is limited to the uapi versions, which are less (almost not) subject> to change, compared to the kernel headers needed for compiling kernel> modules?
Sorry, I should've been clearer. I'm including eBPF/BCC into the
"user-space tools" here. That was in fact my prime motivation in
encouraging Joel at the last LPC to look at this. I've been integrating
the teaching of eBPF into my AOSP debugging and performance analysis
class (see CC courseware here:
http://www.opersys.com/training/android-debug-and-performance), and it
was pretty messy trying to show people how to benefit from such tools
under Android. Joel's present set of patches would obviate this problem.
HTH,

Hi Karim,
On Sat, Mar 9, 2019 at 10:44 PM Karim Yaghmour
<karim.yaghmour@opersys.com> wrote:
> On 3/9/19 2:26 PM, Geert Uytterhoeven wrote:> > So how does this work, with kernel images and kernel modules supplied> > by separate parties, not "bound" by the same kernel headers/API, as they> > can be replaced separately?>> The thing with Android is that there isn't a "one size fits all". Google> provides guidance, policies and at least one example implementation> through the Pixel lines.>> Each vendor however is allowed a great degree of latitude with regards> to what they decide to do. Personally, if I was advising a team working> on an Android device where Joel's patch was available as part of their> kernel I would just recommend that they build it in -- i.e. not as a> module. Hence, there would be no module chasing game.
OK, that makes sense, if kernel and modules are separate.
So kernel size increase due to including the headers does matter.
> With regards to Google's guidelines for manufacturers, though, Google> states that CONFIG_MODVERSIONS needs to be enabled, see here:> https://source.android.com/devices/architecture/kernel/modular-kernels> FWIW, that doesn't mean that modules are actually used. Devices don't> necessarily have to be using modules.
Personally, I had mixed experiences with CONFIG_MODVERSIONS.
But that was a looong time ago, before Android.
Things may have improved.
> > Isn't the need for kernel headers for user-space tools something different,> > as this is limited to the uapi versions, which are less (almost not) subject> > to change, compared to the kernel headers needed for compiling kernel> > modules?>> Sorry, I should've been clearer. I'm including eBPF/BCC into the> "user-space tools" here. That was in fact my prime motivation in> encouraging Joel at the last LPC to look at this. I've been integrating> the teaching of eBPF into my AOSP debugging and performance analysis> class (see CC courseware here:> http://www.opersys.com/training/android-debug-and-performance), and it> was pretty messy trying to show people how to benefit from such tools> under Android. Joel's present set of patches would obviate this problem.
OK.
Now about the actual solution: what is your opinion on embedding e.g.
a squashfs image in the kernel instead, which would be a more generic
solution, not adding more ABI to /proc?
Thanks!
Gr{oetje,eeting}s,
Geert

On Sat, 9 Mar 2019 16:44:31 -0500
Karim Yaghmour <karim.yaghmour@opersys.com> wrote:
> Sorry, I should've been clearer. I'm including eBPF/BCC into the > "user-space tools" here. That was in fact my prime motivation in > encouraging Joel at the last LPC to look at this. I've been integrating > the teaching of eBPF into my AOSP debugging and performance analysis > class (see CC courseware here: > http://www.opersys.com/training/android-debug-and-performance), and it > was pretty messy trying to show people how to benefit from such tools > under Android. Joel's present set of patches would obviate this problem.
I've been reading this thread and staying out for the most part. But I
was thinking about how I could use kernel headers for things like
kprobes, and I want to mention the pony that I would like to have :-)
Are headers really needed, and more importantly, are they enough? What
if userspace is 32bit and the kernel is 64bit. Can ebpf scripts handle
that?
What I would love, is a table that has all structures and their fields
with information about their types, size and signed types. Like the
format fields in the events directory. This way ebpf (and kprobes,
internally in the kernel) could access this table and be able to know
what the data structures of the kernel is).
If you need it for modules, it shouldn't be too hard to take this
information and turn it into headers that building modules could use.
Again, this is my pony, but I'd like to think outside the box here, and
not be so focused on "headers" and see if there's another solution,
that may be even more powerful.
-- Steve

On Mon, Mar 11, 2019 at 4:37 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>> On Sat, 9 Mar 2019 16:44:31 -0500> Karim Yaghmour <karim.yaghmour@opersys.com> wrote:>>> > Sorry, I should've been clearer. I'm including eBPF/BCC into the> > "user-space tools" here. That was in fact my prime motivation in> > encouraging Joel at the last LPC to look at this. I've been integrating> > the teaching of eBPF into my AOSP debugging and performance analysis> > class (see CC courseware here:> > http://www.opersys.com/training/android-debug-and-performance), and it> > was pretty messy trying to show people how to benefit from such tools> > under Android. Joel's present set of patches would obviate this problem.>> I've been reading this thread and staying out for the most part. But I> was thinking about how I could use kernel headers for things like> kprobes, and I want to mention the pony that I would like to have :-)>> Are headers really needed, and more importantly, are they enough? What> if userspace is 32bit and the kernel is 64bit. Can ebpf scripts handle> that?
Why would eBPF care about the bit width of userspace? I've thought a
lot about inspecting user state from the kernel and have some weird
ideas in that area (e.g., why not register eBPF programs to unwind
user stacks?) --- but I think that Joel's work is independent of all
that. I'm curious what you think we can do about userspace. I wish we
could just compile the world with frame pointers, but we can't.
> What I would love, is a table that has all structures and their fields> with information about their types, size and signed types. Like the> format fields in the events directory. This way ebpf (and kprobes,> internally in the kernel) could access this table and be able to know> what the data structures of the kernel is).
Think of the headers as encoding this information and more and the C
compiler as a magical decoder ring. :-) I totally get the desire for a
metadata format a little less messy than C code, but as a practical
matter, a rich C-compilation pipeline already exists, and the
debuginfo you're proposing doesn't, except in the form of DWARF, which
I think would be even more controversial to embed in the kernel memory
image than headers are. A header blob provides a strict superset of
the information your scheme provides.

On Mon, Mar 11, 2019 at 04:58:28PM -0700, Daniel Colascione wrote:
> On Mon, Mar 11, 2019 at 4:37 PM Steven Rostedt <rostedt@goodmis.org> wrote:> >> > On Sat, 9 Mar 2019 16:44:31 -0500> > Karim Yaghmour <karim.yaghmour@opersys.com> wrote:> >> >> > > Sorry, I should've been clearer. I'm including eBPF/BCC into the> > > "user-space tools" here. That was in fact my prime motivation in> > > encouraging Joel at the last LPC to look at this. I've been integrating> > > the teaching of eBPF into my AOSP debugging and performance analysis> > > class (see CC courseware here:> > > http://www.opersys.com/training/android-debug-and-performance), and it> > > was pretty messy trying to show people how to benefit from such tools> > > under Android. Joel's present set of patches would obviate this problem.> >> > I've been reading this thread and staying out for the most part. But I> > was thinking about how I could use kernel headers for things like> > kprobes, and I want to mention the pony that I would like to have :-)> >> > Are headers really needed, and more importantly, are they enough? What> > if userspace is 32bit and the kernel is 64bit. Can ebpf scripts handle> > that?> > Why would eBPF care about the bit width of userspace? I've thought a> lot about inspecting user state from the kernel and have some weird> ideas in that area (e.g., why not register eBPF programs to unwind> user stacks?) --- but I think that Joel's work is independent of all> that. I'm curious what you think we can do about userspace. I wish we> could just compile the world with frame pointers, but we can't.
Yeah, most of the usecases for eBPF tracing are all instrumenting the kernel
using kprobes. There are some usecases that instrument userspace with
uprobes but those don't need headers.
> > What I would love, is a table that has all structures and their fields> > with information about their types, size and signed types. Like the> > format fields in the events directory. This way ebpf (and kprobes,> > internally in the kernel) could access this table and be able to know> > what the data structures of the kernel is).> > Think of the headers as encoding this information and more and the C> compiler as a magical decoder ring. :-) I totally get the desire for a> metadata format a little less messy than C code, but as a practical> matter, a rich C-compilation pipeline already exists, and the> debuginfo you're proposing doesn't, except in the form of DWARF, which> I think would be even more controversial to embed in the kernel memory> image than headers are. A header blob provides a strict superset of> the information your scheme provides.
I think even though the kernel-headers can't have information about all data
structures, they do already contain a lot of data structure definitions we
need already. And anything needed can/should arguably be moved to include/ if
they are really needed for kernel extension by something "external" to the
kernel such as kernel modules or eBPF, right?
In any case, such a solution such as what Steve suggested, still cannot do
what we can with headers - such as build kernel modules on the fly using the
C-compiler without any auto-generation of C code from any debug artifiacts.
Think systemtap working with the module-backend without any need for
linux-headers package on the file system. So such a solution would still be a
bit orthogonal in scope to what this proposed solution can solve IMO.
thanks,
- Joel

On Mon, 11 Mar 2019 16:58:28 -0700
Daniel Colascione <dancol@google.com> wrote:
> On Mon, Mar 11, 2019 at 4:37 PM Steven Rostedt <rostedt@goodmis.org> wrote:> >> > On Sat, 9 Mar 2019 16:44:31 -0500> > Karim Yaghmour <karim.yaghmour@opersys.com> wrote:> >> > > > > Sorry, I should've been clearer. I'm including eBPF/BCC into the> > > "user-space tools" here. That was in fact my prime motivation in> > > encouraging Joel at the last LPC to look at this. I've been integrating> > > the teaching of eBPF into my AOSP debugging and performance analysis> > > class (see CC courseware here:> > > http://www.opersys.com/training/android-debug-and-performance), and it> > > was pretty messy trying to show people how to benefit from such tools> > > under Android. Joel's present set of patches would obviate this problem. > >> > I've been reading this thread and staying out for the most part. But I> > was thinking about how I could use kernel headers for things like> > kprobes, and I want to mention the pony that I would like to have :-)> >> > Are headers really needed, and more importantly, are they enough? What> > if userspace is 32bit and the kernel is 64bit. Can ebpf scripts handle> > that? > > Why would eBPF care about the bit width of userspace? I've thought a
How do you know the difference between long if you are running on a
64bit kernel in 32bit userspace? I haven't compiled into ebpf byte
code, so I'm clueless here, but I know other tools (PowerTop) got burnt
when it assumed "long" was 4 bytes when digging into the kernel via
trace events, when the kernel was 64bit and user was 32bit.
> lot about inspecting user state from the kernel and have some weird> ideas in that area (e.g., why not register eBPF programs to unwind> user stacks?) --- but I think that Joel's work is independent of all> that. I'm curious what you think we can do about userspace. I wish we> could just compile the world with frame pointers, but we can't.
I don't care about user space here. I'm talking about the tools that
need to dig into the kernel, but the tools are 32bit and the kernel is
64bit. Of course, if the tools know the kernel is 64bit regardless,
then it doesn't matter.
> > > What I would love, is a table that has all structures and their> > fields with information about their types, size and signed types.> > Like the format fields in the events directory. This way ebpf (and> > kprobes, internally in the kernel) could access this table and be> > able to know what the data structures of the kernel is). > > Think of the headers as encoding this information and more and the C> compiler as a magical decoder ring. :-) I totally get the desire for a> metadata format a little less messy than C code, but as a practical> matter, a rich C-compilation pipeline already exists, and the> debuginfo you're proposing doesn't, except in the form of DWARF, which> I think would be even more controversial to embed in the kernel memory> image than headers are. A header blob provides a strict superset of> the information your scheme provides.
We already embed dwarf like information in the kernel. It's called ORC,
and is used by bpf (as well as perf, ftrace and stack dumps) when we
need to read functions. A very small subset of dwarf like data would be
used. Just the data structures, nothing else.
-- Steve

On Mon, 11 Mar 2019 20:39:12 -0400
Joel Fernandes <joel@joelfernandes.org> wrote:
> I think even though the kernel-headers can't have information about all data> structures, they do already contain a lot of data structure definitions we> need already. And anything needed can/should arguably be moved to include/ if> they are really needed for kernel extension by something "external" to the> kernel such as kernel modules or eBPF, right?
That's not my worry. I would like to be able to easily walk data
structures from within the kernel, without having to do a lot of work
in userspace to get that information. The kprobe_events could then be
passed type casts or such to access data fields of arguments to
functions and such.
> > In any case, such a solution such as what Steve suggested, still cannot do> what we can with headers - such as build kernel modules on the fly using the> C-compiler without any auto-generation of C code from any debug artifiacts.> Think systemtap working with the module-backend without any need for> linux-headers package on the file system. So such a solution would still be a> bit orthogonal in scope to what this proposed solution can solve IMO.>
With the information I would like to have, it would be trivial to read
the data to create the header files needed for modules.
-- Steve

On Mon, Mar 11, 2019 at 6:28 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>> On Mon, 11 Mar 2019 20:39:12 -0400> Joel Fernandes <joel@joelfernandes.org> wrote:>> > I think even though the kernel-headers can't have information about all data> > structures, they do already contain a lot of data structure definitions we> > need already. And anything needed can/should arguably be moved to include/ if> > they are really needed for kernel extension by something "external" to the> > kernel such as kernel modules or eBPF, right?>> That's not my worry. I would like to be able to easily walk data> structures from within the kernel, without having to do a lot of work> in userspace to get that information. The kprobe_events could then be> passed type casts or such to access data fields of arguments to> functions and such.
Ok.
> > In any case, such a solution such as what Steve suggested, still cannot do> > what we can with headers - such as build kernel modules on the fly using the> > C-compiler without any auto-generation of C code from any debug artifiacts.> > Think systemtap working with the module-backend without any need for> > linux-headers package on the file system. So such a solution would still be a> > bit orthogonal in scope to what this proposed solution can solve IMO.> >>> With the information I would like to have, it would be trivial to read> the data to create the header files needed for modules.
But there are macros and other #define things too. We lose all of them
and can't recreate them from just DWARF (AFAIK). Including
include/generated/autoconf.h which #defines the CONFIG options. For
that we either need headers, or full kernel's sources with build
artifacts.
I do see a use case for the debug info you are talking about as you
mentioned for the kprobe_events argument list types, and I already
thought about it. But it does not seem to work for all the use cases I
am referring to here.
thanks,
- Joel

On Mon, Mar 11, 2019 at 09:28:23PM -0400, Steven Rostedt wrote:
> On Mon, 11 Mar 2019 20:39:12 -0400> Joel Fernandes <joel@joelfernandes.org> wrote:> > > I think even though the kernel-headers can't have information about all data> > structures, they do already contain a lot of data structure definitions we> > need already. And anything needed can/should arguably be moved to include/ if> > they are really needed for kernel extension by something "external" to the> > kernel such as kernel modules or eBPF, right?> > That's not my worry. I would like to be able to easily walk data> structures from within the kernel, without having to do a lot of work> in userspace to get that information. The kprobe_events could then be> passed type casts or such to access data fields of arguments to> functions and such.> > > > > In any case, such a solution such as what Steve suggested, still cannot do> > what we can with headers - such as build kernel modules on the fly using the> > C-compiler without any auto-generation of C code from any debug artifiacts.> > Think systemtap working with the module-backend without any need for> > linux-headers package on the file system. So such a solution would still be a> > bit orthogonal in scope to what this proposed solution can solve IMO.> > > > With the information I would like to have, it would be trivial to read> the data to create the header files needed for modules.
what you're asking for we already have. It's called BTF.
pahole takes vmlinux dwarf and convert it into ~1Mbyte of BTF that includes
all kernel types.
With gzip it can be compressed further if necessary.
We also have a prototype to generate all_vmlinux_types.h from BTF.
But it's not a substitute for kernel headers.
We've had a long discussion during last LPC regarding this:
http://vger.kernel.org/lpc-bpf2018.html#session-2
tldr: many tracing use cases will be solved with BTF, but kernel headers
are here to stay.

Hi Geert,
On 3/11/19 4:03 AM, Geert Uytterhoeven wrote:
[ snip ]
> OK.> > Now about the actual solution: what is your opinion on embedding e.g.> a squashfs image in the kernel instead, which would be a more generic> solution, not adding more ABI to /proc?
I'm not familiar enough with the intricacies of squashfs to have an
educated opinion, but I hear that it's got its quirks (need for
user-space tools, etc.) and possibly security issues. Also, I wonder
whether it's a generalized solution that still kicks the ABI can down
the road -- ultimately the kernel would still have a path/format/foo for
making kheaders available in that squashfs image and that convention
would become ABI. The only "benefit" being that said ABI wouldn't appear
under /proc, and, tbh, I'm not sure that that's actually a benefit or is
even idiomatic since kconfig.gz is already under /proc. To an extent,
the precedent set by kconfig favors kheaders to also be available in the
same location using a similar mechanism ... i.e. bonus points for
consistency.
But that's my hand-wavy gut-reaction response to your question. I'm sure
others on this thread have far more informed opinions about the
specifics than I could have. My priority was to clarify the basis for
the need being addressed.
Cheers,

On Mon, 11 Mar 2019 18:45:24 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> what you're asking for we already have. It's called BTF.
Cool.
> pahole takes vmlinux dwarf and convert it into ~1Mbyte of BTF that includes> all kernel types.> With gzip it can be compressed further if necessary.> We also have a prototype to generate all_vmlinux_types.h from BTF.> But it's not a substitute for kernel headers.> We've had a long discussion during last LPC regarding this:> http://vger.kernel.org/lpc-bpf2018.html#session-2> tldr: many tracing use cases will be solved with BTF, but kernel headers> are here to stay.
Is this work a result of that session? (I was only able to attend part
of the BPF sessions due to conflicts). I guess I should watch the
videos.
-- Steve

On Mon, 11 Mar 2019 18:38:49 -0700
Joel Fernandes <joelaf@google.com> wrote:
> On Mon, Mar 11, 2019 at 6:28 PM Steven Rostedt <rostedt@goodmis.org> wrote:> >> > On Mon, 11 Mar 2019 20:39:12 -0400> > Joel Fernandes <joel@joelfernandes.org> wrote:> >> > > I think even though the kernel-headers can't have information about all data> > > structures, they do already contain a lot of data structure definitions we> > > need already. And anything needed can/should arguably be moved to include/ if> > > they are really needed for kernel extension by something "external" to the> > > kernel such as kernel modules or eBPF, right?> >> > That's not my worry. I would like to be able to easily walk data> > structures from within the kernel, without having to do a lot of work> > in userspace to get that information. The kprobe_events could then be> > passed type casts or such to access data fields of arguments to> > functions and such.> > Ok.> > > > In any case, such a solution such as what Steve suggested, still cannot do> > > what we can with headers - such as build kernel modules on the fly using the> > > C-compiler without any auto-generation of C code from any debug artifiacts.> > > Think systemtap working with the module-backend without any need for> > > linux-headers package on the file system. So such a solution would still be a> > > bit orthogonal in scope to what this proposed solution can solve IMO.> > >> >> > With the information I would like to have, it would be trivial to read> > the data to create the header files needed for modules.> > But there are macros and other #define things too. We lose all of them> and can't recreate them from just DWARF (AFAIK). Including> include/generated/autoconf.h which #defines the CONFIG options. For> that we either need headers, or full kernel's sources with build> artifacts.
What kind of macros would you concern?
> I do see a use case for the debug info you are talking about as you> mentioned for the kprobe_events argument list types, and I already> thought about it. But it does not seem to work for all the use cases I> am referring to here.
But the eBPF is based on kprobe-events. What kind of usage would you
expected? (with macros??)
Thank you,

Hi Masami,
Sorry for late reply as I was traveling.
On Wed, Mar 13, 2019 at 10:18:43AM +0900, Masami Hiramatsu wrote:
> On Mon, 11 Mar 2019 18:38:49 -0700> Joel Fernandes <joelaf@google.com> wrote:> > > On Mon, Mar 11, 2019 at 6:28 PM Steven Rostedt <rostedt@goodmis.org> wrote:> > >> > > On Mon, 11 Mar 2019 20:39:12 -0400> > > Joel Fernandes <joel@joelfernandes.org> wrote:> > >> > > > I think even though the kernel-headers can't have information about all data> > > > structures, they do already contain a lot of data structure definitions we> > > > need already. And anything needed can/should arguably be moved to include/ if> > > > they are really needed for kernel extension by something "external" to the> > > > kernel such as kernel modules or eBPF, right?> > >> > > That's not my worry. I would like to be able to easily walk data> > > structures from within the kernel, without having to do a lot of work> > > in userspace to get that information. The kprobe_events could then be> > > passed type casts or such to access data fields of arguments to> > > functions and such.> > > > Ok.> > > > > > In any case, such a solution such as what Steve suggested, still cannot do> > > > what we can with headers - such as build kernel modules on the fly using the> > > > C-compiler without any auto-generation of C code from any debug artifiacts.> > > > Think systemtap working with the module-backend without any need for> > > > linux-headers package on the file system. So such a solution would still be a> > > > bit orthogonal in scope to what this proposed solution can solve IMO.> > > >> > >> > > With the information I would like to have, it would be trivial to read> > > the data to create the header files needed for modules.> > > > But there are macros and other #define things too. We lose all of them> > and can't recreate them from just DWARF (AFAIK). Including> > include/generated/autoconf.h which #defines the CONFIG options. For> > that we either need headers, or full kernel's sources with build> > artifacts.> > What kind of macros would you concern?> > > I do see a use case for the debug info you are talking about as you> > mentioned for the kprobe_events argument list types, and I already> > thought about it. But it does not seem to work for all the use cases I> > am referring to here.> > But the eBPF is based on kprobe-events. What kind of usage would you> expected? (with macros??)
eBPF C programs are compiled with kernel headers. They can execute inline
functions or refer to macros in the kernel headers. They are similar to
kernel modules where you build a C program that then later is executed in
kernel context. It goes through the whole compiler pipeline. This is slightly
different usage from pure kprobe-events. Also eBPF kprobe programs need
LINUX_VERSION_CODE (or similarly named) macro which it provides to the bpf(2)
syscall when loading kprobe programs. This is because eBPF implementation in
the kernel checks if the eBPF programs that use kprobes are being loaded
against the right kernel.
thanks,
- Joel

On Thu, 14 Mar 2019 08:27:48 -0400
Joel Fernandes <joel@joelfernandes.org> wrote:
> > But the eBPF is based on kprobe-events. What kind of usage would you> > expected? (with macros??)> > eBPF C programs are compiled with kernel headers. They can execute inline> functions or refer to macros in the kernel headers. They are similar to> kernel modules where you build a C program that then later is executed in> kernel context. It goes through the whole compiler pipeline. This is slightly> different usage from pure kprobe-events. Also eBPF kprobe programs need> LINUX_VERSION_CODE (or similarly named) macro which it provides to the bpf(2)> syscall when loading kprobe programs. This is because eBPF implementation in> the kernel checks if the eBPF programs that use kprobes are being loaded> against the right kernel.
Ah, I got it. It's similar to SystemTap. :)
Thank you,

Hi Alan,
On 3/18/19 2:57 PM, Alan Cox wrote:
>> I think the compressed tarball is much simpler/easier overall. If>> someone really wants the filesystem, they just uncompress it into a>> tmpfs mount. It's much less moving kernel code to worry about.> > There is an even simpler approach. The people who want this for whatever> strange reason are Android folks. Android lives on flash, so all they have> to do is put the headers in a flash file system that is updated with the> kernel and mount it wherever they like. Simple matter of a bit of> devicetree no ?
The Android use-case scenario might indeed have been the one to best
crystallize the requirement, but that doesn't mean that all use-cases of
eBPF wouldn't benefit from this -- which in fact they would, see
instructions here for example on the need for kernel headers:
https://github.com/iovisor/bcc/blob/master/INSTALL.md
Just a quick $PREFERRED_SEARCH_ENGINE search returns interesting
exchanges such as this one taken from a discussion thread on LWN
covering an introduction to eBPF:
https://lwn.net/Articles/741348/
Effectively this person had to be hand-held in understanding that they
needed a good set of kernel headers to make the tool work. Their comment
after being shown that this was needed was:
"It looks like the headers package you need on Ubuntu is
linux-headers-$(uname -r), which contains the entire kernel source tree,
and is specific to the running kernel."
Surely having Joel's patch in the kernel would obviate the issue for all
Linux kernel users, not just Android.
Cheers,