Linking binary data

Took me half a weak to figure this out, but here is a quick description how to link your binary data into a library/executable:

We want this to:

  • work with GNU-ld
  • work with GNU-gold
  • work with GNU-libtool (if required)
  • work with cross-compiling
  • work with LLVM
  • not require any external non-standard tools

For now lets assume we have a file src/mydata.bin that contains your binary data that you want to link into your executable as a huge C-array.

With GNU-autotools

If you use GNU-autotools, add the following rule to your Makefile.am:

CLEANFILES += src/*.bin.lo src/*.bin.o

src/%.bin.lo: src/%.bin
     $(AM_V_GEN)$(LD) -r -o "src/$*.bin.o" -z noexecstack --format=binary "$<"
     $(AM_V_at)$(OBJCOPY) --rename-section .data=.rodata,alloc,load,readonly,data,contents "src/$*.bin.o
     $(AM_V_at)echo "# $@ - a libtool object file" >"$@"
     $(AM_V_at)echo "# Generated by $(shell $(LIBTOOL) --version | head -n 1)" >>"$@"
     $(AM_V_at)echo "#" >>"$@"
     $(AM_V_at)echo "# Please DO NOT delete this file!" >>"$@"
     $(AM_V_at)echo "# It is necessary for linking the library." >>"$@"
     $(AM_V_at)echo >>"$@"
     $(AM_V_at)echo "# Name of the PIC object." >>"$@"
     $(AM_V_at)echo "pic_object='$*.bin.o'" >>"$@"
     $(AM_V_at)echo >>"$@"
     $(AM_V_at)echo "# Name of the non-PIC object" >>"$@"
     $(AM_V_at)echo "non_pic_object='$*.bin.o'" >>"$@"
     $(AM_V_at)echo >>"$@"

Now you can simply use src/mydata.bin.lo as libtool-object file in any _LIBADD automake variable. This rule causes $(LD) to link the binary into the object file src/mydata.bin.o (via partial-linking/relocatables). After that we create the corresponding libtool object file src/mydata.bin.lo. Note that the comments in the libtool file are mandatory. You must not remove them!

The objcopy line causes the .data section to be renamed to .rodata and marked as read-only. You can drop it if you require write access.

Without GNU-autotools

Obviously, this is much easier. You can simply use:

$(LD) -r -o "src/mydata.bin.o" -z noexecstack --format=binary "src/mydata.bin"
$(OBJCOPY) --rename-section .data=.rodata,alloc,load,readonly,data,contents "src/mydata.bin.o"

This produces src/mydata.bin.o which you can then link like any other object.

That’s already it!

You can now access the binary data via:

extern const char _binary_<path>_start[];
extern const char _binary_<path>_end[];

In our case this would be:

extern const char _binary_src_mydata_bin_start[];
extern const char _binary_src_mydata_bin_end[];

The linker automatically creates these symbols. It also creates *_size symbols but I never had to use these. The linker also alignes the start address correctly, but the content isn’t touched, of course.

The data is placed in .data by default and marked as read/write. The objcopy command renames it to .rodata and marks it as read-only. Drop it if you require write access.

If you think you know another way to do this, please let me know! I have tried lots of other ways and they all fail in some subtle way:

  • Converting it into a C-source file with a static array. Problem: It works but the C-compiler might take considerably longer to compile the sources. On my machine a 2MB array file took >10min to compile.
  • Using objcopy to convert binary files into object files. Problem: This requires the target architecture as -B command. Unfortunately, it has some weird format and doesn’t accept the standard GNU $host_cpu kind of strings. For instance it requires i386:x86-64 instead of x86_64. I haven’t found a way to get these BFD strings reliably via autoconf.
  • Use objcopy without -B to produce architecture=UNKNOWN object files and link via –accept-unknown-input-arch with GNU-ldProblem: This doesn’t work with GNU-gold as it doesn’t support this command-line option.
  • Avoid partial-linking and add this to your LDFLAGS: -Wl,--format=binary -Wl,src/mydata.bin -Wl,--format=default Problem: This marks the file as requiring an executable-stack. You can fix this with -Wl,-z,noexecstack but that forces all other linked files to also not require an executable stack. And also, GNU-gold doesn’t support --format=default, it requires --format=elf which itself isn’t supported by GNU-ld. Hilarious, right?

I also got some other hints to use GNU-as or other nice workarounds. However, I wanted a solution that uses the tools that were created for that task. I mean, why should I create an assembler or C source file to link binary data? Seriously, the only reason to do this is to work around crappy binutils interfaces. I want ld and objcopy to do the task and if they can’t, then we should fix them!

2 thoughts on “Linking binary data

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s