Booting OpenWRT - From the correct address

6 posts / 0 new
Last post
bzing2
bzing2's picture
Booting OpenWRT - From the correct address

After two days of pain I have finally managed to get OpenWRT booting off the flash! What follows is a quick patch (it's not perfect, but it works for me).

--- arch/mips/kernel/head.S.orig        2008-10-30 16:33:27.000000000 +0000
+++ arch/mips/kernel/head.S     2008-10-30 16:59:47.000000000 +0000
@@ -107,6 +107,16 @@
__FINIT
/*
+                * Dummy kernel entry for WGR614v8.  This should be located
+                * at 0x80001400 (may be an idea to check System.map).  This
+                * is required as the CFE (1.5 - Fri Jun  6 15:53:24 CST 2008)
+                * on this board has been hard wired jump to this address!
+                */
+               EXPORT(dummy_kernel_entry)
+               j       kernel_entry
+               nop
+
+               /*
* EJTAG debug exception handler.
*/
NESTED(ejtag_debug_handler, PT_SIZE, sp)

For the interested reader, here is some background on the above. The problem I have been suffering with is that I am able to create .chk files and upload them. However the kernel would not boot from the flash; however I could boot from the network and explain why the TRX file generated by OpenWRT is not enough. So whats really the problem!
Well after a little digging around in the CFE sources (from tomato I think, and for a different version, but anyway) I found the following lump of code:

static int boot_lzma(void)
{
unsigned long in        = KERNEL_FLASH_ADDR + sizeof(struct trx_header);
unsigned long out       = KERNEL_RAM_ADDR;
unsigned long in_len    = 0x3A0000;
unsigned long out_len   = 0x3A0000;
unsigned long start_addr = 0x80001400;  /* addr for 'kernel_entry' */
if (!lzma_decode((unsigned char *)in, (unsigned char *)out,
(unsigned int )in_len, (unsigned int *)&out_len))
cfe_start(start_addr);
printf("LZMA boot failed\n");
return 0;
}

The take away point here is that CFE is jumping to 0x80001400 and not 0x80001000 as you would have expected. Note the kernel has a jump at 0x80001000 to the real location of kernel_entry. Why this has been hard coded this way I have no idea! Anyway on the kernel I built 0x80001400 corrisponds to ejtag_debug_handler, hence why it wasn't working. So the patch above inserts a jump (plus nop) at that location so we can reach the correct kernel_entry point (which was 0x80168040 in my case).

As an aside I did manage to put a gziped ELF version of the kernel into a .chk file. I could then execute it using the following CFE commands. This was the ah-ha moment! As ELF's tell you the entry point (was also using them via tftp)...

boot -z -elf flash0.os:

You may also be interested about how to create the .chk files - seeing as that would be usefull. I have been using the addchkhdr tool. Here is the set of steps I have been using:

cd build_dir/linux-brcm-2.4
addchkhdr -i vmlinux.lzma -i root.squashfs -n "U12H072T00_NETGEAR,image.chk"

 

Nachi
Nachi's picture
this is one approach. The

this is one approach. The other approach would be to modify the lzma-loader to an offset to boot from 0x80001400 so that you can use the lzma-loader to boot from the kernel.

Other fix that you need to do is to make the trx boundary aligned to 64K to make sure that you don't rewrite the at the first time boot. This will avoid failure of 2nd time boot.

Of course these fixes are required only for the EU version (CFE 1.5) and not for the NA version.

bzing2
bzing2's picture
I would agree that the lzma

I would agree that the lzma-loader would be a better place, I was being lazzy as I already had the kernel on the operating table. But in the spirit of things here is a patch:

--- target/linux/brcm-2.4/image/lzma-loader/src/head.S.orig     2008-10-31 11:47:09.000000000 +0000
+++ target/linux/brcm-2.4/image/lzma-loader/src/head.S  2008-10-31 11:23:59.000000000 +0000
@@ -36,6 +36,7 @@
 #define Index_Writeback_Inv_D   0x01

        .text
+       .space ENTRY_POINT-LOADADDR
        LEAF(startup)
        .set noreorder
        addi    sp, -48
--- target/linux/brcm-2.4/image/lzma-loader/src/Makefile.orig   2008-10-31 11:47:03.000000000 +0000
+++ target/linux/brcm-2.4/image/lzma-loader/src/Makefile        2008-10-31 11:48:22.000000000 +0000
@@ -18,6 +18,7 @@
 #

 TEXT_START     := 0x80001000
+ENTRY_POINT    := 0x80001400
 BZ_TEXT_START  := 0x80400000

 OBJCOPY                := $(CROSS_COMPILE)objcopy -O binary -R .reginfo -R .note -R .comment -R .mdebug -S
@@ -26,7 +27,7 @@
                  -fno-strict-aliasing -fno-common -fomit-frame-pointer -G 0 -mno-abicalls -fno-pic \
                  -ffunction-sections -pipe -mlong-calls -fno-common \
                  -mabi=32 -march=mips32 -Wa,-32 -Wa,-march=mips32 -Wa,-mips32 -Wa,--trap-CFLAGS         += -DLOADADDR=$(TEXT_START) -D_LZMA_IN_CB
+CFLAGS         += -DLOADADDR=$(TEXT_START) -D_LZMA_IN_CB -DENTRY_POINT=$(ENTRY_POINT)

 ASFLAGS                = $(CFLAGS) -D__ASSEMBLY__ -DBZ_TEXT_START=$(BZ_TEXT_START)

@@ -74,4 +75,4 @@
 mrproper: clean

 clean:
-       rm -f loader.gz loader decompress *.lds *.o *.image
+       rm -f loader.gz loader decompress *.lds *.o *.image loader.elf

Again its a quick one, I am just inserting some space at the beginning to force the entry point to arrive at the correct location. Any one using the above should make sure they lzma the loader (not gzip) and then follow the usual recipes for creating an image.

Now onto the trx boundary issue. Yup you correctly guessed my issue, and I still have it! What I have done is to create a padding file thats big enough to align the trx at a 64k boundry. Then create a trx appending (-A) that to the end. I then append to the end of trx the jffs2 marker (ie its not mentioned in the trx length field). Finally I package that up into a .chk file.

Unfortunately there is something still broken! What happens is the jffs2 gets created, but I get write errors. Here are a few choice bits!

...
...
sflash: Filesystem type: squashfs, size=0x1833db
Creating 5 MTD partitions on "sflash":
0x00000000-0x00020000 : "cfe"
0x00020000-0x003f0000 : "linux"
0x0009a400-0x00220000 : "rootfs"
0x003f0000-0x00400000 : "nvram"
0x00220000-0x00380000 : "rootfs_data"
....
....
jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
jffs2_build_filesystem(): unlocking the mtd device... done.
jffs2_build_filesystem(): erasing all blocks after the end marker... done.
Write of 68 bytes at 0x0014000c failed. returned 0, retlen 136
Write of 68 bytes at 0x00140050 failed. returned 0, retlen 136
Nachi
Nachi's picture
lzma-loader patch Looks good

lzma-loader patch Looks good to me - although this patch may affect builds for other routers.. - we possibly need to find out a way to add this only for WGR614L.

For the jffs write problem - this is a sflash driver issue.

as far as I could check in the kernel.
drivers/mtd/devices/sflash.c - function sflash_mtd_write() is returning unsuccessful.

Looks like the write itself is going through (sflash_write(..) call) - but the sflash_mtd_poll() seems to be failing. It may be a good idea to increase the time of polling from HZ/10 to some higher values if indeed sflash_polls are slower - will update once I make some progress.

UPDATE: The sflash write seems to be alright and poll is also correct - for some reason sflash is writing double the data requested in the writer command.

Nachi
Nachi's picture
The sflash issue is due to

The sflash issue is due to incorrect handling of zero byte writes in the sflash_mtd_write() in drivers/mtd/devices/sflash.c

Here is the modified function - watch out for an updated patch for OpenWRT trunk.

This fixes the issues in jffs2.

 

static int
sflash_mtd_write(struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf)
{
        struct sflash_mtd *sflash = (struct sflash_mtd *) mtd->priv;
        int bytes, ret = 0;

        if (retlen)
                *retlen = 0;

        /* Check address range */
        if (!len)
                return 0;
        if ((to + len) > mtd->size)
                return -EINVAL;

        down(&sflash->lock);

        while (len) {
                if ((bytes = sflash_write(sflash->cc, (uint) to, len, buf)) < 0) {
                        ret = bytes;
                        break;
                }
                if ((ret = sflash_mtd_poll(sflash, (unsigned int) to, HZ / 10)))
                        break;
                to += (loff_t) bytes;
                len -= bytes;
                buf += bytes;
                if (retlen)
                        *retlen += bytes;

        }

        up(&sflash->lock);

        return ret;
}

bzing2
bzing2's picture
Excellent, well found! I was

Excellent, well found! I was not looking forward to a Monday of peering through call traces, so thank you.

I have applied the changes, as you suggested, to sflash_mtd_write and the j2ffs problem is fixed. While in there I also spotted that sflash_mtd_read could suffer from exactly the same problem. I have attached a patch at the end, just in case anyone else stumbles over the same problem.

With respect to the lzma-loader patch, yes indeed that will effect other routers. What I suggest is that we make the ENTRY_POINT variable default to TEXT_START but allow it to be overridden if set in the environment. That way we can rebuild the lzma-loader specifically for our target. target/linux/brcm-2.4/image/Makefile already has some extra's required for other boards (specifically Motorola and USR) so I think that would be acceptable.

Along those lines a version of the 'packet' tool to create the .chk files will clearly be required. I am happy to implement a clean room version from your docs on the format if required.

All of that brings me to the question of pushing things into OpenWRT. I would prefer to get all of this stuff into OpenWRT! Have you (Netgear) submitted any patches to OpenWRT? Are you planning on doing so? I am happy to lend a hand if required...

Now for that patch:

--- sflash.c.orig       2008-11-03 12:35:52.000000000 +0000
+++ sflash.c    2008-11-03 12:37:46.000000000 +0000
@@ -81,6 +81,9 @@
        struct sflash_mtd *sflash = (struct sflash_mtd *) mtd->priv;
        int bytes, ret = 0;

+       if (retlen)
+               *retlen = 0;
+
        /* Check address range */
        if (!len)
                return 0;
@@ -89,7 +92,6 @@

        down(&sflash->lock);

-       *retlen = 0;
        while (len) {
                if ((bytes = sflash_read(sflash->sbh, sflash->cc, (uint) from, len, buf)) lock);
@@ -112,6 +115,9 @@
        struct sflash_mtd *sflash = (struct sflash_mtd *) mtd->priv;
        int bytes, ret = 0;

+       if (retlen)
+               *retlen = 0;
+
        /* Check address range */
        if (!len)
                return 0;
@@ -120,7 +126,6 @@

        down(&sflash->lock);

-       *retlen = 0;
        while (len) {
                if ((bytes = sflash_write(sflash->sbh, sflash->cc, (uint) to, len, buf)) lock);