Wednesday, 26 December 2012

Updated linker script module-common.lds in kernel 3.0

    If your two kernel modules are dependent on each other means one modules is exporting some symbols and other module is using them then you have to make sure your .ko is linked by scripts/module-common.lds.
    Starting from Linux 3.0.0 kernel, we are working with updated version of module.lds. If you don't use this script loading of modules may fail with 

"
Unknown symbol" 

error. 

    Starting from kernel 3.0.0 each __kcrctab __ksymtab symbol has been assigned separate section in .ko If you see such .ko with nm, all symbols would have virtual address 0x00000000 These symbols should be in one separate sections, __ksymtab for all ___ksymtab+* symbols and __kcrctab for all ___kcrctab+* symbols.
    linker script in scripts/module-common.lds produces exactly what we need. This all behaviour is due to symbol resolution process changes in kernel >= 3.0.0 This patch has been added to speed up the symbols resolution process. Kernel will actually sort this symbols and apply binary search instead of linear to speed up process.

    Standard kernel modules automatically gets compiled and linked with module-common.lds but if you are working on any third party kernel modules then make sure you use module-common.lds to link object file of your module.


Wednesday, 7 November 2012

Does 64 bit system use all 64 bits ?

      Have you ever thought how many bits are actually used in any running 64 bit systems to manage memory? Does it use all 64 bits ? 
Answer to his question in NO.
      Actually instead of all 64 bits only 42 or 47 are used.(This might change in future). Reason for this is that we don't need this much memory. If we use all 64 bits then it maps to 16 EB memory. As we don't need such huge memory it becomes overhead to manage these addresses. So less bits are used to implement memory structure. With 48 bits we can get 256 TB of usable virtual memory.
      Though less bits are used first address and last address is still 00000000'00000000 and FFFFFFFF'FFFFFFFF respectively. Then how can we say we have used less than 64 bits ? This is managed by using canonical form addresses. To know in details follow, 

Friday, 26 October 2012

Solution to problem of module getting marked as [permanent]

Lot of times you will see your own module has been marked as permanent. 

$ lsmod
 Module Size Used by
 hello 78567 0 [permanent] 


      These module can't be removed unless system is rebooted. You get messages like following when you try to remove/rmmod them.

ERROR: Module hello is in use by [permanent]
or
ERROR: Removing 'hello': Device or resource busy

      Solution is quite simple for this problem. Recompile you hello.ko module with -DCC_HAVE_ASM_GOTO flag. Problem is struct module layout has dependency on HAVE_JUMP_LABEL => CC_HAVE_ASM_GOTO => gcc-goto.sh script => gcc version being used. 
      When there is a mismatch, the module exit callback (destructor) gets value of NULL which results in module being marked as permanent. This causes mod->exit in following code snippet of kernel/module.c to become null and because CONFIG_MODULE_FORCE_UNLOAD is not set in config file -EBUSY is returned. 

/* If it has an init func, it must have an exit func to unload */
if (mod->init && !mod->exit) {
     forced = try_force_unload(flags);
     if (!forced) {
         /* This module can't be removed */
         ret = -EBUSY;
         goto out;
     }
 } 


To find out exactly which Linux platforms needs this flag run 

sh scripts/gcc-goto.sh gcc 

command in kernel header directory. If you get output as "y" then you must include this flag in any module compilation 

Tuesday, 23 October 2012

Linux kernel crash in apply_alternatives function

If anyone hits kernel crash while loading any module and stack similar to something like this, 

 [<ffffffff810b90c6>] ? crash_kexec+0x66/0x110 
 [<ffffffff810121b8>] ? apply_alternatives+0x328/0x3b0 
 [<ffffffff814f1410>] ? oops_end+0xc0/0x100 
 [<ffffffff8100f2bb>] ? die+0x5b/0x90 
 [<ffffffff814f0d04>] ? do_trap+0xc4/0x160 
 [<ffffffff8100ce75>] ? do_invalid_op+0x95/0xb0 
 [<ffffffff810121b8>] ? apply_alternatives+0x328/0x3b0 
 [<ffffffff8126c0b0>] ? idr_get_empty_slot+0x110/0x2c0 
 [<ffffffff81133229>] ? zone_statistics+0x99/0xc0 
 [<ffffffff8100bf1b>] ? invalid_op+0x1b/0x20 
 [<ffffffff810121b8>] ? apply_alternatives+0x328/0x3b0 
 [<ffffffff81096a5f>] ? up+0x2f/0x50 
 [<ffffffff8106a36f>] ? release_console_sem+0x1cf/0x220 
 [<ffffffff810aca32>] ? each_symbol+0xa2/0x1f0 
 [<ffffffff8106a931>] ? vprintk+0x1d1/0x4f0 
 [<ffffffff814ed360>] ? printk+0x41/0x49 
 [<ffffffff810339ac>] ? module_finalize+0x10c/0x1b0 
 [<ffffffff810af1a2>] ? load_module+0x17c2/0x1ca0 




Then it is likely that your module is missing .rheldata ELF segment. Confirm with following command
objdump -h module.ko
whether module has this segment or not. To add this ELF segment to your .ko you must compile your module with modpost. This issue is generally seen on RHEL 6 but I have hit this on other Linux distributions also.  
Thanks,
Pritam

Linux source compilation error

Many times when you start compiling Linux source code you come across following error,


[pritam@pritam-pc 2.6.32-220.7.1.el6.x86_64]$ sudo make
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
  SYMLINK include/asm -> include/asm-x86
make[1]: *** No rule to make target `missing-syscalls'.  Stop.
make: *** [prepare0] Error 2


    It is very likely that you are compiling set of header files when you hit this. To cross check do this,


[pritam@pritam-pc 2.6.32-220.7.1.el6.x86_64]$ find . -name *.h | wc -l   4606
[pritam@pritam-pc 2.6.32-220.7.1.el6.x86_64]$ find . -name *.c | wc -l
55


This is output of commands where compilation on headers files is tried. Second command has given only 55 files but it should have been much more (around 4500)

    Also check Makefile if it is pointing to correct source directory. Remember devel/header rpms get installed into /usr/src/kernels on some Linux distributions so this path will actually have header files only and result of this will be ABOVE error.  

Thanks,
Pritam
`