Sunday, 15 August 2010

When (user time + system time) > wall time

In my work with small embedded linux devices, I take it as given that the amount of CPU time that a process gets in a given period is _always_ less than the wall time that passes in that period (due to other processes in the system). So I was surprised when I saw the following after running a program on my desktop machine:

real 0m2.456s
user 0m4.440s
sys 0m0.460s

How can the user time be greater than the real (wall clock) time?

'real time' is just that - the real amount of time from a wall clock during the execution of my program. User and sys are, however, both CPU time - and there are multiple CPUs in the system. So the theoretical maximum available CPU time is (real time * # CPUs). I never see this in my work because we only have single CPU ARM-based chips.

This can be illustrated with the following script, where we spawn 10 processes in the background and examine the scripts usage statistics:

#!/bin/bash

# Script t.sh
# (c) Martin Jackson 2010 

declare -i procs=$1
declare path=$2

f()
{
 find $path -maxdepth 1 -type f | xargs md5sum > /dev/null
}

echo "Time for 1 process ..."
time f

echo -e "\n\nTime for $procs processes ..."
time {
declare -i i
for i in `seq 1 $procs`
do f &
done

wait
}

echo -e "\n\n"
# EOF

martin@dodgecity:~/src/test$ uname -a
Linux dodgecity 2.6.32-24-generic #39-Ubuntu SMP Wed Jul 28 05:14:15 UTC 2010 x86_64 GNU/Linux
martin@dodgecity:~/src/test$ ./wait3 ./t.sh 10 /usr/local/src/
Time for 1 process ...

real 0m0.497s
user 0m0.440s
sys 0m0.050s


Time for 10 processes ...

real 0m2.453s
user 0m4.430s
sys 0m0.450s



Process accounting information for 4092:
  user time = 4s, 870000us
  system time = 0s, 500000us
  maximum resident set size = 1496
  integral shared memory size = 0
  integral unshared data size = 0
  integral unshared stack size = 0
  page reclaims = 13592
  page faults = 0
  swaps = 0
  block input operations = 0
  block output operations = 0
  messages sent = 0
  messages received = 0
  signals received = 0
  voluntary context switches = 94
  involuntary context switches = 879


If we run this in combination with the wait3 program earlier, we can see that the rusage info returned from the linux kernel (user + sys time) is all in terms of CPU time.

Wait3 system call

The wait3(2) system call is like your shell's time(1) command, but returns more and cooler stuff. Particularly interesting is the number of voluntary and involuntary context switches during program execution!

Wait3 comes from BSD but is also available in Linux

/*
 * Print out getrusage info when a process exits
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/wait.h>

int main(int argc, char *argv[])
{
 int f, p, status;
 struct rusage rusage;

 if (argc == 0)
  errx(1, "USAGE: wait3 <command line>");

 if ((f = fork()) < 0)
  errx(1, "Could not run command");

 if (f == 0) {
  execvp(argv[1], &argv[1]);
  err(1, "Command failed");
 }

 p = wait3(&status, 0, &rusage);
 if (p < 0)
  errx(1, "wait3(2) on command failed");

 printf("Process accounting information for %d:\n", p);
 printf("  user time = %ds, %dus\n", rusage.ru_utime.tv_sec, rusage.ru_utime.tv_usec);
 printf("  system time = %ds, %dus\n", rusage.ru_stime.tv_sec, rusage.ru_stime.tv_usec);
 printf("  maximum resident set size = %lu\n", rusage.ru_maxrss);
 printf("  integral shared memory size = %lu\n", rusage.ru_ixrss);
 printf("  integral unshared data size = %lu\n", rusage.ru_idrss);
 printf("  integral unshared stack size = %lu\n", rusage.ru_isrss);
 printf("  page reclaims = %lu\n", rusage.ru_minflt);
 printf("  page faults = %lu\n", rusage.ru_majflt);
 printf("  swaps = %lu\n", rusage.ru_nswap);
 printf("  block input operations = %lu\n", rusage.ru_inblock);
 printf("  block output operations = %lu\n", rusage.ru_oublock);
 printf("  messages sent = %lu\n", rusage.ru_msgsnd);
 printf("  messages received = %lu\n", rusage.ru_msgrcv);
 printf("  signals received = %lu\n", rusage.ru_nsignals);
 printf("  voluntary context switches = %lu\n", rusage.ru_nvcsw);
 printf("  involuntary context switches = %lu\n", rusage.ru_nivcsw);

 exit(EXIT_SUCCESS);
}

e.g.
[martin@somehost ~/src/wait3]$ uname -a
FreeBSD somehost.somedomain 8.0-RELEASE-p3 FreeBSD 8.0-RELEASE-p3 #0: Tue May 25 20:54:11 UTC 2010     root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC  amd64
[martin@somehost ~/src/wait3]$ gcc wait3.c -o wait3
[martin@somehost ~/src/wait3]$ ./wait3 md5 wait3.c
MD5 (wait3.c) = f63266e7c311fe8fa3ef3ea7cd079185
Process accounting information for 6501:
  user time = 0s, 0us
  system time = 0s, 2981us
  maximum resident set size = 0
  integral shared memory size = 0
  integral unshared data size = 0
  integral unshared stack size = 0
  page reclaims = 97
  page faults = 0
  swaps = 0
  block input operations = 0
  block output operations = 0
  messages sent = 0
  messages received = 0
  signals received = 0
  voluntary context switches = 1
  involuntary context switches = 0

Friday, 21 November 2008

HOWTO: Flash Lego Mindstorms NXT firmware in Linux

This is a howto for flashing custom firmware into your Lego mindstorms NXT brick using a linux development host, without having to use the official NXT labview host software, which is available for windows / mac only.

Update: The method below uses the Atmel SAM-BA tool for flashing. It is much easier (though less educational =) to use the libNXT utilities to flash your NXT brick from linux. This uses libusb and automagically finds your NXT when you connect it to your computer.



The following was done using debian linux (unstable @ 21 Nov 2008)
Disclaimer: You can easily screw up your NXT brick doing this! You will also lose any saved data on your NXT brick. You use this howto entirely at your own risk.

Note that linux command-line instructions below are shown in italics, with the root-prompt shown as '#' and a user-prompt shown as '$'
  1. You first need a firmware image for the NXT brick. See references for examples of this.
  2. Download the Atmel SAM-BA flashing tool for linux
  3. I unzipped this to /opt/atmel/sam-ba_cdc_2.8.linux_01 and add it to my path by placing a soft link in /usr/local/bin (# cd /usr/local/bin; ln -s /opt/atmel/sam-ba_cdc_2.8.linux_01/sam-ba_cdc_2.8.linux_01 sam-ba)
  4. Make sure that the sam-ba software has its executable flag set (# chmod +x /opt/atmel/sam-ba_cdc_2.8.linux_01/sam-ba_cdc_2.8.linux_01)
  5. Follow the instructions in the README.linux file: install the tcl / tk dependencies (I was installed the packages tk, tcl and tclx8.4 with aptitude)
  6. Boot the NXT brick in SAM-BA download mode first by switching on the NXT brick and then by holding the reset button on the back of the brick for over 4 seconds, as described on p74 of the Lego Mindstorms User Guide supplied with the NXT box set. You will hear the brick repeatedly making a clicking noise when successful. This may take a few attempts, depending on what firmware is residing in your NXT brick.
  7. Connect the NXT brick to your computer via USB and check that it has been correctly recognized using the 'lsusb' command. You can tell if this is successful, as you will see a device with the description 'Atmel Corp. at91sam SAMBA bootloader', not 'Lego Group'. If you see 'Lego Group', you did not successfully boot into SAM-BA download mode. Go back to step 6 if this is the case.
  8. Remove and reinsert the usbserial module with the relevant parameters, as in the readme (# modprobe -r usbserial; modprobe usbserial vendor=0x03eb product=0x6124)
  9. Check that the kernel has seen your NXT device by issuing the command 'dmesg'. You should see a line containing the text "generic converter now attached to ttyUSB0"
  10. Start the sam-ba utility (should be started with just $ sam-ba if you followed the earlier installation steps)
  11. Select the "AT91SAM7S256-EK" board from the dropdown list. The tty will probably be autodetected, but make sure it is the same as the one from the earlier dmesg command.
  12. In the "flash" tab, make sure that the address is 0x100000 (1 + 6 zeros). This is the location of the 256kbyte flash memory in the AT91SAM7S256 uController.
  13. Select the firmware file that you want to put into flash in the "Send file name" text box. In my case this was "/var/tmp/nxt-lua-beta-16a/nxt-lua-Beta-16a.rfw" Hit the "Send File" button. If the NXT has stopped clicking, it has turned itself off, and you will have to press the reset button and start over from step 6.
  14. When prompted about unlocking / locking flash sectors, reply "yes" both times. This is just a warning to check you are OK about reflashing the device.
  15. It would be nice to then check the data flashed with the "Compare sent file with memory" button. I was unable to get this to work properly however.
  16. When you are finished, unplug the NXT brick from USB and press the reset button on the back on the NXT for a short duration (less than 4 seconds). Your custom firmware should now be flashed. You know when you have reset the brick correctly, because it stops clicking. Congratulations!
If you screw up, you should _probably_ be able to reset your NXT with the 4 second reset procedure, as described in step 6. I'm too tired to check the schematics / datasheet to see exactly how foolproof this is.

Further info: