Where’s the money Lebowski: bypassing page the cache

Posted in Uncategorized on March 2, 2013 by voline

Into the wild

I had a bit of a scare today.  I’ve been rearranging some partitions on my laptop hard drive without a backup (don’t have the extra storage).  Not for the feint of heart!  I would have much more peace of mind using a partition editor like gparted, where I assume its been fairly well tested.  However, gparted won’t move a partition unless it contains one of its supported file systems and it won’t move a partition across another partition, both of which I’m needing to do.  These would be great additions to the tool!

Using ddpt

I’ve been using a variant of dd called ddpt to do the low level copying of data blocks.  Ddpt can speak to the drive via scsi, so I’d guess I could get better read/write rates (turns out maybe not that much better).  There were several moves that needed to be completed and partition juggling to be had.  I was on the last move which was shifting a 500GB+ partition closer to the beginning of the disk.  Before doing the move I verified that the to block address has the content that I was expecting, just so I was sure I was overwriting what I was expecting.  I then set it off and left to come back in 5 hours when it should have completed.

On overlapping partition moves

Incidentally, this process is a bit scarier because this is an overlapping move (and remember with dd and friends this can only be a shift to the left, else you corrupt your data).  That is to say, some the input contents will be overwritten as the copy takes place.  If for some reason the copy is interrupted after some of the input has been over written, you’ve got a huge mess on your hands.  If you just restart the copy, you will corrupt your data.  The situation is not hopeless though, but its potentially very time consuming to fix.  Basically, you need to run a moving window the size of the read/write offsets from the beginning sector by sector.  If the sectors from the beginning are equal (make sure to check the first several sectors), you can start the move from the beginning as before.  Otherwise, if the sectors at the ends of the window are not equal, then the copy has started overwriting the original partition (and thus currently your partition is in a useless state).  Just continue moving the window down until the sectors start matching.  While the sectors are matching continue moving the window until they stop matching.  The length of the matching sectors should be equal to the size of the window (unless by some coincidence there could be some matching sectors at the ends of the window which make the matching sectors size larger than the window, but it doesn’t matter, the size should never be less than the window size).  When you come to the mismatching sectors again, this is the point where you may resume your sector move.

Heart attack!

When I came back to the completed move, I immediately ran some sanity tests before I assume everything went well and potentially make things worse.  The first thing I do is check the first destination block to make sure it has the filesystem header I expect.  This shouldn’t really be necessary… but wait it hasn’t changed!  Ok, remain calm with your seats in the upright position.  Don’t panic.  Let’s see if the first sector of the source has changed… it hasn’t!  Ok what’s going on here.  My first thought is that if nothing has changed I can just restart the move from the beginning.  Don’t be too hasty, why didn’t ddpt report any errors and exit?  Instead it ran for the full length and exited normally.  So ddpt thinks everything is good, which means that the drive must not have errored.  Thus the drive must have executed all those writes successfully.

Caching

Then it hit me caching!  When reading the blocks, I was not running ddpt in pt mode (ie not using the scsi layer).  So I was getting blocks from the kernel’s page cache, which might have those block s cached if I’d recently read them (and I had).  When writing I was using pt mode, which necessarily bypasses the kernel page cache.  Searching through ddpt documentation, I found the fua and fua_nv bits.  The description wasn’t so helpful that I could fully understand the implications of using it, but I could tell it might be useful.  Time to dust off the SCSI spec (SBC-3 5.8 table 40) and see what it says.  Since I wasn’t completely sure that the volatile cache on the disk was right either, I set FUA=0 and FUA_NV=1 to get the block from non-volatile cache or directly from the media.  Lo! And behold!  The sectors were as they should be according to the drive!  Ok, but where do we go from here?

Dumping the cache (the cops are on our tail!)

After doing a few quick googles, I found that you can tell linux to drop the pages in its cache (if they aren’t dirty!).  My biggest concern now was, what if it causes the blocks to be written to disk before dropping them?  Then you end up with blood all over your money, which makes it completely unusable. Now I wouldn’t expect pages to get written back to disk fromthe cache unless they were dirty (if the kernel thinks nothing has changed, why would it write the same data to disk that’s already there).  Some more looking around lead me to /proc/sys/vm/dirty_writeback_centisecs, which says about how long a dirty page will stay in the cache before its written back to disk.  By default this is 5 seconds.  So by the time I was running these sanity check commands, any dirty block should already have been written to disk.  In fact, since the only thing writing to the disk was not going through the page cache, it should have been a very long time since there was a dirty page destined for the sectors I cared about.

Time to pull the trigger.  Done and done.  After telling linux to drop the pages in the page cache (no need to have it drop inodes or dnodes since there was no filesystem associated with those blocks), the sectors from the disk are returning what they should when going through the page cache. Mount readonly and do fsck, everything is fine…  Whew!

Looking back…

I don’t think the disk’s volatile cache could have been inconsistent with the media.  So I shouldn’t have needed the FUA or FUA_NV bits.  Running in pt mode should have been sufficient.

 

Advertisements

Connecting to a local address while using torsocks

Posted in Uncategorized on February 21, 2013 by voline

I recently started using torsocks to ensure that all network traffic from certain programs are routed through the Tor network.  I quickly ran into problems when that program was firefox because of my setup.  I prefer to have firefox proxy all connections through privoxy to remove the adds and have privoxy proxy through to Tor.  If you’re wondering why I’d want to use torsocks if I already have all my privoxy’d connections proxied through Tor, its because firefox plugins need not respect the proxy (this hasn’t been sufficiently verified, so I may be wrong here).  So plugins such as the Google voice plugin could allow google to corrolate your tor browsing session with your real ip.

This kind of setup isn’t currently possible with torsocks (version 1.3).  You’ll get some error messages on stdout saying: “Connection is to a local address (127.0.0.1), may be a TCP DNS request to a local DNS server so have to reject to be safe. Please report a bug to http://code.google.com/p/torsocks/issues/entry if this is preventing a program from working properly with torsocks.” (see torsocks bug)  Fair enough, torsocks was built to ensure that traffic didn’t escape tor.  However, in this case I know that everything going to my local privoxy instance IS going through Tor. (Note: If you’ve not configured your privoxy instance to use Tor AND resolve DNS names through Tor, you’ll shoot yourself in the foot).  Really torsocks only needs to prevent traffic destined to standard DNS ports, assuming you know there’s not a DNS server listening on a non-standard port.  But torsocks blocks traffic to all ports of local addresses, ie the local privoxy instance.

Luckily, there’s a patch to do just that along with ubuntu builds (see this comment).

hibernate/sleep fails on second attempt

Posted in Uncategorized on January 29, 2013 by voline

Problem

I’d never had problems with suspend/sleep on any machine.  So imagine my surprise when one day it all of a sudden broke, and not only that hibernate broke as well.  I use tux-on-ice for its superior hibernation support, so I thought perhaps there was a bug in that.  But to my dismay, even the stock kernel had the same issues.

What was strange was that it was completely reproducible.  After doing a regular boot, I could suspend/hibernate and wakeup/resume once, but the second time the machine would lock up with a blank screen.  The only way to get back to a usable system was to do a hard reset or let the power run out (after which of course there was no usable hibernation image).

What was so infuriating was that I could get absolutely no information about what the problem was.  I tried doing various test using /sys/power/pm_test, but everything checked out fine.  Nothing in syslog indicated any problem. I did many google searches and read tons of suspend/hibernation issues and fixes.  None of them worked or applied to me.  I even tried to use /sys/power/pm_trace to figure out where in the kernel it was hanging.  All to no avail.

I was quickly running out of leads.  Finally, in a desperate attempt at finding the problem, I wrote some scripts to tell me all the packages I’d installed since the problem started occurring.  I culled the list down to ones that could conceivably affect the kernel.  The list was very short, but cgroup-bin appeared on there and much to my surprise this problem went away when I went back to cgroups-lite, which is the default ubuntu package.  Now that I knew the culprit, I googled some more and found a pre-existing bug in launchpad for this issue.

Solution (Work around)

Uninstall cgroups-bin

* NOTE: It appears as this issue may now be fixed in Ubuntu, as I now (several months later) have cgroups-bin installed and suspend/hibernation working.

Reference

Work around hard to read digikam tooltips in Ubuntu

Posted in Uncategorized on January 29, 2013 by voline

Problem

For sometime I’d been living with a rather annoying problem. The tooltips in digikam for every ubuntu (gnome/unity based) I’ve used have been virtually unreadable. Basically the text is dark grey on a black background. After more research than I wanted to do to solve this issue, here’s what I came up with for the benfit of others (and myself when I need this again).

Solution (work around)

Ultimately a fix needs to come from either KDE, Ubuntu or both, so this is just a work around. This should work for similar problems in other KDE apps, but I’ve not tested that. First make sure digikam (or the program with the unreadable tooltips) is running. Then go to the system settings application and select the “Appearance” settings. In the “Theme” section, switch to a different theme, then back to the original. Voila! Your tooltips should have changed to be readable now. This will need to be done everytime the application is started.  Feel free to post better solutions or figure out how to fix this annoying bug!

Alternative Solutions

These may work if the above does not.  I’ve tried some and it didn’t solve the problem:

Reference Material

LUKS Full disk encryption with Ubuntu 12.04 using the Ubiquity installer.

Posted in Uncategorized on September 1, 2012 by voline

As noted before, there are plenty of articles on installing Ubuntu with full disk encryption. But they all recommend using the alternative install cd, which is not using Ubuntu’s ubiquity installer. If you want an Ubuntu desktop install exactly as Ubuntu developer’s intended, as I did, then read on.  Note: This is written based on the 12.04LTS installer, but much of the process may work with other versions.

  1. Boot into desktop live cd.
  2. apt-get install lvm2
    • If you don’t want to stuff your filesystem inside an LVM container, you may ignore this
  3. cryptsetup luksFormat -c twofish-xts-plain64 -s 512 <partition>
  4. cryptsetup luksOpen <partition> <luks dm device name>
    • Keep in mind that the luks device name must be the same name as the one in your crypttab that we create later.  Otherwise update-initramfs will not pick it up (that one bit me hard).
  5. vgcreate <vol group name> <luks dm device path>
  6. lvcreate -n <logical vol name> -l 100%VG <vol group name>
  7. mkfs.<desired fs> <logical volume device path>
    • This is actually an important step.  Currently the Ubiquity installer will not install a filesystem on a raw logical volume.  Without this step, you will later be compelled by the installer to install a partition table on the logical volume, which at best is a waste of space an additional unnecessary complexity.
  8. ubiquity [-b]
    • Note that you can not use the luks device as the boot device, when at the manual partitioning step below.  In most cases you don’t want this anyway because you run into the chicken-egg problem when booting (how to decrypt the boot loader at boot time when you need to boot loader to decrypt the device?)  I get around that by having an unencrypted chicken on a USB device.   However, without the ‘-b’ option, ubiquity forces me to choose a boot device and errors when I set the boot device to the luks device (there’s no other device that I want the boot partition to be).
    1. Choose manual partitioning.  You should see a line with the logical volume with the filesystem on it.
    2. Edit the logical volume, setting the label to ‘/’  and be sure that “Use as” is set to the correct filesystem.
    3. Continue with the installer until it finishes, but do not restart.
  9. Chroot into the newly installed target filesystem and prepare for updating grub.
    1. mount --bind /dev /target/dev
    2. sudo chroot /target
    3. apt-get install lvm2 — again if needed
    4. mount /proc
  10. create /etc/crypttab
    • Remember the source device name must match the current /dev/mapper name.
  11. update_initramfs -u
  12. update_grub
  13. boot into your new system.

LVM is entirely optional here, but I’ve included it because I find it to be a more flexible setup.  Also there does exist similar instructions on ubuntu’s help site that predate’s this post.  However, there’s no mention of the work around above to not having to install a partition table on the luks/lvm device.  As such, it would not precisely work for me.  This issue may be recent as of 12.04.

Updating selected debian/ubuntu repositories

Posted in Uncategorized on August 6, 2012 by voline

I’ve been dealing with the pains of a slow connection for sometime now and one of the many annoyances is trying to update a single repository (in my case a launchpad ppa).  There are over 50 repositories that I subscribe to, which isn’t a problem so much, since most of them are updated infrequently and have only a few packages in them.  But since the official repos are large (like over 4Mb each) and updated frequently, I almost alway need to update them.  Even though downloading the official and unofficial repos happen in parallel, getting the official ones consume a significant percent of my bandwidth.  Also, I want to keep the previous repos not being updated (like the official ones).

After a little googling, I found others trying to solve this problem.  But the answers weren’t really satisfactory.  The simplest solution was to use an infrequently updated mirror.  While definitely something that I wanted to do, I still don’t want to be forced to update when this mirror gets updated, even if its once every two or three weeks.

One suggested ubuntu repository management tool to deselect undesired repos and then running update.  But this is a pain with over 50 repos, where most I don’t want to update.  Also by default unselected repos are deleted.  Since most 3rd party repos have dependencies on packages in the official repos, we don’t want to have the official repos removed.  The latter behavior is controlled by an apt configure option, but I like that behavior in general (when I have a fast connection).

The best solution I saw overrode config options of apt to have it use a file with the desired repo lines.  The original answer didn’t include using the options ‘-o APT::Get::List-Cleanup="0"‘ so that existing repos wouldn’t get removed.  Still this solution didn’t provide for multiple files of repo lines or specifying repos on the command line.

Still I wanted a general tool that could be given a series of files with repo lines and even have repo lines specified on the command line.  Well that and I needed an excuse to see if I could do what I wanted to with python-apt, the python binding to libapt-pkg.  So here’s what I came up with, and remember that python-apt must be installed for this to run properly:

Its slightly annoying that there doesn’t seem to be a way to add repos to a SourceList object directly (libapt-pkg seems not to support it either. So I have to write a temp file with all the repo lines in it.

Startcraft + BWAPI + pybw

Posted in Uncategorized with tags , , on November 30, 2011 by voline

This post is meant to provide a quick way to get up and running with pybw and BWAPI 3.7, instead of spending a lot of time scouring the net for the prereqs.

Prerequisites:

  1. StarCraft Broodwars and patch — this must be installed and patched to at least 1.16.1.
  2. VisualStudio 2008 express — You only need to install the C++ portion.  Note, however, that VS 2010 currently _will not_ work. (not needed if none of the downloads are source that needs to be compiled).
  3. Python 2.6 — pybw should compile against any 2.6.  Python 2.7 may work, but is untested (the project build will need to be modified to include 2.7 instead of 2.6 headers).
  4. Chaoslauncher.zip — The Chaoslauncher can be a little harder to find since the website hosting the project went down.
  5. BWAPI — If you wish you may compile this from source, but the binaries should work just as well
  6. pybw — Checkout the source. Upstream currently only supports BWAPI up to version 3.2 Beta.  Or download precompiled binaries.
  7. vcredist_x86.exe — This may need to be installed.

Installation

Install Starcraft with 1.16.1 patch.  Then follow steps 1 – 4 (note the section is misnamed, “Build Instructions”) for setting up BWAPI to be injected into Starcraft using the Chaoslauncher.  If compiling pybw from source, follow the instructions from the README file.  Otherwise, you may need to install vcredist_x86.exe and then you should be able to run pybwClient.exe from the binary distribution to start the example AI.  Keep in mind that an AI can not start once you’re in a game.  Starcraft should be started with the Chaoslauncher with BWAPI checked.  It will be obvious that the AI is connected at game start because there will be text on the screen indicating the revision of BWAPI being used at game start (not to mention that the workers should automatically start gathering minerals).